After you reboot and restart all Pulp services I expect there to be 0 messages on the 'resource_manager' queue. Can you show the queue depths of your broker in those situations for all queues? With Qpid you can do this with `qpid-stat -q` if I remember correctly. RabbitMQ has a similar command but I don't know it.
Also for those "stuck" tasks that are not starting when you expect them too, can you see if they are "assigned" to a worker. This doesn't show up in pulp-admin output[0] of a task detail so instead use a command like this to show the task details by uuid. pulp-admin -vv tasks details --task-id a83b32c4-4cb1-439e-a373-91797d0b185a The -vv part shows the actual webserver response which will contain a line like: "worker_name": "reserved_resource_worker-0@dev", This info would be helpful in resolving your issue. [0]: https://pulp.plan.io/issues/1832 -Brian On 04/08/2016 03:15 PM, Matthew Madey wrote: > When I checked on the state, the status was "not started". I think > that's why it remained stuck after reboots and bouncing the services. > > On Fri, Apr 8, 2016 at 1:27 PM, Brian Bouterse <[email protected] > <mailto:[email protected]>> wrote: > > I'm not sure how you would get into this situations. When it occurs can > you check which worker is assigned the work, and verify that that worker > is still running? > > Upon starting a worker will move previous tasks it was handling that are > still in the running state to cancelled. Also pulp_celerybeat monitors > pulp workers to determine if died to move its tasks to a cancelled > state. Both of these mechanisms would have to fail in order to have a > task stay in the running state when its not running. Could it still be > running? > > You indicate you rebooted the box and those tasks didn't go to > cancelled. Is it possible they are on another box connected to your > broker or when killing them it didn't respond to the signal you sent it? > > The 2.7.1 doesn't have any known defects like the ones your describing > so sending more info to the list would be good. > > -Brian > > > On 03/30/2016 05:58 PM, Matthew Madey wrote: > > I have a job that mistakenly thinks it's still running.. > > > > # pulp-admin -u admin -p ************ rpm repo sync run > > --repo-id=rhel-x86_64-server-7-base-tools > > > +----------------------------------------------------------------------+ > > Synchronizing Repository [rhel-x86_64-server-7-base-tools] > > > +----------------------------------------------------------------------+ > > > > A sync task is already in progress for this repository. Its > progress will be > > tracked below. > > > > This command may be exited via ctrl+c without affecting the request. > > > > [/] > > Waiting to begin... > > > > > > I checked all running processes and there is no repo sync currently > > running. I have even gone so far as to delete the repo and > recreate it.. > > same issue. I have also tried running pulp-admin orphan remove --all, > > which executes successfully, but does not fix the problem. I have also > > tried rebooting the server, bouncing all pulp services.. still no joy. > > I'm guessing there is a file somewhere that tracks pending tasks? How > > can I clear this so I can successfully run the job again? I'm running > > Pulp 2.7.1-1 > > > > > > _______________________________________________ > > Pulp-list mailing list > > [email protected] <mailto:[email protected]> > > https://www.redhat.com/mailman/listinfo/pulp-list > > > > _______________________________________________ > Pulp-list mailing list > [email protected] <mailto:[email protected]> > https://www.redhat.com/mailman/listinfo/pulp-list > > _______________________________________________ Pulp-list mailing list [email protected] https://www.redhat.com/mailman/listinfo/pulp-list
