+1 excluding pulp_celerybeat Also since we have the --ignore-running-workers flag and are ignoring celerybeat I would like to propose we stop prompting the user to continue and instead just display an error message when we detect running workers: 'Migration halted because there are still running workers, please stop all workers before re-running this command. If you believe this message was given in error please re-run the command with the --ignore-running-workers flag'
On Wed, Dec 7, 2016 at 7:31 AM, Michael Hrivnak <[email protected]> wrote: > Quick context... this story: https://pulp.plan.io/issues/2186 > > ... was intended to make a best effort to help users avoid accidentally > running pulp-manage-db while pulp services are running, because that is > unsafe. It's not expected to be perfect, but just to help. It does that > well. > > It's hard to do perfectly, because it's hard to know if the WSGI app might > be running on other machines in a multi-machine deployment. Workers are > easier to track, because they register their existence in the database. > > The implementation looks for worker entries in the database. If any are > found, pulp-manage-db asks the user if they are sure they want to proceed. > Or optionally, the user can preemptively force it to proceed with a > command-line option. Here is the full scary message: > > "There are still running workers, continuing could corrupt your Pulp > installation. Are you sure you wish to continue?" > > Problem: it was discovered that pulp_celerybeat does not clean up its > entry in the worker collection. So in all cases of upgrading from < 2.11 to > >= 2.11, a stale entry is present, and the user sees the scary message > (unless they wait 5 minutes [0]). > > I suggest we modify that logic to simply ignore any worker entries from > pulp_celerybeat. That would prevent a large number of users from getting an > unwarranted scary message, and it would enable those who want to script or > otherwise automate upgrades to still use this feature. > > The safety check would still look for entries from normal workers and the > pulp_resource_manager. Thus it would continue to catch most cases where a > user forgot to stop all pulp services. And since this feature was largely > intended to help katello users, who run "katello-service stop" to stop all > processes with one command, they are unlikely to have stopped the workers > but forgotten pulp_celerybeat/ > > Given the "known issue" release note [1] I think we could release 2.11.0 > with this problem and then fix it quickly in 2.11.1. But my concern is that > the user experience is bad in the mean time. So unless anyone is especially > itching to get 2.11.0 out the door, and doesn't mind a very quick 2.11.1 > hotfix, I propose we just make this change and let it briefly block the > 2.11.0 release. > > If we *really* want to get pulp_celerybeat back into the test, there are > more elaborate options [2] we could pursue later. > > Thoughts on the proposed change? > > Thanks, > Michael > > [0] There is logic that ignores entries more than 5 minutes old, based on > the assumption that those processes are no longer alive. That's helpful, > but plenty of users will be able to "yum update" and then start > pulp-manage-db within 5 minutes of having stopped services. > [1] https://github.com/pulp/pulp/pull/2878 > [2] This kind of data quality problem can be solved by versioning the > production of data. In this case, that would mean providing a way to know > if a particular worker entry was created by pulp >= 2.11.0. For example, we > could add a field to the worker collection that contains the version of > pulp the worker was running. That would enable the pulp-manage-db safety > check to ignore entries it knows are problematic, but take newer ones > seriously. That said, I'm not convinced that checking for the > pulp_celerybeat entry provides enough added value to justify such a change. > But we can certainly consider it as a later improvement. > > _______________________________________________ > Pulp-dev mailing list > [email protected] > https://www.redhat.com/mailman/listinfo/pulp-dev > >
_______________________________________________ Pulp-dev mailing list [email protected] https://www.redhat.com/mailman/listinfo/pulp-dev
