On Fri, Jul 22, 2016 at 1:48 AM, Adam Spiers <aspi...@suse.com> wrote: > Ken Gaillot <kgail...@redhat.com> wrote: >> On 07/20/2016 07:32 PM, Andrew Beekhof wrote: >> > On Thu, Jul 21, 2016 at 2:47 AM, Adam Spiers <aspi...@suse.com> wrote: >> >> Ken Gaillot <kgail...@redhat.com> wrote: >> >>> Hello all, >> >>> >> >>> I've been meaning to address the implementation of "reload" in Pacemaker >> >>> for a while now, and I think the next release will be a good time, as it >> >>> seems to be coming up more frequently. >> >> >> >> [snipped] >> >> >> >> I don't want to comment directly on any of the excellent points which >> >> have been raised in this thread, but it seems like a good time to make >> >> a plea for easier reload / restart of individual instances of cloned >> >> services, one node at a time. Currently, if nodes are all managed by >> >> a configuration management system (such as Chef in our case), >> > >> > Puppet creates the same kinds of issues. >> > Both seem designed for a magical world full of unrelated servers that >> > require no co-ordination to update. >> > Particularly when the timing of an update to some central store (cib, >> > database, whatever) needs to be carefully ordered. >> > >> > When you say "restart" though, is that a traditional stop/start cycle >> > in Pacemaker that also results in all the dependancies being stopped >> > too? > > No, just the service reload or restart without causing any cascading > effects in Pacemaker. > >> > I'm guessing you really want the "atomic reload" kind where nothing >> > else is affected because we already have the other style covered by >> > crm_resource --restart. >> >> crm_resource --restart isn't sufficient for his use case because it >> affects all clone instances cluster-wide, whereas he needs to reload or >> restart (depending on the service) the local instance only.
Isn't that what I said? That --restart does a version that he doesn't want? > Exactly. > >> > I propose that we introduce a --force-restart option for crm_resource >> > which: >> > >> > 1. disables any recurring monitor operations >> >> None of the other --force-* options disable monitors, so for >> consistency, I think we should leave this to the user (or add it for >> other --force-*). No. There is no other way to reliably achieve a restart than to disable the monitors first so that they don't detect a transient state. Especially if the resource doesn't advertise a restart command. >> >> > 2. calls a native restart action directly on the resource if it >> > exists, otherwise calls the native stop+start actions >> >> What do you mean by native restart action? Systemd restart? Whatever the agent supports. >> >> > 3. re-enables the recurring monitor operations regardless of whether >> > the reload succeeds, fails, or times out, etc >> > >> > No maintenance mode required, and whatever state the resource ends up >> > in is re-detected by the cluster in step 3. >> >> If you're lucky :-) >> >> The cluster may still mess with the resource even without monitors, e.g. >> a dependency fails or a preferred node comes online. Can you explain how neither of those results in a restart of the service? >> Maintenance >> mode/unmanaging would still be safer (though no --force-* option is >> completely safe, besides check). > > I'm happy with whatever you gurus come up with ;-) I'm just hoping > that it can be made possible to pinpoint an individual resource on an > individual node, rather than having to toggle maintenance flag(s) > across a whole set of clones, or a whole node. Yep. > > _______________________________________________ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org