On Mon, Oct 27, 2014 at 6:34 AM, Andrew Beekhof <and...@beekhof.net> wrote: > >> On 27 Oct 2014, at 2:30 pm, Andrei Borzenkov <arvidj...@gmail.com> wrote: >> >> В Mon, 27 Oct 2014 11:09:08 +1100 >> Andrew Beekhof <and...@beekhof.net> пишет: >> >>> >>>> On 25 Oct 2014, at 12:38 am, Andrei Borzenkov <arvidj...@gmail.com> wrote: >>>> >>>> On Fri, Oct 24, 2014 at 9:17 AM, Andrew Beekhof <and...@beekhof.net> wrote: >>>>> >>>>>> On 16 Oct 2014, at 9:31 pm, Andrei Borzenkov <arvidj...@gmail.com> wrote: >>>>>> >>>>>> The primary goal is to transparently update software in cluster. I >>>>>> just did HA suite update using simple RPM and observed that RPM >>>>>> attempts to restart stack (rcopenais try-restart). So >>>>>> >>>>>> a) if it worked, it would mean resources had been migrated from this >>>>>> node - interruption >>>>>> >>>>>> b) it did not work - apparently new versions of installed utils were >>>>>> incompatible with running pacemaker so request to shutdown crm fails >>>>>> and openais hung forever. >>>>>> >>>>>> The usual workflow with one cluster products I worked before was - >>>>>> stop cluster processes without stopping resources; update; restart >>>>>> cluster processes. They would detect that resources are started and >>>>>> return to the same state as before stopping. Is something like this >>>>>> possible with pacemaker? >>>>> >>>>> absolutely. this should be of some help: >>>>> >>>>> http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/_disconnect_and_reattach.html >>>>> >>>> >>>> Did not work. It ended up moving master to another node and leaving >>>> slave on original node stopped after that. >>> >>> When you stopped the cluster or when you started it after an upgrade? >> >> When I started it >> >> crm_attribute -t crm_config -n is-managed-default -v false >> rcopenais stop on both nodes >> rcopenais start on both node; wait for them to stabilize >> crm_attribute -t crm_config -n is-managed-default -v true >> >> It stopped running master/slave, moved master and left slave stopped. > > What did crm_mon say before you set is-managed-default back to true? > Did the resource agent properly detect it as running in the master state?
You are right, it returned 0, not 8. > Did the resource agent properly (re)set a preference for being promoted > during the initial monitor operation? > It did, but it was too late - after it had already been demoted. > Pacemaker can do it, but it is dependant on the resources behaving correctly. > I see. Well, this would be a problem ... RA keeps track of current promoted/demoted status in CIB as transient attribute which gets reset after reboot. This would entail quite a bit of redesign ... But what got me confused were these errors during initial probing, like Oct 24 17:26:54 n1 crmd[32425]: warning: status_from_rc: Action 9 (rsc_ip_VIP_monitor_0) on n2 failed (target: 7 vs. rc: 0): Error This looks like pacemaker does expect resource to be in stopped state and "running" state would be interpreted as error? I mean, normal response to such monitor response would be to stop resource to bring it in target state, no? _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org