20.01.2015 02:44, Andrew Beekhof wrote: > >> On 16 Jan 2015, at 3:59 pm, Vladislav Bogdanov <bub...@hoster-ok.com> wrote: >> >> 16.01.2015 07:44, Andrew Beekhof wrote: >>> >>>> On 15 Jan 2015, at 3:11 pm, Vladislav Bogdanov <bub...@hoster-ok.com> >>>> wrote: >>>> >>>> 13.01.2015 11:32, Andrei Borzenkov wrote: >>>>> On Tue, Jan 13, 2015 at 10:20 AM, Vladislav Bogdanov >>>>> <bub...@hoster-ok.com> wrote: >>>>>> Hi Andrew, David, all. >>>>>> >>>>>> I found a little bit strange operation ordering during transition >>>>>> execution. >>>>>> >>>>>> Could you please look at the following partial configuration (crmsh >>>>>> syntax)? >>>>>> >>>>>> === >>>>>> ... >>>>>> clone cl-broker broker \ >>>>>> meta interleave=true target-role=Started >>>>>> clone cl-broker-vips broker-vips \ >>>>>> meta clone-node-max=2 globally-unique=true interleave=true >>>>>> resource-stickiness=0 target-role=Started >>>>>> clone cl-ctdb ctdb \ >>>>>> meta interleave=true target-role=Started >>>>>> colocation broker-vips-with-broker inf: cl-broker-vips cl-broker >>>>>> colocation broker-with-ctdb inf: cl-broker cl-ctdb >>>>>> order broker-after-ctdb inf: cl-ctdb cl-broker >>>>>> order broker-vips-after-broker 0: cl-broker cl-broker-vips >>>>>> ... >>>>>> === >>>>>> >>>>>> After I put one node to standby and then back to online, I see the >>>>>> following transition (relevant excerpt): >>>>>> >>>>>> === >>>>>> * Pseudo action: cl-broker-vips_stop_0 >>>>>> * Resource action: broker-vips:1 stop on c-pa-0 >>>>>> * Pseudo action: cl-broker-vips_stopped_0 >>>>>> * Pseudo action: cl-ctdb_start_0 >>>>>> * Resource action: ctdb start on c-pa-1 >>>>>> * Pseudo action: cl-ctdb_running_0 >>>>>> * Pseudo action: cl-broker_start_0 >>>>>> * Resource action: ctdb monitor=10000 on c-pa-1 >>>>>> * Resource action: broker start on c-pa-1 >>>>>> * Pseudo action: cl-broker_running_0 >>>>>> * Pseudo action: cl-broker-vips_start_0 >>>>>> * Resource action: broker monitor=10000 on c-pa-1 >>>>>> * Resource action: broker-vips:1 start on c-pa-1 >>>>>> * Pseudo action: cl-broker-vips_running_0 >>>>>> * Resource action: broker-vips:1 monitor=30000 on c-pa-1 >>>>>> === >>>>>> >>>>>> What could be a reason to stop unique clone instance so early for move? >>>>>> >>>>> >>>>> Do not take it as definitive answer, but cl-broker-vips cannot run >>>>> unless both other resources are started. So if you compute closure of >>>>> all required transitions it looks rather logical. Having >>>>> cl-broker-vips started while broker is still stopped would violate >>>>> constraint. >>>> >>>> Problem is that broker-vips:1 is stopped on one (source) node >>>> unnecessarily early. >>> >>> It looks to be moving from c-pa-0 to c-pa-1 >>> It might be unnecessarily early, but it is what you asked for... we have to >>> unwind the resource stack before we can build it up. >> >> Yes, I understand that it is valid, but could its stop be delayed until >> cluster is in the state when all dependencies are satisfied to start it on >> another node (like migration?)? > > No, because "we have to unwind the resource stack before we can build it up." > Doing anything else would be one of those things that is trivial for a human > to identify but rather complex for a computer.
I believe there is also an issue with migration of clone instances. I modified pe-input to allow migration of cl-broker-vips (and also set inf score for broker-vips-after-broker and make cl-broker-vips interleaved). Relevant part is: clone cl-broker broker \ meta interleave=true target-role=Started clone cl-broker-vips broker-vips \ meta clone-node-max=2 globally-unique=true interleave=true allow-migrate=true resource-stickiness=0 target-role=Started clone cl-ctdb ctdb \ meta interleave=true target-role=Started colocation broker-vips-with-broker inf: cl-broker-vips cl-broker colocation broker-with-ctdb inf: cl-broker cl-ctdb order broker-after-ctdb inf: cl-ctdb cl-broker order broker-vips-after-broker inf: cl-broker cl-broker-vips After that (part of) transition is: * Resource action: broker-vips:1 migrate_to on c-pa-0 * Pseudo action: cl-broker-vips_stop_0 * Resource action: broker-vips:1 migrate_from on c-pa-1 * Resource action: broker-vips:1 stop on c-pa-0 * Pseudo action: cl-broker-vips_stopped_0 * Pseudo action: all_stopped * Pseudo action: cl-ctdb_start_0 * Resource action: ctdb start on c-pa-1 * Pseudo action: cl-ctdb_running_0 * Pseudo action: cl-broker_start_0 * Resource action: ctdb monitor=10000 on c-pa-1 * Resource action: broker start on c-pa-1 * Pseudo action: cl-broker_running_0 * Pseudo action: cl-broker-vips_start_0 * Resource action: broker monitor=10000 on c-pa-1 * Pseudo action: broker-vips:1_start_0 * Pseudo action: cl-broker-vips_running_0 * Resource action: broker-vips:1 monitor=30000 on c-pa-1 But, I would say that at least from a human logic PoV the above breaks ordering rule broker-vips-after-broker (cl-broker-vips finished migrating and thus runs on c-pa-1 before cl-broker started there). Technically broker-vips:1_start_0 goes at the right position, but actually resource is "started" in migrate_to/mifrate_from. I also went further and injected a pair of non-clone IPAddr2 resources into the same pe-input, and also enabled migration for them (returning interleave for cl-broker-vips to false and setting ordering score for broker-vips-after-broker back to 0, so all three order constraints are adjacent): clone cl-broker broker \ meta interleave=true target-role=Started clone cl-broker-vips broker-vips \ meta clone-node-max=2 globally-unique=true interleave=false allow-migrate=true resource-stickiness=0 target-role=Started clone cl-ctdb ctdb \ meta interleave=true target-role=Started primitive broker-vip1 IPaddr2 \ params ip=192.168.122.70 cidr_netmask=24 nic=eth0 \ op start interval=0 timeout=20 \ op stop interval=0 timeout=20 \ op monitor interval=30 primitive broker-vip2 IPaddr2 \ params ip=192.168.122.71 cidr_netmask=24 nic=eth0 \ op start interval=0 timeout=20 \ op stop interval=0 timeout=20 \ op monitor interval=30 colocation broker-with-ctdb inf: cl-broker cl-ctdb colocation broker-vips-with-broker inf: cl-broker-vips cl-broker colocation broker-vip1-with-broker inf: broker-vip1 cl-broker colocation broker-vip2-with-broker inf: broker-vip2 cl-broker colocation broker-vip2-not-with-vip1 -100: broker-vip2 broker-vip1 order broker-after-ctdb inf: cl-ctdb cl-broker order broker-vips-after-broker 0: cl-broker cl-broker-vips order broker-vip1-after-broker 0: cl-broker broker-vip1 order broker-vip2-after-broker 0: cl-broker broker-vip2 For broker-vip2 I see completely different output (compare with broker-vips:1): * Resource action: broker-vips:1 migrate_to on c-pa-0 * Pseudo action: cl-broker-vips_stop_0 * Resource action: broker-vips:1 migrate_from on c-pa-1 * Resource action: broker-vips:1 stop on c-pa-0 * Pseudo action: cl-broker-vips_stopped_0 * Pseudo action: cl-ctdb_start_0 * Resource action: ctdb start on c-pa-1 * Pseudo action: cl-ctdb_running_0 * Pseudo action: cl-broker_start_0 * Resource action: ctdb monitor=10000 on c-pa-1 * Resource action: broker start on c-pa-1 * Pseudo action: cl-broker_running_0 * Resource action: broker-vip2 migrate_to on c-pa-0 * Pseudo action: cl-broker-vips_start_0 * Resource action: broker monitor=10000 on c-pa-1 * Resource action: broker-vip2 migrate_from on c-pa-1 * Resource action: broker-vip2 stop on c-pa-0 * Pseudo action: broker-vips:1_start_0 * Pseudo action: cl-broker-vips_running_0 * Pseudo action: all_stopped * Pseudo action: broker-vip2_start_0 * Resource action: broker-vips:1 monitor=30000 on c-pa-1 * Resource action: broker-vip2 monitor=30000 on c-pa-1 broker-vip2 is migrated much later than broker-vips:1, exactly at the point I would expect to see. For me that means that some logic already exists which would allow to postpone resource move until everything is ready for it at the destination. I also tried to disable migration for broker-vip2, and in that case it was also stopped too early. So, there are four cases, and for one of them I get expected result: *) g-u clone, migration disabled - early stop *) g-u clone, migration enabled - early stop *) ordinary resource, migration disabled - early stop *) ordinary resource, migration enabled - stop at the expected point The question is: Is it strictly impossible to make non-migratable resources behave the same way as that migratable broker-vip2? (I'm pretty sure I didn't make a mess in details anywhere but I want to recheck that all once again) Best, Vladislav > > Better to look at why broker-vips:1 needed to be moved. > >> >> Like: >> === >> * Pseudo action: cl-ctdb_start_0 >> * Resource action: ctdb start on c-pa-1 >> * Pseudo action: cl-ctdb_running_0 >> * Pseudo action: cl-broker_start_0 >> * Resource action: ctdb monitor=10000 on c-pa-1 >> * Resource action: broker start on c-pa-1 >> * Pseudo action: cl-broker_running_0 >> * Pseudo action: cl-broker-vips_start_0 >> * Resource action: broker monitor=10000 on c-pa-1 >> * Pseudo action: cl-broker-vips_stop_0 >> * Resource action: broker-vips:1 stop on c-pa-0 >> * Pseudo action: cl-broker-vips_stopped_0 >> * Resource action: broker-vips:1 start on c-pa-1 >> * Pseudo action: cl-broker-vips_running_0 >> * Resource action: broker-vips:1 monitor=30000 on c-pa-1 >> === >> That would be the great optimization toward five nines... >> >> Best, >> Vladislav >> >> >> _______________________________________________ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org