Thanks very much, the Advisory Ordering in conjunction with the collocation constraints seems to achieve the desired goal if albeit in a somewhat roundabout way. We have now scaled the test environment up to 4 nodes and will start to address the next problems,
Best regards, Simon Simon Talbot MEng, ACGI (Chief Engineer) Net Solutions Europe Tel: 020 3161 6001 Fax: 020 3161 6011 The information contained in this e-mail and any attachments are private and confidential and may be legally privileged. It is intended for the named addressee(s) only. If you are not the intended recipient(s), you must not read, copy or use the information contained in any way. If you receive this email or any attachments in error, please notify us immediately by e-mail and destroy any copy you have of it. We accept no responsibility for any loss or damages whatsoever arising in any way from receipt or use of this e-mail or any attachments. This e-mail is not intended to create legally binding commitments on our behalf, nor do its comments reflect our corporate views or policies. -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Andrew Beekhof Sent: 15 September 2008 19:08 To: High-Availability Linux Development List Subject: Re: [Linux-ha-dev] Order Constraints and Clone Groups On Mon, Sep 15, 2008 at 17:33, Simon Talbot <[EMAIL PROTECTED]> wrote: > Thanks very much for that Andrew, we had come to the same conclusion > with score=0 and it does prevent the symptom, but causes other problems > which your suggestion of a collocation constraint may help with. It will. > > When you say the cluster isn't that smart yet, what does 'yet' mean in > terms of timeframe? I want to solve this for 1.2 (1.0 being practically done already) So depending on how many bugs people find in 1.0, probably later this year or early next. > -- I assume this is a somewhat non-trivial modification? indeed > > Is collocation smart enough to work out that the rule you are talking > about refers to a clone instance, rather than the clone set itself yes > (I > assume you configure the collocation with reference to the whole > cloneset, rather than any particular clone member)? right > > And finally, for the benefit of myself and the list as a whole, what > exactly does an advisory order rule (Score=0) mean -- I get a feeling > for what it means, but I have not found it described in any > documentation? http://clusterlabs.org/mw/Image:Ordering_Explained.pdf > > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Andrew > Beekhof > Sent: 15 September 2008 14:42 > To: High-Availability Linux Development List > Subject: Re: [Linux-ha-dev] Order Constraints and Clone Groups > > On Fri, Sep 5, 2008 at 15:08, Simon Talbot <[EMAIL PROTECTED]> wrote: >> Hi All, >> >> Before I dig any deeper looking for the root cause, I thought I would >> run a scenario past the list to make sure I was not missing something >> obvious that has already been addresses/fixed. >> >> Take the following scenario: >> >> Clariion iSCSI SAN >> Two Hosts Connected >> SLES10 SP2 Heartbeat + OCFS2 + Xen, Identical configuration >> >> We have OCFS2 configured as standard with user mode cluster control > from >> heartbeat and all working fine. >> >> Next we have Xen VM Instances reading their configurations from the >> Clustered OCFS2 Filesystem and running against physical volumes >> presented (shared) from the SAN, live migration etc. >> >> Again all fine >> >> So effectively we have: >> an anonymous clone group running stonith (external/SSH) >> an anonymous clone group running the OCFS2 configuration store >> and a normal VM resource for each XEN Virtual machine >> >> The problem comes with the specific interactions of the order >> constraints. >> >> We have a simple order constraint saying the XEN VM instance depends >> upon the OCFS2 configuration store >> >> With both nodes running, we can live migrate the Xen VM between the >> hosts as we like and all works fine, but the problem starts if for >> example we perform the following: >> >> Start up VM, Let us say it starts on > > I assume you meant to follow up with "Host A" here :) > >> Live Migrate a VM from Host A to Host B >> Instruct Host A to go into standby >> >> At this point, the Xen VM received a shutdown command, even though the >> instruction to put Host A into standby has absolutely nothing to do > with >> Host B or the OCFS2 Clone instance on host B. > > You'd think that. But the cluster isn't quite so smart (yet). > > It just knows that you told it to restart Xen if OCFS2 changes state > (which it did, but not on the node you care about). > The solution is to use score=0 for the constraint (preventing the > restart) and add a colocation constraint between them (so that it wont > try to run anywhere that OCFS2 isn't running). > >> >> The Xen VM on B shuts down and then immediately restarts again on the >> same host (B) >> >> If I then take Host A back out of standby and make it active, again > the >> Xen host is shut down and re-started (again on the same host) >> >> If I take the order constraint away from the configuration, the >> behaviour does not happen and host A can be taken into standby > correctly >> without affecting host B or the running VM. >> >> My feeling is that something in the order clause logic is getting >> confused and thinking that the OCFS2 clone on Host B is shutting down, >> which causes the VM to be shut down, then the CRM realises that the >> OCFS2 clone is alive and well and starts the VM back up again. >> >> Does this ring any bells, or should I start digging. >> >> Config as follows: > > Please supply as an attachment next time. Mail clients tend to mangle > long lines. > _______________________________________________________ > Linux-HA-Dev: [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev > Home Page: http://linux-ha.org/ > _______________________________________________________ > Linux-HA-Dev: [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev > Home Page: http://linux-ha.org/ > _______________________________________________________ Linux-HA-Dev: [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/ _______________________________________________________ Linux-HA-Dev: [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
