[ha-clusters-discuss] HA-xVM requirements specification

Neil Garthwaite Thu, 13 Nov 2008 16:55:45 +0000

Hi Andy,

Just a quick followup on a two node cluster.


bash-3.00# clrg create RG1
bash-3.00# clrg create RG2
bash-3.00# clrg create RG3
bash-3.00# clrg set -p RG_affinities=--RG1 RG2
bash-3.00# clrg set -p RG_affinities=--RG2 RG3
bash-3.00# clrg set -p RG_affinities=--RG3 RG1
clrg:  (C908891) Request failed on resource group RG1 because the  
combination of resource group dependencies and/or strong affinities  
would create a cycle
bash-3.00#

Regards
Neil

On 13 Nov 2008, at 16:48, Neil Garthwaite wrote:

> Hi Andy,
>
> I had an email discussion with Marty on this a while ago and sent that
> email to this alias.  it was entitled "Re: RG_affinities with xVM live
> migration".  Anyway the point that Marty and I concurred on was point
> 2, e.g.
>
> <snip>
> 2) We could define a new RG property which would be set on the
> affinity target resource group [i.e., the RG containing the xVM
> resource, say rg1].  This property, if set on rg1, would modify the
> switchover behavior when another RG (say, rg2) declares a strong
> negative (SN) affinity upon rg1.  If rg1 is being switched from nodeA
> to nodeB, and rg2 is currently online on nodeB, the RGM would offline
> rg2 from nodeB before *stopping* rg1 on nodeA, rather than waiting
> until rg1 begins to come online on nodeB.
>
> I can see some potential complications to approach 2, mostly in regard
> to error behavior.  Suppose we have the above scenario of rg1 and rg2,
> in which we initiate a switchover of rg1 from nodeA to nodeB.
>
> Then we might have to deal with various failure scenarios, for  
> example:
>
>  - rg2 encounters a stop failure on nodeB.  Depending on the
> failover_mode,
>    it might end up in ERROR_STOP_FAILED state on nodeB.  We would
> then have
>    to abort the switchover of rg1 from nodeA to nodeB.
>
>  - rg2 succeeds in going offline on nodeB.  I think that we need to
> prevent
>    it coming online immediately on nodeA, until rg1 has finished going
>    offline from nodeA (this is the prospective bug mentioned in my
> other
>    email).  If rg1 fails to go offline on nodeA and goes to
> ERROR_STOP_FAILED,
>    then rg2 might end up remaining offline.
>
>  - rg1 might succeed in going offline on nodeA, but then it might
> encounter a
>    start method failure on nodeB.  This could cause it to fail over
> back to
>    nodeA, on which rg2 is presumably PENDING_ONLINE or ONLINE.  I
> think this
>    case would be handled by the current rgmd logic -- it would take
> rg2 offline
>    from nodeA before switching rg1 online on nodeA.  I suspect that  
> rg2
>    would then attempt to fail over back onto nodeB.
>
> It might be that existing rgmd logic handles all of the above cases
> (except for the prospective bug fix mentioned in my previous email),
> but we'd have to confirm that.
> <snip>
>
> Also please see some comments below.
>
> Regards
> Neil
>
> On 13 Nov 2008, at 16:27, Andrew Hisgen wrote:
>
>> Neil,
>>
>> I am looking at the diagram in section 2.2.4.3 page 11,
>> of the requirements specification.
>>
>> It appears that part of the desired behavior is to
>> change the order of STOPing of the two RG's, such
>> that RG2 is stopped before the stop of RG1.
>>
>> It is not completely clear to me what is the order
>> of the invocation of the STOPing.  Is the stop
>> of RG1 called first, and then somehow it requests
>> the stop of RG2?  Which software component is expected
>> to call the stop of RG2: is the RGM daemon
>> expected to do that, based on somehow knowing
>> that RG1 contains an R of type HA-xVVM-Agent?
>> Or perhaps the STOP method of the Agent is
>> invoking thru a private interface the STOP
>> of RG2?
>>
>> In any case, regarding the ordering, the STOP
>> of RG2 seems to run to completion, before the
>> STOP method of the Agent in RG1 has finished;
>> said method then invokes the live migration.
>>
>> Also, what happens if RG2 also contains an R
>> of type HA-xVM-Agent?
>> And what if it needs to migrate to node1, e.g.,
>> that is the only other physical host it can go
>> to?
>> Would that give us a nested call to RG1 telling
>> it that it needs to stop?  Seemingly, that would
>> be a deadlock?
>
> I don't believe you can code such a circular or nested affinity, i.e.
> if RG2 has a SN affinity with RG1, then RG2 cannot occupy the same
> node where RG1 is still or currently online.  For example the
> following is not possible,
>
> RG1 -pushes-> RG2 -pushes-> RG3 -pushes-> RG1
>
> or
>
> RG2 has RG_affinities=--RG1
> RG3 has RG_affinities=--RG2
> RG1 has RG_Affinities=--RG3   <- I believe this would not be allowed by
> RGM
>
>
>>
>> If it is a deadlock, how would we detect that--could
>> we somehow detect and forbid this config in advancce?  But
>> notice that in 2.2.4.1 diagram (without resources of
>> type HA-xVM-Agent) we can effectively swap which host
>> is running which RG (albeit with no migration)?
>> On the other hand, if we are unable to detect/prevent
>> the nested call that would cause a deadlock, perhaps
>> the other solution is to allow both domU's to be
>> running on the same physical host, temporarily,
>> at the same time.  Alas, this may cause an overload
>> condition.
>>
>> thanks,
>> Andy
>>
>
> _______________________________________________
> ha-clusters-discuss mailing list
> ha-clusters-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/ha-clusters-discuss

[ha-clusters-discuss] HA-xVM requirements specification

Reply via email to