Hi Andy, Just a quick followup on a two node cluster.
bash-3.00# clrg create RG1 bash-3.00# clrg create RG2 bash-3.00# clrg create RG3 bash-3.00# clrg set -p RG_affinities=--RG1 RG2 bash-3.00# clrg set -p RG_affinities=--RG2 RG3 bash-3.00# clrg set -p RG_affinities=--RG3 RG1 clrg: (C908891) Request failed on resource group RG1 because the combination of resource group dependencies and/or strong affinities would create a cycle bash-3.00# Regards Neil On 13 Nov 2008, at 16:48, Neil Garthwaite wrote: > Hi Andy, > > I had an email discussion with Marty on this a while ago and sent that > email to this alias. it was entitled "Re: RG_affinities with xVM live > migration". Anyway the point that Marty and I concurred on was point > 2, e.g. > > <snip> > 2) We could define a new RG property which would be set on the > affinity target resource group [i.e., the RG containing the xVM > resource, say rg1]. This property, if set on rg1, would modify the > switchover behavior when another RG (say, rg2) declares a strong > negative (SN) affinity upon rg1. If rg1 is being switched from nodeA > to nodeB, and rg2 is currently online on nodeB, the RGM would offline > rg2 from nodeB before *stopping* rg1 on nodeA, rather than waiting > until rg1 begins to come online on nodeB. > > I can see some potential complications to approach 2, mostly in regard > to error behavior. Suppose we have the above scenario of rg1 and rg2, > in which we initiate a switchover of rg1 from nodeA to nodeB. > > Then we might have to deal with various failure scenarios, for > example: > > - rg2 encounters a stop failure on nodeB. Depending on the > failover_mode, > it might end up in ERROR_STOP_FAILED state on nodeB. We would > then have > to abort the switchover of rg1 from nodeA to nodeB. > > - rg2 succeeds in going offline on nodeB. I think that we need to > prevent > it coming online immediately on nodeA, until rg1 has finished going > offline from nodeA (this is the prospective bug mentioned in my > other > email). If rg1 fails to go offline on nodeA and goes to > ERROR_STOP_FAILED, > then rg2 might end up remaining offline. > > - rg1 might succeed in going offline on nodeA, but then it might > encounter a > start method failure on nodeB. This could cause it to fail over > back to > nodeA, on which rg2 is presumably PENDING_ONLINE or ONLINE. I > think this > case would be handled by the current rgmd logic -- it would take > rg2 offline > from nodeA before switching rg1 online on nodeA. I suspect that > rg2 > would then attempt to fail over back onto nodeB. > > It might be that existing rgmd logic handles all of the above cases > (except for the prospective bug fix mentioned in my previous email), > but we'd have to confirm that. > <snip> > > Also please see some comments below. > > Regards > Neil > > On 13 Nov 2008, at 16:27, Andrew Hisgen wrote: > >> Neil, >> >> I am looking at the diagram in section 2.2.4.3 page 11, >> of the requirements specification. >> >> It appears that part of the desired behavior is to >> change the order of STOPing of the two RG's, such >> that RG2 is stopped before the stop of RG1. >> >> It is not completely clear to me what is the order >> of the invocation of the STOPing. Is the stop >> of RG1 called first, and then somehow it requests >> the stop of RG2? Which software component is expected >> to call the stop of RG2: is the RGM daemon >> expected to do that, based on somehow knowing >> that RG1 contains an R of type HA-xVVM-Agent? >> Or perhaps the STOP method of the Agent is >> invoking thru a private interface the STOP >> of RG2? >> >> In any case, regarding the ordering, the STOP >> of RG2 seems to run to completion, before the >> STOP method of the Agent in RG1 has finished; >> said method then invokes the live migration. >> >> Also, what happens if RG2 also contains an R >> of type HA-xVM-Agent? >> And what if it needs to migrate to node1, e.g., >> that is the only other physical host it can go >> to? >> Would that give us a nested call to RG1 telling >> it that it needs to stop? Seemingly, that would >> be a deadlock? > > I don't believe you can code such a circular or nested affinity, i.e. > if RG2 has a SN affinity with RG1, then RG2 cannot occupy the same > node where RG1 is still or currently online. For example the > following is not possible, > > RG1 -pushes-> RG2 -pushes-> RG3 -pushes-> RG1 > > or > > RG2 has RG_affinities=--RG1 > RG3 has RG_affinities=--RG2 > RG1 has RG_Affinities=--RG3 <- I believe this would not be allowed by > RGM > > >> >> If it is a deadlock, how would we detect that--could >> we somehow detect and forbid this config in advancce? But >> notice that in 2.2.4.1 diagram (without resources of >> type HA-xVM-Agent) we can effectively swap which host >> is running which RG (albeit with no migration)? >> On the other hand, if we are unable to detect/prevent >> the nested call that would cause a deadlock, perhaps >> the other solution is to allow both domU's to be >> running on the same physical host, temporarily, >> at the same time. Alas, this may cause an overload >> condition. >> >> thanks, >> Andy >> > > _______________________________________________ > ha-clusters-discuss mailing list > ha-clusters-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/ha-clusters-discuss