[ha-clusters-discuss] Quorum server question

Sambit Nayak Mon, 13 Jul 2009 16:34:11 +0530

Two points to consider if we think on having such a preference rule :

(1) We will need to think of a "distinction" (or "preferred node") rule.

We can say, we follow a "all resource groups or not" approach : if our 
node/partition does not host *any* resource groups (which means the 
other node/partition probably hosts all), then introduce a delay in our 
reconfiguration, to allow the other partition a chance to win.

Or we can say, let the node/partition hosting lesser resource groups 
than the other node/partition introduce a delay in its CMM 
reconfiguration (race to acquire quorum).

Another thing we would need to think over is this : with multiple rules 
for delay (size of partition, resources hosted by partition), how do we 
prioritize them as deciders for "delay"?

So there are multiple deciding parameters to be considered.
Admin of a cluster might also want to have a say in this decision of 
which partition delays itself - so things might need to be tunable.

Theoretically, it seems we can have such a "preference" rule.

(2) One important point to note is that the resource/resource group 
information is with RGM. We will obviously want to check if making CMM 
obtain information from RGM for its decision of "delay" in a 
reconfiguration is a good idea or not - knowing that RGM is dependent on 
CMM for membership information.

Thanks & Regards,
Sambit

Hartmut Streppel wrote:
> Isn't there an algorithm already in place that adapts the priority to 
> start the race for quorum
> based on:
> - node id (??)
> - size of partition (i.e. higher number of nodes get priority).
> If yes, this could be easily enhanced to include active/passive as 
> well - maybe this should be made configurable.
>
> Regards
> Hartmut
>
>
> On 07/13/09 10:32, Tirthankar wrote:
>>
>>
>> On 07/13/09 13:54, Sergei Kolodka wrote:
>>> Thanks Hartmut,
>>>
>>> It took me a couple of minutes to actually read about split brain 
>>> clusters and I've edited my initial post a bit :-)
>>>
>>> Anyway, how about define inactive node in active/passive cluster 
>>> instead as node without any resource groups running and introduce 
>>> some kind of check for resource groups before nodes starting 
>>> eliminating each other. I.e. if I'm not running anything I'm 
>>> inactive and can wait for second or two before deleting active node 
>>> key and actually deliberately let active node win in nodes shoot out. 
>> In a cluster all nodes are considered equal. Now say you have a 2 
>> node cluster. node1 is running all the RGs and node2 isnt. There are 
>> a couple of scenarios
>>
>> 1. A real split brain happens.
>> I.e. Both nodes are up but there is a network disconnect. In this 
>> case, giving priority to node1 makes sense and it commits suicide.
>>
>> 2. Node 1 panics
>> To node 2, this still looks like a split brain as it can not contact 
>> node1. If the algo is modified to give priority to node1, you will 
>> have a full cluster outage.
>>
>> Hence in order to make the algo work in all scenarios, we do not give 
>> priority based on what groups are being hosted on which nodes.
>>
>> Though I guess the algo could be enhanced.
>>
>>
>>>
>>> I'm not sure if timestamps are saved in quorum database but if they 
>>> are the simple check if primary node of split-brain cluster was 
>>> reachable from quorum server after standby node was not able to 
>>> connect to it can give standby node idea that primary node is not so 
>>> dead and might be able to continue work. In real life situation node 
>>> which under load has no chance to win in this race with node which 
>>> is just sitting and doing nothing. But I'm pretty sure I'm missing 
>>> something important here.
>>>
>>> Regards,
>>> Sergei
>> _______________________________________________
>> ha-clusters-discuss mailing list
>> ha-clusters-discuss at opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/ha-clusters-discuss
>

[ha-clusters-discuss] Quorum server question

Reply via email to