Hi,

>From my point of view the problem is not so related to what does the "bad 
>node" is down but what happen when communications are restored. Let me explain 
>it.

1. Let's start in a 2 nodes clean cluster, each in a different site. Data 
duplication is done from the host using md or lvm mirror. There is a service 
running on each node. Qdisk or third node in a third site for quorum.

2. Communications are lost in site B (where node B runs.). What happens?, not 
sure, but as my understanding:

           I- Node B will continue working for some time until it realizes it's 
not quorated (depending on timeouts, let's say 1 minute). Data writes on this 
time are only written to Disks on site B, modifications not written to disks in 
site A.
            II- Finally, Node B detects it lost qdisk and detects it's 
inquorate and rgmanager stops all services running in node B.
            III- In node A, time some time to detect A is dead and never will 
become inquorate. Services running in node A will continue working, but writes 
will only be done to disks in disks in site A. Mirror is lost.
            IV- Finally, Node A detects node B is dead and will try to fence it 
(probably it will need to use manual fence for confirmation).
            V- Until fence is successful, services running originally in node B 
will not be transferred to node A, so service will be never running 
simultaneously on both nodes.
            VI- After fence is successful, service starts in node A using disks 
in site A, without any modification done since the outage until failure is 
deteced by node B (from I to II). Data modification done from node A are only 
done to these disks.

3. Communications are restored in site B. At this time node B will join the 
cluster again. Acces to disks in site B is recovered by node A. At this time 
mirror should be synchronized from disks in site A to site B always, so that, 
we have a coherent view of the data in both disks, and changes done from node B 
in the qdisk timeout (from I to II) will be definitively lost.

I think this the expected behavior for a multisite cluster in this scenario.

Best regards,

Alfredo



________________________________
From: [email protected] 
[mailto:[email protected]] On Behalf Of brem belguebli
Sent: Thursday, September 10, 2009 11:23 PM
To: linux clustering
Subject: [Linux-cluster] Re: Fencing question in geo cluster (dual sites 
clustering)

Hi,

No comments on this RHCS gurus ? Am I trying to setup (multisite cluster) 
something that 'll never be supported ?

Or is the qdiskd reboot action considered as sufficient?  (Reboot action should 
be a dirty power reset to prevent data syncing)

If so, all IO's on the wrong nodes (at the isolated site) should be frozen 
untill quorum is eventually regained. If not it'll end up with a (dirty) reboot.

Brem

2009/8/21 brem belguebli 
<[email protected]<mailto:[email protected]>>
Hi,

I'm trying to find out what best fencing solution could fit a dual sites 
cluster.

Cluster is equally sized on each site (2 nodes/site), each site hosting a SAN 
array so that each node from any site can see the 2 arrays.

Quorum  disk (iscsi LUN) is hosted on a 3rd site.

SAN and LAN using the same telco infrastructure (2 redundant DWDM loops).

In case something happens at Telco level (both DWDM loops are broken) that 
makes 1 of the 2 sites completely isolated from the rest of the world,
the nodes at the good site (the one still operationnal) won't be able to fence 
any node from the wrong site (the one that is isolated) as there is no way for 
them to reach their ILO's or do any SAN fencing as the switches at the wrong 
site are no more reachable.

As qdiskd is not reachable from the wrong nodes, they end up being rebooted by  
qdisk, but there is a short time (a few seconds) during which the wrong nodes 
are still seing their local SAN array storage and may potentially have written 
data on it.

Any ideas or comments on how to ensure data integrity in such setup ?

Regards

Brem

--
Linux-cluster mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/linux-cluster

Reply via email to