Laurent,

  I am not a developer and I am not very familiar with the inner workings of 
OCFS2, so I am assuming some of the things below based on generic cluster 
design.

  There are two heartbeats, the network heartbeat and the disk heartbeat.

   If I am not mistaken the disk heartbeat is done on the block device that is 
mounted as a OCFS2 filesystem. So you can decide which node will fence by 
cutting its access to the disk device. 

   When using a SAN this is kind of simple, since there is a external disk 
device and one node eventually locks the device and forces the other node to be 
evicted.

    Since you are using DRDB, you need to make sure that the node that your 
cluster manager evicts cannot access the DRDB device any longer. As there are 
two paths to the DRDB device on each node (one local device and one remote 
device), I am not exactly sure how you will acomplish this or if DRDB already 
has this kind of control to prevent a split brain, but what you need to do is 
to block access to the shared disk device on the evicted node before the OCFS2 
timeout.

Regards,
Luis

Laurent Neiger <[EMAIL PROTECTED]> wrote:        Hi guys,
 
 I keep you in touch with my issue...
 
 
 Luis Freitas wrote: Laurent,
   
     What you need to be able to decide is what node still have network 
connectivity. If both have network connectivity you could fence any of them. If 
both lost connectivity (someone turned the switch off), then you are in trouble.
   
    You will need to plug the backend network in a switch and monitor the 
interface status, so when one machine is shutdown or you disconnect its network 
cable, you still get the up status on the other machine. If you dont want to 
use two switches, plug them into the same switch and use different vlans.
  
 Yes I achieved to do that. In my cluster manager, I'm able to know which node 
is still up before ocfs2 timers fence
 all nodes but the lower one, even if it's node0 which is off the network and 
node1 still connected.
 
    To deal with OCFS2 I think the easiest approach is increase its timeouts to 
let your cluster manager decide which node will survive before the OCFS2 
heartbeat fences the node. I wouldnt be messing with its inner workings, YMMV...
  
 I think I managed to get time for my cluster manager to decide without having 
to increase ocfs2 timeouts.
 
 But my problem is not here.
 It's _HOW_ to cancel ocfs2 self-fencing on node 1 if I work out node0 have to 
be fenced and not node1.
 
 I tried this :
 node0 and node1 are OK, into the ocfs2 cluster, shared disk is mounted, all is 
fine.
 I guess both of them are writing their timers every two secs to their blocks 
in the "heartbeat system file",
 as mentionned in the FAQ.
 
 But what/where is "heartbeat system file", BTW ?
 
 When I unplug node0 network link, both of them say they lost their netcomm to 
the peer.
 Within the five first seconds, my cluster manager works out node0 is off the 
network,
 and node1 is OK. So the decision to have node0 fenced and cancel fencing for 
node1
 is taken (as node1 would have to be fenced according to ocfs2 decision of 
fencing the
 upper node number and leave the lower alive).
 
 So cluster manager runs "ocfs2_hb_ctl -K -d /dev/drbd0", which stops heartbeat 
on node1.
 
 But this doesn't prevent node1 to be self-fencing 28 seconds after netcomm 
lost, and
 node0 to stay alive with its deceased card. My entire cluster is down. No more 
service,
 nor data access, still available.
 
 Logical, afterwards, as heartbeat was stopped but timers still countdown, 
nothing reset them.
 
 Sunil Mushran <[EMAIL PROTECTED]> wrote:   Each of those pings will require a 
timeout - short timeouts. So short 
 that you
 may not even be able to distinguish between errors and overloaded run-queue,
 transmit queue, router, etc.  
 Once more I think I achieved that. My problem is to cancel self-fencing of 
node1,
 not to decide to do so.
 
 
 I'm sorry to annoy you, you might find it trivial but I probably missed 
something.
 
 You wrote "one does not have to have 3 nodes when one only wants 2 nodes".
 Great, this is fine for me as I don't (and can't) have SANs and drbd allows 
max 2 nodes
 for disk-sharing.
 
 I read too that behavior of fencing all nodes but the lower one is the wanted 
behavior
 of ocfs2.
 
 So I rephrase my question :
 
 How can I make a 2-node cluster works with high-availablity, i.e. still having 
access to
 the remaining node in the eventuality of _ANY_ node failure ? Cluster will be 
degraded,
 only one node remaining until we repair and power up the node which failed, 
but no
 services loss.
 Even if node0 fails, node1 still assumes tasks, rather than self-fencing.
 
 Once more thanks a lot for your help.
 
 Have a good day,
 
 best regards,
 
 Laurent.
 
 
 begin:vcard
fn:Laurent Neiger
n:Neiger;Laurent
org;quoted-printable:CNRS Grenoble;Centre R=C3=A9seau & Informatique Commun
adr:B.P. 166;;25, avenue des Martyrs;Grenoble;;38042;France
email;internet:[EMAIL PROTECTED]
title;quoted-printable:Administrateur Syst=C3=A8mes & R=C3=A9seaux
tel;work:(0033) (0)4 76 88 79 91
tel;fax:(0033) (0)4 76 88 12 95
note:Certificats : http://igc.services.cnrs.fr/Doc/General/trust.html
x-mozilla-html:TRUE
url:http://cric.grenoble.cnrs.fr
version:2.1
end:vcard

_______________________________________________
Ocfs2-users mailing list
[email protected]
http://oss.oracle.com/mailman/listinfo/ocfs2-users

       
---------------------------------
Looking for last minute shopping deals?  Find them fast with Yahoo! Search.
_______________________________________________
Ocfs2-users mailing list
[email protected]
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Reply via email to