Laurent,
I am not a developer and I am not very familiar with the inner workings of
OCFS2, so I am assuming some of the things below based on generic cluster
design.
There are two heartbeats, the network heartbeat and the disk heartbeat.
If I am not mistaken the disk heartbeat is done on the block device that is
mounted as a OCFS2 filesystem. So you can decide which node will fence by
cutting its access to the disk device.
When using a SAN this is kind of simple, since there is a external disk
device and one node eventually locks the device and forces the other node to be
evicted.
Since you are using DRDB, you need to make sure that the node that your
cluster manager evicts cannot access the DRDB device any longer. As there are
two paths to the DRDB device on each node (one local device and one remote
device), I am not exactly sure how you will acomplish this or if DRDB already
has this kind of control to prevent a split brain, but what you need to do is
to block access to the shared disk device on the evicted node before the OCFS2
timeout.
Regards,
Luis
Laurent Neiger <[EMAIL PROTECTED]> wrote: Hi guys,
I keep you in touch with my issue...
Luis Freitas wrote: Laurent,
What you need to be able to decide is what node still have network
connectivity. If both have network connectivity you could fence any of them. If
both lost connectivity (someone turned the switch off), then you are in trouble.
You will need to plug the backend network in a switch and monitor the
interface status, so when one machine is shutdown or you disconnect its network
cable, you still get the up status on the other machine. If you dont want to
use two switches, plug them into the same switch and use different vlans.
Yes I achieved to do that. In my cluster manager, I'm able to know which node
is still up before ocfs2 timers fence
all nodes but the lower one, even if it's node0 which is off the network and
node1 still connected.
To deal with OCFS2 I think the easiest approach is increase its timeouts to
let your cluster manager decide which node will survive before the OCFS2
heartbeat fences the node. I wouldnt be messing with its inner workings, YMMV...
I think I managed to get time for my cluster manager to decide without having
to increase ocfs2 timeouts.
But my problem is not here.
It's _HOW_ to cancel ocfs2 self-fencing on node 1 if I work out node0 have to
be fenced and not node1.
I tried this :
node0 and node1 are OK, into the ocfs2 cluster, shared disk is mounted, all is
fine.
I guess both of them are writing their timers every two secs to their blocks
in the "heartbeat system file",
as mentionned in the FAQ.
But what/where is "heartbeat system file", BTW ?
When I unplug node0 network link, both of them say they lost their netcomm to
the peer.
Within the five first seconds, my cluster manager works out node0 is off the
network,
and node1 is OK. So the decision to have node0 fenced and cancel fencing for
node1
is taken (as node1 would have to be fenced according to ocfs2 decision of
fencing the
upper node number and leave the lower alive).
So cluster manager runs "ocfs2_hb_ctl -K -d /dev/drbd0", which stops heartbeat
on node1.
But this doesn't prevent node1 to be self-fencing 28 seconds after netcomm
lost, and
node0 to stay alive with its deceased card. My entire cluster is down. No more
service,
nor data access, still available.
Logical, afterwards, as heartbeat was stopped but timers still countdown,
nothing reset them.
Sunil Mushran <[EMAIL PROTECTED]> wrote: Each of those pings will require a
timeout - short timeouts. So short
that you
may not even be able to distinguish between errors and overloaded run-queue,
transmit queue, router, etc.
Once more I think I achieved that. My problem is to cancel self-fencing of
node1,
not to decide to do so.
I'm sorry to annoy you, you might find it trivial but I probably missed
something.
You wrote "one does not have to have 3 nodes when one only wants 2 nodes".
Great, this is fine for me as I don't (and can't) have SANs and drbd allows
max 2 nodes
for disk-sharing.
I read too that behavior of fencing all nodes but the lower one is the wanted
behavior
of ocfs2.
So I rephrase my question :
How can I make a 2-node cluster works with high-availablity, i.e. still having
access to
the remaining node in the eventuality of _ANY_ node failure ? Cluster will be
degraded,
only one node remaining until we repair and power up the node which failed,
but no
services loss.
Even if node0 fails, node1 still assumes tasks, rather than self-fencing.
Once more thanks a lot for your help.
Have a good day,
best regards,
Laurent.
begin:vcard
fn:Laurent Neiger
n:Neiger;Laurent
org;quoted-printable:CNRS Grenoble;Centre R=C3=A9seau & Informatique Commun
adr:B.P. 166;;25, avenue des Martyrs;Grenoble;;38042;France
email;internet:[EMAIL PROTECTED]
title;quoted-printable:Administrateur Syst=C3=A8mes & R=C3=A9seaux
tel;work:(0033) (0)4 76 88 79 91
tel;fax:(0033) (0)4 76 88 12 95
note:Certificats : http://igc.services.cnrs.fr/Doc/General/trust.html
x-mozilla-html:TRUE
url:http://cric.grenoble.cnrs.fr
version:2.1
end:vcard
_______________________________________________
Ocfs2-users mailing list
[email protected]
http://oss.oracle.com/mailman/listinfo/ocfs2-users
---------------------------------
Looking for last minute shopping deals? Find them fast with Yahoo! Search._______________________________________________
Ocfs2-users mailing list
[email protected]
http://oss.oracle.com/mailman/listinfo/ocfs2-users