Hi,

I'm still testing a setup with 2 nodes and a primary partition on each
machine with the secondary on the opposite machine. I'm running drbd8
with Heartbeat 2.1.3 + CRM. It works fine if one of the machines fails
completely (hold in the power til it shuts off or stopping Heartbeat).
I'm running into a problem when I just pull the network cable on one
machine and then plug it in again after a while. Obviously nothing has
failed as far as heartbeat is concerned so the "failed" node takes
back it's primary drbd partition and causes a split brain when I plug
the cable back in. Would I need to add something like a pingd
primitive and base the promotion of a drbd partition on the result
from pingd?
Or would STONITH be what I need to look at? I've not looked at STONITH
at all yet.

If/when I use this in a production environment the machines are going
to be administered remotely so the machines need to sort this sort of
thing out from rules rather than intervention from me.

I have one other stupid question, once I've brought a failed node back
up and checked that it's ok, how do I switch the primary partition
back to the recovered node? Make the changes through drbdadm or using
crm_resource and those programs? Like I said, the machines would be
administered remotely so I can't use a GUI. I need to be able to do it
all from commandline.

Thanks
Guy

-- 
Don't just do something...sit there!
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to