On Sun, Oct 5, 2008 at 3:54 AM, Daniel Asplund <[EMAIL PROTECTED]>wrote:
> > > > Hey all: > > > > It seems like my question is related to ha, drbd and xen . Hence posting > to all of them at once. > > I have two nodes setup with xen 3.0.3, drbd82, heartbeat 2 under centos > 5.2. As I was testing this cluster for high availibility, I noticed some > issues > > > > 1) domA is running under node1. when I manually shutdown node 1, > sometimes it is migrated automatically to node2 and sometimes it is > restarted in node2. Why is this happening? > > 2) domA is running under node1. when I pull off the network cable, domA > is restarted in node 2 with no problem. But when the node1 comes back, domA > is not migrated to node1 and if i do 'xm list' under node1, I see > "migrating-domain". This is complicating everything. > > > > 1) Most likely live migration fails for some reason and therefore the > domA is restarted in node2. Could be a timer issue or a problem with > release of resources. You should be able to see something from the > logs during shutdown on node1. > > 2) heartbeat on node1 will sense an error and try to migrate domA to > node2 when node1 is up again. But the node2 has already started domA > and you basically have domA running on both nodes. To avoid split > situations like this you should really use a STONITH device that can > reboot the other node, a hardware device connected via serial cable is > most secure, but a cheaper alternative is to use soft stonith device > that can reboot the other node via SSH or telnet. You probably need to > tweak heartbeat as well to allow it to do further checks, for example > test connectivity to your gateway. Yes it seems I need Stonith. At least for now I want to use stonith ssh for testing purposes. One thing that i am confused, how do i configure stonith and what is the typical practise. In above scenario, node1 should be rebooted or node2. What i did is under node1, I added "stonith_host * ssh node2" to ha.cf and under node2: "stonith_host * ssh node1". But this is not working. Is that the way to configure stonith. I have checked linux-ha.org + google, but this confusion persists. What I want is, if there is a network outage in node1, it should be automatically rebooted or shutdown migrating all domUs to node2. > > > Do you have two NICs in both nodes or are you running DRBD, HA and > data traffic over same NIC? Daniel, Yes I have 2 NICs in both nodes. > > Regards, Daniel > http://www.asplund.nu/xencluster.html > Thanks Paras. _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
