Re: [Linux-HA] what to do on loss of network

Steve Wray Thu, 24 Jan 2008 12:04:02 -0800

Forgive top posting but I just noted this in some documentation:

"Provided both HA nodes can communicate with each other, ipfail canreliably detect when one of their network links has become unusable, andcompensate."

In the example which I give this is not the case; the loss ofconnectivity is complete. The nodes cannot communicate with one another.

One of the nodes can still contact its 'ping' node but not the othernode in the cluster. It is still on the network and can still provideNFS service.

The other node cannot contact its 'ping' node and also cannot contactthe other node in the cluster. It is not on the network at all. It has adead network connection.

I need for the node with *zero* connectivity to *not* take over as theactive node as this makes no sense at all; its not on the network, it ispointless bringing up NFS. It should just sit and wait for connectivityto be restored and do nothing but monitor the state of its networkconnection.



Steve Wray wrote:

Dejan Muhamedagic wrote:
Hi,

On Thu, Jan 24, 2008 at 09:39:05AM +1300, Steve Wray wrote:
Well I posted my config and I've tried various things and tested thissetup... and it still behaves incorrectly: going primary in the eventof a complete loss of network connectivity.
I mean... its an NFS server... *network* filesystem. If it can'tconnect to the network *at* *all* it makes no sense to become theprimary NFS server...
I'd really appreciate some comment on what may be wrong in the configfiles that I've posted. If theres any further info that I need topost please mention it.
Did you check if ipfail is running? If not, then you have to
check the user in the respawn line. Otherwise, please post the
logs.
Thanks for your reply!

ipfail is running, the user in the respawn line is correct.
I just ran a test failure of the network interface in the non-primarynode. Here are the logs from this test run only from the 'failed' node.
ipfail determines that "We are dead" and then heartbeat decides to takeover as primary.
Could this be a problem with "/etc/ha.d/rc.d/status status"?


------------------------------------------------------------------------

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] what to do on loss of network

Reply via email to