[Linux-HA] Primary node restart on Standby node link failure?

Michael Moon Wed, 07 Sep 2011 08:55:00 -0700

I searched hi and low and couldn't find an answer to this issue. We are running 
heartbeat 2.1.4 on a two node configuration. If I stop the both ethernet 
interfaces on the Standby node, the Application on the Primary node will 
restart. Anyone know why this would be happening? I would think there would be 
a setting on what to do upon a failure, (ie, restart, nothing, etc...) but I 
can't find it for this version of heartbeat.


Thanks in advance.


The ha.cf looks like

logfile         /var/log/ha-log
logfacility     local0
ucast           eth0 10.xxx.xx.16
ucast           eth0 10.xx.xx.17
ucast           eth1 169.254.10.1
ucast           eth1 169.254.10.2
auto_failback   off
node            customer-site-node1
node            customer-site-node2
deadtime        60
keepalive       1
warntime        5


Here is the ha-log

heartbeat[13008]: 2011/09/06_17:41:30 info: Link customer-site-node2:eth0 dead.
heartbeat[13008]: 2011/09/06_17:41:31 WARN: node customer-site-node2: is dead
heartbeat[13008]: 2011/09/06_17:41:31 WARN: No STONITH device configured.
heartbeat[13008]: 2011/09/06_17:41:31 WARN: Shared disks are not protected.
heartbeat[13008]: 2011/09/06_17:41:31 info: Resources being acquired from 
customer-site-node2.
heartbeat[13008]: 2011/09/06_17:41:31 info: Link customer-site-node2:eth1 dead.
harc[20623]:    2011/09/06_17:41:31 info: Running /etc/ha.d/rc.d/status status
mach_down[20653]:       2011/09/06_17:41:31 info: 
/usr/share/heartbeat/mach_down: nice_failback: foreign resources acqui
red
mach_down[20653]:       2011/09/06_17:41:31 info: mach_down takeover complete 
for node customer-site-node2.
heartbeat[13008]: 2011/09/06_17:41:31 info: mach_down takeover complete.
IPaddr[20697]:  2011/09/06_17:41:31 INFO:  Running OK
heartbeat[20624]: 2011/09/06_17:41:31 info: Local Resource acquisition 
completed.
heartbeat[13008]: 2011/09/06_17:41:47 CRIT: Cluster node customer-site-node2 
returning after partition.
heartbeat[13008]: 2011/09/06_17:41:47 info: For information on cluster 
partitions, See URL: http://linux-ha.org/SplitBra
in
heartbeat[13008]: 2011/09/06_17:41:47 WARN: Deadtime value may be too small.
heartbeat[13008]: 2011/09/06_17:41:47 info: See FAQ for information on tuning 
deadtime.
heartbeat[13008]: 2011/09/06_17:41:47 info: URL: 
http://linux-ha.org/FAQ#heavy_load
heartbeat[13008]: 2011/09/06_17:41:47 info: Link customer-site-node2:eth0 up.
heartbeat[13008]: 2011/09/06_17:41:47 WARN: Late heartbeat: Node 
customer-site-node2: interval 76500 ms
heartbeat[13008]: 2011/09/06_17:41:47 info: Status update for node 
customer-site-node2: status active
harc[20765]:    2011/09/06_17:41:47 info: Running /etc/ha.d/rc.d/status status
heartbeat[13008]: 2011/09/06_17:41:49 info: Link customer-site-node2:eth1 up.
heartbeat[13008]: 2011/09/06_17:41:49 WARN: Shutdown delayed until current 
resource activity finishes.
heartbeat[13008]: 2011/09/06_17:41:50 info: Heartbeat shutdown in progress. 
(13008)
heartbeat[13008]: 2011/09/06_17:41:50 info: Received shutdown notice from 
'customer-site-node2'.
heartbeat[13008]: 2011/09/06_17:41:50 info: Resource takeover cancelled - 
shutdown in progress.
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

[Linux-HA] Primary node restart on Standby node link failure?

Reply via email to