Hello, I am now running 2.1.1 compiled into RPM's from tip.tar.gz on RHEL 4WS. I have an issue, where when I bring up the second host, it thinks the first is dead and takes the common IP address. Anyone have any idea what is causing this? I lengthened the multicast heartbeats to 2 to see if that would resolve the issue, but no change. Both machines have firewalls disabled for testing.
I'm not too sure where to get more debug information, any suggestions? Here is the relative information: ha.cf: use_logd yes keepalive 1 deadtime 60 warntime 20 initdead 120 udpport 694 mcast eth0 225.0.0.1 694 2 0 auto_failback off node node001 node node002 ping 10.1.1.1 respawn hacluster /usr/lib/heartbeat/ipfail crm no haresources: node001 IPaddr2::10.1.1.9/24/eth0/10.1.1.255 ------------------------------------------------------------------ Node #1 heartbeat: [1766]: info: mach_down takeover complete. IPaddr2[1843]: [1911]: INFO: Running OK heartbeat: [1781]: info: Local Resource acquisition completed. heartbeat: [1766]: info: Local Resource acquisition completed. (none) heartbeat: [1766]: info: local resource transition completed. heartbeat: [1766]: info: Link node002:eth0 up. heartbeat: [1766]: info: Status update for node node002: status init heartbeat: [1766]: info: Status update for node node002: status up ipfail: [1779]: info: Link Status update: Link node002/eth0 now has status up ipfail: [1779]: info: Status update: Node node002 now has status init ipfail: [1779]: info: Status update: Node node002 now has status up harc[1914]: [1921]: info: Running /etc/ha.d/rc.d/status status harc[1927]: [1933]: info: Running /etc/ha.d/rc.d/status status heartbeat: [1766]: info: all clients are now paused heartbeat: [1766]: ERROR: Message hist queue is filling up (151 messages in queue) heartbeat: [1766]: ERROR: Message hist queue is filling up (152 messages in queue) heartbeat: [1766]: ERROR: Message hist queue is filling up (153 messages in queue) heartbeat: [1766]: ERROR: Message hist queue is filling up (154 messages in queue) -------- Node #2 heartbeat: [2588]: WARN: node node001: is dead heartbeat: [2588]: info: Comm_now_up(): updating status to active heartbeat: [2588]: info: Local status now set to: 'active' heartbeat: [2588]: info: Starting child client "/usr/lib/heartbeat/ipfail" (90,90) heartbeat: [2588]: WARN: No STONITH device configured. heartbeat: [2588]: WARN: Shared disks are not protected. heartbeat: [2588]: info: Resources being acquired from node001. heartbeat: [2597]: info: Starting "/usr/lib/heartbeat/ipfail" as uid 90 gid 90 (pid 2597) harc[2598]: [2611]: info: Running /etc/ha.d/rc.d/status status heartbeat: [2599]: info: No local resources [/usr/lib/heartbeat/ResourceManager listkeys node002] to acquire. heartbeat: [2588]: info: Initial resource acquisition complete (T_RESOURCES(us)) mach_down[2623]: [2644]: info: Taking over resource group IPaddr2::10.1.1.9/24/eth0/10.1.1.255 ResourceManager[2645]: [2656]: info: Acquiring resource group: node001 IPaddr2::10.1.1.9/24/eth0/10.1.1.255 IPaddr2[2668]: [2725]: INFO: Resource is stopped ResourceManager[2645]: [2739]: info: Running /etc/ha.d/resource.d/IPaddr2 10.1.1.9/24/eth0/10.1.1.255 start IPaddr2[2770]: [2805]: INFO: ip -f inet addr add 10.1.1.9/24 brd 10.1.1.255 dev eth0 IPaddr2[2770]: [2807]: INFO: ip link set eth0 up IPaddr2[2770]: [2809]: INFO: /usr/lib/heartbeat/send_arp -i 200 -r 5 -p /var/run/heartbeat/rsctmp/send_arp/send_arp-10.1.1.9 eth0 10.1.1.9 auto not_used not_used IPaddr2[2741]: [2813]: INFO: Success mach_down[2623]: [2815]: info: /usr/lib/heartbeat/mach_down: nice_failback: foreign resources acquired mach_down[2623]: [2819]: info: mach_down takeover complete for node node001. heartbeat: [2588]: info: mach_down takeover complete. heartbeat: [2588]: info: Local Resource acquisition completed. (none) heartbeat: [2588]: info: local resource transition completed. ______________________________________________________________________ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email ______________________________________________________________________ _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
