[Linux-HA] Second host taking IP when first is still active.

Chad Osmond Thu, 19 Jul 2007 08:06:40 -0700

Hello,
 
I am now running 2.1.1 compiled into RPM's from tip.tar.gz on RHEL 4WS.
I have an issue, where when I bring up the second host, it thinks the
first is dead and takes the common IP address.
 
Anyone have any idea what is causing this? I lengthened the multicast
heartbeats to 2 to see if that would resolve the issue, but no change.
Both machines have firewalls disabled for testing.


I'm not too sure where to get more debug information, any suggestions?
 
Here is the relative information:
 
ha.cf:
 
use_logd        yes
keepalive       1
deadtime        60
warntime        20
initdead        120
udpport         694
mcast eth0 225.0.0.1 694 2 0
auto_failback off
node            node001
node            node002
ping            10.1.1.1
respawn hacluster /usr/lib/heartbeat/ipfail
crm no
 
haresources:
node001 IPaddr2::10.1.1.9/24/eth0/10.1.1.255
 
------------------------------------------------------------------
Node #1
heartbeat: [1766]: info: mach_down takeover complete.
IPaddr2[1843]: [1911]: INFO:  Running OK
heartbeat: [1781]: info: Local Resource acquisition completed.
heartbeat: [1766]: info: Local Resource acquisition completed. (none)
heartbeat: [1766]: info: local resource transition completed.
heartbeat: [1766]: info: Link node002:eth0 up.
heartbeat: [1766]: info: Status update for node node002: status init
heartbeat: [1766]: info: Status update for node node002: status up
ipfail: [1779]: info: Link Status update: Link node002/eth0 now has
status up
ipfail: [1779]: info: Status update: Node node002 now has status init
ipfail: [1779]: info: Status update: Node node002 now has status up
harc[1914]: [1921]: info: Running /etc/ha.d/rc.d/status status
harc[1927]: [1933]: info: Running /etc/ha.d/rc.d/status status
heartbeat: [1766]: info: all clients are now paused
heartbeat: [1766]: ERROR: Message hist queue is filling up (151 messages
in queue)
heartbeat: [1766]: ERROR: Message hist queue is filling up (152 messages
in queue)
heartbeat: [1766]: ERROR: Message hist queue is filling up (153 messages
in queue)
heartbeat: [1766]: ERROR: Message hist queue is filling up (154 messages
in queue)
 
-------- Node #2
heartbeat: [2588]: WARN: node node001: is dead
heartbeat: [2588]: info: Comm_now_up(): updating status to active
heartbeat: [2588]: info: Local status now set to: 'active'
heartbeat: [2588]: info: Starting child client
"/usr/lib/heartbeat/ipfail" (90,90)
heartbeat: [2588]: WARN: No STONITH device configured.
heartbeat: [2588]: WARN: Shared disks are not protected.
heartbeat: [2588]: info: Resources being acquired from node001.
heartbeat: [2597]: info: Starting "/usr/lib/heartbeat/ipfail" as uid 90
gid 90 (pid 2597)
harc[2598]: [2611]: info: Running /etc/ha.d/rc.d/status status
heartbeat: [2599]: info: No local resources
[/usr/lib/heartbeat/ResourceManager listkeys node002] to acquire.
heartbeat: [2588]: info: Initial resource acquisition complete
(T_RESOURCES(us))
mach_down[2623]: [2644]: info: Taking over resource group
IPaddr2::10.1.1.9/24/eth0/10.1.1.255
ResourceManager[2645]: [2656]: info: Acquiring resource group: node001
IPaddr2::10.1.1.9/24/eth0/10.1.1.255
IPaddr2[2668]: [2725]: INFO:  Resource is stopped
ResourceManager[2645]: [2739]: info: Running
/etc/ha.d/resource.d/IPaddr2 10.1.1.9/24/eth0/10.1.1.255 start
IPaddr2[2770]: [2805]: INFO: ip -f inet addr add 10.1.1.9/24 brd
10.1.1.255 dev eth0
IPaddr2[2770]: [2807]: INFO: ip link set eth0 up
IPaddr2[2770]: [2809]: INFO: /usr/lib/heartbeat/send_arp -i 200 -r 5 -p
/var/run/heartbeat/rsctmp/send_arp/send_arp-10.1.1.9 eth0 10.1.1.9 auto
not_used not_used
IPaddr2[2741]: [2813]: INFO:  Success
mach_down[2623]: [2815]: info: /usr/lib/heartbeat/mach_down:
nice_failback: foreign resources acquired
mach_down[2623]: [2819]: info: mach_down takeover complete for node
node001.
heartbeat: [2588]: info: mach_down takeover complete.
heartbeat: [2588]: info: Local Resource acquisition completed. (none)
heartbeat: [2588]: info: local resource transition completed.


______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
______________________________________________________________________
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

[Linux-HA] Second host taking IP when first is still active.

Reply via email to