Hi everyone,

I'm running into trouble configuring Heartbeat 2.1.3 on CentOS 5.6.
I'm new to Linux high availability in general, sorry if the answer is
obvious. Thanks in advance for taking the time to read my message.

Basically, when I reboot both my nodes simultaneously, nobody will
mount the IP address (IPaddr2::172.22.4.1/24/eth0). However, as soon
as I shutdown one of the nodes when they're both on, the other one
will takeover and mount the IP address.

Let me show you my configuration files and some logs:

# [/etc/ha.d/ha.cf] on both nodes:
#
debugfile /var/log/ha-debug
logfile /var/log/ha-log
logfacility local0
keepalive 2
deadtime 10
auto_failback on
udpport 694
bcast eth0
node PBX02.BRM
node PBX03.BRM

# [/etc/ha.d/haresources] on both nodes:
#
PBX02.BRM IPaddr2::172.22.4.1/24/eth0 asterisk
# [/var/log/ha-log] PROBLEMATIC SITUATION
# This is what happens from the very beginning of a simultaneous boot
of my nodes. They won't mount the IP address.

# SYSTEM A: PBX02.BRM
#
heartbeat[3151]: 2011/07/08_11:34:50 info: Version 2 support: false
heartbeat[3151]: 2011/07/08_11:34:50 WARN: Logging daemon is disabled
--enabling logging daemon is recommended
heartbeat[3151]: 2011/07/08_11:34:50 info: **************************
heartbeat[3151]: 2011/07/08_11:34:50 info: Configuration validated.
Starting heartbeat 2.1.3
heartbeat[3152]: 2011/07/08_11:34:50 info: heartbeat: version 2.1.3
heartbeat[3152]: 2011/07/08_11:34:50 info: Heartbeat generation: 1310070701
heartbeat[3152]: 2011/07/08_11:34:50 info: glib: UDP Broadcast
heartbeat started on port 694 (694) interface eth0
heartbeat[3152]: 2011/07/08_11:34:50 info: glib: UDP Broadcast
heartbeat closed on port 694 interface eth0 - Status: 1
heartbeat[3152]: 2011/07/08_11:34:50 info: G_main_add_TriggerHandler:
Added signal manual handler
heartbeat[3152]: 2011/07/08_11:34:50 info: G_main_add_TriggerHandler:
Added signal manual handler
heartbeat[3152]: 2011/07/08_11:34:50 info: G_main_add_SignalHandler:
Added signal handler for signal 17
heartbeat[3152]: 2011/07/08_11:34:50 info: Local status now set to: 'up'
heartbeat[3152]: 2011/07/08_11:34:51 info: Link pbx02.brm:eth0 up.
heartbeat[3152]: 2011/07/08_11:34:52 info: Link pbx03.brm:eth0 up.
heartbeat[3152]: 2011/07/08_11:34:52 info: Status update for node
pbx03.brm: status active
heartbeat[3152]: 2011/07/08_11:34:52 info: Comm_now_up(): updating
status to active
heartbeat[3152]: 2011/07/08_11:34:52 info: Local status now set to: 'active'
harc[3287]:     2011/07/08_11:34:52 info: Running /etc/ha.d/rc.d/status status
heartbeat[3152]: 2011/07/08_11:35:03 info: local resource transition completed.
heartbeat[3152]: 2011/07/08_11:35:03 info: Initial resource
acquisition complete (T_RESOURCES(us))
heartbeat[3651]: 2011/07/08_11:35:03 info: No local resources
[/usr/share/heartbeat/ResourceManager listkeys pbx02.brm] to acquire.
heartbeat[3152]: 2011/07/08_11:35:04 info: remote resource transition completed.

# SYSTEM B: PBX03.BRM
#
heartbeat[3151]: 2011/07/08_11:34:48 info: Version 2 support: false
heartbeat[3151]: 2011/07/08_11:34:48 WARN: Logging daemon is disabled
--enabling logging daemon is recommended
heartbeat[3151]: 2011/07/08_11:34:48 info: **************************
heartbeat[3151]: 2011/07/08_11:34:48 info: Configuration validated.
Starting heartbeat 2.1.3
heartbeat[3152]: 2011/07/08_11:34:48 info: heartbeat: version 2.1.3
heartbeat[3152]: 2011/07/08_11:34:48 info: Heartbeat generation: 1310070791
heartbeat[3152]: 2011/07/08_11:34:48 info: glib: UDP Broadcast
heartbeat started on port 694 (694) interface eth0
heartbeat[3152]: 2011/07/08_11:34:48 info: glib: UDP Broadcast
heartbeat closed on port 694 interface eth0 - Status: 1
heartbeat[3152]: 2011/07/08_11:34:48 info: G_main_add_TriggerHandler:
Added signal manual handler
heartbeat[3152]: 2011/07/08_11:34:48 info: G_main_add_TriggerHandler:
Added signal manual handler
heartbeat[3152]: 2011/07/08_11:34:48 info: G_main_add_SignalHandler:
Added signal handler for signal 17
heartbeat[3152]: 2011/07/08_11:34:48 info: Local status now set to: 'up'
heartbeat[3152]: 2011/07/08_11:34:49 info: Link pbx03.brm:eth0 up.
heartbeat[3152]: 2011/07/08_11:34:52 info: Link pbx02.brm:eth0 up.
heartbeat[3152]: 2011/07/08_11:34:52 info: Status update for node
pbx02.brm: status up
harc[3416]:     2011/07/08_11:34:52 info: Running /etc/ha.d/rc.d/status status
heartbeat[3152]: 2011/07/08_11:34:52 info: Comm_now_up(): updating
status to active
heartbeat[3152]: 2011/07/08_11:34:52 info: Local status now set to: 'active'
heartbeat[3152]: 2011/07/08_11:34:53 info: Status update for node
pbx02.brm: status active
harc[3435]:     2011/07/08_11:34:53 info: Running /etc/ha.d/rc.d/status status
heartbeat[3152]: 2011/07/08_11:35:04 info: remote resource transition completed.
heartbeat[3152]: 2011/07/08_11:35:04 info: remote resource transition completed.
heartbeat[3152]: 2011/07/08_11:35:04 info: Initial resource
acquisition complete (T_RESOURCES(us))
heartbeat[3688]: 2011/07/08_11:35:04 info: No local resources
[/usr/share/heartbeat/ResourceManager listkeys pbx03.brm] to acquire.

# [/var/log/ha-log] WORKING STATE
# This is what happens if I shutdown one of both nodes. The IP address
gets mounted instantly.

# SYSTEM A: PBX02.BRM while PBX03.BRM is shutdown (note that it's also
working when I force shutdown the other node)
heartbeat[3152]: 2011/07/08_11:45:41 info: Received shutdown notice
from 'pbx03.brm'.
heartbeat[3152]: 2011/07/08_11:45:41 info: Resources being acquired
from pbx03.brm.
heartbeat[4237]: 2011/07/08_11:45:41 info: acquire local HA resources (standby).
heartbeat[4238]: 2011/07/08_11:45:41 info: No local resources
[/usr/share/heartbeat/ResourceManager listkeys pbx02.brm] to acquire.
heartbeat[4237]: 2011/07/08_11:45:41 info: local HA resource
acquisition completed (standby).
heartbeat[3152]: 2011/07/08_11:45:41 info: Standby resource
acquisition done [all].
harc[4263]:     2011/07/08_11:45:41 info: Running /etc/ha.d/rc.d/status status
mach_down[4279]:        2011/07/08_11:45:41 info: Taking over resource
group IPaddr2::172.22.4.1/24/eth0
ResourceManager[4305]:  2011/07/08_11:45:41 info: Acquiring resource
group: pbx03.brm IPaddr2::172.22.4.1/24/eth0 asterisk
IPaddr2[4332]:  2011/07/08_11:45:41 INFO:  Resource is stopped
ResourceManager[4305]:  2011/07/08_11:45:41 info: Running
/etc/ha.d/resource.d/IPaddr2 172.22.4.1/24/eth0 start
IPaddr2[4444]:  2011/07/08_11:45:41 INFO: ip -f inet addr add
172.22.4.1/24 brd 172.22.4.255 dev eth0
IPaddr2[4444]:  2011/07/08_11:45:41 INFO: ip link set eth0 up
IPaddr2[4444]:  2011/07/08_11:45:41 INFO:
/usr/lib64/heartbeat/send_arp -i 200 -r 5 -p
/var/run/heartbeat/rsctmp/send_arp/send_arp-172.22.4.1 eth0 172.22.4.1
auto not_used not_used
IPaddr2[4415]:  2011/07/08_11:45:41 INFO:  Success
mach_down[4279]:        2011/07/08_11:45:41 info:
/usr/share/heartbeat/mach_down: nice_failback: foreign resources
acquired
mach_down[4279]:        2011/07/08_11:45:41 info: mach_down takeover
complete for node pbx03.brm.
heartbeat[3152]: 2011/07/08_11:45:41 info: mach_down takeover complete.

Thank you very much,
--
Gregory A. Lussier
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to