Hi, I am using v2.07 but with v1 config, that is, without "crm on".
Here is my very simple test config: ha.cf: logfacility local0 keepalive 2 deadtime 20 warntime 10 bcast eth0 auto_failback no node atlanta boston respawn hacluster /usr/lib/heartbeat/ipfail ping 192.168.15.53 harescources: atlanta 10.180.225.99 httpd And here is the log output from a failover (sorry for the verbosity): ATLANTA: ------------------------------------------------------------------------------------ Aug 29 12:44:32 atlanta heartbeat: [21685]: info: Link atlanta:eth0 up. Aug 29 12:44:37 atlanta heartbeat: [21685]: info: Link boston:eth0 up. Aug 29 12:44:37 atlanta heartbeat: [21685]: info: Status update for node boston: status up Aug 29 12:44:37 atlanta harc[21696]: info: Running /etc/ha.d/rc.d/status status Aug 29 12:44:38 atlanta heartbeat: [21685]: info: Comm_now_up(): updating status to active Aug 29 12:44:38 atlanta heartbeat: [21685]: info: Local status now set to: 'active' Aug 29 12:44:38 atlanta heartbeat: [21685]: info: Starting child client "/usr/lib/heartbeat/ipfail" (40002,40002) Aug 29 12:44:38 atlanta heartbeat: [21685]: info: Status update for node boston: status active Aug 29 12:44:38 atlanta heartbeat: [21707]: info: Starting "/usr/lib/heartbeat/ipfail" as uid 40002 gid 40002 (pid 21707) Aug 29 12:44:38 atlanta harc[21708]: info: Running /etc/ha.d/rc.d/status status Aug 29 12:44:46 atlanta ipfail: [21707]: info: Ping node count is balanced. Aug 29 12:44:48 atlanta heartbeat: [21685]: info: local resource transition completed. Aug 29 12:44:48 atlanta heartbeat: [21685]: info: Initial resource acquisition complete (T_RESOURCES(us)) Aug 29 12:44:48 atlanta IPaddr[21747]: INFO: IPaddr Resource is stopped Aug 29 12:44:48 atlanta heartbeat: [21720]: info: Local Resource acquisition completed. Aug 29 12:44:48 atlanta harc[21833]: info: Running /etc/ha.d/rc.d/ip-request-resp ip-request-resp Aug 29 12:44:48 atlanta ip-request-resp[21833]: received ip-request-resp 10.180.225.99 OK yes Aug 29 12:44:48 atlanta ResourceManager[21848]: info: Acquiring resource group: atlanta 10.180.225.99 httpd Aug 29 12:44:48 atlanta IPaddr[21872]: INFO: IPaddr Resource is stopped Aug 29 12:44:48 atlanta ResourceManager[21848]: info: Running /etc/ha.d/resource.d/IPaddr 10.180.225.99 start Aug 29 12:44:48 atlanta IPaddr[22048]: INFO: eval /sbin/ifconfig eth0:0 10.180.225.99 netmask 255.255.240.0 broadcast 10.180.239.255 Aug 29 12:44:48 atlanta IPaddr[22048]: INFO: Sending Gratuitous Arp for 10.180.225.99 on eth0:0 [eth0] Aug 29 12:44:48 atlanta IPaddr[22048]: INFO: /usr/lib/heartbeat/send_arp -i 500 -r 10 -p /var/run/heartbeat/rsctmp/send_arp/send_arp-10.180.225.99 eth0 10.180.225.99 auto 10.180.225.99 ffffffffffff Aug 29 12:44:48 atlanta IPaddr[21978]: INFO: IPaddr Success Aug 29 12:44:48 atlanta ResourceManager[21848]: info: Running /etc/init.d/httpd start Aug 29 12:44:48 atlanta httpd: httpd startup succeeded Aug 29 12:44:48 atlanta heartbeat: [21685]: info: remote resource transition completed. Aug 29 12:49:21 atlanta heartbeat: [21690]: ERROR: write failure on bcast eth0.: No such device Aug 29 12:49:23 atlanta heartbeat: [21690]: ERROR: glib: Unable to send bcast [-1] packet(len=170): No such device Aug 29 12:49:23 atlanta heartbeat: [21690]: ERROR: MSG: Dumping message with 12 fields Aug 29 12:49:23 atlanta heartbeat: [21690]: ERROR: MSG[0] : [t=status] Aug 29 12:49:23 atlanta heartbeat: [21690]: ERROR: MSG[1] : [st=active] Aug 29 12:49:23 atlanta heartbeat: [21690]: ERROR: MSG[2] : [dt=4e20] Aug 29 12:49:23 atlanta heartbeat: [21690]: ERROR: MSG[3] : [protocol=1] Aug 29 12:49:23 atlanta heartbeat: [21690]: ERROR: MSG[4] : [src=atlanta] Aug 29 12:49:23 atlanta heartbeat: [21690]: ERROR: MSG[5] : [(1)srcuuid=0x8a68368(36 27)] Aug 29 12:49:23 atlanta heartbeat: [21690]: ERROR: MSG[6] : [seq=aa] Aug 29 12:49:23 atlanta heartbeat: [21690]: ERROR: MSG[7] : [hg=e] Aug 29 12:49:23 atlanta heartbeat: [21690]: ERROR: MSG[8] : [ts=46d56b53] Aug 29 12:49:23 atlanta heartbeat: [21690]: ERROR: MSG[9] : [ld=0.00 0.00 0.00 1/78 23095] Aug 29 12:49:23 atlanta heartbeat: [21690]: ERROR: MSG[10] : [ttl=4] Aug 29 12:49:23 atlanta heartbeat: [21692]: ERROR: glib: Error sending packet: Network is unreachable Aug 29 12:49:23 atlanta heartbeat: [21690]: ERROR: MSG[11] : [auth=1 2fbd6303] Aug 29 12:49:23 atlanta heartbeat: [21692]: ERROR: write failure on ping 192.168.15.53.: Network is unreachable Aug 29 12:49:23 atlanta heartbeat: [21690]: ERROR: write failure on bcast eth0.: No such device Aug 29 12:49:23 atlanta kernel: bnx2: eth0 NIC Link is Up, 1000 Mbps full duplex Aug 29 12:49:24 atlanta network: Bringing up interface eth0: succeeded Aug 29 12:49:25 atlanta heartbeat: [21685]: info: Link 192.168.15.53:192.168.15.53 up. Aug 29 12:49:25 atlanta heartbeat: [21685]: WARN: Late heartbeat: Node 192.168.15.53: interval 65610 ms Aug 29 12:49:25 atlanta heartbeat: [21685]: info: Status update for node 192.168.15.53: status ping Aug 29 12:49:25 atlanta ipfail: [21707]: info: Link Status update: Link 192.168.15.53/192.168.15.53 now has status up Aug 29 12:49:25 atlanta ipfail: [21707]: info: Status update: Node 192.168.15.53 now has status ping Aug 29 12:49:25 atlanta ipfail: [21707]: info: A ping node just came up. Aug 29 12:49:25 atlanta heartbeat: [21685]: CRIT: Cluster node boston returning after partition. Aug 29 12:49:25 atlanta heartbeat: [21685]: info: For information on cluster partitions, See URL: http://linux-ha.org/SplitBrain Aug 29 12:49:25 atlanta heartbeat: [21685]: WARN: Deadtime value may be too small. Aug 29 12:49:25 atlanta heartbeat: [21685]: info: See FAQ for information on tuning deadtime. Aug 29 12:49:25 atlanta heartbeat: [21685]: info: URL: http://linux-ha.org/FAQ#heavy_load Aug 29 12:49:25 atlanta heartbeat: [21685]: info: Link boston:eth0 up. Aug 29 12:49:25 atlanta heartbeat: [21685]: WARN: Late heartbeat: Node boston: interval 67140 ms Aug 29 12:49:25 atlanta heartbeat: [21685]: info: Status update for node boston: status active Aug 29 12:49:25 atlanta harc[23158]: info: Running /etc/ha.d/rc.d/status status Aug 29 12:49:25 atlanta ipfail: [21707]: info: Asking other side for ping node count. Aug 29 12:49:25 atlanta ipfail: [21707]: info: Link Status update: Link boston/eth0 now has status up Aug 29 12:49:25 atlanta ipfail: [21707]: info: Status update: Node boston now has status active Aug 29 12:49:27 atlanta heartbeat: [21685]: WARN: Shutdown delayed until current resource activity finishes. Aug 29 12:49:28 atlanta heartbeat: [21685]: info: Heartbeat shutdown in progress. (21685) Aug 29 12:49:28 atlanta heartbeat: [21685]: info: Received shutdown notice from 'boston'. Aug 29 12:49:28 atlanta heartbeat: [21685]: info: Resource takeover cancelled - shutdown in progress. Aug 29 12:49:28 atlanta ipfail: [21707]: info: No giveup timer to abort. Aug 29 12:49:28 atlanta heartbeat: [23169]: info: Giving up all HA resources. Aug 29 12:49:28 atlanta ResourceManager[23179]: info: Releasing resource group: atlanta 10.180.225.99 httpd Aug 29 12:49:28 atlanta ResourceManager[23179]: info: Running /etc/init.d/httpd stop Aug 29 12:49:28 atlanta httpd: httpd shutdown failed Aug 29 12:49:28 atlanta ResourceManager[23179]: ERROR: Return code 1 from /etc/init.d/httpd Aug 29 12:49:29 atlanta ResourceManager[23179]: info: Retrying failed stop operation [httpd] Aug 29 12:49:29 atlanta ResourceManager[23179]: info: Running /etc/init.d/httpd stop Aug 29 12:49:29 atlanta httpd: httpd shutdown failed Aug 29 12:49:29 atlanta ResourceManager[23179]: ERROR: Return code 1 from /etc/init.d/httpd Aug 29 12:49:30 atlanta ResourceManager[23179]: info: Retrying failed stop operation [httpd] Aug 29 12:49:30 atlanta ResourceManager[23179]: info: Running /etc/init.d/httpd stop Aug 29 12:49:30 atlanta httpd: httpd shutdown failed Aug 29 12:49:30 atlanta ResourceManager[23179]: ERROR: Return code 1 from /etc/init.d/httpd Aug 29 12:49:31 atlanta ResourceManager[23179]: info: Retrying failed stop operation [httpd] Aug 29 12:49:31 atlanta ResourceManager[23179]: info: Running /etc/init.d/httpd stop Aug 29 12:49:31 atlanta httpd: httpd shutdown failed Aug 29 12:49:31 atlanta ResourceManager[23179]: ERROR: Return code 1 from /etc/init.d/httpd Aug 29 12:49:32 atlanta ResourceManager[23179]: info: Retrying failed stop operation [httpd] Aug 29 12:49:32 atlanta ResourceManager[23179]: info: Running /etc/init.d/httpd stop Aug 29 12:49:32 atlanta httpd: httpd shutdown failed Aug 29 12:49:32 atlanta ResourceManager[23179]: ERROR: Return code 1 from /etc/init.d/httpd Aug 29 12:49:33 atlanta ResourceManager[23179]: info: Retrying failed stop operation [httpd] Aug 29 12:49:33 atlanta ResourceManager[23179]: info: Running /etc/init.d/httpd stop Aug 29 12:49:33 atlanta httpd: httpd shutdown failed Aug 29 12:49:33 atlanta ResourceManager[23179]: ERROR: Return code 1 from /etc/init.d/httpd Aug 29 12:49:34 atlanta ResourceManager[23179]: info: Retrying failed stop operation [httpd] Aug 29 12:49:34 atlanta ResourceManager[23179]: info: Running /etc/init.d/httpd stop Aug 29 12:49:34 atlanta httpd: httpd shutdown failed Aug 29 12:49:34 atlanta ResourceManager[23179]: ERROR: Return code 1 from /etc/init.d/httpd Aug 29 12:49:35 atlanta ResourceManager[23179]: info: Retrying failed stop operation [httpd] Aug 29 12:49:35 atlanta ResourceManager[23179]: info: Running /etc/init.d/httpd stop Aug 29 12:49:35 atlanta httpd: httpd shutdown failed Aug 29 12:49:35 atlanta ResourceManager[23179]: ERROR: Return code 1 from /etc/init.d/httpd Aug 29 12:49:36 atlanta ResourceManager[23179]: info: Retrying failed stop operation [httpd] Aug 29 12:49:36 atlanta ResourceManager[23179]: info: Running /etc/init.d/httpd stop Aug 29 12:49:36 atlanta httpd: httpd shutdown failed Aug 29 12:49:36 atlanta ResourceManager[23179]: ERROR: Return code 1 from /etc/init.d/httpd Aug 29 12:49:37 atlanta ResourceManager[23179]: info: Retrying failed stop operation [httpd] Aug 29 12:49:37 atlanta ResourceManager[23179]: info: Running /etc/init.d/httpd stop Aug 29 12:49:37 atlanta httpd: httpd shutdown failed Aug 29 12:49:37 atlanta ResourceManager[23179]: ERROR: Return code 1 from /etc/init.d/httpd Aug 29 12:49:38 atlanta ResourceManager[23179]: info: Retrying failed stop operation [httpd] Aug 29 12:49:38 atlanta ResourceManager[23179]: info: Running /etc/init.d/httpd stop Aug 29 12:49:38 atlanta httpd: httpd shutdown failed Aug 29 12:49:38 atlanta ResourceManager[23179]: ERROR: Return code 1 from /etc/init.d/httpd Aug 29 12:49:38 atlanta ResourceManager[23179]: ERROR: Resource script for httpd probably not LSB-compliant. Aug 29 12:49:38 atlanta ResourceManager[23179]: WARN: it (httpd) MUST succeed on a stop when already stopped Aug 29 12:49:38 atlanta ResourceManager[23179]: WARN: Machine reboot narrowly avoided! Aug 29 12:49:38 atlanta ResourceManager[23179]: info: Running /etc/ha.d/resource.d/IPaddr 10.180.225.99 stop Aug 29 12:49:38 atlanta IPaddr[23666]: INFO: IPaddr Success Aug 29 12:49:38 atlanta heartbeat: [23169]: info: All HA resources relinquished. Aug 29 12:49:39 atlanta heartbeat: [21685]: info: killing /usr/lib/heartbeat/ipfail process group 21707 with signal 15 Aug 29 12:49:41 atlanta heartbeat: [21685]: info: killing HBWRITE process 21692 with signal 15 Aug 29 12:49:41 atlanta heartbeat: [21685]: info: killing HBREAD process 21693 with signal 15 Aug 29 12:49:41 atlanta heartbeat: [21685]: info: killing HBFIFO process 21689 with signal 15 Aug 29 12:49:41 atlanta heartbeat: [21685]: info: killing HBWRITE process 21690 with signal 15 Aug 29 12:49:41 atlanta heartbeat: [21685]: info: killing HBREAD process 21691 with signal 15 Aug 29 12:49:41 atlanta heartbeat: [21685]: info: Core process 21690 exited. 5 remaining Aug 29 12:49:41 atlanta heartbeat: [21685]: info: Core process 21689 exited. 4 remaining Aug 29 12:49:41 atlanta heartbeat: [21685]: info: Core process 21692 exited. 3 remaining Aug 29 12:49:41 atlanta heartbeat: [21685]: info: Core process 21691 exited. 2 remaining Aug 29 12:49:41 atlanta heartbeat: [21685]: info: Core process 21693 exited. 1 remaining Aug 29 12:49:41 atlanta heartbeat: [21685]: info: atlanta Heartbeat shutdown complete. Aug 29 12:49:41 atlanta heartbeat: [21685]: info: Heartbeat restart triggered. Aug 29 12:49:41 atlanta heartbeat: [21685]: info: Restarting heartbeat. Aug 29 12:49:41 atlanta heartbeat: [21685]: info: Performing heartbeat restart exec. Aug 29 12:50:02 atlanta heartbeat: [21685]: WARN: Core dumps could be lost if multiple dumps occur Aug 29 12:50:02 atlanta heartbeat: [21685]: WARN: Consider setting /proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum supportability Aug 29 12:50:02 atlanta heartbeat: [21685]: WARN: Logging daemon is disabled --enabling logging daemon is recommended Aug 29 12:50:02 atlanta heartbeat: [21685]: info: ************************** Aug 29 12:50:02 atlanta heartbeat: [21685]: info: Configuration validated. Starting heartbeat 2.0.7 Aug 29 12:50:02 atlanta heartbeat: [23754]: info: heartbeat: version 2.0.7 Aug 29 12:50:02 atlanta heartbeat: [23754]: info: Heartbeat generation: 15 Aug 29 12:50:02 atlanta heartbeat: [23754]: info: G_main_add_TriggerHandler: Added signal manual handler Aug 29 12:50:02 atlanta heartbeat: [23754]: info: G_main_add_TriggerHandler: Added signal manual handler Aug 29 12:50:02 atlanta heartbeat: [23754]: info: Removing /var/run/heartbeat/rsctmp failed, recreating. Aug 29 12:50:02 atlanta heartbeat: [23754]: info: glib: UDP Broadcast heartbeat started on port 694 (694) interface eth0 Aug 29 12:50:02 atlanta heartbeat: [23754]: info: glib: UDP Broadcast heartbeat closed on port 694 interface eth0 - Status: 1 Aug 29 12:50:02 atlanta heartbeat: [23754]: info: glib: ping heartbeat started. Aug 29 12:50:02 atlanta heartbeat: [23754]: info: G_main_add_SignalHandler: Added signal handler for signal 17 Aug 29 12:50:02 atlanta heartbeat: [23754]: info: Local status now set to: 'up' Aug 29 12:50:03 atlanta heartbeat: [23754]: info: Link 192.168.15.53:192.168.15.53 up. Aug 29 12:50:03 atlanta heartbeat: [23754]: info: Status update for node 192.168.15.53: status ping Aug 29 12:50:03 atlanta heartbeat: [23754]: info: Link atlanta:eth0 up. Aug 29 12:50:03 atlanta heartbeat: [23754]: info: Link boston:eth0 up. Aug 29 12:50:04 atlanta heartbeat: [23754]: info: Status update for node boston: status up Aug 29 12:50:04 atlanta heartbeat: [23754]: info: Comm_now_up(): updating status to active Aug 29 12:50:04 atlanta heartbeat: [23754]: info: Local status now set to: 'active' Aug 29 12:50:04 atlanta heartbeat: [23754]: info: Starting child client "/usr/lib/heartbeat/ipfail" (40002,40002) Aug 29 12:50:04 atlanta heartbeat: [23754]: info: Status update for node boston: status active Aug 29 12:50:04 atlanta heartbeat: [23766]: info: Starting "/usr/lib/heartbeat/ipfail" as uid 40002 gid 40002 (pid 23766) Aug 29 12:50:04 atlanta harc[23763]: info: Running /etc/ha.d/rc.d/status status Aug 29 12:50:04 atlanta harc[23774]: info: Running /etc/ha.d/rc.d/status status Aug 29 12:50:14 atlanta heartbeat: [23754]: info: local resource transition completed. Aug 29 12:50:14 atlanta heartbeat: [23754]: info: Initial resource acquisition complete (T_RESOURCES(us)) Aug 29 12:50:14 atlanta ipfail: [23766]: info: Ping node count is balanced. Aug 29 12:50:14 atlanta IPaddr[23812]: INFO: IPaddr Resource is stopped Aug 29 12:50:14 atlanta heartbeat: [23785]: info: Local Resource acquisition completed. Aug 29 12:50:14 atlanta heartbeat: [23754]: info: remote resource transition completed. Aug 29 12:50:14 atlanta harc[23898]: info: Running /etc/ha.d/rc.d/ip-request-resp ip-request-resp Aug 29 12:50:14 atlanta ip-request-resp[23898]: received ip-request-resp 10.180.225.99 OK yes Aug 29 12:50:14 atlanta ResourceManager[23913]: info: Acquiring resource group: atlanta 10.180.225.99 httpd Aug 29 12:50:14 atlanta IPaddr[23937]: INFO: IPaddr Resource is stopped Aug 29 12:50:14 atlanta ResourceManager[23913]: info: Running /etc/ha.d/resource.d/IPaddr 10.180.225.99 start Aug 29 12:50:14 atlanta IPaddr[24113]: INFO: eval /sbin/ifconfig eth0:0 10.180.225.99 netmask 255.255.240.0 broadcast 10.180.239.255 Aug 29 12:50:14 atlanta IPaddr[24113]: INFO: Sending Gratuitous Arp for 10.180.225.99 on eth0:0 [eth0] Aug 29 12:50:14 atlanta IPaddr[24113]: INFO: /usr/lib/heartbeat/send_arp -i 500 -r 10 -p /var/run/heartbeat/rsctmp/send_arp/send_arp-10.180.225.99 eth0 10.180.225.99 auto 10.180.225.99 ffffffffffff Aug 29 12:50:14 atlanta IPaddr[24043]: INFO: IPaddr Success Aug 29 12:50:14 atlanta ResourceManager[23913]: info: Running /etc/init.d/httpd start Aug 29 12:50:14 atlanta httpd: httpd startup succeeded BOSTON: ------------------------------------------ Aug 29 12:44:37 boston harc[30555]: info: Running /etc/ha.d/rc.d/status status Aug 29 12:44:37 boston heartbeat: [30546]: info: Comm_now_up(): updating status to active Aug 29 12:44:37 boston heartbeat: [30546]: info: Local status now set to: 'active' Aug 29 12:44:37 boston heartbeat: [30546]: info: Starting child client "/usr/lib/heartbeat/ipfail" (40002,40002) Aug 29 12:44:37 boston heartbeat: [30566]: info: Starting "/usr/lib/heartbeat/ipfail" as uid 40002 gid 40002 (pid 30566) Aug 29 12:44:38 boston heartbeat: [30546]: info: Status update for node atlanta: status active Aug 29 12:44:38 boston harc[30567]: info: Running /etc/ha.d/rc.d/status status Aug 29 12:44:42 boston ipfail: [30566]: info: Status update: Node atlanta now has status active Aug 29 12:44:44 boston ipfail: [30566]: info: Asking other side for ping node count. Aug 29 12:44:47 boston ipfail: [30566]: info: No giveup timer to abort. Aug 29 12:44:48 boston heartbeat: [30546]: info: remote resource transition completed. Aug 29 12:44:48 boston heartbeat: [30546]: info: remote resource transition completed. Aug 29 12:44:48 boston heartbeat: [30546]: info: Initial resource acquisition complete (T_RESOURCES(us)) Aug 29 12:44:48 boston heartbeat: [30578]: info: No local resources [/usr/lib/heartbeat/ResourceManager listkeys boston] to acquire. Aug 29 12:48:39 boston heartbeat: [30546]: WARN: node atlanta: is dead Aug 29 12:48:39 boston ipfail: [30566]: info: Status update: Node atlanta now has status dead Aug 29 12:48:39 boston heartbeat: [30546]: WARN: No STONITH device configured. Aug 29 12:48:39 boston heartbeat: [30546]: WARN: Shared disks are not protected. Aug 29 12:48:39 boston heartbeat: [30546]: info: Resources being acquired from atlanta. Aug 29 12:48:39 boston heartbeat: [30546]: info: Link atlanta:eth0 dead. Aug 29 12:48:39 boston harc[30657]: info: Running /etc/ha.d/rc.d/status status Aug 29 12:48:39 boston heartbeat: [30658]: info: No local resources [/usr/lib/heartbeat/ResourceManager listkeys boston] to acquire. Aug 29 12:48:39 boston ipfail: [30566]: info: NS: We are still alive! Aug 29 12:48:39 boston mach_down[30677]: info: Taking over resource group 10.180.225.99 Aug 29 12:48:39 boston ResourceManager[30697]: info: Acquiring resource group: atlanta 10.180.225.99 httpd Aug 29 12:48:39 boston IPaddr[30721]: INFO: IPaddr Resource is stopped Aug 29 12:48:39 boston ResourceManager[30697]: info: Running /etc/ha.d/resource.d/IPaddr 10.180.225.99 start Aug 29 12:48:39 boston ipfail: [30566]: info: Link Status update: Link atlanta/eth0 now has status dead Aug 29 12:48:39 boston IPaddr[30898]: INFO: eval /sbin/ifconfig eth0:0 10.180.225.99 netmask 255.255.240.0 broadcast 10.180.239.255 Aug 29 12:48:39 boston IPaddr[30898]: INFO: Sending Gratuitous Arp for 10.180.225.99 on eth0:0 [eth0] Aug 29 12:48:39 boston IPaddr[30898]: INFO: /usr/lib/heartbeat/send_arp -i 500 -r 10 -p /var/run/heartbeat/rsctmp/send_arp/send_arp-10.180.225.99 eth0 10.180.225.99 auto 10.180.225.99 ffffffffffff Aug 29 12:48:39 boston IPaddr[30828]: INFO: IPaddr Success Aug 29 12:48:39 boston ResourceManager[30697]: info: Running /etc/init.d/httpd start Aug 29 12:48:40 boston httpd: httpd startup succeeded Aug 29 12:48:40 boston mach_down[30677]: info: /usr/lib/heartbeat/mach_down: nice_failback: foreign resources acquired Aug 29 12:48:40 boston mach_down[30677]: info: mach_down takeover complete for node atlanta. Aug 29 12:48:40 boston heartbeat: [30546]: info: mach_down takeover complete. Aug 29 12:48:40 boston ipfail: [30566]: info: Asking other side for ping node count. Aug 29 12:48:40 boston ipfail: [30566]: info: Checking remote count of ping nodes. Aug 29 12:49:25 boston heartbeat: [30546]: CRIT: Cluster node atlanta returning after partition. Aug 29 12:49:25 boston heartbeat: [30546]: info: For information on cluster partitions, See URL: http://linux-ha.org/SplitBrain Aug 29 12:49:25 boston heartbeat: [30546]: WARN: Deadtime value may be too small. Aug 29 12:49:25 boston heartbeat: [30546]: info: See FAQ for information on tuning deadtime. Aug 29 12:49:25 boston heartbeat: [30546]: info: URL: http://linux-ha.org/FAQ#heavy_load Aug 29 12:49:25 boston heartbeat: [30546]: info: Link atlanta:eth0 up. Aug 29 12:49:25 boston heartbeat: [30546]: WARN: Late heartbeat: Node atlanta: interval 66150 ms Aug 29 12:49:25 boston heartbeat: [30546]: info: Status update for node atlanta: status active Aug 29 12:49:25 boston ipfail: [30566]: info: Link Status update: Link atlanta/eth0 now has status up Aug 29 12:49:25 boston ipfail: [30566]: info: Status update: Node atlanta now has status active Aug 29 12:49:25 boston harc[31071]: info: Running /etc/ha.d/rc.d/status status Aug 29 12:49:27 boston heartbeat: [30546]: info: Heartbeat shutdown in progress. (30546) Aug 29 12:49:27 boston ipfail: [30566]: info: Ping node count is balanced. Aug 29 12:49:27 boston heartbeat: [31081]: info: Giving up all HA resources. Aug 29 12:49:27 boston ResourceManager[31091]: info: Releasing resource group: atlanta 10.180.225.99 httpd Aug 29 12:49:27 boston ResourceManager[31091]: info: Running /etc/init.d/httpd stop Aug 29 12:49:27 boston httpd: httpd shutdown succeeded Aug 29 12:49:27 boston ResourceManager[31091]: info: Running /etc/ha.d/resource.d/IPaddr 10.180.225.99 stop Aug 29 12:49:27 boston IPaddr[31224]: INFO: /sbin/route -n del -host 10.180.225.99 Aug 29 12:49:27 boston IPaddr[31224]: INFO: /sbin/ifconfig eth0:0 10.180.225.99 down Aug 29 12:49:27 boston IPaddr[31224]: INFO: IP Address 10.180.225.99released Aug 29 12:49:27 boston IPaddr[31154]: INFO: IPaddr Success Aug 29 12:49:27 boston heartbeat: [31081]: info: All HA resources relinquished. Aug 29 12:49:28 boston heartbeat: [30546]: info: killing /usr/lib/heartbeat/ipfail process group 30566 with signal 15 Aug 29 12:49:30 boston heartbeat: [30546]: info: killing HBFIFO process 30549 with signal 15 Aug 29 12:49:30 boston heartbeat: [30546]: info: killing HBWRITE process 30550 with signal 15 Aug 29 12:49:30 boston heartbeat: [30546]: info: killing HBREAD process 30551 with signal 15 Aug 29 12:49:30 boston heartbeat: [30546]: info: killing HBWRITE process 30552 with signal 15 Aug 29 12:49:30 boston heartbeat: [30546]: info: killing HBREAD process 30553 with signal 15 Aug 29 12:49:30 boston heartbeat: [30546]: info: Core process 30550 exited. 5 remaining Aug 29 12:49:30 boston heartbeat: [30546]: info: Core process 30551 exited. 4 remaining Aug 29 12:49:30 boston heartbeat: [30546]: info: Core process 30552 exited. 3 remaining Aug 29 12:49:30 boston heartbeat: [30546]: info: Core process 30549 exited. 2 remaining Aug 29 12:49:30 boston heartbeat: [30546]: info: Core process 30553 exited. 1 remaining Aug 29 12:49:30 boston heartbeat: [30546]: info: boston Heartbeat shutdown complete. Aug 29 12:49:30 boston heartbeat: [30546]: info: Heartbeat restart triggered. Aug 29 12:49:30 boston heartbeat: [30546]: info: Restarting heartbeat. Aug 29 12:49:30 boston heartbeat: [30546]: info: Performing heartbeat restart exec. Aug 29 12:49:51 boston heartbeat: [30546]: WARN: Core dumps could be lost if multiple dumps occur Aug 29 12:49:51 boston heartbeat: [30546]: WARN: Consider setting /proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum supportability Aug 29 12:49:51 boston heartbeat: [30546]: WARN: Logging daemon is disabled --enabling logging daemon is recommended Aug 29 12:49:51 boston heartbeat: [30546]: info: ************************** Aug 29 12:49:51 boston heartbeat: [30546]: info: Configuration validated. Starting heartbeat 2.0.7 Aug 29 12:49:51 boston heartbeat: [31259]: info: heartbeat: version 2.0.7 Aug 29 12:49:51 boston heartbeat: [31259]: info: Heartbeat generation: 14 Aug 29 12:49:51 boston heartbeat: [31259]: info: G_main_add_TriggerHandler: Added signal manual handler Aug 29 12:49:51 boston heartbeat: [31259]: info: G_main_add_TriggerHandler: Added signal manual handler Aug 29 12:49:51 boston heartbeat: [31259]: info: Removing /var/run/heartbeat/rsctmp failed, recreating. Aug 29 12:49:51 boston heartbeat: [31259]: info: glib: UDP Broadcast heartbeat started on port 694 (694) interface eth0 Aug 29 12:49:51 boston heartbeat: [31259]: info: glib: UDP Broadcast heartbeat closed on port 694 interface eth0 - Status: 1 Aug 29 12:49:51 boston heartbeat: [31259]: info: glib: ping heartbeat started. Aug 29 12:49:51 boston heartbeat: [31259]: info: G_main_add_SignalHandler: Added signal handler for signal 17 Aug 29 12:49:51 boston heartbeat: [31259]: info: Local status now set to: 'up' Aug 29 12:49:53 boston heartbeat: [31259]: info: Link 192.168.15.53:192.168.15.53 up. Aug 29 12:49:53 boston heartbeat: [31259]: info: Status update for node 192.168.15.53: status ping Aug 29 12:49:53 boston heartbeat: [31259]: info: Link boston:eth0 up. Aug 29 12:50:03 boston heartbeat: [31259]: info: Link atlanta:eth0 up. Aug 29 12:50:03 boston heartbeat: [31259]: info: Status update for node atlanta: status up Aug 29 12:50:03 boston harc[31269]: info: Running /etc/ha.d/rc.d/status status Aug 29 12:50:03 boston heartbeat: [31259]: WARN: 1 lost packet(s) for [atlanta] [3:5] Aug 29 12:50:03 boston heartbeat: [31259]: info: No pkts missing from atlanta! Aug 29 12:50:03 boston heartbeat: [31259]: info: Comm_now_up(): updating status to active Aug 29 12:50:03 boston heartbeat: [31259]: info: Local status now set to: 'active' Aug 29 12:50:03 boston heartbeat: [31259]: info: Starting child client "/usr/lib/heartbeat/ipfail" (40002,40002) Aug 29 12:50:03 boston heartbeat: [31280]: info: Starting "/usr/lib/heartbeat/ipfail" as uid 40002 gid 40002 (pid 31280) Aug 29 12:50:04 boston heartbeat: [31259]: info: Status update for node atlanta: status active Aug 29 12:50:04 boston harc[31281]: info: Running /etc/ha.d/rc.d/status status Aug 29 12:50:08 boston ipfail: [31280]: info: Status update: Node atlanta now has status active Aug 29 12:50:11 boston ipfail: [31280]: info: Asking other side for ping node count. Aug 29 12:50:14 boston heartbeat: [31259]: info: remote resource transition completed. Aug 29 12:50:14 boston heartbeat: [31259]: info: remote resource transition completed. Aug 29 12:50:14 boston heartbeat: [31259]: info: Initial resource acquisition complete (T_RESOURCES(us)) Aug 29 12:50:14 boston ipfail: [31280]: info: No giveup timer to abort. Aug 29 12:50:14 boston heartbeat: [31306]: info: No local resources [/usr/lib/heartbeat/ResourceManager listkeys boston] to acquire. On 8/29/07, Max Hofer <[EMAIL PROTECTED]> wrote: > > According to the homepage: > > auto_failback does not have any effect on a Release 2 CRM-style cluster > (one > configured with crm on). For CRM-style clusters, this has been replaced > with > the default_resource_stickiness attribute in the CIB. > > so if "crm on" ---> auto_failback = ignored > > the values you have to look in are: > resource_stickiness > resource_failure_stickiness > > > if you use heartbeat version 1 (no CIB) configuration set > auto_failback no > > > > On Tuesday 28 August 2007, Michael Dengler wrote: > > GRRRRRR..... > > > > OK...i have *auto_failback *set to no. > > > > and > > > > do a "service network stop" on the primary node....fine > > > > the standby comes up and starts all the resources...fine > > > > then i do a "service network start" on the primary node.... > > > > heartbeat FAILS BACK! why? > > > > I'm very confused...Please help.... > > > > Mike > > > > On 8/10/07, Andrew Beekhof <[EMAIL PROTECTED]> wrote: > > > On 8/10/07, Mark Eisenblaetter <[EMAIL PROTECTED]> wrote: > > > > Hi, > > > > > > > > you must cofigure autofailback no in your ha.cf. > > > > > > > > so the sytem will stop to failback on the preferred node. > > > > > > the crm/cib.xml equivalent is default_resource_stickiness > > > _______________________________________________ > > > Linux-HA mailing list > > > [email protected] > > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > > See also: http://linux-ha.org/ReportingProblems > > > > _______________________________________________ > > Linux-HA mailing list > > [email protected] > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > See also: http://linux-ha.org/ReportingProblems > > > > -- > Max Hofer > APUS Software G.m.b.H. > A-8074 Raaba, Bahnhofstraße 1/1 > T| +43 316 401629 11 > F| +43 316 401629 9 > W| www.apus.co.at > E| [EMAIL PROTECTED] > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
