David and Dmitri, Here's one more try and one more set of log files. I now see that heartbeat is shutting down, which is beyond what used to happen.
some interesting lines I saw: Aug 10 17:49:09 pfs-srv4 heartbeat: [1276]: info: Received shutdown notice from 'pfs-srv3'. Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: info: Received shutdown notice from 'pfs-srv4'. Now it seems that each other tells each other to shut down. Here are the logs: pfs-srv3: Aug 10 17:46:04 pfs-srv3 logd: [899]: info: logd started with /etc/logd.cf. Aug 10 17:46:04 pfs-srv3 logd: [899]: WARN: Core dumps could be lost if multiple dumps occur. Aug 10 17:46:04 pfs-srv3 logd: [899]: WARN: Consider setting non-default value in /proc/sys/kernel/core_pattern (or equivalent) for maximum supportability Aug 10 17:46:04 pfs-srv3 logd: [899]: WARN: Consider setting /proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum supportability Aug 10 17:46:04 pfs-srv3 logd: [899]: info: G_main_add_SignalHandler: Added signal handler for signal 15 Aug 10 17:46:04 pfs-srv3 logd: [945]: info: G_main_add_SignalHandler: Added signal handler for signal 15 Aug 10 17:46:08 pfs-srv3 heartbeat: [1161]: info: Enabling logging daemon Aug 10 17:46:08 pfs-srv3 heartbeat: [1161]: info: logfile and debug file are those specified in logd config file (default /etc/logd.cf) Aug 10 17:46:08 pfs-srv3 heartbeat: [1161]: info: Version 2 support: off Aug 10 17:46:08 pfs-srv3 heartbeat: [1161]: info: AUTH: i=1: key = 0x97bcb30, auth=0xb7282034, authname=md5 Aug 10 17:46:08 pfs-srv3 heartbeat: [1161]: WARN: Core dumps could be lost if multiple dumps occur. Aug 10 17:46:08 pfs-srv3 heartbeat: [1161]: WARN: Consider setting non-default value in /proc/sys/kernel/core_pattern (or equivalent) for maximum supportability Aug 10 17:46:08 pfs-srv3 heartbeat: [1161]: WARN: Consider setting /proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum supportability Aug 10 17:46:08 pfs-srv3 heartbeat: [1161]: info: ************************** Aug 10 17:46:08 pfs-srv3 heartbeat: [1161]: info: Configuration validated. Starting heartbeat 3.0.2 Aug 10 17:46:08 pfs-srv3 heartbeat: [1161]: info: Heartbeat Hg Version: node: ed844d11ea2b603f7d01cce1700d6c1fcb404d29 Aug 10 17:46:08 pfs-srv3 heartbeat: [1162]: info: heartbeat: version 3.0.2 Aug 10 17:46:08 pfs-srv3 heartbeat: [1162]: info: Heartbeat generation: 1279723766 Aug 10 17:46:08 pfs-srv3 heartbeat: [1162]: info: glib: UDP Broadcast heartbeat started on port 12694 (12694) interface eth1 Aug 10 17:46:08 pfs-srv3 heartbeat: [1162]: info: glib: UDP Broadcast heartbeat closed on port 12694 interface eth1 - Status: 1 Aug 10 17:46:08 pfs-srv3 heartbeat: [1162]: info: G_main_add_TriggerHandler: Added signal manual handler Aug 10 17:46:08 pfs-srv3 heartbeat: [1162]: info: G_main_add_TriggerHandler: Added signal manual handler Aug 10 17:46:08 pfs-srv3 heartbeat: [1162]: info: G_main_add_SignalHandler: Added signal handler for signal 17 Aug 10 17:46:08 pfs-srv3 heartbeat: [1162]: info: Local status now set to: 'up' Aug 10 17:46:08 pfs-srv3 heartbeat: [1162]: info: Link pfs-srv3:eth1 up. Aug 10 17:46:08 pfs-srv3 heartbeat: [1162]: info: Managed write_hostcachedata process 1201 exited with return code 0. Aug 10 17:46:09 pfs-srv3 heartbeat: [1162]: info: Link pfs-srv4:eth1 up. Aug 10 17:46:09 pfs-srv3 heartbeat: [1162]: info: Status update for node pfs-srv4: status active Aug 10 17:46:09 pfs-srv3 heartbeat: [1162]: info: Managed write_hostcachedata process 1206 exited with return code 0. Aug 10 17:46:09 pfs-srv3 harc[1205]: [1212]: info: Running /etc/ha.d//rc.d/status status Aug 10 17:46:09 pfs-srv3 heartbeat: [1162]: info: Managed status process 1205 exited with return code 0. Aug 10 17:46:10 pfs-srv3 heartbeat: [1162]: info: Comm_now_up(): updating status to active Aug 10 17:46:10 pfs-srv3 heartbeat: [1162]: info: Local status now set to: 'active' Aug 10 17:46:10 pfs-srv3 heartbeat: [1162]: info: Managed write_hostcachedata process 1218 exited with return code 0. Aug 10 17:46:10 pfs-srv3 heartbeat: [1162]: info: Managed write_delcachedata process 1219 exited with return code 0. Aug 10 17:46:10 pfs-srv3 heartbeat: [1162]: info: remote resource transition completed. Aug 10 17:46:10 pfs-srv3 heartbeat: [1162]: info: STATE 1 => 3 Aug 10 17:46:10 pfs-srv3 heartbeat: [1162]: info: other_holds_resources: 3 Aug 10 17:46:10 pfs-srv3 heartbeat: [1162]: info: remote resource transition completed. Aug 10 17:46:10 pfs-srv3 heartbeat: [1162]: info: Local Resource acquisition completed. (none) Aug 10 17:46:10 pfs-srv3 heartbeat: [1162]: info: AnnounceTakeover(local 0, foreign 1, reason 'T_RESOURCES(them)' (0)) Aug 10 17:46:10 pfs-srv3 heartbeat: [1162]: info: STATE 3 => 4 Aug 10 17:46:11 pfs-srv3 heartbeat: [1162]: info: other_holds_resources: 3 Aug 10 17:46:11 pfs-srv3 heartbeat: [1162]: info: pfs-srv4 wants to go standby [foreign] Aug 10 17:46:11 pfs-srv3 heartbeat: [1162]: info: standby: other_holds_resources: 3 Aug 10 17:46:11 pfs-srv3 heartbeat: [1162]: info: New standby state: 2 Aug 10 17:46:11 pfs-srv3 heartbeat: [1162]: info: New standby state: 2 Aug 10 17:46:12 pfs-srv3 heartbeat: [1162]: info: other_holds_resources: 1 Aug 10 17:46:12 pfs-srv3 heartbeat: [1162]: info: standby: acquire [foreign] resources from pfs-srv4 Aug 10 17:46:12 pfs-srv3 heartbeat: [1162]: info: New standby state: 3 Aug 10 17:46:12 pfs-srv3 heartbeat: [1295]: info: acquire local HA resources (standby). Aug 10 17:46:12 pfs-srv3 heartbeat: [1295]: info: go_standby: who: 2 resource set: local Aug 10 17:46:12 pfs-srv3 heartbeat: [1295]: info: go_standby: (query/action): (ourkeys/takegroup) Aug 10 17:46:12 pfs-srv3 ResourceManager[1309]: [1320]: info: Acquiring resource group: pfs-srv3 drbddisk::r0 Filesystem::/dev/drbd0::/pfs::ext3 10.1.8.45/24 nfs-kernel-server smbd Aug 10 17:46:12 pfs-srv3 Filesystem[1347]: [1386]: INFO: Resource is stopped Aug 10 17:46:12 pfs-srv3 ResourceManager[1309]: [1401]: info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 /pfs ext3 start Aug 10 17:46:12 pfs-srv3 Filesystem[1410]: [1438]: INFO: Running start for /dev/drbd0 on /pfs Aug 10 17:46:12 pfs-srv3 Filesystem[1404]: [1455]: INFO: Success Aug 10 17:46:12 pfs-srv3 IPaddr[1469]: [1498]: INFO: Resource is stopped Aug 10 17:46:12 pfs-srv3 ResourceManager[1309]: [1515]: info: Running /etc/ha.d/resource.d/IPaddr 10.1.8.45/24 start Aug 10 17:46:12 pfs-srv3 IPaddr[1539]: [1564]: INFO: Using calculated nic for 10.1.8.45: eth0 Aug 10 17:46:12 pfs-srv3 IPaddr[1539]: [1570]: INFO: Using calculated netmask for 10.1.8.45: 255.255.255.0 Aug 10 17:46:12 pfs-srv3 IPaddr[1539]: [1594]: INFO: eval ifconfig eth0:0 10.1.8.45 netmask 255.255.255.0 broadcast 10.1.8.255 Aug 10 17:46:12 pfs-srv3 IPaddr[1518]: [1615]: INFO: Success Aug 10 17:46:12 pfs-srv3 ResourceManager[1309]: [1635]: info: Running /etc/init.d/nfs-kernel-server start Aug 10 17:46:13 pfs-srv3 ResourceManager[1309]: [1680]: info: Running /etc/init.d/smbd start Aug 10 17:46:13 pfs-srv3 heartbeat: [1295]: info: local HA resource acquisition completed (standby). Aug 10 17:46:13 pfs-srv3 heartbeat: [1295]: info: FIFO message [type ask_resources] written rc=51 Aug 10 17:46:13 pfs-srv3 heartbeat: [1162]: info: Standby resource acquisition done [foreign]. Aug 10 17:46:13 pfs-srv3 heartbeat: [1162]: info: AnnounceTakeover(local 1, foreign 1, reason 'auto_failback' (0)) Aug 10 17:46:13 pfs-srv3 heartbeat: [1162]: info: Initial resource acquisition complete (auto_failback) Aug 10 17:46:13 pfs-srv3 heartbeat: [1162]: info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1)) Aug 10 17:46:13 pfs-srv3 heartbeat: [1162]: info: New standby state: 0 Aug 10 17:46:13 pfs-srv3 heartbeat: [1162]: info: Managed go_standby process 1295 exited with return code 0. Aug 10 17:46:13 pfs-srv3 heartbeat: [1162]: info: remote resource transition completed. Aug 10 17:46:13 pfs-srv3 heartbeat: [1162]: info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1)) Aug 10 17:46:13 pfs-srv3 heartbeat: [1162]: info: other_holds_resources: 1 Aug 10 17:46:13 pfs-srv3 heartbeat: [1162]: info: other_holds_resources: 1 Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: info: other_holds_resources: 0 Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: info: Received shutdown notice from 'pfs-srv4'. Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: info: Resources being acquired from pfs-srv4. Aug 10 17:49:08 pfs-srv3 heartbeat: [1915]: info: acquire local HA resources (standby). Aug 10 17:49:08 pfs-srv3 heartbeat: [1915]: info: go_standby: who: 2 resource set: local Aug 10 17:49:08 pfs-srv3 heartbeat: [1915]: info: go_standby: (query/action): (ourkeys/takegroup) Aug 10 17:49:08 pfs-srv3 ResourceManager[1943]: [1964]: info: Acquiring resource group: pfs-srv3 drbddisk::r0 Filesystem::/dev/drbd0::/pfs::ext3 10.1.8.45/24 nfs-kernel-server smbd Aug 10 17:49:08 pfs-srv3 heartbeat: [1916]: info: 1 local resources from [/usr/share/heartbeat/ResourceManager listkeys pfs-srv3] Aug 10 17:49:08 pfs-srv3 heartbeat: [1916]: info: Local Resource acquisition completed. Aug 10 17:49:08 pfs-srv3 heartbeat: [1916]: info: FIFO message [type resource] written rc=79 Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1)) Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: info: Managed req_our_resources(ask) process 1916 exited with return code 0. Aug 10 17:49:08 pfs-srv3 Filesystem[2006]: [2045]: INFO: Running OK Aug 10 17:49:08 pfs-srv3 IPaddr[2057]: [2086]: INFO: Running OK Aug 10 17:49:08 pfs-srv3 ResourceManager[1943]: [2113]: info: Running /etc/init.d/smbd start Aug 10 17:49:08 pfs-srv3 heartbeat: [1915]: info: local HA resource acquisition completed (standby). Aug 10 17:49:08 pfs-srv3 heartbeat: [1915]: info: FIFO message [type ask_resources] written rc=51 Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: info: Standby resource acquisition done [foreign]. Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: info: AnnounceTakeover(local 1, foreign 1, reason 'auto_failback' (1)) Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1)) Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: info: New standby state: 0 Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: info: Managed go_standby process 1915 exited with return code 0. Aug 10 17:49:08 pfs-srv3 harc[2127]: [2136]: info: Running /etc/ha.d//rc.d/status status Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: WARN: Shutdown delayed until current resource activity finishes. Aug 10 17:49:08 pfs-srv3 mach_down[2149]: [2175]: info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired Aug 10 17:49:08 pfs-srv3 mach_down[2149]: [2180]: info: mach_down takeover complete for node pfs-srv4. Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1)) Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: info: mach_down takeover complete. Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: info: AnnounceTakeover(local 1, foreign 1, reason 'mach_down' (1)) Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: info: Managed status process 2127 exited with return code 0. Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: info: hb_giveup_resources(): current status: active Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: info: Heartbeat shutdown in progress. (1162) Aug 10 17:49:08 pfs-srv3 heartbeat: [2181]: info: Giving up all HA resources. Aug 10 17:49:08 pfs-srv3 ResourceManager[2195]: [2206]: info: Releasing resource group: pfs-srv3 drbddisk::r0 Filesystem::/dev/drbd0::/pfs::ext3 10.1.8.45/24 nfs-kernel-server smbd Aug 10 17:49:08 pfs-srv3 ResourceManager[2195]: [2217]: info: Running /etc/init.d/smbd stop Aug 10 17:49:08 pfs-srv3 ResourceManager[2195]: [2236]: info: Running /etc/init.d/nfs-kernel-server stop Aug 10 17:49:08 pfs-srv3 ResourceManager[2195]: [2263]: info: Running /etc/ha.d/resource.d/IPaddr 10.1.8.45/24 stop Aug 10 17:49:08 pfs-srv3 IPaddr[2287]: [2298]: INFO: ifconfig eth0:0 down Aug 10 17:49:08 pfs-srv3 IPaddr[2266]: [2302]: INFO: Success Aug 10 17:49:08 pfs-srv3 ResourceManager[2195]: [2319]: info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 /pfs ext3 stop Aug 10 17:49:08 pfs-srv3 Filesystem[2328]: [2356]: INFO: Running stop for /dev/drbd0 on /pfs Aug 10 17:49:08 pfs-srv3 Filesystem[2328]: [2371]: INFO: Trying to unmount /pfs Aug 10 17:49:09 pfs-srv3 Filesystem[2328]: [2379]: INFO: unmounted /pfs successfully Aug 10 17:49:09 pfs-srv3 Filesystem[2322]: [2386]: INFO: Success Aug 10 17:49:09 pfs-srv3 ResourceManager[2195]: [2403]: info: Running /etc/ha.d/resource.d/drbddisk r0 stop Aug 10 17:49:09 pfs-srv3 heartbeat: [2181]: info: All HA resources relinquished. Aug 10 17:49:09 pfs-srv3 heartbeat: [2181]: info: FIFO message [type shutdone] written rc=27 Aug 10 17:49:11 pfs-srv3 heartbeat: [1162]: info: killing HBFIFO process 1198 with signal 15 Aug 10 17:49:11 pfs-srv3 heartbeat: [1162]: info: killing HBWRITE process 1199 with signal 15 Aug 10 17:49:11 pfs-srv3 heartbeat: [1162]: info: killing HBREAD process 1200 with signal 15 Aug 10 17:49:11 pfs-srv3 heartbeat: [1162]: info: Core process 1198 exited. 3 remaining Aug 10 17:49:11 pfs-srv3 heartbeat: [1162]: info: Core process 1199 exited. 2 remaining Aug 10 17:49:11 pfs-srv3 heartbeat: [1162]: info: Core process 1200 exited. 1 remaining Aug 10 17:49:11 pfs-srv3 heartbeat: [1162]: info: pfs-srv3 Heartbeat shutdown complete. Aug 10 17:49:11 pfs-srv3 logd: [945]: info: logd_term_write_action: received SIGTERM Aug 10 17:49:11 pfs-srv3 logd: [945]: info: Exiting write process pfs-srv4: Aug 10 17:46:09 pfs-srv4 heartbeat: [1276]: info: Link pfs-srv3:eth1 up. Aug 10 17:46:09 pfs-srv4 heartbeat: [1276]: info: Status update for node pfs-srv3: status init Aug 10 17:46:09 pfs-srv4 heartbeat: [1276]: info: Status update for node pfs-srv3: status up Aug 10 17:46:09 pfs-srv4 harc[1916]: [1922]: info: Running /etc/ha.d//rc.d/status status Aug 10 17:46:09 pfs-srv4 heartbeat: [1276]: info: Managed status process 1916 exited with return code 0. Aug 10 17:46:09 pfs-srv4 harc[1927]: [1933]: info: Running /etc/ha.d//rc.d/status status Aug 10 17:46:09 pfs-srv4 heartbeat: [1276]: info: Managed status process 1927 exited with return code 0. Aug 10 17:46:10 pfs-srv4 heartbeat: [1276]: info: Status update for node pfs-srv3: status active Aug 10 17:46:10 pfs-srv4 heartbeat: [1276]: info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1)) Aug 10 17:46:10 pfs-srv4 harc[1938]: [1944]: info: Running /etc/ha.d//rc.d/status status Aug 10 17:46:10 pfs-srv4 heartbeat: [1276]: info: Managed status process 1938 exited with return code 0. Aug 10 17:46:11 pfs-srv4 heartbeat: [1276]: info: other_holds_resources: 0 Aug 10 17:46:11 pfs-srv4 heartbeat: [1276]: info: remote resource transition completed. Aug 10 17:46:11 pfs-srv4 heartbeat: [1276]: info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1)) Aug 10 17:46:11 pfs-srv4 heartbeat: [1276]: info: pfs-srv4 wants to go standby [foreign] Aug 10 17:46:11 pfs-srv4 heartbeat: [1276]: info: i_hold_resources: 3 Aug 10 17:46:11 pfs-srv4 heartbeat: [1276]: info: New standby state: 1 Aug 10 17:46:11 pfs-srv4 heartbeat: [1276]: info: other_holds_resources: 0 Aug 10 17:46:11 pfs-srv4 heartbeat: [1276]: info: standby: pfs-srv3 can take our foreign resources Aug 10 17:46:11 pfs-srv4 heartbeat: [1276]: info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1)) Aug 10 17:46:11 pfs-srv4 heartbeat: [1276]: info: New standby state: 1 Aug 10 17:46:11 pfs-srv4 heartbeat: [1949]: info: give up foreign HA resources (standby). Aug 10 17:46:11 pfs-srv4 heartbeat: [1949]: info: go_standby: who: 1 resource set: foreign Aug 10 17:46:11 pfs-srv4 heartbeat: [1949]: info: go_standby: (query/action): (otherkeys/givegroup) Aug 10 17:46:11 pfs-srv4 ResourceManager[1963]: [1974]: info: Releasing resource group: pfs-srv3 drbddisk::r0 Filesystem::/dev/drbd0::/pfs::ext3 10.1.8.45/24 nfs-kernel-server smbd Aug 10 17:46:11 pfs-srv4 ResourceManager[1963]: [1985]: info: Running /etc/init.d/smbd stop Aug 10 17:46:11 pfs-srv4 ResourceManager[1963]: [2004]: info: Running /etc/init.d/nfs-kernel-server stop Aug 10 17:46:11 pfs-srv4 ResourceManager[1963]: [2031]: info: Running /etc/ha.d/resource.d/IPaddr 10.1.8.45/24 stop Aug 10 17:46:11 pfs-srv4 IPaddr[2055]: [2066]: INFO: ifconfig eth0:0 down Aug 10 17:46:11 pfs-srv4 IPaddr[2034]: [2070]: INFO: Success Aug 10 17:46:11 pfs-srv4 ResourceManager[1963]: [2087]: info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 /pfs ext3 stop Aug 10 17:46:12 pfs-srv4 Filesystem[2096]: [2124]: INFO: Running stop for /dev/drbd0 on /pfs Aug 10 17:46:12 pfs-srv4 Filesystem[2096]: [2139]: INFO: Trying to unmount /pfs Aug 10 17:46:12 pfs-srv4 Filesystem[2096]: [2147]: INFO: unmounted /pfs successfully Aug 10 17:46:12 pfs-srv4 Filesystem[2090]: [2154]: INFO: Success Aug 10 17:46:12 pfs-srv4 ResourceManager[1963]: [2171]: info: Running /etc/ha.d/resource.d/drbddisk r0 stop Aug 10 17:46:12 pfs-srv4 heartbeat: [1949]: info: foreign HA resource release completed (standby). Aug 10 17:46:12 pfs-srv4 heartbeat: [1949]: info: FIFO message [type ask_resources] written rc=51 Aug 10 17:46:12 pfs-srv4 heartbeat: [1276]: info: Local standby process completed [foreign]. Aug 10 17:46:12 pfs-srv4 heartbeat: [1276]: info: New standby state: 3 Aug 10 17:46:12 pfs-srv4 heartbeat: [1276]: info: Managed go_standby process 1949 exited with return code 0. Aug 10 17:46:13 pfs-srv4 heartbeat: [1276]: WARN: 1 lost packet(s) for [pfs-srv3] [12:14] Aug 10 17:46:13 pfs-srv4 heartbeat: [1276]: info: remote resource transition completed. Aug 10 17:46:13 pfs-srv4 heartbeat: [1276]: info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1)) Aug 10 17:46:13 pfs-srv4 heartbeat: [1276]: info: other_holds_resources: 1 Aug 10 17:46:13 pfs-srv4 heartbeat: [1276]: info: No pkts missing from pfs-srv3! Aug 10 17:46:13 pfs-srv4 heartbeat: [1276]: info: Other node completed standby takeover of foreign resources. Aug 10 17:46:13 pfs-srv4 heartbeat: [1276]: info: AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1)) Aug 10 17:46:13 pfs-srv4 heartbeat: [1276]: info: New standby state: 0 Aug 10 17:46:13 pfs-srv4 heartbeat: [1276]: info: other_holds_resources: 1 Aug 10 17:49:07 pfs-srv4 heartbeat: [1276]: info: hb_giveup_resources(): current status: active Aug 10 17:49:07 pfs-srv4 heartbeat: [1276]: info: Heartbeat shutdown in progress. (1276) Aug 10 17:49:07 pfs-srv4 heartbeat: [2203]: info: Giving up all HA resources. Aug 10 17:49:07 pfs-srv4 ResourceManager[2217]: [2228]: info: Releasing resource group: pfs-srv3 drbddisk::r0 Filesystem::/dev/drbd0::/pfs::ext3 10.1.8.45/24 nfs-kernel-server smbd Aug 10 17:49:07 pfs-srv4 ResourceManager[2217]: [2239]: info: Running /etc/init.d/smbd stop Aug 10 17:49:07 pfs-srv4 ResourceManager[2217]: [2258]: info: Running /etc/init.d/nfs-kernel-server stop Aug 10 17:49:07 pfs-srv4 ResourceManager[2217]: [2285]: info: Running /etc/ha.d/resource.d/IPaddr 10.1.8.45/24 stop Aug 10 17:49:07 pfs-srv4 IPaddr[2288]: [2317]: INFO: Success Aug 10 17:49:07 pfs-srv4 ResourceManager[2217]: [2334]: info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 /pfs ext3 stop Aug 10 17:49:08 pfs-srv4 Filesystem[2343]: [2371]: INFO: Running stop for /dev/drbd0 on /pfs Aug 10 17:49:08 pfs-srv4 Filesystem[2337]: [2383]: INFO: Success Aug 10 17:49:08 pfs-srv4 ResourceManager[2217]: [2400]: info: Running /etc/ha.d/resource.d/drbddisk r0 stop Aug 10 17:49:08 pfs-srv4 heartbeat: [2203]: info: All HA resources relinquished. Aug 10 17:49:08 pfs-srv4 heartbeat: [2203]: info: FIFO message [type shutdone] written rc=27 Aug 10 17:49:08 pfs-srv4 heartbeat: [1276]: info: other_holds_resources: 3 Aug 10 17:49:08 pfs-srv4 heartbeat: [1276]: WARN: 1 lost packet(s) for [pfs-srv3] [193:195] Aug 10 17:49:08 pfs-srv4 heartbeat: [1276]: info: other_holds_resources: 3 Aug 10 17:49:08 pfs-srv4 heartbeat: [1276]: info: No pkts missing from pfs-srv3! Aug 10 17:49:08 pfs-srv4 heartbeat: [1276]: info: other_holds_resources: 3 Aug 10 17:49:08 pfs-srv4 heartbeat: [1276]: info: other_holds_resources: 0 Aug 10 17:49:09 pfs-srv4 heartbeat: [1276]: info: Received shutdown notice from 'pfs-srv3'. Aug 10 17:49:09 pfs-srv4 heartbeat: [1276]: info: Resource takeover cancelled - shutdown in progress. Aug 10 17:49:10 pfs-srv4 heartbeat: [1276]: info: killing HBFIFO process 1302 with signal 15 Aug 10 17:49:10 pfs-srv4 heartbeat: [1276]: info: killing HBWRITE process 1303 with signal 15 Aug 10 17:49:10 pfs-srv4 heartbeat: [1276]: info: killing HBREAD process 1304 with signal 15 Aug 10 17:49:10 pfs-srv4 heartbeat: [1276]: info: Core process 1302 exited. 3 remaining Aug 10 17:49:10 pfs-srv4 heartbeat: [1276]: info: Core process 1303 exited. 2 remaining Aug 10 17:49:10 pfs-srv4 heartbeat: [1276]: info: Core process 1304 exited. 1 remaining Aug 10 17:49:10 pfs-srv4 heartbeat: [1276]: info: pfs-srv4 Heartbeat shutdown complete. Aug 10 17:49:11 pfs-srv4 logd: [994]: info: logd_term_write_action: received SIGTERM Aug 10 17:49:11 pfs-srv4 logd: [994]: info: Exiting write process Aug 10 17:51:14 pfs-srv4 logd: [898]: info: logd started with /etc/logd.cf. Aug 10 17:51:14 pfs-srv4 logd: [898]: WARN: Core dumps could be lost if multiple dumps occur. Aug 10 17:51:14 pfs-srv4 logd: [898]: WARN: Consider setting non-default value in /proc/sys/kernel/core_pattern (or equivalent) for maximum supportability Aug 10 17:51:14 pfs-srv4 logd: [898]: WARN: Consider setting /proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum supportability Aug 10 17:51:14 pfs-srv4 logd: [898]: info: G_main_add_SignalHandler: Added signal handler for signal 15 Aug 10 17:51:14 pfs-srv4 logd: [953]: info: G_main_add_SignalHandler: Added signal handler for signal 15 _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
