David and Dmitri,

Here's one more try and one more set of log files. I now see that
heartbeat is shutting down, which is beyond what used to happen.

some interesting lines I saw:

Aug 10 17:49:09 pfs-srv4 heartbeat: [1276]: info: Received shutdown
notice from 'pfs-srv3'.
Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: info: Received shutdown
notice from 'pfs-srv4'.

Now it seems that each other tells each other to shut down.

Here are the logs:

pfs-srv3:



Aug 10 17:46:04 pfs-srv3 logd: [899]: info: logd started with /etc/logd.cf.
Aug 10 17:46:04 pfs-srv3 logd: [899]: WARN: Core dumps could be lost
if multiple dumps occur.
Aug 10 17:46:04 pfs-srv3 logd: [899]: WARN: Consider setting
non-default value in /proc/sys/kernel/core_pattern (or equivalent) for
maximum supportability
Aug 10 17:46:04 pfs-srv3 logd: [899]: WARN: Consider setting
/proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum
supportability
Aug 10 17:46:04 pfs-srv3 logd: [899]: info: G_main_add_SignalHandler:
Added signal handler for signal 15
Aug 10 17:46:04 pfs-srv3 logd: [945]: info: G_main_add_SignalHandler:
Added signal handler for signal 15
Aug 10 17:46:08 pfs-srv3 heartbeat: [1161]: info: Enabling logging daemon
Aug 10 17:46:08 pfs-srv3 heartbeat: [1161]: info: logfile and debug
file are those specified in logd config file (default /etc/logd.cf)
Aug 10 17:46:08 pfs-srv3 heartbeat: [1161]: info: Version 2 support: off
Aug 10 17:46:08 pfs-srv3 heartbeat: [1161]: info: AUTH: i=1: key =
0x97bcb30, auth=0xb7282034, authname=md5
Aug 10 17:46:08 pfs-srv3 heartbeat: [1161]: WARN: Core dumps could be
lost if multiple dumps occur.
Aug 10 17:46:08 pfs-srv3 heartbeat: [1161]: WARN: Consider setting
non-default value in /proc/sys/kernel/core_pattern (or equivalent) for
maximum supportability
Aug 10 17:46:08 pfs-srv3 heartbeat: [1161]: WARN: Consider setting
/proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum
supportability
Aug 10 17:46:08 pfs-srv3 heartbeat: [1161]: info: **************************
Aug 10 17:46:08 pfs-srv3 heartbeat: [1161]: info: Configuration
validated. Starting heartbeat 3.0.2
Aug 10 17:46:08 pfs-srv3 heartbeat: [1161]: info: Heartbeat Hg
Version: node: ed844d11ea2b603f7d01cce1700d6c1fcb404d29
Aug 10 17:46:08 pfs-srv3 heartbeat: [1162]: info: heartbeat: version 3.0.2
Aug 10 17:46:08 pfs-srv3 heartbeat: [1162]: info: Heartbeat
generation: 1279723766
Aug 10 17:46:08 pfs-srv3 heartbeat: [1162]: info: glib: UDP Broadcast
heartbeat started on port 12694 (12694) interface eth1
Aug 10 17:46:08 pfs-srv3 heartbeat: [1162]: info: glib: UDP Broadcast
heartbeat closed on port 12694 interface eth1 - Status: 1
Aug 10 17:46:08 pfs-srv3 heartbeat: [1162]: info:
G_main_add_TriggerHandler: Added signal manual handler
Aug 10 17:46:08 pfs-srv3 heartbeat: [1162]: info:
G_main_add_TriggerHandler: Added signal manual handler
Aug 10 17:46:08 pfs-srv3 heartbeat: [1162]: info:
G_main_add_SignalHandler: Added signal handler for signal 17
Aug 10 17:46:08 pfs-srv3 heartbeat: [1162]: info: Local status now set to: 'up'
Aug 10 17:46:08 pfs-srv3 heartbeat: [1162]: info: Link pfs-srv3:eth1 up.
Aug 10 17:46:08 pfs-srv3 heartbeat: [1162]: info: Managed
write_hostcachedata process 1201 exited with return code 0.
Aug 10 17:46:09 pfs-srv3 heartbeat: [1162]: info: Link pfs-srv4:eth1 up.
Aug 10 17:46:09 pfs-srv3 heartbeat: [1162]: info: Status update for
node pfs-srv4: status active
Aug 10 17:46:09 pfs-srv3 heartbeat: [1162]: info: Managed
write_hostcachedata process 1206 exited with return code 0.
Aug 10 17:46:09 pfs-srv3 harc[1205]: [1212]: info: Running
/etc/ha.d//rc.d/status status
Aug 10 17:46:09 pfs-srv3 heartbeat: [1162]: info: Managed status
process 1205 exited with return code 0.
Aug 10 17:46:10 pfs-srv3 heartbeat: [1162]: info: Comm_now_up():
updating status to active
Aug 10 17:46:10 pfs-srv3 heartbeat: [1162]: info: Local status now set
to: 'active'
Aug 10 17:46:10 pfs-srv3 heartbeat: [1162]: info: Managed
write_hostcachedata process 1218 exited with return code 0.
Aug 10 17:46:10 pfs-srv3 heartbeat: [1162]: info: Managed
write_delcachedata process 1219 exited with return code 0.
Aug 10 17:46:10 pfs-srv3 heartbeat: [1162]: info: remote resource
transition completed.
Aug 10 17:46:10 pfs-srv3 heartbeat: [1162]: info: STATE 1 => 3
Aug 10 17:46:10 pfs-srv3 heartbeat: [1162]: info: other_holds_resources: 3
Aug 10 17:46:10 pfs-srv3 heartbeat: [1162]: info: remote resource
transition completed.
Aug 10 17:46:10 pfs-srv3 heartbeat: [1162]: info: Local Resource
acquisition completed. (none)
Aug 10 17:46:10 pfs-srv3 heartbeat: [1162]: info:
AnnounceTakeover(local 0, foreign 1, reason 'T_RESOURCES(them)' (0))
Aug 10 17:46:10 pfs-srv3 heartbeat: [1162]: info: STATE 3 => 4
Aug 10 17:46:11 pfs-srv3 heartbeat: [1162]: info: other_holds_resources: 3
Aug 10 17:46:11 pfs-srv3 heartbeat: [1162]: info: pfs-srv4 wants to go
standby [foreign]
Aug 10 17:46:11 pfs-srv3 heartbeat: [1162]: info: standby:
other_holds_resources: 3
Aug 10 17:46:11 pfs-srv3 heartbeat: [1162]: info: New standby state: 2
Aug 10 17:46:11 pfs-srv3 heartbeat: [1162]: info: New standby state: 2
Aug 10 17:46:12 pfs-srv3 heartbeat: [1162]: info: other_holds_resources: 1
Aug 10 17:46:12 pfs-srv3 heartbeat: [1162]: info: standby: acquire
[foreign] resources from pfs-srv4
Aug 10 17:46:12 pfs-srv3 heartbeat: [1162]: info: New standby state: 3
Aug 10 17:46:12 pfs-srv3 heartbeat: [1295]: info: acquire local HA
resources (standby).
Aug 10 17:46:12 pfs-srv3 heartbeat: [1295]: info: go_standby: who: 2
resource set: local
Aug 10 17:46:12 pfs-srv3 heartbeat: [1295]: info: go_standby:
(query/action): (ourkeys/takegroup)
Aug 10 17:46:12 pfs-srv3 ResourceManager[1309]: [1320]: info:
Acquiring resource group: pfs-srv3 drbddisk::r0
Filesystem::/dev/drbd0::/pfs::ext3 10.1.8.45/24 nfs-kernel-server smbd
Aug 10 17:46:12 pfs-srv3 Filesystem[1347]: [1386]: INFO:  Resource is stopped
Aug 10 17:46:12 pfs-srv3 ResourceManager[1309]: [1401]: info: Running
/etc/ha.d/resource.d/Filesystem /dev/drbd0 /pfs ext3 start
Aug 10 17:46:12 pfs-srv3 Filesystem[1410]: [1438]: INFO: Running start
for /dev/drbd0 on /pfs
Aug 10 17:46:12 pfs-srv3 Filesystem[1404]: [1455]: INFO:  Success
Aug 10 17:46:12 pfs-srv3 IPaddr[1469]: [1498]: INFO:  Resource is stopped
Aug 10 17:46:12 pfs-srv3 ResourceManager[1309]: [1515]: info: Running
/etc/ha.d/resource.d/IPaddr 10.1.8.45/24 start
Aug 10 17:46:12 pfs-srv3 IPaddr[1539]: [1564]: INFO: Using calculated
nic for 10.1.8.45: eth0
Aug 10 17:46:12 pfs-srv3 IPaddr[1539]: [1570]: INFO: Using calculated
netmask for 10.1.8.45: 255.255.255.0
Aug 10 17:46:12 pfs-srv3 IPaddr[1539]: [1594]: INFO: eval ifconfig
eth0:0 10.1.8.45 netmask 255.255.255.0 broadcast 10.1.8.255
Aug 10 17:46:12 pfs-srv3 IPaddr[1518]: [1615]: INFO:  Success
Aug 10 17:46:12 pfs-srv3 ResourceManager[1309]: [1635]: info: Running
/etc/init.d/nfs-kernel-server  start
Aug 10 17:46:13 pfs-srv3 ResourceManager[1309]: [1680]: info: Running
/etc/init.d/smbd  start
Aug 10 17:46:13 pfs-srv3 heartbeat: [1295]: info: local HA resource
acquisition completed (standby).
Aug 10 17:46:13 pfs-srv3 heartbeat: [1295]: info: FIFO message [type
ask_resources] written rc=51
Aug 10 17:46:13 pfs-srv3 heartbeat: [1162]: info: Standby resource
acquisition done [foreign].
Aug 10 17:46:13 pfs-srv3 heartbeat: [1162]: info:
AnnounceTakeover(local 1, foreign 1, reason 'auto_failback' (0))
Aug 10 17:46:13 pfs-srv3 heartbeat: [1162]: info: Initial resource
acquisition complete (auto_failback)
Aug 10 17:46:13 pfs-srv3 heartbeat: [1162]: info:
AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1))
Aug 10 17:46:13 pfs-srv3 heartbeat: [1162]: info: New standby state: 0
Aug 10 17:46:13 pfs-srv3 heartbeat: [1162]: info: Managed go_standby
process 1295 exited with return code 0.
Aug 10 17:46:13 pfs-srv3 heartbeat: [1162]: info: remote resource
transition completed.
Aug 10 17:46:13 pfs-srv3 heartbeat: [1162]: info:
AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1))
Aug 10 17:46:13 pfs-srv3 heartbeat: [1162]: info: other_holds_resources: 1
Aug 10 17:46:13 pfs-srv3 heartbeat: [1162]: info: other_holds_resources: 1
Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: info: other_holds_resources: 0
Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: info: Received shutdown
notice from 'pfs-srv4'.
Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: info: Resources being
acquired from pfs-srv4.
Aug 10 17:49:08 pfs-srv3 heartbeat: [1915]: info: acquire local HA
resources (standby).
Aug 10 17:49:08 pfs-srv3 heartbeat: [1915]: info: go_standby: who: 2
resource set: local
Aug 10 17:49:08 pfs-srv3 heartbeat: [1915]: info: go_standby:
(query/action): (ourkeys/takegroup)
Aug 10 17:49:08 pfs-srv3 ResourceManager[1943]: [1964]: info:
Acquiring resource group: pfs-srv3 drbddisk::r0
Filesystem::/dev/drbd0::/pfs::ext3 10.1.8.45/24 nfs-kernel-server smbd
Aug 10 17:49:08 pfs-srv3 heartbeat: [1916]: info: 1 local resources
from [/usr/share/heartbeat/ResourceManager listkeys pfs-srv3]
Aug 10 17:49:08 pfs-srv3 heartbeat: [1916]: info: Local Resource
acquisition completed.
Aug 10 17:49:08 pfs-srv3 heartbeat: [1916]: info: FIFO message [type
resource] written rc=79
Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: info:
AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1))
Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: info: Managed
req_our_resources(ask) process 1916 exited with return code 0.
Aug 10 17:49:08 pfs-srv3 Filesystem[2006]: [2045]: INFO:  Running OK
Aug 10 17:49:08 pfs-srv3 IPaddr[2057]: [2086]: INFO:  Running OK
Aug 10 17:49:08 pfs-srv3 ResourceManager[1943]: [2113]: info: Running
/etc/init.d/smbd  start
Aug 10 17:49:08 pfs-srv3 heartbeat: [1915]: info: local HA resource
acquisition completed (standby).
Aug 10 17:49:08 pfs-srv3 heartbeat: [1915]: info: FIFO message [type
ask_resources] written rc=51
Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: info: Standby resource
acquisition done [foreign].
Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: info:
AnnounceTakeover(local 1, foreign 1, reason 'auto_failback' (1))
Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: info:
AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1))
Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: info: New standby state: 0
Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: info: Managed go_standby
process 1915 exited with return code 0.
Aug 10 17:49:08 pfs-srv3 harc[2127]: [2136]: info: Running
/etc/ha.d//rc.d/status status
Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: WARN: Shutdown delayed
until current resource activity finishes.
Aug 10 17:49:08 pfs-srv3 mach_down[2149]: [2175]: info:
/usr/share/heartbeat/mach_down: nice_failback: foreign resources
acquired
Aug 10 17:49:08 pfs-srv3 mach_down[2149]: [2180]: info: mach_down
takeover complete for node pfs-srv4.
Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: info:
AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1))
Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: info: mach_down takeover complete.
Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: info:
AnnounceTakeover(local 1, foreign 1, reason 'mach_down' (1))
Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: info: Managed status
process 2127 exited with return code 0.
Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: info:
hb_giveup_resources(): current status: active
Aug 10 17:49:08 pfs-srv3 heartbeat: [1162]: info: Heartbeat shutdown
in progress. (1162)
Aug 10 17:49:08 pfs-srv3 heartbeat: [2181]: info: Giving up all HA resources.
Aug 10 17:49:08 pfs-srv3 ResourceManager[2195]: [2206]: info:
Releasing resource group: pfs-srv3 drbddisk::r0
Filesystem::/dev/drbd0::/pfs::ext3 10.1.8.45/24 nfs-kernel-server smbd
Aug 10 17:49:08 pfs-srv3 ResourceManager[2195]: [2217]: info: Running
/etc/init.d/smbd  stop
Aug 10 17:49:08 pfs-srv3 ResourceManager[2195]: [2236]: info: Running
/etc/init.d/nfs-kernel-server  stop
Aug 10 17:49:08 pfs-srv3 ResourceManager[2195]: [2263]: info: Running
/etc/ha.d/resource.d/IPaddr 10.1.8.45/24 stop
Aug 10 17:49:08 pfs-srv3 IPaddr[2287]: [2298]: INFO: ifconfig eth0:0 down
Aug 10 17:49:08 pfs-srv3 IPaddr[2266]: [2302]: INFO:  Success
Aug 10 17:49:08 pfs-srv3 ResourceManager[2195]: [2319]: info: Running
/etc/ha.d/resource.d/Filesystem /dev/drbd0 /pfs ext3 stop
Aug 10 17:49:08 pfs-srv3 Filesystem[2328]: [2356]: INFO: Running stop
for /dev/drbd0 on /pfs
Aug 10 17:49:08 pfs-srv3 Filesystem[2328]: [2371]: INFO: Trying to unmount /pfs
Aug 10 17:49:09 pfs-srv3 Filesystem[2328]: [2379]: INFO: unmounted
/pfs successfully
Aug 10 17:49:09 pfs-srv3 Filesystem[2322]: [2386]: INFO:  Success
Aug 10 17:49:09 pfs-srv3 ResourceManager[2195]: [2403]: info: Running
/etc/ha.d/resource.d/drbddisk r0 stop
Aug 10 17:49:09 pfs-srv3 heartbeat: [2181]: info: All HA resources relinquished.
Aug 10 17:49:09 pfs-srv3 heartbeat: [2181]: info: FIFO message [type
shutdone] written rc=27
Aug 10 17:49:11 pfs-srv3 heartbeat: [1162]: info: killing HBFIFO
process 1198 with signal 15
Aug 10 17:49:11 pfs-srv3 heartbeat: [1162]: info: killing HBWRITE
process 1199 with signal 15
Aug 10 17:49:11 pfs-srv3 heartbeat: [1162]: info: killing HBREAD
process 1200 with signal 15
Aug 10 17:49:11 pfs-srv3 heartbeat: [1162]: info: Core process 1198
exited. 3 remaining
Aug 10 17:49:11 pfs-srv3 heartbeat: [1162]: info: Core process 1199
exited. 2 remaining
Aug 10 17:49:11 pfs-srv3 heartbeat: [1162]: info: Core process 1200
exited. 1 remaining
Aug 10 17:49:11 pfs-srv3 heartbeat: [1162]: info: pfs-srv3 Heartbeat
shutdown complete.
Aug 10 17:49:11 pfs-srv3 logd: [945]: info: logd_term_write_action:
received SIGTERM
Aug 10 17:49:11 pfs-srv3 logd: [945]: info: Exiting write process


pfs-srv4:


Aug 10 17:46:09 pfs-srv4 heartbeat: [1276]: info: Link pfs-srv3:eth1 up.
Aug 10 17:46:09 pfs-srv4 heartbeat: [1276]: info: Status update for
node pfs-srv3: status init
Aug 10 17:46:09 pfs-srv4 heartbeat: [1276]: info: Status update for
node pfs-srv3: status up
Aug 10 17:46:09 pfs-srv4 harc[1916]: [1922]: info: Running
/etc/ha.d//rc.d/status status
Aug 10 17:46:09 pfs-srv4 heartbeat: [1276]: info: Managed status
process 1916 exited with return code 0.
Aug 10 17:46:09 pfs-srv4 harc[1927]: [1933]: info: Running
/etc/ha.d//rc.d/status status
Aug 10 17:46:09 pfs-srv4 heartbeat: [1276]: info: Managed status
process 1927 exited with return code 0.
Aug 10 17:46:10 pfs-srv4 heartbeat: [1276]: info: Status update for
node pfs-srv3: status active
Aug 10 17:46:10 pfs-srv4 heartbeat: [1276]: info:
AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1))
Aug 10 17:46:10 pfs-srv4 harc[1938]: [1944]: info: Running
/etc/ha.d//rc.d/status status
Aug 10 17:46:10 pfs-srv4 heartbeat: [1276]: info: Managed status
process 1938 exited with return code 0.
Aug 10 17:46:11 pfs-srv4 heartbeat: [1276]: info: other_holds_resources: 0
Aug 10 17:46:11 pfs-srv4 heartbeat: [1276]: info: remote resource
transition completed.
Aug 10 17:46:11 pfs-srv4 heartbeat: [1276]: info:
AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1))
Aug 10 17:46:11 pfs-srv4 heartbeat: [1276]: info: pfs-srv4 wants to go
standby [foreign]
Aug 10 17:46:11 pfs-srv4 heartbeat: [1276]: info: i_hold_resources: 3
Aug 10 17:46:11 pfs-srv4 heartbeat: [1276]: info: New standby state: 1
Aug 10 17:46:11 pfs-srv4 heartbeat: [1276]: info: other_holds_resources: 0
Aug 10 17:46:11 pfs-srv4 heartbeat: [1276]: info: standby: pfs-srv3
can take our foreign resources
Aug 10 17:46:11 pfs-srv4 heartbeat: [1276]: info:
AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1))
Aug 10 17:46:11 pfs-srv4 heartbeat: [1276]: info: New standby state: 1
Aug 10 17:46:11 pfs-srv4 heartbeat: [1949]: info: give up foreign HA
resources (standby).
Aug 10 17:46:11 pfs-srv4 heartbeat: [1949]: info: go_standby: who: 1
resource set: foreign
Aug 10 17:46:11 pfs-srv4 heartbeat: [1949]: info: go_standby:
(query/action): (otherkeys/givegroup)
Aug 10 17:46:11 pfs-srv4 ResourceManager[1963]: [1974]: info:
Releasing resource group: pfs-srv3 drbddisk::r0
Filesystem::/dev/drbd0::/pfs::ext3 10.1.8.45/24 nfs-kernel-server smbd
Aug 10 17:46:11 pfs-srv4 ResourceManager[1963]: [1985]: info: Running
/etc/init.d/smbd  stop
Aug 10 17:46:11 pfs-srv4 ResourceManager[1963]: [2004]: info: Running
/etc/init.d/nfs-kernel-server  stop
Aug 10 17:46:11 pfs-srv4 ResourceManager[1963]: [2031]: info: Running
/etc/ha.d/resource.d/IPaddr 10.1.8.45/24 stop
Aug 10 17:46:11 pfs-srv4 IPaddr[2055]: [2066]: INFO: ifconfig eth0:0 down
Aug 10 17:46:11 pfs-srv4 IPaddr[2034]: [2070]: INFO:  Success
Aug 10 17:46:11 pfs-srv4 ResourceManager[1963]: [2087]: info: Running
/etc/ha.d/resource.d/Filesystem /dev/drbd0 /pfs ext3 stop
Aug 10 17:46:12 pfs-srv4 Filesystem[2096]: [2124]: INFO: Running stop
for /dev/drbd0 on /pfs
Aug 10 17:46:12 pfs-srv4 Filesystem[2096]: [2139]: INFO: Trying to unmount /pfs
Aug 10 17:46:12 pfs-srv4 Filesystem[2096]: [2147]: INFO: unmounted
/pfs successfully
Aug 10 17:46:12 pfs-srv4 Filesystem[2090]: [2154]: INFO:  Success
Aug 10 17:46:12 pfs-srv4 ResourceManager[1963]: [2171]: info: Running
/etc/ha.d/resource.d/drbddisk r0 stop
Aug 10 17:46:12 pfs-srv4 heartbeat: [1949]: info: foreign HA resource
release completed (standby).
Aug 10 17:46:12 pfs-srv4 heartbeat: [1949]: info: FIFO message [type
ask_resources] written rc=51
Aug 10 17:46:12 pfs-srv4 heartbeat: [1276]: info: Local standby
process completed [foreign].
Aug 10 17:46:12 pfs-srv4 heartbeat: [1276]: info: New standby state: 3
Aug 10 17:46:12 pfs-srv4 heartbeat: [1276]: info: Managed go_standby
process 1949 exited with return code 0.
Aug 10 17:46:13 pfs-srv4 heartbeat: [1276]: WARN: 1 lost packet(s) for
[pfs-srv3] [12:14]
Aug 10 17:46:13 pfs-srv4 heartbeat: [1276]: info: remote resource
transition completed.
Aug 10 17:46:13 pfs-srv4 heartbeat: [1276]: info:
AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1))
Aug 10 17:46:13 pfs-srv4 heartbeat: [1276]: info: other_holds_resources: 1
Aug 10 17:46:13 pfs-srv4 heartbeat: [1276]: info: No pkts missing from pfs-srv3!
Aug 10 17:46:13 pfs-srv4 heartbeat: [1276]: info: Other node completed
standby takeover of foreign resources.
Aug 10 17:46:13 pfs-srv4 heartbeat: [1276]: info:
AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1))
Aug 10 17:46:13 pfs-srv4 heartbeat: [1276]: info: New standby state: 0
Aug 10 17:46:13 pfs-srv4 heartbeat: [1276]: info: other_holds_resources: 1
Aug 10 17:49:07 pfs-srv4 heartbeat: [1276]: info:
hb_giveup_resources(): current status: active
Aug 10 17:49:07 pfs-srv4 heartbeat: [1276]: info: Heartbeat shutdown
in progress. (1276)
Aug 10 17:49:07 pfs-srv4 heartbeat: [2203]: info: Giving up all HA resources.
Aug 10 17:49:07 pfs-srv4 ResourceManager[2217]: [2228]: info:
Releasing resource group: pfs-srv3 drbddisk::r0
Filesystem::/dev/drbd0::/pfs::ext3 10.1.8.45/24 nfs-kernel-server smbd
Aug 10 17:49:07 pfs-srv4 ResourceManager[2217]: [2239]: info: Running
/etc/init.d/smbd  stop
Aug 10 17:49:07 pfs-srv4 ResourceManager[2217]: [2258]: info: Running
/etc/init.d/nfs-kernel-server  stop
Aug 10 17:49:07 pfs-srv4 ResourceManager[2217]: [2285]: info: Running
/etc/ha.d/resource.d/IPaddr 10.1.8.45/24 stop
Aug 10 17:49:07 pfs-srv4 IPaddr[2288]: [2317]: INFO:  Success
Aug 10 17:49:07 pfs-srv4 ResourceManager[2217]: [2334]: info: Running
/etc/ha.d/resource.d/Filesystem /dev/drbd0 /pfs ext3 stop
Aug 10 17:49:08 pfs-srv4 Filesystem[2343]: [2371]: INFO: Running stop
for /dev/drbd0 on /pfs
Aug 10 17:49:08 pfs-srv4 Filesystem[2337]: [2383]: INFO:  Success
Aug 10 17:49:08 pfs-srv4 ResourceManager[2217]: [2400]: info: Running
/etc/ha.d/resource.d/drbddisk r0 stop
Aug 10 17:49:08 pfs-srv4 heartbeat: [2203]: info: All HA resources relinquished.
Aug 10 17:49:08 pfs-srv4 heartbeat: [2203]: info: FIFO message [type
shutdone] written rc=27
Aug 10 17:49:08 pfs-srv4 heartbeat: [1276]: info: other_holds_resources: 3
Aug 10 17:49:08 pfs-srv4 heartbeat: [1276]: WARN: 1 lost packet(s) for
[pfs-srv3] [193:195]
Aug 10 17:49:08 pfs-srv4 heartbeat: [1276]: info: other_holds_resources: 3
Aug 10 17:49:08 pfs-srv4 heartbeat: [1276]: info: No pkts missing from pfs-srv3!
Aug 10 17:49:08 pfs-srv4 heartbeat: [1276]: info: other_holds_resources: 3
Aug 10 17:49:08 pfs-srv4 heartbeat: [1276]: info: other_holds_resources: 0
Aug 10 17:49:09 pfs-srv4 heartbeat: [1276]: info: Received shutdown
notice from 'pfs-srv3'.
Aug 10 17:49:09 pfs-srv4 heartbeat: [1276]: info: Resource takeover
cancelled - shutdown in progress.
Aug 10 17:49:10 pfs-srv4 heartbeat: [1276]: info: killing HBFIFO
process 1302 with signal 15
Aug 10 17:49:10 pfs-srv4 heartbeat: [1276]: info: killing HBWRITE
process 1303 with signal 15
Aug 10 17:49:10 pfs-srv4 heartbeat: [1276]: info: killing HBREAD
process 1304 with signal 15
Aug 10 17:49:10 pfs-srv4 heartbeat: [1276]: info: Core process 1302
exited. 3 remaining
Aug 10 17:49:10 pfs-srv4 heartbeat: [1276]: info: Core process 1303
exited. 2 remaining
Aug 10 17:49:10 pfs-srv4 heartbeat: [1276]: info: Core process 1304
exited. 1 remaining
Aug 10 17:49:10 pfs-srv4 heartbeat: [1276]: info: pfs-srv4 Heartbeat
shutdown complete.
Aug 10 17:49:11 pfs-srv4 logd: [994]: info: logd_term_write_action:
received SIGTERM
Aug 10 17:49:11 pfs-srv4 logd: [994]: info: Exiting write process
Aug 10 17:51:14 pfs-srv4 logd: [898]: info: logd started with /etc/logd.cf.
Aug 10 17:51:14 pfs-srv4 logd: [898]: WARN: Core dumps could be lost
if multiple dumps occur.
Aug 10 17:51:14 pfs-srv4 logd: [898]: WARN: Consider setting
non-default value in /proc/sys/kernel/core_pattern (or equivalent) for
maximum supportability
Aug 10 17:51:14 pfs-srv4 logd: [898]: WARN: Consider setting
/proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum
supportability
Aug 10 17:51:14 pfs-srv4 logd: [898]: info: G_main_add_SignalHandler:
Added signal handler for signal 15
Aug 10 17:51:14 pfs-srv4 logd: [953]: info: G_main_add_SignalHandler:
Added signal handler for signal 15
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to