On Tue, Aug 10, 2010 at 3:25 PM, David Lang
<[email protected]> wrote:
> could you re-post the files (log files, ha.cf and haresources from each box)
>

Log file  from pfs-srv3


Aug 10 17:08:28 pfs-srv3 heartbeat: [1216]: info: other_holds_resources: 0
Aug 10 17:08:28 pfs-srv3 heartbeat: [1216]: info: Received shutdown
notice from 'pfs-srv4'.
Aug 10 17:08:28 pfs-srv3 heartbeat: [1216]: info: Resources being
acquired from pfs-srv4.
Aug 10 17:08:28 pfs-srv3 heartbeat: [1528]: info: acquire local HA
resources (standby).
Aug 10 17:08:28 pfs-srv3 heartbeat: [1528]: info: go_standby: who: 2
resource set: local
Aug 10 17:08:28 pfs-srv3 heartbeat: [1528]: info: go_standby:
(query/action): (ourkeys/takegroup)
Aug 10 17:08:28 pfs-srv3 ResourceManager[1558]: [1577]: info:
Acquiring resource group: pfs-srv3 drbddisk::r0
Filesystem::/dev/drbd0::/pfs::ext3 10.1.8.45/24 nfs-kernel-server smbd
Aug 10 17:08:28 pfs-srv3 heartbeat: [1529]: info: 1 local resources
from [/usr/share/heartbeat/ResourceManager listkeys pfs-srv3]
Aug 10 17:08:28 pfs-srv3 heartbeat: [1529]: info: Local Resource
acquisition completed.
Aug 10 17:08:28 pfs-srv3 heartbeat: [1529]: info: FIFO message [type
resource] written rc=79
Aug 10 17:08:28 pfs-srv3 heartbeat: [1216]: info:
AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1))
Aug 10 17:08:28 pfs-srv3 heartbeat: [1216]: info: Managed
req_our_resources(ask) process 1529 exited with return code 0.
Aug 10 17:08:28 pfs-srv3 Filesystem[1619]: [1658]: INFO:  Resource is stopped
Aug 10 17:08:28 pfs-srv3 ResourceManager[1558]: [1673]: info: Running
/etc/ha.d/resource.d/Filesystem /dev/drbd0 /pfs ext3 start
Aug 10 17:08:28 pfs-srv3 Filesystem[1682]: [1710]: INFO: Running start
for /dev/drbd0 on /pfs
Aug 10 17:08:28 pfs-srv3 Filesystem[1676]: [1727]: INFO:  Success
Aug 10 17:08:28 pfs-srv3 IPaddr[1741]: [1770]: INFO:  Resource is stopped
Aug 10 17:08:28 pfs-srv3 ResourceManager[1558]: [1787]: info: Running
/etc/ha.d/resource.d/IPaddr 10.1.8.45/24 start
Aug 10 17:08:28 pfs-srv3 IPaddr[1811]: [1836]: INFO: Using calculated
nic for 10.1.8.45: eth0
Aug 10 17:08:28 pfs-srv3 IPaddr[1811]: [1842]: INFO: Using calculated
netmask for 10.1.8.45: 255.255.255.0
Aug 10 17:08:28 pfs-srv3 IPaddr[1811]: [1866]: INFO: eval ifconfig
eth0:0 10.1.8.45 netmask 255.255.255.0 broadcast 10.1.8.255
Aug 10 17:08:28 pfs-srv3 IPaddr[1790]: [1887]: INFO:  Success
Aug 10 17:08:28 pfs-srv3 ResourceManager[1558]: [1907]: info: Running
/etc/init.d/nfs-kernel-server  start
Aug 10 17:08:28 pfs-srv3 ResourceManager[1558]: [1967]: info: Running
/etc/init.d/smbd  start
Aug 10 17:08:28 pfs-srv3 heartbeat: [1216]: WARN: Shutdown delayed
until current resource activity finishes.
Aug 10 17:08:28 pfs-srv3 heartbeat: [1528]: info: local HA resource
acquisition completed (standby).
Aug 10 17:08:28 pfs-srv3 heartbeat: [1528]: info: FIFO message [type
ask_resources] written rc=47
Aug 10 17:08:28 pfs-srv3 heartbeat: [1216]: info: Standby resource
acquisition done [all].
Aug 10 17:08:28 pfs-srv3 heartbeat: [1216]: info:
AnnounceTakeover(local 1, foreign 1, reason 'auto_failback' (1))
Aug 10 17:08:28 pfs-srv3 heartbeat: [1216]: info:
AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1))
Aug 10 17:08:28 pfs-srv3 heartbeat: [1216]: info: New standby state: 0
Aug 10 17:08:28 pfs-srv3 heartbeat: [1216]: info: Managed go_standby
process 1528 exited with return code 0.
Aug 10 17:08:28 pfs-srv3 harc[1982]: [1990]: info: Running
/etc/ha.d//rc.d/status status
Aug 10 17:08:28 pfs-srv3 mach_down[1995]: [2015]: info:
/usr/share/heartbeat/mach_down: nice_failback: foreign resources
acquired
Aug 10 17:08:28 pfs-srv3 mach_down[1995]: [2020]: info: mach_down
takeover complete for node pfs-srv4.
Aug 10 17:08:28 pfs-srv3 heartbeat: [1216]: info:
AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1))
Aug 10 17:08:28 pfs-srv3 heartbeat: [1216]: info: mach_down takeover complete.
Aug 10 17:08:28 pfs-srv3 heartbeat: [1216]: info:
AnnounceTakeover(local 1, foreign 1, reason 'mach_down' (1))
Aug 10 17:08:28 pfs-srv3 heartbeat: [1216]: info: Managed status
process 1982 exited with return code 0.
Aug 10 17:08:28 pfs-srv3 heartbeat: [1216]: info:
hb_giveup_resources(): current status: active
Aug 10 17:08:28 pfs-srv3 heartbeat: [1216]: info: Heartbeat shutdown
in progress. (1216)
Aug 10 17:08:28 pfs-srv3 heartbeat: [2021]: info: Giving up all HA resources.
Aug 10 17:08:28 pfs-srv3 ResourceManager[2035]: [2046]: info:
Releasing resource group: pfs-srv3 drbddisk::r0
Filesystem::/dev/drbd0::/pfs::ext3 10.1.8.45/24 nfs-kernel-server smbd
Aug 10 17:08:28 pfs-srv3 ResourceManager[2035]: [2057]: info: Running
/etc/init.d/smbd  stop
Aug 10 17:08:29 pfs-srv3 ResourceManager[2035]: [2080]: info: Running
/etc/init.d/nfs-kernel-server  stop
Aug 10 17:08:29 pfs-srv3 ResourceManager[2035]: [2107]: info: Running
/etc/ha.d/resource.d/IPaddr 10.1.8.45/24 stop
Aug 10 17:08:29 pfs-srv3 IPaddr[2131]: [2146]: INFO: ifconfig eth0:0 down
Aug 10 17:08:29 pfs-srv3 IPaddr[2110]: [2150]: INFO:  Success
Aug 10 17:08:29 pfs-srv3 ResourceManager[2035]: [2167]: info: Running
/etc/ha.d/resource.d/Filesystem /dev/drbd0 /pfs ext3 stop
Aug 10 17:08:29 pfs-srv3 Filesystem[2176]: [2204]: INFO: Running stop
for /dev/drbd0 on /pfs
Aug 10 17:08:29 pfs-srv3 Filesystem[2176]: [2219]: INFO: Trying to unmount /pfs
Aug 10 17:08:29 pfs-srv3 Filesystem[2176]: [2227]: INFO: unmounted
/pfs successfully
Aug 10 17:08:29 pfs-srv3 Filesystem[2170]: [2234]: INFO:  Success
Aug 10 17:08:29 pfs-srv3 ResourceManager[2035]: [2251]: info: Running
/etc/ha.d/resource.d/drbddisk r0 stop
Aug 10 17:08:29 pfs-srv3 heartbeat: [2021]: info: All HA resources relinquished.
Aug 10 17:08:29 pfs-srv3 heartbeat: [2021]: info: FIFO message [type
shutdone] written rc=27
Aug 10 17:08:31 pfs-srv3 heartbeat: [1216]: info: killing HBFIFO
process 1255 with signal 15
Aug 10 17:08:31 pfs-srv3 heartbeat: [1216]: info: killing HBWRITE
process 1256 with signal 15
Aug 10 17:08:31 pfs-srv3 heartbeat: [1216]: info: killing HBREAD
process 1257 with signal 15
Aug 10 17:08:31 pfs-srv3 heartbeat: [1216]: info: Core process 1255
exited. 3 remaining
Aug 10 17:08:31 pfs-srv3 heartbeat: [1216]: info: Core process 1257
exited. 2 remaining
Aug 10 17:08:31 pfs-srv3 heartbeat: [1216]: info: Core process 1256
exited. 1 remaining
Aug 10 17:08:31 pfs-srv3 heartbeat: [1216]: info: pfs-srv3 Heartbeat
shutdown complete.
Aug 10 17:08:32 pfs-srv3 logd: [979]: info: logd_term_write_action:
received SIGTERM
Aug 10 17:08:33 pfs-srv3 logd: [979]: info: Exiting write process

Log file from pfs-srv4:


Aug 10 17:08:28 pfs-srv4 heartbeat: [1168]: info: Heartbeat shutdown
in progress. (1168)
Aug 10 17:08:28 pfs-srv4 heartbeat: [1340]: info: Giving up all HA resources.
Aug 10 17:08:28 pfs-srv4 ResourceManager[1354]: [1365]: info:
Releasing resource group: pfs-srv3 drbddisk::r0
Filesystem::/dev/drbd0::/pfs::ext3 10.1.8.45/24 nfs-kernel-server smbd
Aug 10 17:08:28 pfs-srv4 ResourceManager[1354]: [1376]: info: Running
/etc/init.d/smbd  stop
Aug 10 17:08:28 pfs-srv4 ResourceManager[1354]: [1395]: info: Running
/etc/init.d/nfs-kernel-server  stop
Aug 10 17:08:28 pfs-srv4 ResourceManager[1354]: [1421]: info: Running
/etc/ha.d/resource.d/IPaddr 10.1.8.45/24 stop
Aug 10 17:08:28 pfs-srv4 IPaddr[1424]: [1453]: INFO:  Success
Aug 10 17:08:28 pfs-srv4 ResourceManager[1354]: [1470]: info: Running
/etc/ha.d/resource.d/Filesystem /dev/drbd0 /pfs ext3 stop
Aug 10 17:08:28 pfs-srv4 Filesystem[1479]: [1507]: INFO: Running stop
for /dev/drbd0 on /pfs
Aug 10 17:08:28 pfs-srv4 Filesystem[1473]: [1519]: INFO:  Success
Aug 10 17:08:28 pfs-srv4 ResourceManager[1354]: [1536]: info: Running
/etc/ha.d/resource.d/drbddisk r0 stop
Aug 10 17:08:28 pfs-srv4 heartbeat: [1340]: info: All HA resources relinquished.
Aug 10 17:08:28 pfs-srv4 heartbeat: [1340]: info: FIFO message [type
shutdone] written rc=27
Aug 10 17:08:28 pfs-srv4 heartbeat: [1168]: info: other_holds_resources: 3
Aug 10 17:08:29 pfs-srv4 heartbeat: [1168]: WARN: 1 lost packet(s) for
[pfs-srv3] [2631:2633]
Aug 10 17:08:29 pfs-srv4 heartbeat: [1168]: info: other_holds_resources: 3
Aug 10 17:08:29 pfs-srv4 heartbeat: [1168]: info: No pkts missing from pfs-srv3!
Aug 10 17:08:29 pfs-srv4 heartbeat: [1168]: info: other_holds_resources: 3
Aug 10 17:08:29 pfs-srv4 heartbeat: [1168]: info: other_holds_resources: 0
Aug 10 17:08:30 pfs-srv4 heartbeat: [1168]: info: Received shutdown
notice from 'pfs-srv3'.
Aug 10 17:08:30 pfs-srv4 heartbeat: [1168]: info: Resource takeover
cancelled - shutdown in progress.
Aug 10 17:08:30 pfs-srv4 heartbeat: [1168]: info: killing HBREAD
process 1196 with signal 15
Aug 10 17:08:30 pfs-srv4 heartbeat: [1168]: info: killing HBFIFO
process 1194 with signal 15
Aug 10 17:08:30 pfs-srv4 heartbeat: [1168]: info: killing HBWRITE
process 1195 with signal 15
Aug 10 17:08:30 pfs-srv4 heartbeat: [1168]: info: Core process 1194
exited. 3 remaining
Aug 10 17:08:30 pfs-srv4 heartbeat: [1168]: info: Core process 1195
exited. 2 remaining
Aug 10 17:08:30 pfs-srv4 heartbeat: [1168]: info: Core process 1196
exited. 1 remaining
Aug 10 17:08:30 pfs-srv4 heartbeat: [1168]: info: pfs-srv4 Heartbeat
shutdown complete.
Aug 10 17:08:31 pfs-srv4 logd: [1002]: info: logd_term_write_action:
received SIGTERM
Aug 10 17:08:31 pfs-srv4 logd: [1002]: info: Exiting write process

haresources file from both


r...@pfs-srv3:~# cat /etc/ha.d/haresources
pfs-srv3 drbddisk::r0 Filesystem::/dev/drbd0::/pfs::ext3 10.1.8.45/24
nfs-kernel-server smbd


r...@pfs-srv4:~# cat /etc/ha.d/haresources
pfs-srv3 drbddisk::r0 Filesystem::/dev/drbd0::/pfs::ext3 10.1.8.45/24
nfs-kernel-server smbd

ha.cf from both:


r...@pfs-srv3:~# cat /etc/ha.d/ha.cf
use_logd on
udpport 12694
keepalive 1
warntime 15
deadtime 20
debug 1
initdead 180
bcast eth1
node pfs-srv3
node pfs-srv4
auto_failback on
crm off

r...@pfs-srv4:~# cat /etc/ha.d/ha.cf
use_logd on
udpport 12694
keepalive 1
warntime 15
deadtime 20
debug 1
initdead 180
bcast eth1
node pfs-srv3
node pfs-srv4
auto_failback on
crm off
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to