Hello,
we setup heartbeat with stonith support for the APC SMART 750 via serial
cable.
The connection is working and the APC can be contacted via "apctest".
Starting heartbeat connects to the USV. See log below.
When we kill the heartbeat deamon on one the slave node the master node tries
to power cycle the slave USV but that fails.
We observe that the USV is put into battery mode, than the battery test LED
flashes for a short period of time and the USV reconnects to the mains again.
Unfortunatly the server is not unpowered during that time since the USV runs
on battery. So no hard reboot.
Since the slave node is not power cycled the master tries to power cycle the
slave every few seconds.
What is the designed behaviour for the stonith apcsmart module ?
What can be configured to get the SMART 750 to power cycle ?
Thanks and Regards
Heiko
ha.cf:
# logfacility local7
# logfile /var/log/ha-log
# debugfile /var/log/ha-debug
use_logd yes
udpport 694
keepalive 1 # 1 second
deadtime 30
initdead 80
bcast eth1
serial /dev/ttyS0 #if you use serial
baud 19200 #if you use serial
node mds1
node mds2
# crm yes
crm no
# crm respawn
auto_failback yes
# watchdog /dev/watchdog
stonith_host mds1 apcsmart /dev/ttyS1 mds2
stonith_host mds2 apcsmart /dev/ttyS1 mds1
failover trigger:
Aug 5 15:35:46 mds1 heartbeat: [5188]: info: all clients are now resumed
Aug 5 15:35:46 mds1 heartbeat: [5188]: WARN: 1 lost packet(s) for [mds2]
[12:14]
Aug 5 15:35:46 mds1 heartbeat: [5188]: info: remote resource transition
completed.
Aug 5 15:35:46 mds1 heartbeat: [5188]: info: No pkts missing from mds2!
Aug 5 15:35:46 mds1 heartbeat: [5188]: info: Other node completed standby
takeover of foreign resources.
Aug 5 15:37:00 mds1 heartbeat: [5188]: info: Link mds2:/dev/ttyS0 dead.
Aug 5 15:37:01 mds1 heartbeat: [5188]: WARN: node mds2: is dead
Aug 5 15:37:01 mds1 heartbeat: [5188]: info: Link mds2:eth1 dead.
Aug 5 15:37:01 mds1 heartbeat: [6182]: info: Resetting node mds2 with
[APCSmart]
Aug 5 15:37:06 mds1 heartbeat: [6182]: info: node mds2 now reset.
{{{EDIT: that is not true ... }}}
Aug 5 15:37:06 mds1 heartbeat: [5188]: info: Exiting STONITH mds2 process
6182 returned rc 0.
Aug 5 15:37:06 mds1 heartbeat: [5188]: info: Resources being acquired from
mds2.
Aug 5 15:37:06 mds1 heartbeat: [6183]: debug: notify_world: setting SIGCHLD
Handler to SIG_DFL
Aug 5 15:37:06 mds1 harc[6183]: [6190]: info: Running /etc/ha.d/rc.d/status
status
Aug 5 15:37:06 mds1 mach_down[6200]: [6228]:
info: /usr/lib/heartbeat/mach_down: nice_failback: foreign resources acqu
ired
Aug 5 15:37:06 mds1 mach_down[6200]: [6235]: info: mach_down takeover
complete for node mds2.
--
-----------------------------------------------------------------------
Dipl.-Ing. Heiko Schröter
Institute of Environmental Physics (IUP) phone: ++49-(0)421-218-4080
Institute of Remote Sensing (IFE) fax: ++49-(0)421-218-4555
University of Bremen (FB1)
P.O. Box 330440 email: [EMAIL PROTECTED]
Otto-Hahn-Allee 1
28359 Bremen
Germany
-----------------------------------------------------------------------
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems