Hello,
Sorry for the lack of information. You guys are so good that sometimes I
think you have a crystal-ball. ;o)
As the following shows, I am running Heartbeat and STONITH version 2.0.8
release 1 on Fedora 7.
[root@shemshak~]# rpm -qa | grep -i heartbeat
heartbeat-2.0.8-1.fc7
[root@shemshak~]# rpm -qa | grep -i stonith
stonith-2.0.8-1.fc7
This is an old system which I built over 2 years ago and still runs like a
clock. I have recently added two STONITH Devices (APC9225 MasterSwithch Plus
with APC9617 Network Management card)
and here is the heartbeat configuration file "/etc/ha.d/ha.cf":
# Heartbeat logging configuration
debugfile /var/log/ha-debug
logfile /var/log/ha-log
logfacility local0
# Heartbeat cluster members
node shemshak
node dizin
# Heartbeat communication timing
keepalive 2
deadtime 32
initdead 64
# Heartbeat communication paths
udpport 694
bcast eth1
#ucast eth1 192.168.1.21
#ucast eth1 192.168.1.22
#ucast eth0 192.168.1.81
#ucast eth0 192.168.1.82
baud 19200
serial /dev/ttyS0
# Don't fail back automatically - on/off
auto_failback on
# Monitoring of network connection to default gateway
ping 192.168.1.1
#respawn hacluster /usr/lib64/heartbeat/ipfail
#STONITH
stonith_host Testing apcmaster 192.168.1.56 apc apc
Here is also my log file "/var/log/ha-log" after stopping the heartbeat on
the primary host by issuing "service heartbeat stop" command at
"2011/07/22_08:30:48"
[root@shemshak ~]# tail -f /var/log/ha-log
heartbeat[4741]: 2011/07/21_18:36:04 info: Current arena value: 0
heartbeat[4741]: 2011/07/21_18:36:04 info: MSG stats: 0/190108 ms age 10
[pid4749/HBWRITE]
heartbeat[4741]: 2011/07/21_18:36:04 info: ha_malloc stats: 379/5069800
38076/18447 [pid4749/HBWRITE]
heartbeat[4741]: 2011/07/21_18:36:04 info: RealMalloc stats: 50112 total
malloc bytes. pid [4749/HBWRITE]
heartbeat[4741]: 2011/07/21_18:36:04 info: Current arena value: 0
heartbeat[4741]: 2011/07/21_18:36:04 info: MSG stats: 0/86408 ms age 20
[pid4750/HBREAD]
heartbeat[4741]: 2011/07/21_18:36:04 info: ha_malloc stats: 380/1815007
38160/18491 [pid4750/HBREAD]
heartbeat[4741]: 2011/07/21_18:36:04 info: RealMalloc stats: 39660 total
malloc bytes. pid [4750/HBREAD]
heartbeat[4741]: 2011/07/21_18:36:04 info: Current arena value: 0
heartbeat[4741]: 2011/07/21_18:36:04 info: These are nothing to worry about.
heartbeat[4741]: 2011/07/22_08:30:48 info: Heartbeat shutdown in progress.
(4741)
heartbeat[17136]: 2011/07/22_08:30:48 info: Giving up all HA resources.
ResourceManager[17146]: 2011/07/22_08:30:48 info: Releasing resource group:
shemshak 192.168.1.8/24/eth0
ResourceManager[17146]: 2011/07/22_08:30:48 info: Running
/etc/ha.d/resource.d/IPaddr 192.168.1.8/24/eth0 stop
IPaddr[17204]: 2011/07/22_08:30:48 INFO: /sbin/ifconfig eth0:0 192.168.1.8
down
IPaddr[17183]: 2011/07/22_08:30:48 INFO: Success
heartbeat[17136]: 2011/07/22_08:30:48 info: All HA resources relinquished.
heartbeat[4741]: 2011/07/22_08:30:49 WARN: 1 lost packet(s) for [dizin]
[134127:134129]
heartbeat[4741]: 2011/07/22_08:30:49 info: No pkts missing from dizin!
heartbeat[4741]: 2011/07/22_08:30:50 info: killing HBFIFO process 4744 with
signal 15
heartbeat[4741]: 2011/07/22_08:30:50 info: killing HBWRITE process 4745 with
signal 15
heartbeat[4741]: 2011/07/22_08:30:50 info: killing HBREAD process 4746 with
signal 15
heartbeat[4741]: 2011/07/22_08:30:50 info: killing HBWRITE process 4747 with
signal 15
heartbeat[4741]: 2011/07/22_08:30:50 info: killing HBREAD process 4748 with
signal 15
heartbeat[4741]: 2011/07/22_08:30:50 info: killing HBWRITE process 4749 with
signal 15
heartbeat[4741]: 2011/07/22_08:30:50 info: killing HBREAD process 4750 with
signal 15
heartbeat[4741]: 2011/07/22_08:30:50 info: Core process 4749 exited. 7
remaining
heartbeat[4741]: 2011/07/22_08:30:50 info: Core process 4747 exited. 6
remaining
heartbeat[4741]: 2011/07/22_08:30:50 info: Core process 4746 exited. 5
remaining
heartbeat[4741]: 2011/07/22_08:30:50 info: Core process 4745 exited. 4
remaining
heartbeat[4741]: 2011/07/22_08:30:50 info: Core process 4744 exited. 3
remaining
heartbeat[4741]: 2011/07/22_08:30:50 info: Core process 4750 exited. 2
remaining
heartbeat[4741]: 2011/07/22_08:30:50 info: Core process 4748 exited. 1
remaining
heartbeat[4741]: 2011/07/22_08:30:51 info: shemshak Heartbeat shutdown
complete.
when I check the log file I don't see the directive "stonith_host" taking
place!
I know the STONITH demean and the device working as I am able to control the
device by directly issuing the STONITH commands such as:
stonith -t apcmaster -p "192.168.1.56 apc apc" -T off Testing
stonith -t apcmaster -p "192.168.1.56 apc apc" -T on Testing
Thank you for your help.
Avestan
Nikita Michalko wrote:
>
> Hi Avestan,
>
> do you use really V1/haresource? What version of HA? config?
> We have no crystall ball anymore ;-)
>
>
> Nikita Michalko
>
>
> Am Mittwoch 20 Juli 2011 18:08:56 schrieb Avestan:
>> Hello everyone,
>>
>> I am trying to add a STONITH device into my Linux-HA. I have added the
>> "stonith_host" directive into the configuration file "ha.cf" as follow:
>>
>> #stonith_host lashgarak apcmaster 192.168.1.55 apc apc
>> #stonith_host dizin apcmaster 192.168.1.56 apc apc
>>
>> The format of the command that I am using is:
>>
>> stonith_host {host_name} {stonith_type} {ipadress_stonith} {user}
>> {password}
>>
>> When I shutdown the heartbeat on the primary host, nothing happen. I have
>> checked the log files both "/etc/log/ha-log" and "/etc/log/messages" and
>> I
>> don't see anything in regard with the stonith directive.
>>
>> I should also mention that the resources which are placed in the
>> "haresource" file are moved from the primary host "lashgarak" to the
>> secondary host "dizin" with no issue. Currently the only resource that I
>> have in the "haresource" file is the floating IP address.
>>
>> Thanks,
>>
>> Avestan
>>
>
> --
>
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
>
--
View this message in context:
http://old.nabble.com/need-your-help%3A-%22stonith_host%22-directive-is-not-happending%21-tp32100585p32116479.html
Sent from the Linux-HA mailing list archive at Nabble.com.
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems