Hi, On Fri, Jun 06, 2008 at 03:05:53PM +0200, Andreas Reschke wrote: > Hi, > stonith with external/ibmrsa doesn't work yet. > My ha.cf: > bgstsapgtsls1:~ # cat /etc/ha.d/ha.cf > # /etc/ha.d/ha.cf > logfile /var/log/ha-log > keepalive 2 > deadtime 30 > warntime 10 > initdead 120 > auto_failback off > bcast eth2 > bcast eth3 > node bgstsapgtsls1 > node bgstsapgtsls2 > ping 10.20.94.1 > keepalive 10 > stonith_host * external/ibmrsa bgstsapgtsls2 10.20.12.24 root password ibm > > There isn't a output from stonith in the logfile: > heartbeat[15212]: 2008/06/06_14:12:58 info: Received shutdown notice from > 'bgstsapgtsls1'. > heartbeat[15212]: 2008/06/06_14:12:58 info: Resources being acquired from > bgstsapgtsls1. > heartbeat[574]: 2008/06/06_14:12:58 info: acquire all HA resources > (standby). > heartbeat[575]: 2008/06/06_14:12:58 info: No local resources > [/usr/lib64/heartbeat/ResourceManager listkeys bgstsapgtsls2] to acquire. > ResourceManager[594]: 2008/06/06_14:12:58 info: Acquiring resource > group: bgstsapgtsls1 10.20.94.200/32/255.255.255.255/bond0:1 sap > IPaddr[617]: 2008/06/06_14:12:58 INFO: Resource is stopped > ResourceManager[594]: 2008/06/06_14:12:58 info: Running > /etc/ha.d/resource.d/IPaddr 10.20.94.200/32/255.255.255.255/bond0:1 start > IPaddr[690]: 2008/06/06_14:12:58 INFO: Using calculated nic for > 10.20.94.200: bond0 > IPaddr[690]: 2008/06/06_14:12:58 INFO: Using calculated netmask for > 10.20.94.200: 255.255.255.255 > IPaddr[690]: 2008/06/06_14:12:58 INFO: eval /sbin/ifconfig bond0:0 > 10.20.94.200 netmask 32 broadcast 255.255.255.255 > IPaddr[690]: 2008/06/06_14:12:58 DEBUG: Sending Gratuitous Arp for > 10.20.94.200 on bond0:0 [bond0] > IPaddr[670]: 2008/06/06_14:12:58 INFO: Success > ResourceManager[594]: 2008/06/06_14:12:58 info: Running > /etc/ha.d/resource.d/sap start > heartbeat[15212]: 2008/06/06_14:13:29 WARN: node bgstsapgtsls1: is dead > heartbeat[15212]: 2008/06/06_14:13:29 info: Cancelling pending standby > operation > heartbeat[15212]: 2008/06/06_14:13:29 info: Dead node bgstsapgtsls1 gave > up resources. > heartbeat[15212]: 2008/06/06_14:13:29 info: Link bgstsapgtsls1:eth2 dead. > heartbeat[15212]: 2008/06/06_14:13:29 info: Link bgstsapgtsls1:eth3 dead. > heartbeat[574]: 2008/06/06_14:14:09 info: all HA resource acquisition > completed (standby). > heartbeat[15212]: 2008/06/06_14:14:09 ERROR: Ignored standby message > 'done' from bgstsapgtsls2 in state 0 > harc[1545]: 2008/06/06_14:14:09 info: Running /etc/ha.d/rc.d/status > status > mach_down[1554]: 2008/06/06_14:14:09 info: Taking over resource > group 10.20.94.200/32/255.255.255.255/bond0:1 > ResourceManager[1574]: 2008/06/06_14:14:09 info: Acquiring resource > group: bgstsapgtsls1 10.20.94.200/32/255.255.255.255/bond0:1 sap > IPaddr[1597]: 2008/06/06_14:14:09 INFO: Running OK > ResourceManager[1574]: 2008/06/06_14:14:09 info: Running > /etc/ha.d/resource.d/sap start > mach_down[1554]: 2008/06/06_14:14:50 info: > /usr/lib64/heartbeat/mach_down: nice_failback: foreign resources acquired > mach_down[1554]: 2008/06/06_14:14:50 info: mach_down takeover > complete for node bgstsapgtsls1. > heartbeat[15212]: 2008/06/06_14:14:50 info: mach_down takeover complete. > > This was the takeover from bgstsapgtsls1 (the primary node) to > bgstsapgtsls2 (the failback node). Normally the second node should > shutdown the first node with the stonith plugin.
No, it shouldn't. Try 'killall -9 heartbeat' to test stonith. You can also test it manually, using the stonith program (see stonith(8)). Thanks, Dejan > > Gru? > Andreas Reschke > ________________________________________________________________ > BG-IM173 > Unix/Linux-Administration > > Behr GmbH & Co. KG > ST B29, 3.OG > > Tel.: +49 711 896-4598 > Fax: ++49 711-8902-4598 > Mobil: 0173-3197397 > [EMAIL PROTECTED] > > > > Dejan Muhamedagic <[EMAIL PROTECTED]> > Gesendet von: [EMAIL PROTECTED] > 28.05.2008 13:59 > Bitte antworten an > General Linux-HA mailing list <[email protected]> > > > An > General Linux-HA mailing list <[email protected]> > Kopie > > Thema > Re: Antwort: Re: [Linux-HA] Question to stonith and external/ibmrsa > > > > > > Hi, > > On Wed, May 28, 2008 at 01:41:07PM +0200, Andreas Reschke wrote: > > Hi Dejan, > > it doesn't work for me: > > bgstsapgtsls1:~ # /etc/init.d/heartbeat start > > Starting High-Availability services2008/05/28_12:05:47 INFO: Resource > is > > stopped > > heartbeat[18617]: 2008/05/28_12:05:47 ERROR: Invalid Stonith > configuration > > parameter [ bgstsapgtsls2 10.20.12.24 root password] > > Oops. > > > heartbeat[18617]: 2008/05/28_12:05:47 ERROR: Heartbeat not started: > > configuration error. > > heartbeat[18617]: 2008/05/28_12:05:47 ERROR: Configuration error, > > heartbeat not started. > > > > Thats my new config: > > bgstsapgtsls1:~ # cat /etc/ha.d/ha.cf | grep -v \# > > > > debugfile /var/log/ha-debug > > logfile /var/log/ha-log > > keepalive 2 > > deadtime 30 > > warntime 10 > > initdead 120 > > auto_failback off > > bcast eth2 > > bcast eth3 > > node bgstsapgtsls1 > > node bgstsapgtsls2 > > stonith_host * external/ibmrsa bgstsapgtsls2 10.20.12.24 root password > > Can you please try: > > stonith_host * external/ibmrsa bgstsapgtsls2 10.20.12.24 root password ibm > > Thanks, > > Dejan > > > What's wrong? > > > > Andreas > > > > > > > > Dejan Muhamedagic <[EMAIL PROTECTED]> > > Gesendet von: [EMAIL PROTECTED] > > 28.05.2008 11:55 > > Bitte antworten an > > General Linux-HA mailing list <[email protected]> > > > > > > An > > General Linux-HA mailing list <[email protected]> > > Kopie > > > > Thema > > Re: [Linux-HA] Question to stonith and external/ibmrsa > > > > > > > > > > > > Hi, > > > > On Wed, May 28, 2008 at 08:32:51AM +0200, Andreas Reschke wrote: > > > Hello, > > > I've created a HA-Cluster with 2 IBM 3650 with RSA-Adapter and SuSE > > Linux > > > Enterprise 10 SP1 (64Bit). When the slave-server take over the service > > > (SAP), he must poweroff the master-server over the RSA-Adapter with > the > > > Stonith-Module external/ibmrsa. But what is the syntax for the > poweroff > > > and how can I include this in the ha.cf? > > > That's my ha.cf > > > > > > debugfile /var/log/ha-debug > > > logfile /var/log/ha-log > > > keepalive 2 > > > deadtime 30 > > > warntime 10 > > > initdead 120 > > > auto_failback off > > > bcast eth2 > > > bcast eth3 > > > node bgstsapgtsls1 > > > node bgstsapgtsls2 > > > > It's all described in the ha.cf delivered with the distribution. > > > > Add such a line: > > > > stonith_host * external/ibmrsa node rsaip user pass > > > > node: node to be managed > > rsaip: ip of the rsa > > > > If you're worried about security then try with > > > > stonith external/ibmrsa /etc/ha.d/ibmrsa.cf > > > > where ibmrsa.cf contains the configuration. > > > > If you're running v2 then the story's different. Take a look at > > the linux-ha.org web pages. > > > > Thanks, > > > > Dejan > > > > > > > > > > > Thanks for your help > > > > > > Andreas Reschke > > > > > > _______________________________________________ > > > Linux-HA mailing list > > > [email protected] > > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > > See also: http://linux-ha.org/ReportingProblems > > _______________________________________________ > > Linux-HA mailing list > > [email protected] > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > See also: http://linux-ha.org/ReportingProblems > > > > > > > > > > _______________________________________________ > > Linux-HA mailing list > > [email protected] > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > See also: http://linux-ha.org/ReportingProblems > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > > > > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
