Thanks Lars.. problem solved. I changed the asterisk init script to be idempotent.
Regards, Douglas On Wed, May 2, 2012 at 9:25 AM, Lars Ellenberg <lars.ellenb...@linbit.com>wrote: > On Mon, Apr 30, 2012 at 01:52:05PM -0300, Douglas Pasqua wrote: > > Hi friends, > > > > I create a linux ha solution using 2 nodes: node-a and node-b. > > > > My /etc/ha.d/ha.cf: > > > > use_logd yes > > keepalive 1 > > deadtime 90 > > warntime 5 > > initdead 120 > > bcast eth6 > > node node-a > > node node-b > > crm off > > auto_failback off > > > > My /etc/ha.d/haresources > > node-a x.x.x.x/24 x.x.x.x/24 x.x.x.x/24 service1 service2 service3 > > > > I booted the two nodes together. node-a become master and node-b become > > slave. After, I booted the node-a. Then node-b become master. When node-a > > return from boot, it become slave, because *auto_failback is off* i > think. > > All as expected until here. > > > > As the node-a as a slave, I decide to halt the node-a (using halt > command). > > Then heartbeat in node-b go standby and my cluster was down. The virtual > > ips was down too. I expected the node-b stay on. Why did this happen ? > > > > Some log from node2: > > > > Apr 30 00:02:57 node-b heartbeat: [3082]: info: Received shutdown notice > > from 'node-a'. > > Apr 30 00:02:57 node-b heartbeat: [3082]: info: Resources being acquired > > from node-a. > > Apr 30 00:02:57 node-b heartbeat: [4414]: debug: notify_world: setting > > SIGCHLD Handler to SIG_DFL > > Apr 30 00:02:57 node-b harc[4414]: [4428]: info: Running > > /etc/ha.d/rc.d/status status > > Apr 30 00:02:57 node-b heartbeat: [4416]: info: No local resources > > [/usr/share/heartbeat/ResourceManager listkeys node-b] to acquire. > > Apr 30 00:02:57 node-b heartbeat: [3082]: debug: StartNextRemoteRscReq(): > > child count 1 > > > > Apr 30 00:02:58 node-b ResourceManager[4462]: [4657]: debug: > > /etc/init.d/asterisk start done. RC=1 > > Apr 30 00:02:58 node-b ResourceManager[4462]: [4658]: ERROR: Return code > 1 > > from /etc/init.d/asterisk > > Apr 30 00:02:58 node-b ResourceManager[4462]: [4659]: CRIT: Giving up > > resources due to failure of asterisk > > Because of the above error when starting asterisk. Maybe your asterisk > init script is simply not idempotent. Maybe it is broken, or maybe > there really was some problem trying to start asterisk. > > > > Apr 30 00:02:58 node-b ResourceManager[4462]: [4660]: info: Releasing > > resource group: node-a x.x.x.x/24 x.x.x.x/24 x.x.x.x/24 asterisk > > sincronismo notificacao > > Apr 30 00:02:58 node-b ResourceManager[4462]: [4670]: info: Running > > /etc/init.d/notificacao stop > > Apr 30 00:02:58 node-b ResourceManager[4462]: [4671]: debug: Starting > > /etc/init.d/notificacao stop > > > > Apr 30 00:02:58 node-b ResourceManager[4462]: [4694]: debug: > > /etc/init.d/notificacao stop done. RC=0 > > Apr 30 00:02:58 node-b ResourceManager[4462]: [4704]: info: Running > > /etc/init.d/sincronismo stop > > Apr 30 00:02:58 node-b ResourceManager[4462]: [4705]: debug: Starting > > /etc/init.d/sincronismo stop > > Apr 30 00:02:58 node-b ResourceManager[4462]: [4711]: debug: > > /etc/init.d/sincronismo stop done. RC=0 > > Apr 30 00:02:58 node-b ResourceManager[4462]: [4720]: info: Running > > /etc/init.d/asterisk stop > > Apr 30 00:02:58 node-b ResourceManager[4462]: [4721]: debug: Starting > > /etc/init.d/asterisk stop > > Apr 30 00:02:58 node-b ResourceManager[4462]: [4725]: debug: > > /etc/init.d/asterisk stop done. RC=0 > > Apr 30 00:02:58 node-b ResourceManager[4462]: [4741]: info: Running > > /etc/ha.d/resource.d/IPaddr x.x.x.x/24 stop > > Apr 30 00:02:58 node-b ResourceManager[4462]: [4742]: debug: Starting > > /etc/ha.d/resource.d/IPaddr x.x.x.x/24 stop > > > > Apr 30 00:03:29 node-b heartbeat: [3082]: info: node-b wants to go > standby > > [foreign] > > Apr 30 00:03:39 node-b heartbeat: [3082]: WARN: No reply to standby > > request. Standby request cancelled. > > Apr 30 00:04:29 node-b heartbeat: [3082]: WARN: node node-a: is dead > > Apr 30 00:04:29 node-b heartbeat: [3082]: info: Dead node node-a gave up > > resources. > > Apr 30 00:04:29 node-b heartbeat: [3082]: info: Link node-a:eth6 dead. > > -- > : Lars Ellenberg > : LINBIT | Your Way to High Availability > : DRBD/HA support and consulting http://www.linbit.com > > DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. > _______________________________________________ > Linux-HA mailing list > Linux-HA@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems