Thanks Lars..

problem solved. I changed the asterisk init script to be idempotent.

Regards,
Douglas

On Wed, May 2, 2012 at 9:25 AM, Lars Ellenberg <lars.ellenb...@linbit.com>wrote:

> On Mon, Apr 30, 2012 at 01:52:05PM -0300, Douglas Pasqua wrote:
> > Hi friends,
> >
> > I create a linux ha solution using 2 nodes: node-a and node-b.
> >
> > My /etc/ha.d/ha.cf:
> >
> > use_logd yes
> > keepalive 1
> > deadtime 90
> > warntime 5
> > initdead 120
> > bcast eth6
> > node node-a
> > node node-b
> > crm off
> > auto_failback off
> >
> > My /etc/ha.d/haresources
> > node-a x.x.x.x/24 x.x.x.x/24 x.x.x.x/24 service1 service2 service3
> >
> > I booted the two nodes together. node-a become master and node-b become
> > slave. After, I booted the node-a. Then node-b become master. When node-a
> > return from boot, it become slave, because *auto_failback is off* i
> think.
> > All as expected until here.
> >
> > As the node-a as a slave, I decide to halt the node-a (using halt
> command).
> > Then heartbeat in node-b go standby and my cluster was down. The virtual
> > ips was down too. I expected the node-b stay on. Why did this happen ?
> >
> > Some log from node2:
> >
> > Apr 30 00:02:57 node-b heartbeat: [3082]: info: Received shutdown notice
> > from 'node-a'.
> > Apr 30 00:02:57 node-b heartbeat: [3082]: info: Resources being acquired
> > from node-a.
> > Apr 30 00:02:57 node-b heartbeat: [4414]: debug: notify_world: setting
> > SIGCHLD Handler to SIG_DFL
> > Apr 30 00:02:57 node-b harc[4414]: [4428]: info: Running
> > /etc/ha.d/rc.d/status status
> > Apr 30 00:02:57 node-b heartbeat: [4416]: info: No local resources
> > [/usr/share/heartbeat/ResourceManager listkeys node-b] to acquire.
> > Apr 30 00:02:57 node-b heartbeat: [3082]: debug: StartNextRemoteRscReq():
> > child count 1
> >
> > Apr 30 00:02:58 node-b ResourceManager[4462]: [4657]: debug:
> > /etc/init.d/asterisk  start done. RC=1
> > Apr 30 00:02:58 node-b ResourceManager[4462]: [4658]: ERROR: Return code
> 1
> > from /etc/init.d/asterisk
> > Apr 30 00:02:58 node-b ResourceManager[4462]: [4659]: CRIT: Giving up
> > resources due to failure of asterisk
>
> Because of the above error when starting asterisk.  Maybe your asterisk
> init script is simply not idempotent.  Maybe it is broken, or maybe
> there really was some problem trying to start asterisk.
>
>
> > Apr 30 00:02:58 node-b ResourceManager[4462]: [4660]: info: Releasing
> > resource group: node-a x.x.x.x/24 x.x.x.x/24 x.x.x.x/24 asterisk
> > sincronismo notificacao
> > Apr 30 00:02:58 node-b ResourceManager[4462]: [4670]: info: Running
> > /etc/init.d/notificacao  stop
> > Apr 30 00:02:58 node-b ResourceManager[4462]: [4671]: debug: Starting
> > /etc/init.d/notificacao  stop
> >
> > Apr 30 00:02:58 node-b ResourceManager[4462]: [4694]: debug:
> > /etc/init.d/notificacao  stop done. RC=0
> > Apr 30 00:02:58 node-b ResourceManager[4462]: [4704]: info: Running
> > /etc/init.d/sincronismo  stop
> > Apr 30 00:02:58 node-b ResourceManager[4462]: [4705]: debug: Starting
> > /etc/init.d/sincronismo  stop
> > Apr 30 00:02:58 node-b ResourceManager[4462]: [4711]: debug:
> > /etc/init.d/sincronismo  stop done. RC=0
> > Apr 30 00:02:58 node-b ResourceManager[4462]: [4720]: info: Running
> > /etc/init.d/asterisk  stop
> > Apr 30 00:02:58 node-b ResourceManager[4462]: [4721]: debug: Starting
> > /etc/init.d/asterisk  stop
> > Apr 30 00:02:58 node-b ResourceManager[4462]: [4725]: debug:
> > /etc/init.d/asterisk  stop done. RC=0
> > Apr 30 00:02:58 node-b ResourceManager[4462]: [4741]: info: Running
> > /etc/ha.d/resource.d/IPaddr x.x.x.x/24 stop
> > Apr 30 00:02:58 node-b ResourceManager[4462]: [4742]: debug: Starting
> > /etc/ha.d/resource.d/IPaddr x.x.x.x/24 stop
> >
> > Apr 30 00:03:29 node-b heartbeat: [3082]: info: node-b wants to go
> standby
> > [foreign]
> > Apr 30 00:03:39 node-b heartbeat: [3082]: WARN: No reply to standby
> > request.  Standby request cancelled.
> > Apr 30 00:04:29 node-b heartbeat: [3082]: WARN: node node-a: is dead
> > Apr 30 00:04:29 node-b heartbeat: [3082]: info: Dead node node-a gave up
> > resources.
> > Apr 30 00:04:29 node-b heartbeat: [3082]: info: Link node-a:eth6 dead.
>
> --
> : Lars Ellenberg
> : LINBIT | Your Way to High Availability
> : DRBD/HA support and consulting http://www.linbit.com
>
> DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
> _______________________________________________
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to