Re: [Linux-HA] heartbeat strange behavior

2012-05-07 Thread Douglas Pasqua
Thanks Lars..

problem solved. I changed the asterisk init script to be idempotent.

Regards,
Douglas

On Wed, May 2, 2012 at 9:25 AM, Lars Ellenberg lars.ellenb...@linbit.comwrote:

 On Mon, Apr 30, 2012 at 01:52:05PM -0300, Douglas Pasqua wrote:
  Hi friends,
 
  I create a linux ha solution using 2 nodes: node-a and node-b.
 
  My /etc/ha.d/ha.cf:
 
  use_logd yes
  keepalive 1
  deadtime 90
  warntime 5
  initdead 120
  bcast eth6
  node node-a
  node node-b
  crm off
  auto_failback off
 
  My /etc/ha.d/haresources
  node-a x.x.x.x/24 x.x.x.x/24 x.x.x.x/24 service1 service2 service3
 
  I booted the two nodes together. node-a become master and node-b become
  slave. After, I booted the node-a. Then node-b become master. When node-a
  return from boot, it become slave, because *auto_failback is off* i
 think.
  All as expected until here.
 
  As the node-a as a slave, I decide to halt the node-a (using halt
 command).
  Then heartbeat in node-b go standby and my cluster was down. The virtual
  ips was down too. I expected the node-b stay on. Why did this happen ?
 
  Some log from node2:
 
  Apr 30 00:02:57 node-b heartbeat: [3082]: info: Received shutdown notice
  from 'node-a'.
  Apr 30 00:02:57 node-b heartbeat: [3082]: info: Resources being acquired
  from node-a.
  Apr 30 00:02:57 node-b heartbeat: [4414]: debug: notify_world: setting
  SIGCHLD Handler to SIG_DFL
  Apr 30 00:02:57 node-b harc[4414]: [4428]: info: Running
  /etc/ha.d/rc.d/status status
  Apr 30 00:02:57 node-b heartbeat: [4416]: info: No local resources
  [/usr/share/heartbeat/ResourceManager listkeys node-b] to acquire.
  Apr 30 00:02:57 node-b heartbeat: [3082]: debug: StartNextRemoteRscReq():
  child count 1
 
  Apr 30 00:02:58 node-b ResourceManager[4462]: [4657]: debug:
  /etc/init.d/asterisk  start done. RC=1
  Apr 30 00:02:58 node-b ResourceManager[4462]: [4658]: ERROR: Return code
 1
  from /etc/init.d/asterisk
  Apr 30 00:02:58 node-b ResourceManager[4462]: [4659]: CRIT: Giving up
  resources due to failure of asterisk

 Because of the above error when starting asterisk.  Maybe your asterisk
 init script is simply not idempotent.  Maybe it is broken, or maybe
 there really was some problem trying to start asterisk.


  Apr 30 00:02:58 node-b ResourceManager[4462]: [4660]: info: Releasing
  resource group: node-a x.x.x.x/24 x.x.x.x/24 x.x.x.x/24 asterisk
  sincronismo notificacao
  Apr 30 00:02:58 node-b ResourceManager[4462]: [4670]: info: Running
  /etc/init.d/notificacao  stop
  Apr 30 00:02:58 node-b ResourceManager[4462]: [4671]: debug: Starting
  /etc/init.d/notificacao  stop
 
  Apr 30 00:02:58 node-b ResourceManager[4462]: [4694]: debug:
  /etc/init.d/notificacao  stop done. RC=0
  Apr 30 00:02:58 node-b ResourceManager[4462]: [4704]: info: Running
  /etc/init.d/sincronismo  stop
  Apr 30 00:02:58 node-b ResourceManager[4462]: [4705]: debug: Starting
  /etc/init.d/sincronismo  stop
  Apr 30 00:02:58 node-b ResourceManager[4462]: [4711]: debug:
  /etc/init.d/sincronismo  stop done. RC=0
  Apr 30 00:02:58 node-b ResourceManager[4462]: [4720]: info: Running
  /etc/init.d/asterisk  stop
  Apr 30 00:02:58 node-b ResourceManager[4462]: [4721]: debug: Starting
  /etc/init.d/asterisk  stop
  Apr 30 00:02:58 node-b ResourceManager[4462]: [4725]: debug:
  /etc/init.d/asterisk  stop done. RC=0
  Apr 30 00:02:58 node-b ResourceManager[4462]: [4741]: info: Running
  /etc/ha.d/resource.d/IPaddr x.x.x.x/24 stop
  Apr 30 00:02:58 node-b ResourceManager[4462]: [4742]: debug: Starting
  /etc/ha.d/resource.d/IPaddr x.x.x.x/24 stop
 
  Apr 30 00:03:29 node-b heartbeat: [3082]: info: node-b wants to go
 standby
  [foreign]
  Apr 30 00:03:39 node-b heartbeat: [3082]: WARN: No reply to standby
  request.  Standby request cancelled.
  Apr 30 00:04:29 node-b heartbeat: [3082]: WARN: node node-a: is dead
  Apr 30 00:04:29 node-b heartbeat: [3082]: info: Dead node node-a gave up
  resources.
  Apr 30 00:04:29 node-b heartbeat: [3082]: info: Link node-a:eth6 dead.

 --
 : Lars Ellenberg
 : LINBIT | Your Way to High Availability
 : DRBD/HA support and consulting http://www.linbit.com

 DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] Robert Koeppl ist außer Haus. Robert Koeppl is out of office

2012-05-07 Thread Robert . Koeppl

Ich werde ab  06.05.2012 nicht im Büro sein. Ich kehre zurück am
15.05.2012.

Ich werde Ihre Nachricht nach meiner Rückkehr beantworten.
I will answer your Message after my return.

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] We Rebooted a Healthy Standby Node and All the Services on the Primary Node Restarted?

2012-05-07 Thread Robinson, Eric
Hi guys, we rebooted a standby node of a healthy cluster and suddenly all the 
resources on the primary cluster restarted. What's up with that? Before 
rebooting the standby node, we did the normal stuff to verify that all was well.

crm_mon showed all nodes online, in their expected roles, with correct quorum 
votes
cat /proc/drbd showed correct dbbd status
corosync-cfgtool -s showed all rings active without faults

When we rebooted the standby node (ha08c), crm_mon on the primary node (ha08a) 
showed that all the resources stopped and then restarted, resulting in brief 
loss of availability to customers.

Following is what crm_mon showed before server ha08c was rebooted and after it 
came back up. Following that is our crm configuration.


Last updated: Mon May  7 11:13:32 2012
Stack: openais
Current DC: ha08a.mycharts.md - partition with quorum
Version: 1.1.2-f059ec7ced7a86f18e5490b67ebf4a0b963bccfe
3 Nodes configured, 3 expected votes
4 Resources configured.


Online: [ ha08a.mycharts.md ha08b.mycharts.md ha08c.mycharts.md ]

 Master/Slave Set: ms_drbd0
 Masters: [ ha08a.mycharts.md ]
 Slaves: [ ha08c.mycharts.md ]
 Master/Slave Set: ms_drbd1
 Masters: [ ha08b.mycharts.md ]
 Slaves: [ ha08c.mycharts.md ]
 Resource Group: g_clust06
 p_fs_clust06   (ocf::heartbeat:Filesystem):Started 
ha08a.mycharts.md
 p_vip_clust06  (ocf::heartbeat:IPaddr2):   Started 
ha08a.mycharts.md
 p_mysql_371(lsb:mysql_371):Started ha08a.mycharts.md
 p_mysql_372(lsb:mysql_372):Started ha08a.mycharts.md
 p_mysql_373(lsb:mysql_373):Started ha08a.mycharts.md
 p_mysql_374(lsb:mysql_374):Started ha08a.mycharts.md
 p_mysql_375(lsb:mysql_375):Started ha08a.mycharts.md
 p_mysql_376(lsb:mysql_376):Started ha08a.mycharts.md
 p_mysql_047(lsb:mysql_047):Started ha08a.mycharts.md
 p_mysql_100(lsb:mysql_100):Started ha08a.mycharts.md
 p_mysql_379(lsb:mysql_379):Started ha08a.mycharts.md
 p_mysql_377(lsb:mysql_377):Started ha08a.mycharts.md
 p_mysql_378(lsb:mysql_378):Started ha08a.mycharts.md
 p_mysql_380(lsb:mysql_380):Started ha08a.mycharts.md
 p_mysql_381(lsb:mysql_381):Started ha08a.mycharts.md
 p_mysql_382(lsb:mysql_382):Started ha08a.mycharts.md
 p_mysql_383(lsb:mysql_383):Started ha08a.mycharts.md
 p_mysql_384(lsb:mysql_384):Started ha08a.mycharts.md
 p_mysql_385(lsb:mysql_385):Started ha08a.mycharts.md
 p_mysql_386(lsb:mysql_386):Started ha08a.mycharts.md
 p_mysql_387(lsb:mysql_387):Started ha08a.mycharts.md
 p_mysql_002(lsb:mysql_002):Started ha08a.mycharts.md
 p_mysql_035(lsb:mysql_035):Started ha08a.mycharts.md
 p_mysql_049(lsb:mysql_049):Started ha08a.mycharts.md
 p_mysql_097(lsb:mysql_097):Started ha08a.mycharts.md
 p_mysql_024(lsb:mysql_024):Started ha08a.mycharts.md
 p_mysql_077(lsb:mysql_077):Started ha08a.mycharts.md
 p_mysql_084(lsb:mysql_084):Started ha08a.mycharts.md
 p_mysql_113(lsb:mysql_113):Started ha08a.mycharts.md
 p_mysql_116(lsb:mysql_116):Started ha08a.mycharts.md
 p_mysql_388(lsb:mysql_388):Started ha08a.mycharts.md
 p_mysql_389(lsb:mysql_389):Started ha08a.mycharts.md
 p_mysql_390(lsb:mysql_390):Started ha08a.mycharts.md
 p_mysql_391(lsb:mysql_391):Started ha08a.mycharts.md
 p_mysql_392(lsb:mysql_392):Started ha08a.mycharts.md
 p_mysql_393(lsb:mysql_393):Started ha08a.mycharts.md
 p_mysql_394(lsb:mysql_394):Started ha08a.mycharts.md
 p_mysql_395(lsb:mysql_395):Started ha08a.mycharts.md
 p_mysql_396(lsb:mysql_396):Started ha08a.mycharts.md
 p_mysql_397(lsb:mysql_397):Started ha08a.mycharts.md
 p_mysql_398(lsb:mysql_398):Started ha08a.mycharts.md
 p_mysql_399(lsb:mysql_399):Started ha08a.mycharts.md
 p_mysql_400(lsb:mysql_400):Started ha08a.mycharts.md
 p_mysql_401(lsb:mysql_401):Started ha08a.mycharts.md
 p_mysql_402(lsb:mysql_402):Started ha08a.mycharts.md
 p_mysql_403(lsb:mysql_403):Started ha08a.mycharts.md
 p_mysql_404(lsb:mysql_404):Started ha08a.mycharts.md
 p_mysql_405(lsb:mysql_405):Started ha08a.mycharts.md
 p_mysql_104(lsb:mysql_104):Started ha08a.mycharts.md
 p_mysql_406(lsb:mysql_406):Started