Re: [Linux-HA] heartbeat strange behavior
Thanks Lars.. problem solved. I changed the asterisk init script to be idempotent. Regards, Douglas On Wed, May 2, 2012 at 9:25 AM, Lars Ellenberg lars.ellenb...@linbit.comwrote: On Mon, Apr 30, 2012 at 01:52:05PM -0300, Douglas Pasqua wrote: Hi friends, I create a linux ha solution using 2 nodes: node-a and node-b. My /etc/ha.d/ha.cf: use_logd yes keepalive 1 deadtime 90 warntime 5 initdead 120 bcast eth6 node node-a node node-b crm off auto_failback off My /etc/ha.d/haresources node-a x.x.x.x/24 x.x.x.x/24 x.x.x.x/24 service1 service2 service3 I booted the two nodes together. node-a become master and node-b become slave. After, I booted the node-a. Then node-b become master. When node-a return from boot, it become slave, because *auto_failback is off* i think. All as expected until here. As the node-a as a slave, I decide to halt the node-a (using halt command). Then heartbeat in node-b go standby and my cluster was down. The virtual ips was down too. I expected the node-b stay on. Why did this happen ? Some log from node2: Apr 30 00:02:57 node-b heartbeat: [3082]: info: Received shutdown notice from 'node-a'. Apr 30 00:02:57 node-b heartbeat: [3082]: info: Resources being acquired from node-a. Apr 30 00:02:57 node-b heartbeat: [4414]: debug: notify_world: setting SIGCHLD Handler to SIG_DFL Apr 30 00:02:57 node-b harc[4414]: [4428]: info: Running /etc/ha.d/rc.d/status status Apr 30 00:02:57 node-b heartbeat: [4416]: info: No local resources [/usr/share/heartbeat/ResourceManager listkeys node-b] to acquire. Apr 30 00:02:57 node-b heartbeat: [3082]: debug: StartNextRemoteRscReq(): child count 1 Apr 30 00:02:58 node-b ResourceManager[4462]: [4657]: debug: /etc/init.d/asterisk start done. RC=1 Apr 30 00:02:58 node-b ResourceManager[4462]: [4658]: ERROR: Return code 1 from /etc/init.d/asterisk Apr 30 00:02:58 node-b ResourceManager[4462]: [4659]: CRIT: Giving up resources due to failure of asterisk Because of the above error when starting asterisk. Maybe your asterisk init script is simply not idempotent. Maybe it is broken, or maybe there really was some problem trying to start asterisk. Apr 30 00:02:58 node-b ResourceManager[4462]: [4660]: info: Releasing resource group: node-a x.x.x.x/24 x.x.x.x/24 x.x.x.x/24 asterisk sincronismo notificacao Apr 30 00:02:58 node-b ResourceManager[4462]: [4670]: info: Running /etc/init.d/notificacao stop Apr 30 00:02:58 node-b ResourceManager[4462]: [4671]: debug: Starting /etc/init.d/notificacao stop Apr 30 00:02:58 node-b ResourceManager[4462]: [4694]: debug: /etc/init.d/notificacao stop done. RC=0 Apr 30 00:02:58 node-b ResourceManager[4462]: [4704]: info: Running /etc/init.d/sincronismo stop Apr 30 00:02:58 node-b ResourceManager[4462]: [4705]: debug: Starting /etc/init.d/sincronismo stop Apr 30 00:02:58 node-b ResourceManager[4462]: [4711]: debug: /etc/init.d/sincronismo stop done. RC=0 Apr 30 00:02:58 node-b ResourceManager[4462]: [4720]: info: Running /etc/init.d/asterisk stop Apr 30 00:02:58 node-b ResourceManager[4462]: [4721]: debug: Starting /etc/init.d/asterisk stop Apr 30 00:02:58 node-b ResourceManager[4462]: [4725]: debug: /etc/init.d/asterisk stop done. RC=0 Apr 30 00:02:58 node-b ResourceManager[4462]: [4741]: info: Running /etc/ha.d/resource.d/IPaddr x.x.x.x/24 stop Apr 30 00:02:58 node-b ResourceManager[4462]: [4742]: debug: Starting /etc/ha.d/resource.d/IPaddr x.x.x.x/24 stop Apr 30 00:03:29 node-b heartbeat: [3082]: info: node-b wants to go standby [foreign] Apr 30 00:03:39 node-b heartbeat: [3082]: WARN: No reply to standby request. Standby request cancelled. Apr 30 00:04:29 node-b heartbeat: [3082]: WARN: node node-a: is dead Apr 30 00:04:29 node-b heartbeat: [3082]: info: Dead node node-a gave up resources. Apr 30 00:04:29 node-b heartbeat: [3082]: info: Link node-a:eth6 dead. -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
[Linux-HA] Robert Koeppl ist außer Haus. Robert Koeppl is out of office
Ich werde ab 06.05.2012 nicht im Büro sein. Ich kehre zurück am 15.05.2012. Ich werde Ihre Nachricht nach meiner Rückkehr beantworten. I will answer your Message after my return. ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
[Linux-HA] We Rebooted a Healthy Standby Node and All the Services on the Primary Node Restarted?
Hi guys, we rebooted a standby node of a healthy cluster and suddenly all the resources on the primary cluster restarted. What's up with that? Before rebooting the standby node, we did the normal stuff to verify that all was well. crm_mon showed all nodes online, in their expected roles, with correct quorum votes cat /proc/drbd showed correct dbbd status corosync-cfgtool -s showed all rings active without faults When we rebooted the standby node (ha08c), crm_mon on the primary node (ha08a) showed that all the resources stopped and then restarted, resulting in brief loss of availability to customers. Following is what crm_mon showed before server ha08c was rebooted and after it came back up. Following that is our crm configuration. Last updated: Mon May 7 11:13:32 2012 Stack: openais Current DC: ha08a.mycharts.md - partition with quorum Version: 1.1.2-f059ec7ced7a86f18e5490b67ebf4a0b963bccfe 3 Nodes configured, 3 expected votes 4 Resources configured. Online: [ ha08a.mycharts.md ha08b.mycharts.md ha08c.mycharts.md ] Master/Slave Set: ms_drbd0 Masters: [ ha08a.mycharts.md ] Slaves: [ ha08c.mycharts.md ] Master/Slave Set: ms_drbd1 Masters: [ ha08b.mycharts.md ] Slaves: [ ha08c.mycharts.md ] Resource Group: g_clust06 p_fs_clust06 (ocf::heartbeat:Filesystem):Started ha08a.mycharts.md p_vip_clust06 (ocf::heartbeat:IPaddr2): Started ha08a.mycharts.md p_mysql_371(lsb:mysql_371):Started ha08a.mycharts.md p_mysql_372(lsb:mysql_372):Started ha08a.mycharts.md p_mysql_373(lsb:mysql_373):Started ha08a.mycharts.md p_mysql_374(lsb:mysql_374):Started ha08a.mycharts.md p_mysql_375(lsb:mysql_375):Started ha08a.mycharts.md p_mysql_376(lsb:mysql_376):Started ha08a.mycharts.md p_mysql_047(lsb:mysql_047):Started ha08a.mycharts.md p_mysql_100(lsb:mysql_100):Started ha08a.mycharts.md p_mysql_379(lsb:mysql_379):Started ha08a.mycharts.md p_mysql_377(lsb:mysql_377):Started ha08a.mycharts.md p_mysql_378(lsb:mysql_378):Started ha08a.mycharts.md p_mysql_380(lsb:mysql_380):Started ha08a.mycharts.md p_mysql_381(lsb:mysql_381):Started ha08a.mycharts.md p_mysql_382(lsb:mysql_382):Started ha08a.mycharts.md p_mysql_383(lsb:mysql_383):Started ha08a.mycharts.md p_mysql_384(lsb:mysql_384):Started ha08a.mycharts.md p_mysql_385(lsb:mysql_385):Started ha08a.mycharts.md p_mysql_386(lsb:mysql_386):Started ha08a.mycharts.md p_mysql_387(lsb:mysql_387):Started ha08a.mycharts.md p_mysql_002(lsb:mysql_002):Started ha08a.mycharts.md p_mysql_035(lsb:mysql_035):Started ha08a.mycharts.md p_mysql_049(lsb:mysql_049):Started ha08a.mycharts.md p_mysql_097(lsb:mysql_097):Started ha08a.mycharts.md p_mysql_024(lsb:mysql_024):Started ha08a.mycharts.md p_mysql_077(lsb:mysql_077):Started ha08a.mycharts.md p_mysql_084(lsb:mysql_084):Started ha08a.mycharts.md p_mysql_113(lsb:mysql_113):Started ha08a.mycharts.md p_mysql_116(lsb:mysql_116):Started ha08a.mycharts.md p_mysql_388(lsb:mysql_388):Started ha08a.mycharts.md p_mysql_389(lsb:mysql_389):Started ha08a.mycharts.md p_mysql_390(lsb:mysql_390):Started ha08a.mycharts.md p_mysql_391(lsb:mysql_391):Started ha08a.mycharts.md p_mysql_392(lsb:mysql_392):Started ha08a.mycharts.md p_mysql_393(lsb:mysql_393):Started ha08a.mycharts.md p_mysql_394(lsb:mysql_394):Started ha08a.mycharts.md p_mysql_395(lsb:mysql_395):Started ha08a.mycharts.md p_mysql_396(lsb:mysql_396):Started ha08a.mycharts.md p_mysql_397(lsb:mysql_397):Started ha08a.mycharts.md p_mysql_398(lsb:mysql_398):Started ha08a.mycharts.md p_mysql_399(lsb:mysql_399):Started ha08a.mycharts.md p_mysql_400(lsb:mysql_400):Started ha08a.mycharts.md p_mysql_401(lsb:mysql_401):Started ha08a.mycharts.md p_mysql_402(lsb:mysql_402):Started ha08a.mycharts.md p_mysql_403(lsb:mysql_403):Started ha08a.mycharts.md p_mysql_404(lsb:mysql_404):Started ha08a.mycharts.md p_mysql_405(lsb:mysql_405):Started ha08a.mycharts.md p_mysql_104(lsb:mysql_104):Started ha08a.mycharts.md p_mysql_406(lsb:mysql_406):Started