Hi, On Thu, Nov 08, 2007 at 11:32:07AM +0900, HIDEO YAMAUCHI wrote: > Hi, > > I tested behavior of Heartbeat related to split-brain. > I just checked recovery from split-brain. > > I assume the following situation. > > 1)The cluster group of two nodes of Actvie/Standby. > 2)Hertbeat started with we having had a problem in LAN of the Heartbeat > communication. > 3)DC starts in each node in a few minutes. > 4)A resource starts in each node. > 5)Heartbeat communication revives. > > The recognition of the node was strange after this. > I was going to stop each Heartbeat service here. > Heartbeat stopped in one node, but Heartbeat did not stop in the other node. > > Version 2.1.2 and the development version became the same results. > > I think that it is a problem that Heartbeat of both nodes does not stop.
Not sure, but this looks suspicious: dl380g5c/ha-log:crmd[31979]: 2007/11/08_10:40:16 info: do_shutdown_req: Sending shutdown request to DC: <null> After that, crmd makes no effort to exit. Another issue could be that for about two minutes, after the split brain healed, that node couldn't set the DC: crmd[31979]: 2007/11/08_10:38:42 info: update_dc: Set DC to <null> (<null>) ... There's also an uncommon period of inactivity: crmd[31979]: 2007/11/08_10:38:48 notice: populate_cib_nodes: Node: dl380g5c (uuid: a9abdd7e-0a39-40cd-bea5-74494ad97f89) crmd[31979]: 2007/11/08_10:40:11 notice: crmd_client_status_callback: Status update: Client dl380g5d/crmd now has status [offline] Thanks, Dejan > Regard, > Hideo Yamauchi. > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
