RE: [Linux-HA] fail count was initialized after recoveringfromSplitBrain

Junko IKEDA Thu, 27 Sep 2007 02:07:57 -0700

> On 9/13/07, Junko IKEDA <[EMAIL PROTECTED]> wrote:
> > > > once again something about SplitBrain...
> > > > During SplitBrain, I wrecked the resource on the both nodes.
> > > > fail count was increased at this time.
> > > > But after recovering from SplitBrain, fail count returned to zero on
> > both!
> > > > Is this due to the restart of crmd or pengine/tengine?
> > >
> > > Most probably. The fail count belongs to the status section which
> > > is not saved.
> >
> > Where is the status section saved at?
> > I thought that CIB kept the status.
> > cib process seems not to be restarted in this case...
> 
> its reset whenever a node joins the cluster


sorry to keep saying the same thing over and over,
but it might cause confusion to reset CIB information whenever a node joins.
Besides, when I tried the following case, the return code of start action
was not reset.

1) There are two node; active and standby node
2) one resource is running on the active node
3) SplitBrain came up!
4) the resource would be going to start on the both node, 
   I drive it into failure on purpose on the standby node.
   so, the return code of start action would be -1 on standby.
   (it worked well)
5) after recovering SplitBrain, the return code on standby node was "-2"...
   and crm_mon on the active node also showed it as -2.
   
Why is it incremented?
the return code is kept at <status>, but it isn't reset when a node joins.

Thanks,
Junko

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

RE: [Linux-HA] fail count was initialized after recoveringfromSplitBrain

Reply via email to