Re: [Linux-HA] fail count was initialized after recovering fromSplitBrain

Yan Fitterer Thu, 13 Sep 2007 01:43:20 -0700


Junko IKEDA wrote:

once again something about SplitBrain...
During SplitBrain, I wrecked the resource on the both nodes.
fail count was increased at this time.
But after recovering from SplitBrain, fail count returned to zero on

both!

Is this due to the restart of crmd or pengine/tengine?

Most probably. The fail count belongs to the status section which
is not saved.


Where is the status section saved at?

status section is never saved to disk. When the cluster is stopped, thestatus section disappears altogether.

I thought that CIB kept the status.

Yes it does. But status has no meaning once the cluster is stopped - soit isn't kept. Hence failcounts being reset when cluster is restarted.As well, the failcount for a specific node will be reset when _that_node is restarted. How else could resources be allowed to start after aSTONITH operation?

cib process seems not to be restarted in this case...

there is no 'cib' process. If I understand things right, the crmdprocess handles all core CIB maintenance operations. Try pstree -p andlook for the group of processes where the parent is "heartbeat".


HTH
Yan
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] fail count was initialized after recovering fromSplitBrain

Reply via email to