On 9/13/07, Yan Fitterer <[EMAIL PROTECTED]> wrote: > > > Junko IKEDA wrote: > >>> once again something about SplitBrain... > >>> During SplitBrain, I wrecked the resource on the both nodes. > >>> fail count was increased at this time. > >>> But after recovering from SplitBrain, fail count returned to zero on > > both! > >>> Is this due to the restart of crmd or pengine/tengine? > >> Most probably. The fail count belongs to the status section which > >> is not saved. > > > > Where is the status section saved at? > > status section is never saved to disk. When the cluster is stopped, the > status section disappears altogether. > > > I thought that CIB kept the status. > > Yes it does. But status has no meaning once the cluster is stopped - so > it isn't kept. Hence failcounts being reset when cluster is restarted. > As well, the failcount for a specific node will be reset when _that_ > node is restarted. How else could resources be allowed to start after a > STONITH operation? > > > cib process seems not to be restarted in this case... > > there is no 'cib' process.
actually there is :-) > If I understand things right, the crmd > process handles all core CIB maintenance operations. nope, all done by the CIB process > Try pstree -p and > look for the group of processes where the parent is "heartbeat". > > HTH > Yan > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
