Thanks Andrew. I'll try the release you've pointed.
2009/2/3 Andrew Beekhof <[email protected]> > On Thu, Jan 29, 2009 at 20:42, Alexander Timofeev > <[email protected]> wrote: > > All, > > > > I have recently faced with strange behavior of the CRM. I have OCF > compliant > > RA ( ocf-tester considers it to be such ) . > > It is supposed to fail over to other node on the very first failure and > it > > does. I noticed that my resource has fail counter set to INFINITY on node > > from which it has failed over. And crm_mon report it's start as a failed > > action. I have looked into logs and found that for some reason "start" > was > > called again after "monitor" returned OCF_ERR_GENERIC and "stop" has > > successfully executed. > > I supposed that after "monitor" returns error HA will call "stop" once. > > After that the PE should re-calculate scores according to fail counter > value > > and stickinesses and so on... and decide what action should be done. > > Instead "start" was called on failed resource and resource was fenced. I > had > > to run crm_resource -C manually to allow my resource run again on this > node. > > > > Could anybody suggest me how I could debug this to find out what's going > on? > > This was fixed in > http://hg.clusterlabs.org/pacemaker/stable-1.0/rev/9919f48d3313 > Alas the release of 1.0.2 is taking longer than hoped due to other > contractual priorities. > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
