Thanks Andrew.

I'll try the release you've pointed.

2009/2/3 Andrew Beekhof <[email protected]>

> On Thu, Jan 29, 2009 at 20:42, Alexander Timofeev
> <[email protected]> wrote:
> > All,
> >
> > I have recently faced with strange behavior of the CRM. I have OCF
> compliant
> > RA ( ocf-tester considers it to be such ) .
> > It is supposed to fail over to other node on the very first failure and
> it
> > does. I noticed that my resource has fail counter set to INFINITY on node
> > from which it has failed over. And crm_mon report it's start as a failed
> > action. I have looked into logs and found that for some reason "start"
> was
> > called again after "monitor" returned OCF_ERR_GENERIC and "stop" has
> > successfully executed.
> > I supposed that after "monitor" returns error HA will call "stop" once.
> > After that the PE should re-calculate scores according to fail counter
> value
> > and stickinesses and so on... and decide what action should be done.
> > Instead "start" was called on failed resource and resource was fenced. I
> had
> > to run crm_resource -C manually to allow my resource run again on this
> node.
> >
> > Could anybody suggest me how I could debug this to find out what's going
> on?
>
> This was fixed in
> http://hg.clusterlabs.org/pacemaker/stable-1.0/rev/9919f48d3313
> Alas the release of 1.0.2 is taking longer than hoped due to other
> contractual priorities.
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to