Hi,

On Thu, Jan 29, 2009 at 10:42:36PM +0300, Alexander Timofeev wrote:
> All,
> 
> I have recently faced with strange behavior of the CRM. I have OCF compliant
> RA ( ocf-tester considers it to be such ) .
> It is supposed to fail over to other node on the very first failure and it
> does. I noticed that my resource has fail counter set to INFINITY on node
> from which it has failed over. And crm_mon report it's start as a failed
> action. I have looked into logs and found that for some reason "start" was
> called again after "monitor" returned OCF_ERR_GENERIC and "stop" has
> successfully executed.
> I supposed that after "monitor" returns error HA will call "stop" once.
> After that the PE should re-calculate scores according to fail counter value
> and stickinesses and so on... and decide what action should be done.

Right.

> Instead "start" was called on failed resource and resource was fenced.

Fenced? There's no resource level fencing.

I guess that this is a case of start-failure-is-fatal.

> I had
> to run crm_resource -C manually to allow my resource run again on this node.
> 
> Could anybody suggest me how I could debug this to find out what's going on?

Please post your configuration and logs. Try with hb_report.

Thanks,

Dejan

> 
> TIA
> 
> Alex T
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to