On Fri, May 21, 2010 at 1:05 PM, Andrew Beekhof <[email protected]> wrote:

>
> >
> > Yes, he said that In 1.0 it becomes ignored after the specified
> > interval. I wasn't sure what he meant by that. I thought perhaps he
> > meant it would ignore future failures and not fail over.
>
> No, sorry. In 1.0 you have to clear out the fail-counts manually.
> Yes, its not ideal.
> _______________________________________________
>
>
yes, but how to notify this to crm_mon?
In my case (1.0.8) I have:

crm_mon -fr

============
Last updated: Fri May 21 17:13:56 2010
Stack: openais
Current DC: ha1 - partition with quorum
Version: 1.0.8-9881a7350d6182bae9e8e557cf20a3cc5dac3ee7
2 Nodes configured, 2 expected votes
4 Resources configured.
============

Online: [ ha1 ha2 ]

Full list of resources:
...

Migration summary:
* Node ha1:  pingd=200
* Node ha2:  pingd=200
   SitoWeb:1: migration-threshold=1000000 fail-count=3

Then I run:

# crm resource failcount SitoWeb delete ha2

that returns 0 as return code
In fact
# crm resource failcount SitoWeb show ha2
scope=status  name=fail-count-SitoWeb value=0

But crm_mon doesn't get reset.
Even if I stop it and re-run crm_mon I get the "fail-count=3".
Is this a cumulative number of failures since startup and not to be intended
as the current status of counter inside the cluster.....?

Gianluca
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to