On Fri, May 21, 2010 at 5:21 PM, Gianluca Cecchi
<gianluca.cec...@gmail.com>wrote:

> On Fri, May 21, 2010 at 1:05 PM, Andrew Beekhof <and...@beekhof.net>wrote:
>
>>
>> >
>> > Yes, he said that In 1.0 it becomes ignored after the specified
>> > interval. I wasn't sure what he meant by that. I thought perhaps he
>> > meant it would ignore future failures and not fail over.
>>
>> No, sorry. In 1.0 you have to clear out the fail-counts manually.
>> Yes, its not ideal.
>> _______________________________________________
>>
>>
> yes, but how to notify this to crm_mon?
> In my case (1.0.8) I have:
>
> crm_mon -fr
>
> ============
> Last updated: Fri May 21 17:13:56 2010
> Stack: openais
> Current DC: ha1 - partition with quorum
> Version: 1.0.8-9881a7350d6182bae9e8e557cf20a3cc5dac3ee7
> 2 Nodes configured, 2 expected votes
> 4 Resources configured.
> ============
>
> Online: [ ha1 ha2 ]
>
> Full list of resources:
> ...
>
> Migration summary:
> * Node ha1:  pingd=200
> * Node ha2:  pingd=200
>    SitoWeb:1: migration-threshold=1000000 fail-count=3
>
> Then I run:
>
> # crm resource failcount SitoWeb delete ha2
>
> that returns 0 as return code
> In fact
> # crm resource failcount SitoWeb show ha2
> scope=status  name=fail-count-SitoWeb value=0
>
> But crm_mon doesn't get reset.
> Even if I stop it and re-run crm_mon I get the "fail-count=3".
> Is this a cumulative number of failures since startup and not to be
> intended as the current status of counter inside the cluster.....?
>
> Gianluca
>

Sorry.
the correct command to give is:
[r...@ha1 ~]# crm resource failcount SitoWeb:1 delete ha2

and now all is ok with the count line in crm_mon.

Gianluca
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to