On Fri, May 21, 2010 at 5:21 PM, Gianluca Cecchi <gianluca.cec...@gmail.com>wrote:
> On Fri, May 21, 2010 at 1:05 PM, Andrew Beekhof <and...@beekhof.net>wrote: > >> >> > >> > Yes, he said that In 1.0 it becomes ignored after the specified >> > interval. I wasn't sure what he meant by that. I thought perhaps he >> > meant it would ignore future failures and not fail over. >> >> No, sorry. In 1.0 you have to clear out the fail-counts manually. >> Yes, its not ideal. >> _______________________________________________ >> >> > yes, but how to notify this to crm_mon? > In my case (1.0.8) I have: > > crm_mon -fr > > ============ > Last updated: Fri May 21 17:13:56 2010 > Stack: openais > Current DC: ha1 - partition with quorum > Version: 1.0.8-9881a7350d6182bae9e8e557cf20a3cc5dac3ee7 > 2 Nodes configured, 2 expected votes > 4 Resources configured. > ============ > > Online: [ ha1 ha2 ] > > Full list of resources: > ... > > Migration summary: > * Node ha1: pingd=200 > * Node ha2: pingd=200 > SitoWeb:1: migration-threshold=1000000 fail-count=3 > > Then I run: > > # crm resource failcount SitoWeb delete ha2 > > that returns 0 as return code > In fact > # crm resource failcount SitoWeb show ha2 > scope=status name=fail-count-SitoWeb value=0 > > But crm_mon doesn't get reset. > Even if I stop it and re-run crm_mon I get the "fail-count=3". > Is this a cumulative number of failures since startup and not to be > intended as the current status of counter inside the cluster.....? > > Gianluca > Sorry. the correct command to give is: [r...@ha1 ~]# crm resource failcount SitoWeb:1 delete ha2 and now all is ok with the count line in crm_mon. Gianluca _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems