Gianluca Cecchi wrote: > On Thu, May 20, 2010 at 2:45 PM, mike <[email protected]> wrote: > > >> ok, I actually went ahead and did a test on my cluster. The results did >> not occur as I would have expected. >> >> I failed ldirectord twice on the main node. I waited 20 minutes and saw >> this entry in the log file: >> May 20 08:23:10 lvsuat1a.intranet.mydomain.com pengine: [6589]: notice: >> get_failcount: Failcount for ldirectord on >> lvsuat1a.intranet.mydomain.com has expired (limit was 900s) >> >> So now I kill ldirectord again, fully expecting it to restart on the >> same node but instead a failover occurs: >> May 20 08:36:15 lvsuat1a.intranet.mydomain.com pengine: [6589]: WARN: >> common_apply_stickiness: Forcing ldirectord away from >> lvsuat1a.intranet.mydomain.com after 3 failures (max=3) >> >> >> > So your version of pacemaker should be a 1.0.x one. > In fact Andrew wrote that the reset is not automatic for that version, while > it should be for upcoming 1.1 > > Gianluca > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > > >
Yes, he said that In 1.0 it becomes ignored after the specified interval. I wasn't sure what he meant by that. I thought perhaps he meant it would ignore future failures and not fail over. _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
