On Thu, Dec 9, 2010 at 12:14 PM, Evgeniy Ivanov <lolkaanti...@gmail.com> wrote:
> Hi,
>
> What is a best way to check if PM is still alive?

"ps axf | grep crmd" is one approach

>
> We tried following approach: there is a softdog timer (max value is
> 300s + extra 60s to give PM another chance) initially started and
> checked by third party. Clone named HA_alive fails in monitor (except
> first time), monitor interval is 200s. HA_alive:start should reset
> that softdog timer. It looks like sometimes PM doesn't restart failed
> resource for that 360s with no reason: system is almost IDLE.

Strange.  Should work. Details?

> Another approach we used was based on "crmadmin -S this_node" && start
> timer if any problems && try to compare "crm resource status" at
> different time to see that something happens on system (PM works and
> bad result of crmadmin -S caused by high load of PM). It doesn't work
> fine either.
>
> --
> Evgeniy Ivanov
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: 
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

Reply via email to