On Thu, Dec 9, 2010 at 12:14 PM, Evgeniy Ivanov <lolkaanti...@gmail.com> wrote: > Hi, > > What is a best way to check if PM is still alive?
"ps axf | grep crmd" is one approach > > We tried following approach: there is a softdog timer (max value is > 300s + extra 60s to give PM another chance) initially started and > checked by third party. Clone named HA_alive fails in monitor (except > first time), monitor interval is 200s. HA_alive:start should reset > that softdog timer. It looks like sometimes PM doesn't restart failed > resource for that 360s with no reason: system is almost IDLE. Strange. Should work. Details? > Another approach we used was based on "crmadmin -S this_node" && start > timer if any problems && try to compare "crm resource status" at > different time to see that something happens on system (PM works and > bad result of crmadmin -S caused by high load of PM). It doesn't work > fine either. > > -- > Evgeniy Ivanov > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker