Hi Andreas, On 05/29/2012 04:14 PM, Stallmann, Andreas wrote: > Hi there, > > > > we have here a corosync/pacemaker cluster running tomcat. Sometimes our > application running inside tomcat fails and tomcat dies. > > > > This – for some reason I don’t understand – leads to an “unmanaged > failed” state for tomcat diplayed in crm_mon. This would not been to > bad, but at this point the cluster “decides” not to failover the > resource to the second node. > > > > My questions: > > > > 1. Is this a standard behaviour? Should a failover stop (or not > take place at all), if a resource runs into an unmanaged failed state?
Yes, that is default behaviour ... Pacemaker tries to stop, that fails so it must assume (worst case) it is still running, now STONITH would trigger to make sure the node including the resource is definitely down ... without STONITH it stays unmanaged until cleared. > > 2. What conditions have to apply, before a resource is called > “unmanaged failed”? e.g. stop failures ;-) > > 3. Is there any way of an “automatic recover” of a resource that > ran into an “unmanaged failed” state? First attempt should be to fix your application. There is also the "failure-timeout" resource meta-attribute ... in combination with the cluster-recheck-interval cluster property, this clears resource failures on a regular base. Regards, Andreas -- Need help with Pacemaker? http://www.hastexo.com/now > > > > Cheers, > > > > Andreas > > -- > CONET Solutions GmbH > Andreas Stallmann, > Theodor-Heuss-Allee 19, 53773 Hennef > Tel.: +49 2242 939-677, Fax: +49 2242 939-393 > Mobil: +49 172 2455051 > Internet: http://www.conet.de, mailto: [email protected] > <mailto:[email protected]> > > > > ---------------------------- > CONET Solutions GmbH, Theodor-Heuss-Allee 19, 53773 Hennef. > Registergericht/Registration Court: Amtsgericht Siegburg (HRB Nr. 9136) > Geschäftsführer/Managing Director: Anke Höfer > > ---------------------------- > > > > > > _______________________________________________ > Pacemaker mailing list: [email protected] > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Pacemaker mailing list: [email protected] http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
