On 02/06/17 01:51 AM, Attila Megyeri wrote: > Thanks. > > We have several clusters working fine since several years, without STONITH, > so we did not really bother to implement it.
I have been driving a car for 20+ years, and I have never needed my seatbelt, thankfully. That you haven't needed stonith yet is not an indication that stonith is not required. You've just been lucky so far. > As for the failed actions - I cannot recall since when these are not cleared, > but they aren't. > > When I check the pengine log on the DC, at every recheck interval I see lines > like: > > ... pengine: warning: unpack_rsc_op: Processing failed op start for > jboss_admin1 on ctadmin1: unknown error (1) > > or > ... pengine: warning: unpack_rsc_op: Processing failed op monitor for > jboss_abssrv2 on ctabs2: unknown error (1) > > > And these are the failed actions visible in the crm_mon -f as well: > > > Failed actions: > jboss_admin1_start_0 (node=ctadmin1, call=120, rc=1, status=Timed Out, > last-rc-change=Thu Jun 1 14:17:31 2017 > , queued=40001ms, exec=0ms > ): unknown error > jboss_abssrv2_monitor_10000 (node=ctabs2, call=106, rc=1, > status=complete, last-rc-change=Thu Jun 1 14:13:36 2017 > , queued=0ms, exec=0ms > ): unknown error > > > If I do a resource cleanup, the errors are gone. > > At the same time I see no actions on the mentioned nodes - this log is from > the DC... > > On the mentioned notes a regular monitoring operation is performed, and > reults in 0 - no error. > > What am I missing here? > > > >> -----Original Message----- >> From: Ulrich Windl [mailto:ulrich.wi...@rz.uni-regensburg.de] >> Sent: Thursday, June 1, 2017 8:34 AM >> To: users@clusterlabs.org >> Subject: [ClusterLabs] Antw: Re: Antw: clearing failed actions >> >>>>> Digimer <li...@alteeve.ca> schrieb am 01.06.2017 um 00:03 in Nachricht >> <50aad2be-185b-0348-6a93-987034c9c...@alteeve.ca>: >> [...] >>> I don't know, but according to Ken's last email, what you're seeing is >>> expected. I replied because of the miss understanding of the rolls >>> quorum and fencing play. Running a cluster without fencing is dangerous. >> >> I'd recommend this: Enable a working STONITH. Then if you see your cluster >> never uses STONITH, and everything works fine, and you feel you don't >> needit, then you can get rid of it. >> But don't try the other way 'round: Omit STONITH, expecting the cluster >> would work flawlessly, then (not) add STONITH. >> >> Regards, >> Ulrich >> >> >> >> >> _______________________________________________ >> Users mailing list: Users@clusterlabs.org >> http://lists.clusterlabs.org/mailman/listinfo/users >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org > > _______________________________________________ > Users mailing list: Users@clusterlabs.org > http://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould _______________________________________________ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org