On 10/05/2016 04:22 PM, renayama19661...@ybb.ne.jp wrote: > Hi All, > >>> If a user uses sbd, can the cluster evade a problem of SIGSTOP of crmd? >> >> As pointed out earlier, maybe crmd should feed a watchdog. Then stopping >> crmd >> will reboot the node (unless the watchdog fails). > > Thank you for comment. > > We examine watchdog of crmd, too. > In addition, I comment after examination advanced.
Was thinking of doing a small test implementation going a little in the direction Lars Ellenberg had been pointing out. a couple of thoughts I had so far: - add an API (via DBus or libqb - favoring libqb atm) to sbd an application can use to create a watchdog within sbd - parameters for the first are a name and a timeout - first use-case would be crmd observation - later on we could think of removing pacemaker dependencies from sbd by moving the actual implementation of pacemaker-watcher and probably cluster-watcher as well into pacemaker - using the new API - this of course creates sbd dependency within pacemaker so that it would make sense to offer a simpler and self-contained implementation within pacemaker as an alternative thus it would be favorable to have the dependency within a non-compulsory pacemaker-rpm so that we can offer an alternative that doesn't use sbd at maybe the cost of being less reliable or one that owns a hardware-watchdog by itself for systems where this is still unused. - e.g. via some kind of plugin (Andrew forgive me - no pils ;-) ) - or via an additional daemon What did you have in mind? Maybe it makes sense to synchronize... Regards, Klaus > > > Best Regards, > Hideo Yamauchi. > > > > ----- Original Message ----- >> From: Ulrich Windl <ulrich.wi...@rz.uni-regensburg.de> >> To: users@clusterlabs.org; renayama19661...@ybb.ne.jp >> Cc: >> Date: 2016/10/5, Wed 23:08 >> Subject: Antw: Re: [ClusterLabs] Antw: Re: When the DC crmd is frozen, >> cluster decisions are delayed infinitely >> >>>>> <renayama19661...@ybb.ne.jp> schrieb am 21.09.2016 um 11:52 >> in Nachricht >> <876439.61305...@web200311.mail.ssk.yahoo.co.jp>: >>> Hi All, >>> >>> Was the final conclusion given about this problem? >>> >>> If a user uses sbd, can the cluster evade a problem of SIGSTOP of crmd? >> As pointed out earlier, maybe crmd should feed a watchdog. Then stopping >> crmd >> will reboot the node (unless the watchdog fails). >> >>> We are interested in this problem, too. >>> >>> Best Regards, >>> >>> Hideo Yamauchi. >>> >>> >>> _______________________________________________ >>> Users mailing list: Users@clusterlabs.org >>> http://clusterlabs.org/mailman/listinfo/users >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: http://bugs.clusterlabs.org > _______________________________________________ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org