On 10/07/2016 11:10 PM, [email protected] wrote: > Hi All, > > Our user may not necessarily use sdb. > > I confirmed that there was a method using WD service of corosync as one > method not to use sdb. > Pacemaker watches the process of pacemaker by WD service using CMAP and can > carry out watchdog.
Have to have a look at that... But if we establish some in-between-layer in pacemaker we could have this as one of the possibilities besides e.g. sbd (with enhanced API), going for a watchdog-device directly, ... > > > We can set up a patch of pacemaker. Always helpful to discuss/clarify an idea once some code is available ... > Was the discussion of using WD service over so far? Not from my pov. Just a day off ;-) > > > Best Regard, > Hideo Yamauchi. > > > ----- Original Message ----- >> From: Klaus Wenninger <[email protected]> >> To: Ulrich Windl <[email protected]>; [email protected] >> Cc: >> Date: 2016/10/7, Fri 17:47 >> Subject: Re: [ClusterLabs] Antw: Re: Antw: Re: Antw: Re: When the DC crmd is >> frozen, cluster decisions are delayed infinitely >> >> On 10/07/2016 08:14 AM, Ulrich Windl wrote: >>>>>> Klaus Wenninger <[email protected]> schrieb am >> 06.10.2016 um 18:03 in >>> Nachricht <[email protected]>: >>>> On 10/05/2016 04:22 PM, [email protected] wrote: >>>>> Hi All, >>>>> >>>>>>> If a user uses sbd, can the cluster evade a problem of >> SIGSTOP of crmd? >>>>>> >>>>>> As pointed out earlier, maybe crmd should feed a watchdog. Then >> stopping >>>> crmd >>>>>> will reboot the node (unless the watchdog fails). >>>>> Thank you for comment. >>>>> >>>>> We examine watchdog of crmd, too. >>>>> In addition, I comment after examination advanced. >>>> Was thinking of doing a small test implementation going >>>> a little in the direction Lars Ellenberg had been pointing out. >>>> >>>> a couple of thoughts I had so far: >>>> >>>> - add an API (via DBus or libqb - favoring libqb atm) to sbd >>>> an application can use to create a watchdog within sbd >>> Why has it to be done within sbd? >> Not necessarily, could be spawned out as well into an own project or >> something already existent could be taken. >> Remember to have added a dbus-interface to >> https://sourceforge.net/projects/watchdog/ for a project once. >> If you have a suggestion I'm open. >> Going off sbd would have the advantage of a smooth start: >> >> - cluster/pacemaker-watcher are there already and can >> be replaced/moved over time >> - the lifecycle of the daemon (when started/stopped) is >> already something that is in the code and in the people's minds >> >>>> - parameters for the first are a name and a timeout >>>> >>>> - first use-case would be crmd observation >>>> >>>> - later on we could think of removing pacemaker dependencies >>>> from sbd by moving the actual implementation of >>>> pacemaker-watcher and probably cluster-watcher as well >>>> into pacemaker - using the new API >>>> >>>> - this of course creates sbd dependency within pacemaker so >>>> that it would make sense to offer a simpler and self-contained >>>> implementation within pacemaker as an alternative >>> I think the watchdog interface is so simple that you don't need a relay >> for it. The only limit I can imagine is the number of watchdogs available of >> some specific hardware. >> That is the point ;-) >>>> thus it would be favorable to have the dependency >>>> within a non-compulsory pacemaker-rpm so that >>>> we can offer an alternative that doesn't use sbd >>>> at maybe the cost of being less reliable or one >>>> that owns a hardware-watchdog by itself for systems >>>> where this is still unused. >>>> >>>> - e.g. via some kind of plugin (Andrew forgive me - >>>> no pils ;-) ) >>>> - or via an additional daemon >>>> >>>> What did you have in mind? >>>> Maybe it makes sense to synchronize... >>>> >>>> Regards, >>>> Klaus >>>> >>>>> Best Regards, >>>>> Hideo Yamauchi. >>>>> >>>>> >>>>> >>>>> ----- Original Message ----- >>>>>> From: Ulrich Windl <[email protected]> >>>>>> To: [email protected]; [email protected] >>>>>> Cc: >>>>>> Date: 2016/10/5, Wed 23:08 >>>>>> Subject: Antw: Re: [ClusterLabs] Antw: Re: When the DC crmd is >> frozen, >>>> cluster decisions are delayed infinitely >>>>>>>>> <[email protected]> schrieb am >> 21.09.2016 um 11:52 >>>>>> in Nachricht >>>>>> <[email protected]>: >>>>>>> Hi All, >>>>>>> >>>>>>> Was the final conclusion given about this problem? >>>>>>> >>>>>>> If a user uses sbd, can the cluster evade a problem of >> SIGSTOP of crmd? >>>>>> As pointed out earlier, maybe crmd should feed a watchdog. Then >> stopping >>>> crmd >>>>>> will reboot the node (unless the watchdog fails). >>>>>> >>>>>>> We are interested in this problem, too. >>>>>>> >>>>>>> Best Regards, >>>>>>> >>>>>>> Hideo Yamauchi. >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Users mailing list: [email protected] >>>>>>> http://clusterlabs.org/mailman/listinfo/users >>>>>>> >>>>>>> Project Home: http://www.clusterlabs.org >>>>>>> Getting started: >> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>>>> Bugs: http://bugs.clusterlabs.org >>>>> _______________________________________________ >>>>> Users mailing list: [email protected] >>>>> http://clusterlabs.org/mailman/listinfo/users >>>>> >>>>> Project Home: http://www.clusterlabs.org >>>>> Getting started: >> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>> Bugs: http://bugs.clusterlabs.org >>>> >>>> _______________________________________________ >>>> Users mailing list: [email protected] >>>> http://clusterlabs.org/mailman/listinfo/users >>>> >>>> Project Home: http://www.clusterlabs.org >>>> Getting started: >> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>> Bugs: http://bugs.clusterlabs.org >>> >> >> _______________________________________________ >> Users mailing list: [email protected] >> http://clusterlabs.org/mailman/listinfo/users >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org >> > _______________________________________________ > Users mailing list: [email protected] > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Users mailing list: [email protected] http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
