[ClusterLabs] Antw: Re: Antw: Re: Antw: Re: When the DC crmd is frozen, cluster decisions are delayed infinitely

Ulrich Windl Thu, 06 Oct 2016 23:18:09 -0700

>>> Klaus Wenninger <[email protected]> schrieb am 06.10.2016 um 18:03 in
Nachricht <[email protected]>:
> On 10/05/2016 04:22 PM, [email protected] wrote:
>> Hi All,
>>
>>>> If a user uses sbd, can the cluster evade a problem of SIGSTOP of crmd?
>>>  
>>> As pointed out earlier, maybe crmd should feed a watchdog. Then stopping 
> crmd 
>>> will reboot the node (unless the watchdog fails).
>>
>> Thank you for comment.
>>
>> We examine watchdog of crmd, too.
>> In addition, I comment after examination advanced.
> 
> Was thinking of doing a small test implementation going
> a little in the direction Lars Ellenberg had been pointing out.
> 
> a couple of thoughts I had so far:
> 
> - add an API (via DBus or libqb - favoring libqb atm) to sbd
>   an application can use to create a watchdog within sbd


Why has it to be done within sbd?

> 
> - parameters for the first are a name and a timeout
> 
> - first use-case would be crmd observation
> 
> - later on we could think of removing pacemaker dependencies
>   from sbd by moving the actual implementation of
>   pacemaker-watcher and probably cluster-watcher as well
>   into pacemaker - using the new API
> 
> - this of course creates sbd dependency within pacemaker so
>   that it would make sense to offer a simpler and self-contained
>   implementation within pacemaker as an alternative

I think the watchdog interface is so simple that you don't need a relay for it. 
The only limit I can imagine is the number of watchdogs available of some 
specific hardware.

> 
>   thus it would be favorable to have the dependency
>   within a non-compulsory pacemaker-rpm so that
>   we can offer an alternative that doesn't use sbd
>   at maybe the cost of being less reliable or one
>   that owns a hardware-watchdog by itself for systems
>   where this is still unused.
> 
>   - e.g. via some kind of plugin (Andrew forgive me -
>                                                    no pils ;-) )
>   - or via an additional daemon
> 
> What did you have in mind?
> Maybe it makes sense to synchronize...
> 
> Regards,
> Klaus
>  
>>
>>
>> Best Regards,
>> Hideo Yamauchi.
>>
>>
>>
>> ----- Original Message -----
>>> From: Ulrich Windl <[email protected]>
>>> To: [email protected]; [email protected] 
>>> Cc: 
>>> Date: 2016/10/5, Wed 23:08
>>> Subject: Antw: Re: [ClusterLabs] Antw: Re: When the DC crmd is frozen, 
> cluster decisions are delayed infinitely
>>>
>>>>>>  <[email protected]> schrieb am 21.09.2016 um 11:52 
>>> in Nachricht
>>> <[email protected]>:
>>>>  Hi All,
>>>>
>>>>  Was the final conclusion given about this problem?
>>>>
>>>>  If a user uses sbd, can the cluster evade a problem of SIGSTOP of crmd?
>>> As pointed out earlier, maybe crmd should feed a watchdog. Then stopping 
> crmd 
>>> will reboot the node (unless the watchdog fails).
>>>
>>>>  We are interested in this problem, too.
>>>>
>>>>  Best Regards,
>>>>
>>>>  Hideo Yamauchi.
>>>>
>>>>
>>>>  _______________________________________________
>>>>  Users mailing list: [email protected] 
>>>>  http://clusterlabs.org/mailman/listinfo/users 
>>>>
>>>>  Project Home: http://www.clusterlabs.org 
>>>>  Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>>>>  Bugs: http://bugs.clusterlabs.org 
>> _______________________________________________
>> Users mailing list: [email protected] 
>> http://clusterlabs.org/mailman/listinfo/users 
>>
>> Project Home: http://www.clusterlabs.org 
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>> Bugs: http://bugs.clusterlabs.org 
> 
> 
> 
> _______________________________________________
> Users mailing list: [email protected] 
> http://clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org 




_______________________________________________
Users mailing list: [email protected]
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[ClusterLabs] Antw: Re: Antw: Re: Antw: Re: When the DC crmd is frozen, cluster decisions are delayed infinitely

Reply via email to