Re: [ClusterLabs] Antw: Re: When the DC crmd is frozen, cluster decisions are delayed infinitely

Klaus Wenninger Thu, 08 Sep 2016 03:41:09 -0700

On 09/08/2016 11:51 AM, Shermal Fernando wrote:
> Hi Jehan-Guillaume,
>
> Sorry for disturbing you. This is really important for us to pass this test 
> on the pacemaker resiliency and robustness. 
> To my understanding, it's the pacemakerd who feeds the watchdog. If only the 
> crmd is hung, fencing will not work. Am I correct here?


sbd is observing pacemaker (basically by interfering with corosync and
reading the cib - both
obviously not affected by your test-scenario) and is feeding the
watchdog if everything seems ok.

>
> Regards,
> Shermal Fernando
>
>
>
>
>
>
>
> -----Original Message-----
> From: Jehan-Guillaume de Rorthais [mailto:[email protected]] 
> Sent: Thursday, September 08, 2016 3:12 PM
> To: Shermal Fernando
> Cc: Cluster Labs - All topics related to open-source clustering welcomed
> Subject: Re: [ClusterLabs] Antw: Re: When the DC crmd is frozen, cluster 
> decisions are delayed infinitely
>
> On Thu, 8 Sep 2016 08:58:15 +0000
> Shermal Fernando <[email protected]> wrote:
>
>> Hi Jehan-Guillaume,
>>
>> Does this means watchdog will serf-terminate the machine when the crm 
>> daemon is frozen?
> This means that if the machine is under such a load that PAcemaker is not 
> able to feed the watchdog, the watchdog will fence the machine itself.
>
>> -----Original Message-----
>> From: Jehan-Guillaume de Rorthais [mailto:[email protected]]
>> Sent: Thursday, September 08, 2016 12:52 PM
>> To: Digimer
>> Cc: Cluster Labs - All topics related to open-source clustering 
>> welcomed
>> Subject: Re: [ClusterLabs] Antw: Re: When the DC crmd is frozen, 
>> cluster decisions are delayed infinitely
>>
>> On Thu, 8 Sep 2016 15:55:50 +0900
>> Digimer <[email protected]> wrote:
>>
>>> On 08/09/16 03:47 PM, Ulrich Windl wrote:
>>>>>>> Shermal Fernando <[email protected]> schrieb am
>>>>>>> 08.09.2016 um
>>>>>>> 06:41 in
>>>> Nachricht
>>>> <8ce6e8d87f896546b9c65ed80d30a4336578c...@lg-spmb-mbx02.lseg.stockex.local>:
>>>>> The whole cluster will fail if the DC (crm daemon) is frozen due 
>>>>> to CPU starvation or hanging while trying to perform a IO operation.
>>>>> Please share some thoughts on this issue.
>>>> What is "the whole cluster will fail"? If the DC times out, some 
>>>> recovery will take place.
>>> Yup. The starved node should be declared lost by corosync, the 
>>> remaining nodes reform and if they're still quorate, the hung node 
>>> should be fenced. Recovery occur and life goes on.
>> +1
>>
>> And fencing might either come from outside, or just from the server 
>> itself using watchdog.
>
> This e-mail transmission (inclusive of any attachments) is strictly 
> confidential and intended solely for the ordinary user of the e-mail address 
> to which it was addressed. It may contain legally privileged and/or 
> CONFIDENTIAL information. The unauthorized use, disclosure, distribution 
> printing and/or copying of this e-mail or any information it contains is 
> prohibited and could, in certain circumstances, constitute an offence. If you 
> have received this e-mail in error or are not an intended recipient please 
> inform the sender of the email and MillenniumIT immediately by return e-mail 
> or telephone (+94-11) 2416000. We advise that in keeping with good computing 
> practice, the recipient of this e-mail should ensure that it is virus free. 
> We do not accept responsibility for any virus that may be transferred by way 
> of this e-mail. E-mail may be susceptible to data corruption, interception 
> and unauthorized amendment, and we do not accept liability for any such 
> corruption, interceptio!
>  n or amen
>  dment or any consequences thereof.  www.millenniumit.com 
>
>
> _______________________________________________
> Users mailing list: [email protected]
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org




_______________________________________________
Users mailing list: [email protected]
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Antw: Re: When the DC crmd is frozen, cluster decisions are delayed infinitely

Reply via email to