[ClusterLabs] Antw: Re: Corosync: 100% cpu (corosync 2.3.5, libqb 0.17.1, pacemaker 1.1.13)

Ulrich Windl Thu, 06 Aug 2015 23:03:37 -0700

I know that corosync runs at "moderate real-time priority". Despite of the fact 
that I wonder whether it's a work-around for some bugs in corosync, have you 
tried running DRBD with real-time priority also? I never tried to change the 
priority of a kernel thread, however...



>>> Pallai Roland <[email protected]> schrieb am 06.08.2015 um 15:54 in Nachricht
<CALj=1whcfzhjf97dg+ykde7kgdj79ghgybj2nymygd8xyxf...@mail.gmail.com>:
> 2015-08-06 15:24 GMT+02:00 Pallai Roland <[email protected]>:
> 
>>   drbdtest1 corosync[4734]:   [MAIN  ] Corosync main process was not
>>>> scheduled for 2590.4512 ms (threshold is 2400.0000 ms). Consider token
>>>> timeout increase.
>>>>
>>>> and even drbd:
>>>>   drbdtest1 kernel: drbd p1: PingAck did not arrive in time.
>>>>
>>>
>>> Kernel module blocked by unrelated userspace app?
>>
>>
>> There is a chance that the nodes are blocking each other as they are on
>> the same host and that is the reason of the DRBD timeout but it's also
>> weird - how can a guest block an other entirely when there are idle cores
>> on the host?
>>
>> All in all, DRBD timeout has been eliminated when a node got more than one
>> logical core.
>>
> 
> I have to correct myself;
> 
> DRBD timeout is not fixed if only one node has more cores. In this case the
> other node will report PingAck timeout periodically. I think the most
> simple explanation on this is a spinning corosync can block even kernel
> threads.
> 
> DRBD timeout fixed if both nodes has more logical cores.





_______________________________________________
Users mailing list: [email protected]
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[ClusterLabs] Antw: Re: Corosync: 100% cpu (corosync 2.3.5, libqb 0.17.1, pacemaker 1.1.13)

Reply via email to