Re: [Pacemaker] The larger cluster is tested.

Andrew Beekhof Wed, 06 Nov 2013 21:11:19 -0800

On 7 Nov 2013, at 12:43 pm, yusuke iida <yusk.i...@gmail.com> wrote:


> Hi, Andrew
> 
> 2013/11/7 Andrew Beekhof <and...@beekhof.net>:
>> 
>> On 6 Nov 2013, at 4:48 pm, yusuke iida <yusk.i...@gmail.com> wrote:
>> 
>>> Hi, Andrew
>>> 
>>> I tested by the following versions.
>>> https://github.com/ClusterLabs/pacemaker/commit/3492fec7fe58a6fd94071632df27d3fd3fc3ffe3
>>> 
>>> load-threshold was checked at 60%, 40%, and 20%.
>>> 
>>> However, the problem was not solved.
>>> It will not change but timeout will occur.
>> 
>> That is extremely surprising.  I will have a look at your logs today.
>> How many cores do these machines have btw?
> 
> The machine which I am using by the test is a virtual machine of KVM.
> There are four physical servers. Four virtual machines are started on
> each server.
> Has four core physical server, I am assigned a core of separate to the
> virtual machine.
> The number of CPUs currently assigned to the virtual machine is one piece.
> The memory is assigning 2048 MB per set.

I think I understand whats happening...

The throttling code is designed to keep the cib's CPU usage from reaching 100% 
(ie. 1 core completely busy).
In a single core setup, thats already much too late, and with 16 nodes I can 
easily imagine that even 1 job per machine is going to be too much for an 
underpowered CPU.

I'm currently experimenting with: 

   http://paste.fedoraproject.org/52283/37994581

which may help on both fronts.

Essentially it is trying to dynamically infer a "good" value for batch-limit 
when the CIB is using too much CPU.



_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] The larger cluster is tested.

Reply via email to