On 7 Nov 2013, at 12:43 pm, yusuke iida <yusk.i...@gmail.com> wrote:
> Hi, Andrew > > 2013/11/7 Andrew Beekhof <and...@beekhof.net>: >> >> On 6 Nov 2013, at 4:48 pm, yusuke iida <yusk.i...@gmail.com> wrote: >> >>> Hi, Andrew >>> >>> I tested by the following versions. >>> https://github.com/ClusterLabs/pacemaker/commit/3492fec7fe58a6fd94071632df27d3fd3fc3ffe3 >>> >>> load-threshold was checked at 60%, 40%, and 20%. >>> >>> However, the problem was not solved. >>> It will not change but timeout will occur. >> >> That is extremely surprising. I will have a look at your logs today. >> How many cores do these machines have btw? > > The machine which I am using by the test is a virtual machine of KVM. > There are four physical servers. Four virtual machines are started on > each server. > Has four core physical server, I am assigned a core of separate to the > virtual machine. > The number of CPUs currently assigned to the virtual machine is one piece. > The memory is assigning 2048 MB per set. I think I understand whats happening... The throttling code is designed to keep the cib's CPU usage from reaching 100% (ie. 1 core completely busy). In a single core setup, thats already much too late, and with 16 nodes I can easily imagine that even 1 job per machine is going to be too much for an underpowered CPU. I'm currently experimenting with: http://paste.fedoraproject.org/52283/37994581 which may help on both fronts. Essentially it is trying to dynamically infer a "good" value for batch-limit when the CIB is using too much CPU. _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org