Re: [Pacemaker] The larger cluster is tested.

Andrew Beekhof Mon, 11 Nov 2013 15:08:54 -0800

On 11 Nov 2013, at 11:48 pm, yusuke iida <yusk.i...@gmail.com> wrote:


> Execution of the graph was also checked.
> Since the number of pending(s) is restricted to 16 from the middle, it
> is judged that batch-limit is effective.
> Observing here, even if a job is restricted by batch-limit, two or
> more jobs are always fired(ed) in 1 second.
> These performed jobs return a result and the synchronous message of
> CIB generates them.
> The node which continued receiving a synchronous message processes
> there preferentially, and postpones an internal IPC message.
> I think that it caused timeout.

What load-threshold were you running this with?

I see this in the logs:
"Host vm10 supports a maximum of 4 jobs and throttle mode 0100.  New job limit 
is 1"

Have you set LRMD_MAX_CHILDREN=4 on these nodes?
I wouldn't recommend that for a single core VM.  I'd let the default of 2*cores 
be used.


Also, I'm not seeing "Extreme CIB load detected".  Are these still single core 
machines?
If so it would suggest that something about:

        if(cores == 1) {
            cib_max_cpu = 0.4;
        }
        if(throttle_load_target > 0.0 && throttle_load_target < cib_max_cpu) {
            cib_max_cpu = throttle_load_target;
        }

        if(load > 1.5 * cib_max_cpu) {
            /* Can only happen on machines with a low number of cores */
            crm_notice("Extreme %s detected: %f", desc, load);
            mode |= throttle_extreme;

is wrong.

What was load-threshold configured as?

signature.asc
Description: Message signed with OpenPGP using GPGMail

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] The larger cluster is tested.

Reply via email to