Re: [Pacemaker] (LRMD|PCMK)_MAX_CHILDREN?

Lars Marowsky-Bree Wed, 11 Sep 2013 23:50:36 -0700

On 2013-09-12T14:34:02, Andrew Beekhof <and...@beekhof.net> wrote:

> > Well, they're all doing something completely different.
> No, they're all crude approximations designed to stop the cluster as a whole 
> from using up so much cpu/network/etc that recovery introduces more failures 
> than it resolves.


OK. Though they do effect the limit on very different levels - which
sort of makes some sense, because there are limitations on different
levels, and at best we want to use them all.

> > The max_children prevent a given node from being overloaded by
> > concurrent operations.
> At the expense of introducing other failures... such as "I fired off
> an action N seconds ago with a timeout < N and still haven't heard
> back" which was possible if batch-limit and max children were too out
> of balance.

Yes. That was very rare, but could happen.

> Which is why any limiting needs to happen at centrally on the DC.

On the other hand, the DC cannot possibly limit concurrent monitor
operations (since it isn't involved). Arguably, for nodes hosting 100+
resources, there is some value in limiting parallelism on those. But I'd
be happy if they were smartly staggered.

> As above, the rate limiting needs to happen on the DC which lends
> itself to being a property of the cib and/or transition graph rather
> than defined in sysconfig.

I'd be quite happy with that.

The most directly equivalent solution would be to number the per-node
in-flight operations similar to what migration-threshold does. (I think
we can safely continue to treat all resources as equal to start with.)

Though the transition from an environment variable to a CIB node
attribute (inherited from a cluster-property, I assume) is going to suck
for the upgrade path :-/


Regards,
    Lars

-- 
Architect Storage/HA
SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 
21284 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde


_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] (LRMD|PCMK)_MAX_CHILDREN?

Reply via email to