Re: [Pacemaker] Call cib_query failed (-41): Remote node did not respond

Brian J. Murrell Tue, 03 Jul 2012 12:20:18 -0700

On 12-06-27 11:30 PM, Andrew Beekhof wrote:
> 
> The updates from you aren't the problem.  Its the number of resource
> operations (that need to be stored in the CIB) that result from your
> changes that might be causing the problem.


Just to follow this up for anyone currently following or anyone finding
this thread in the future...

It turns out that the problem is simply the size of the HA cluster that
I want to create.  The details are in the bug I filed at
http://bugs.clusterlabs.org/show_bug.cgi?id=5076 but the short story is
that I can add the number of resources and constrains I want to add
(i.e. 32-34 of each, as previously described in this thread),
concurrently even, so long as there is not more than 4 nodes per
corosync/pacemaker cluster.

Even adding 4 passive nodes (I only tried 8 total of 8 nodes, but not
values between 4 and 8 so the tipping point might be somewhere in
between 4 and 8) -- nodes that do no CIB operations of their own made
pacemaker crumble.

So the summary seems to be that pacemaker cannot scale to more than a
handful of nodes, even when the nodes are big: 12 core Xeon nodes with
gobs of memory.

I can only guess that everybody is using pacemaker in "pair" (or not
much bigger) type configurations currently.  Is that accurate?

Perhaps there is some tuning that can be done to scale somewhat, but
realistically, I am looking for pacemaker clusters in the tens, if not
into the hundreds of nodes.  However, I really wonder if any amount of
tuning could be done to achieve clusters that large given the small
number of nodes supported with the default tuning values.

Thoughts?

b.

signature.asc
Description: OpenPGP digital signature

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] Call cib_query failed (-41): Remote node did not respond

Reply via email to