Re: [Linux-HA] Antw: Re: Max number of resources under Pacemaker ?

Moullé Alain Mon, 30 Sep 2013 05:52:54 -0700

Hi,

sorry for the delay on this thread, I was unavailable a few weeks, butjust FYI, I wanted to share some results I got a few weeks ago:

I've tried some tests on a configuration and start/stop of 500 Dummyresources, and I got these time values :

1/ configuration with successive crm commands "crm configure primitive..." :**it takes about 1H so it is not usable2/ with a unique crm command "crm configure < File " with all dummyprimitives in File : it takes 7s / that's OK3/ add just one location constraint for each dummy primitive with "crmconfigure < File" with all constraints in File : it takes 27s / strangebut acceptable4/ Start of the 500 primitives with successive crm commands "crmresource start ..." :it takes 7mn28 / seems not acceptable moreoverfor dummy resources ...5/ Start of the 500 primitives with parallel (background) crm commands"crm resource start ... &" : not possible, lots of commands exit inerrors and anyway it takes also long time6/ Start of the 500 primitives in parallel by setting all target-rolesto "Started" in Pacemaker:

    => with crm configure edit : s/Stopped/Started on the  500 primitives

Result : around6 mn for all primitives to be started. Seems notacceptable moreover for dummy resources , and it will take let's sayabout 3 mn for a failover if primitives are

    well located half on one node and half on the other.

These results are with dummy resources, and we can imagine that withreal resources it will take much longer, not speaking about the periodicmonitoring of 500 primitives ...

So, based on these results, I think that the limit in number ofresources is far below 500 resources ...

But I wanted to give these results just to keep going on this subjectand perhaps get some ideas ...


Thanks
Alain



Le 05/09/2013 10:58, Lars Marowsky-Bree a écrit :

On 2013-09-04T08:26:14, Ulrich Windl <[email protected]> wrote:

In my experience network traffic grows somewhat linear with the size
of the CIB. At some point you probably have to change communication
parameters to keep the cluster in a happy comminication state.

Yes, I wish corosync would "auto-tune" to a higher degree. Apparently
though, that's a slightly harder problem.

We welcome any feedback on required tunables. Those that we ship on SLE
HA worked for us (and even for rather largeish configurations), but they
may not be appropriate everywhere.

Despite of the cluster internals, there may be problems if a node goes
online and hundreds of resources are started in parallel, specifically
if those resources weren't designed for it. I suspect IP addresses,
MD-RAIDs, LVM stuff, drbd, filesystems, exportfs, etc.

No, most of these resource scripts *are* supposed to be
concurrency-safe. If you find something that breaks, please share the
feedback.

It's true that the way how concurrent load limitation is implemented in
Pacemaker/LRM isn't perfect yet. batch-limit is rather coarse. The
per-node LRM child limit is probably the best bet right now. But it
doesn't differentiate between starting many light-weight resources in
parallel (such as IPaddr) versus heavy-weights (VMs with Oracle
databases).

(migration-threshold goes in the same direction.)

Historical context matters. Pacemaker comes from the HA world; we still
believe 3-7 node clusters are the largest anyone ought to reasonably
build, considering the failure/admin/security domain issues with single
point of failures and the increasing likelihood of double failures etc.

But there's several trends -

Even those 3-7 nodes become increasingly powerful multi-core kick-ass
boxes. 7 nodes might well host hundreds of resources nowadays (say,
above 70 VMs with all their supporting resources).

People build much larger clusters because there's no good way to "divide
and conquer" yet - e.g., if you build several 3 or 5 node clusters,
there's no support for managing those clusters-of-clusters.

And people use Pacemaker for HPC style deployments (e.g., private
clouds with tons of VMs) - because while our HPC support is suboptimal,
it is better than the HA support in most of the Cloud offerings.

As a note: Just recently we had a failure in MD-RAID activation with no real
reason to be found in syslog, and the cluster got quite confused.
(I had reported this to my favourite supporter (SR 10851868591), but haven't
heard anything since then...)

I'll try to dig that out of the support system and give it a look.


Regards,
     Lars


_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Antw: Re: Max number of resources under Pacemaker ?

Reply via email to