On 2013-09-04T08:26:14, Ulrich Windl <[email protected]> wrote:

> In my experience network traffic grows somewhat linear with the size
> of the CIB. At some point you probably have to change communication
> parameters to keep the cluster in a happy comminication state.

Yes, I wish corosync would "auto-tune" to a higher degree. Apparently
though, that's a slightly harder problem.

We welcome any feedback on required tunables. Those that we ship on SLE
HA worked for us (and even for rather largeish configurations), but they
may not be appropriate everywhere.

> Despite of the cluster internals, there may be problems if a node goes
> online and hundreds of resources are started in parallel, specifically
> if those resources weren't designed for it. I suspect IP addresses,
> MD-RAIDs, LVM stuff, drbd, filesystems, exportfs, etc.

No, most of these resource scripts *are* supposed to be
concurrency-safe. If you find something that breaks, please share the
feedback.

It's true that the way how concurrent load limitation is implemented in
Pacemaker/LRM isn't perfect yet. batch-limit is rather coarse. The
per-node LRM child limit is probably the best bet right now. But it
doesn't differentiate between starting many light-weight resources in
parallel (such as IPaddr) versus heavy-weights (VMs with Oracle
databases).

(migration-threshold goes in the same direction.)

Historical context matters. Pacemaker comes from the HA world; we still
believe 3-7 node clusters are the largest anyone ought to reasonably
build, considering the failure/admin/security domain issues with single
point of failures and the increasing likelihood of double failures etc.

But there's several trends -

Even those 3-7 nodes become increasingly powerful multi-core kick-ass
boxes. 7 nodes might well host hundreds of resources nowadays (say,
above 70 VMs with all their supporting resources).

People build much larger clusters because there's no good way to "divide
and conquer" yet - e.g., if you build several 3 or 5 node clusters,
there's no support for managing those clusters-of-clusters.

And people use Pacemaker for HPC style deployments (e.g., private
clouds with tons of VMs) - because while our HPC support is suboptimal,
it is better than the HA support in most of the Cloud offerings.


> As a note: Just recently we had a failure in MD-RAID activation with no real
> reason to be found in syslog, and the cluster got quite confused.
> (I had reported this to my favourite supporter (SR 10851868591), but haven't
> heard anything since then...)

I'll try to dig that out of the support system and give it a look.


Regards,
    Lars

-- 
Architect Storage/HA
SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 
21284 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to