When I joined the previous company, we were just decommissioning a 41-node Scale-out HANA /SLES 11/ with a 21-node /SLES12/ cluster. The most popular was a 2-node cluster, but we had a lot of issues .For me a 2-node clusters with qnet will be the most popular.
Best Regards, Strahil Nikolov На 31 юли 2020 г. 8:57:29 GMT+03:00, Ulrich Windl <[email protected]> написа: >>>> Ken Gaillot <[email protected]> schrieb am 30.07.2020 um 16:43 in >Nachricht ><[email protected]>: >> On Wed, 2020‑07‑29 at 23:12 +0000, Toby Haynes wrote: >>> In Corosync 1.x there was a limit on the maximum number of active >>> nodes in a corosync cluster ‑ broswing the mailing list says 64 >>> hosts. The Pacemaker 1.1 documentation says scalability goes up to >16 >>> nodes. The Pacemaker 2.0 documentation says the same, although I >>> can't find a maximum number of nodes in Corosync 3. >> >> My understanding is that there is no theoretical limit, only >practical >> limits, so giving a single number is somewhat arbitrary. >> >> There is a huge difference between full cluster nodes (running >corosync >> and all pacemaker daemons) and Pacemaker Remote nodes (running only >> pacemaker‑remoted). >> >> Corosync uses a ring model where a token has to be passed in a very >> short amount of time, and also has message guarantees (i.e. every >node >> has to confirm receiving a message before it is made available), so >> there is a low practical limit to full cluster nodes. The 16 or 32 >> number comes from what enterprise providers are willing to support, >and >> is a good ballpark for a real‑world comfort zone. Even at 32 you need >a > >What I'd like to see is some table with recommended parameters, >depending on >the number of nodes and the maximum acceptable network delay. > >The other thing I'd like to see is a worl-wide histogram (x-axis: >number of >nodes, y-axis: number of installations) of pacemaker clusters. >Here we have a configuration ot two 2-node clusters and one 3-node >cluster. >Initially we had planned to make one 7-node cluster, but basically >stability >(common fencing) and configuration issues (becoming complex) prevented >that. > >> dedicated fast network and likely some tuning tweaks. Going beyond >that >> is possible but depends on hardware and tuning, and becomes sensitive >> to slight disturbances. >> >> Pacemaker Remote nodes on the other hand are lightweight. They >> communicate with only a single cluster node, with relatively low >> traffic. The upper bound is unknown; some people report getting >strange >> errors with as few as 40 remote nodes, while others run over 100 with >> no problems. So it may well depend on network and hardware >capabilities > >See the parameter table requested above. > >> at high numbers, and you can run far more in VMs or containers than >on >> bare metal, since traffic will (usually) be internal rather than over >> the network. >> >> I would expect a cluster with 16‑32 full nodes and several hundred >> remotes (maybe even thousands in VMs or containers) to be feasible >with >> the right hardware and tuning. > >I wonder: Do such configurations have a lot of identical or similar >resources, >or do they do massive load balancing, or do they run many different >resources? > >> >> Since remotes don't run all the daemons, they can't do things like >> directly execute fence devices or contribute to cluster quorum, but >> remotes on bare metal or VMs are not really in a hierarchy as far as >> the services being clustered go. A resource can move between cluster >> and remote nodes, and a remote's connection can move from one cluster >> node to another without interrupting the services on the remote. > >Regards, >Ulrich > >_______________________________________________ >Manage your subscription: >https://lists.clusterlabs.org/mailman/listinfo/users > >ClusterLabs home: https://www.clusterlabs.org/ _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
