Thanks Vladimir

As you mentioned, FB had clusters with tens of thousands of nodes in a cluster.

How they orchestrate these nodes? Here are some options in my mind

1.       All the nodes share a few centralized gmonds and all of them belong to 
a single cluster (the cluster concept in ganglia)

2.       All the nodes share a few centralized gmonds and each centralized 
gmond belong to different cluster, and there is a single gmetad which poll data 
from these centralized gmond

3.       There are multiple gmetad/grid and then orchestrate these grids with a 
centralized gmetad/grid\

Thanks & Best Regards,
Jason Guo

From: Vladimir Vuksan <vli...@veus.hr>
Date: Wednesday, March 29, 2017 at 20:09
To: "Guo, Jason" <ju...@ebay.com>, "ganglia-developers@lists.sourceforge.net" 
<ganglia-developers@lists.sourceforge.net>
Subject: Re: [Ganglia-developers] Does Ganglia work well for a large-scale 
cluster

Hi Jason,

it depends on the number of metrics and associated metadata in the cluster and 
how busy gmetad is overall. Also depends on your hardware. At one point FB had 
clusters with tens of thousands of nodes in a cluster.

Try to keep your metrics lean ie. don't add any metric descriptions if you 
don't have to so to keep the XML payload small and it should be fine.

Vladimir

3/28/2017 u 10:19 PM, Guo, Jason je napisao/la:
Hi,


I’m writing this mail to discuss whether Ganglia works well for a large-scale 
cluster (more than 4000 nodes).


As per Ganglia document, ganglia can scale to handle clusters with 2000 nodes. 
So many people have concern on using Ganglia for a 4000 nodes production 
cluster.
It has been used to link clusters across university campuses and around the 
world and can scale to handle clusters with 2000 nodes.

If the cluster is large than 2000 nodes, say 4000 nodes, can Ganglia handle it 
properly?


To verify this, I create a 5000 nodes ganglia cluster on top of Docker cluster 
(10 machine).
I put 500 nodes in a cluster, so there are 10 cluster. And these 10 clusters 
are in the same Grid.
For each gmond,  I use a script to generate 30 customized metrics (with 
gmetric).

Currently it works fine in the Docker based test environment.

So, my question is whether Ganglia is suitable for 4000 nodes cluster?


Thanks & Best Regards,
Jason Gu0o






------------------------------------------------------------------------------

Check out the vibrant tech community on one of the world's most

engaging tech sites, Slashdot.org! http://sdm.link/slashdot




_______________________________________________

Ganglia-developers mailing list

Ganglia-developers@lists.sourceforge.net<mailto:Ganglia-developers@lists.sourceforge.net>

https://lists.sourceforge.net/lists/listinfo/ganglia-developers


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

Reply via email to