Re: [Ganglia-developers] Does Ganglia work well for a large-scale cluster

Vladimir Vuksan Thu, 30 Mar 2017 06:24:41 -0700

Clusters are logical grouping of like hosts. This can be e.g. per location (same data center), per app or per function (DB, web, etc.). It really depends how you are viewing your environment. There is no right or wrong way to group it.

Vladimir

03/30/2017 u 04:30 AM, Guo, Jason je napisao/la:

Thanks Vladimir

As you mentioned, FB had clusters with tens of thousands of nodes in a cluster.

How they orchestrate these nodes? Here are some options in my mind

1.       All the nodes share a few centralized gmonds and all of them belong to a single cluster (the cluster concept in ganglia)

2.       All the nodes share a few centralized gmonds and each centralized gmond belong to different cluster, and there is a single gmetad which poll data from these centralized gmond

3.       There are multiple gmetad/grid and then orchestrate these grids with a centralized gmetad/grid\

Thanks & Best Regards,

Jason Guo

From: Vladimir Vuksan <vli...@veus.hr>
Date: Wednesday, March 29, 2017 at 20:09
To: "Guo, Jason" <ju...@ebay.com>, "ganglia-developers@lists.sourceforge.net" <ganglia-developers@lists.sourceforge.net>
Subject: Re: [Ganglia-developers] Does Ganglia work well for a large-scale cluster

Hi Jason,

it depends on the number of metrics and associated metadata in the cluster and how busy gmetad is overall. Also depends on your hardware. At one point FB had clusters with tens of thousands of nodes in a cluster.

Try to keep your metrics lean ie. don't add any metric descriptions if you don't have to so to keep the XML payload small and it should be fine.

Vladimir

3/28/2017 u 10:19 PM, Guo, Jason je napisao/la:

Hi,

I’m writing this mail to discuss whether Ganglia works well for a large-scale cluster (more than 4000 nodes).

As per Ganglia document, ganglia can scale to handle clusters with 2000 nodes. So many people have concern on using Ganglia for a 4000 nodes production cluster.

It has been used to link clusters across university campuses and around the world and can scale to handle clusters with 2000 nodes.

If the cluster is large than 2000 nodes, say 4000 nodes, can Ganglia handle it properly?

To verify this, I create a 5000 nodes ganglia cluster on top of Docker cluster (10 machine).

I put 500 nodes in a cluster, so there are 10 cluster. And these 10 clusters are in the same Grid.

For each gmond, I use a script to generate 30 customized metrics (with gmetric).

Currently it works fine in the Docker based test environment.

So, my question is whether Ganglia is suitable for 4000 nodes cluster?

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

_______________________________________________
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

Re: [Ganglia-developers] Does Ganglia work well for a large-scale cluster

Reply via email to