Re: [Ganglia-general] Pointers on architecting a large scale ganglia setup??

Jason A. Smith Fri, 27 Jan 2006 14:28:36 -0800

On Fri, 2006-01-27 at 15:05 -0500, Rick Mohr wrote: 
> It's not so much that I "fear" multicast, it's just that I see no need for 
> it. 
> Admittedly, my setup is not necessarily like others, and therefore my 
> preferences don't necessarily apply to other setups.  And I wouldn't claim 
> the 
> my setup is the "best" setup by any means.  But if you are interested, I will 
> try to explain my personal resoning for not using multicast. (I have replied 
> to 
> parts of your previous email below.)


I didn't mean to make sound like I think that everyone should be using
multicast.  I am sure there are some places where it isn't necessary or
where unicast is better.  It is just it seems like every few months,
someone posts to the list asking how to set up ganglia without multicast
because they think it is too heavy for them or their network people
don't like it for some reason.  Its strength is in its redundancy and we
haven't noticed any negative impact in our environment.

> I agree which is why our monitoring node is not part of any cluster.  It is a 
> separate machine that provides other monitoring services, so it is expected 
> to 
> be up 24x7 anyway.  Besides, I can easily get redundancy by sending unicast 
> to 
> two or three hosts.  I don't need 400 copies of my cluster's data.  And even 
> if 
> I did have that many copies, when I modify gmetad.conf, I certainly don't 
> plan 
> to enter 400 host names as alternative sources for gmond info.

We have 10 clusters here and we don't want to pick a dedicated node from
each one and convert it into a ganglia node, for us, this would be a
waste of a compute node.  If we didn't use 10 dedicated nodes then we
would have to run 10 separate gmond or gmetad processes on a single node
which is more complications than we want.  Also, I forgot to mention
that we are using a modified gmetad that gets the list of nodes in each
cluster from our database so we don't have to create a huge config file,
we only specify the appropriate database query.  It also refreshes this
list periodically and sticks with a "good" source node until it has
problems, then tries another one.

> I don't worry about redundancy either, but this also fails to address the 
> availability of the gmetad process.  Ganglia isn't very useful if the gmetad 
> server goes down and it is not recording the metrics to disk.  Since I pretty 
> much need to make sure that gmetad is running 24x7 anyway, why not let it 
> collect gmond info?  The server I run gmetad on is better equipped for 
> availability than any of the compute nodes in the cluster.

Of course the gmetad server is critical, but it is better to have a
single critical server rather than the 10 that we would need.  Because
we want to keep these 10 clusters separate, we would need to configure
10 separate gmond or gmetad on our ganglia server if we wanted to stay
with just a single server.

> Thanks for supplying this data.  It is a good baseline for people to use when 
> planning their Ganglia deployments.  However, I add a lot more metrics than 
> the 
> default.  I estimate that I get about 500 packets/second which translates to 
> about 30,000 bytes/sec.  Plus, this "background noise" scales linearly with 
> the 
> number of nodes in the cluster.

This is true, and the default metrics are different for different OSes I
believe.

> I should also mention that the 60 byte metric packet actually consumes more 
> than 
> 60 bytes in the linux socket receive buffer.  On my system, I ran gmond and 
> then 
> sent it the STOP signal.  I then used gmetric to submit a single metric by 
> hand. 
> By looking at the rx_queue column in /proc/net/udp, I saw that the one metric 
> actually took up 304 bytes.  The buffer fills up pretty quick and UDP packets 
> start dropping when the number of metrics gets big.

I was only considering the ganglia multicast bandwidth usage on the
wire, which is something most people seem to be concerned with, not the
extra overhead in the kernel.

> Then you also have to consider the bandwidth used by gmetad when it contacts 
> gmond.  I tested having gmetad contact one of my cluster nodes to get the 
> cluster info.  My test showed that it took about 0.7 secs for it to retreive 
> about 2.5 MB of data.  That's 28.5 Mbps of the node's bandwidth used every 15 
> seconds.  (In a previous post I also mentioned how I saw increased UDP drop 
> rates whenever the gmetad contacted a gmond.)

This is one reason why we have increased our polling interval to 60
seconds instead of the default 15.  I'm glad that you brought this up
because it reminds me of something that I remember seeing in an early
ganglia beta, optional support for compressed xml tcp streams.  Does
anyone know what happened to this feature?  Any chance of resurrecting
the code?

~Jason


-- 
/------------------------------------------------------------------\
|  Jason A. Smith                          Email:  [EMAIL PROTECTED] |
|  Atlas Computing Facility, Bldg. 510M    Phone:  (631)344-4226   |
|  Brookhaven National Lab, P.O. Box 5000  Fax:    (631)344-7616   |
|  Upton, NY 11973-5000                                            |
\------------------------------------------------------------------/

Re: [Ganglia-general] Pointers on architecting a large scale ganglia setup??

Reply via email to