Re: [Ganglia-general] Gaps in graphs

Ofer Inbar Fri, 04 Dec 2009 12:08:32 -0800

Cassandra Pugh <[email protected]> wrote:
> I am monitoring *1630 nodes.  *
> So, I thought that perhaps it was getting overburdened at times from all 
> the traffic.  However I am using a very beefy machine, and do not see 
> high cpu or memory usage even during these gaps.  Also the network folks 
> don't see any abnormally large traffic on the network.


gmetad is not very cpu-intensive or memory-intensive in my experience,
but if you're storing all those RRD files on disk, you could easily be
overwhelming the disk.  Did you look for I/O wait?

I had to migrate my Ganglia installations to store their RRDs on tmpfs
(RAM) at around 30 nodes, though I had a *lot* of custom metrics and
each one gets stored per node.  I also hear rumors than in the past
year Ganglia has moved to a newer version of RRDtool that caches
writes, so it should be able to handle staying on disk better than
it did when I was running it.  Still, with 1630 nodes, especially if
you have some custom metrics, you could easily be overwhelming your
disk.  If you see a lot of I/O wait, you could move to tmpfs like I
did.  I had a cron job to rsync to real disk every 10 minutes, and
I changed the init script to rsync when you start and stop Ganglia.
You can search this list's archives for my post which includes my
diffs to the init script, and all the steps for what I did.
  -- Cos

------------------------------------------------------------------------------
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev
_______________________________________________
Ganglia-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Re: [Ganglia-general] Gaps in graphs

Reply via email to