On Fri, Jan 27, 2006 at 01:41:28AM -0800, Joel Krauska wrote:
> I've seen similar scaling questions asked, but not a lot of answers.
> 
> I hope this query falls on some ears with experience in the brains 
> behind them.
> 
> I'm looking to deploy ganglia on a largish cluster.
> (1000+ nodes)
> 
> Here were some of my thoughts on how I could help scale the system.
> 
> Any opinions or suggestions are greatly appreciated.
> 
> - Put gmetad rrd files on a ramdisk.
> This should decrease the frequency of disk writes
> during normal runs.
> If I rsync to a local disk every hour or so, I can get away
> with limited disk writes, and still have a reasonable backup of data.

I use a ramdisk for my rrds, but rather than using rsync I store them in
a tar.gz file.  The time difference between the two is night and day.
When you use rsync you waste a tremendous amount of time opening and
reading the backup files.  By just writing a tarball you only have to
open the files on the ramdisk and then by compressing the data you
reduce the number of writes.  The data is so small that using bzip is a
significant slowdown on the 2.4GHz Xeon so I use gzip instead because
it doesn't need to read a large chunk of data before it can start
writing.  I've integrated this into the FreeBSD port start and stop
scripts which automaticly take a snapshot at shutdown, restore from it
at startup, and can be called from cron to take period snapshots.  Even
if you lose a snapshot file, the system will recover from one of the
temporary files.

The script I use is part of the FreeBSD port:

http://www.freebsd.org/cgi/cvsweb.cgi/ports/sysutils/ganglia-monitor-core/files/gmetasnap.sh?rev=1.2&content-type=text/x-cvsweb-markup

> - Use TCP polling queries instead of UDP or Multicast push.
> (disable UDP/multicast pushing)
> I'd prefer to let gmetad poll instead of having 1000 UDP messages flying 
> around on odd intervals.  A good practice?
> 
> 
> - Alter timers for lighter network load?
> examples? ideas?
> Was going to just go to 30 or 60s timers in gmetad.conf cluster 
> definition to start.
> 
> 
> - Consider "federating"?
> Create groups of 100 gmond hosts managed by single gmetas, all linking
> up to a core gmetad.

As to the rest, IMO ganglia's traffic below the noise floor and I
don't bother to look at it.

-- Brooks

-- 
Any statement of the form "X is the one, true Y" is FALSE.
PGP fingerprint 655D 519C 26A7 82E7 2529  9BF0 5D8E 8BE9 F238 1AD4

Attachment: pgpEzmuKmenTz.pgp
Description: PGP signature

Reply via email to