On Wed, Jul 11, 2007 at 03:34:41PM -0400, Ofer Inbar wrote:
> gmetad is very write-intensive, because it updates hundreds of RRD
> files about every minute or two.  Has anyone tried running it with
> the rrd directory on a RAM disk (tmpfs) ?

I'll toss in my $.02 here as well, though many people have already said
the same thing.

I created a ramdisk when my cluster grew beyond ~50 nodes (I report a
lot of extra statistics).  I use an actual ramdisk instead of tmpfs
(though I chose it out of ignorance when I first set it up, wikipedia[*]
says that tmpfs might swap to disk whereas ramfs is just straight up in
memory, nothing fancy).  

Instead of reconfiguring ganglia to keep the repositories in
/mnt/ram0/rrds or mounting the ramdisk in /var/lib/ganglia/rrds, I
mounted the ramdisk in /mnt/ram0 and made /var/lib/ganglia/rrds a
symlink to /mnt/ram0/rrds.  Just my preference...

I wrote a new script to drop in /etc/init.d/ called, inventively enough,
setup_gmetad_ramdisk, which starts before gmetad and stops after it.  It
creates the ramdisk, formats it, and copies over the backed up rrds.
When stopped, it backs up the rrds.  Theoretically, this should make
system bootup and shutdown work the same as though it were on disk.
Unfortunately, I am missing some part of installing the stop script
correctly (in the right runlevel or something) so it doesn't actually
work on shutdown.  :(  I imagine the fix is pretty simple, but I havn't
bothered yet.

I had to edit grub.conf to adjust the size of the ramdisk.  By default
they're 64MB, but with an argument to the kernel start line, you can set
it to whatever size you need.  I chose 4x the current RRD directory, to
accomodate new hosts and more metrics.  It is unfortunate that a reboot
is required to change the size of the ramdisk.

I also set up a cronjob to backup the rrds themselves every hour, but
unlike the folks so far, instead of rsyncing or keeping just one copy, I
keep 8 days worth of hourly snapshots, so that if something goes wrong,
I can get back to a healthy snapshot.  (Note - I have never actually
used any snapshot further back than the most recent...  ;)  (Note2 - the
first version of this used 'find' to get anything >8d old, and it
started really tearing up the disk as the number of hosts/metrics grew.
Now I use perl to create the timestamp from 8 days ago and just rm the
directory.  This will fail if the host is down for more than an hour,
but that's OK by me.)


The backup cronjob and new ramdisk start script are all available off my
website http://ben.hartshorne.net/ganglia/

-ben

[*] http://en.wikipedia.org/wiki/TMPFS

-- 
Ben Hartshorne
email: [EMAIL PROTECTED]
http://ben.hartshorne.net

Attachment: signature.asc
Description: Digital signature

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Ganglia-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to