jason- disk i/o in gmetad is a big problem. it's one of the problems that is being addressed in ganglia3 (as you mention). here are some of the ways we are going to reduce disk i/o.
gmetad2 currently polls all the data from each data source every n secs. it creates round-robin databases that assume data will be coming in every n secs. it's ugly because it means that even metrics that are only updated every hour (for example) remotely are being written to disk locally every n secs. (to explain why i wrote it that way.. i didn't have a library that could quickly save XML data from multiple sources into a common hierarchical data structure for easy manipulation... in a sense a true XMLdb. the DOM libraries out there were not portable or fast enough and still aren't. to get around that limitation i immediately parsed all XML, saved the data to RRDs and then saved the *raw* XML for output on 8651.. i know... excuses excuses :)). i have written a basic XMLdb that can quickly collect/merge xml data from many data sources (and allow for quick summarizing and filtering). this will empower us to make gmetad3 a lot more efficient and intelligent. gmetad3 will poll data from multiple data sources just like gmetad2 but with some huge differences. gmetad3 will immediately write the incoming data to its hierarchical data structure. this will allow use to create a custom RRD format for each individual metric. if a metric is only updated every hour then the RRDb will be setup to expect data once an hour. gmetad3 will also be dealing with interactive data sources. currently gmond2/gmetad2 are non-interactive and simply spit out all or nothing. having interactive data sources means that gmetad3 will be able to send a request for only "new" data. say gmetad3 polls every minute it would pass 60 to the remote source which would return all data that is <=60 seconds old. another possibility is persisent connections. gmetad3 will also use a delegation model instead of an aggregation model which will allow RRDb to be distributed instead of so centralized. if you have any questions, suggestions or would like to be a part of the coding process please let us know. -- matt Today, Jason A. Smith wrote forth saying... > Since we have been slowly increasing the number of clusters and hosts > that we are monitoring with ganglia, I have been watching closely how > the gmetad host is handling the increased load and experimenting with a > few alternatives for locations to store the rrds. > > At first it was just on a partition made up of a pair of raided disks > which obviously didn't scale very far. Then I tried moving to a > filesystem image mounted via the loopback device. I think this helped > to aggregate the thousands of disk accesses into just updating a single > file, but as the size grew to a few hundred megabytes total (several > hundred nodes), the disk I/O started to cause too much of a load on the > gmetad node. One advantage of this method is it is fairly easy to setup > and the rrds are still being written to a physical disk so you don't > lose any data on a reboot. > > Then I tried experimenting with ramfs, since it is much easier to setup > than a ramdisk, but I had a few problems with it, that I suspect might > be bugs in the Linux kernel. Since now the databases are stored only in > RAM, I had a cronjob that would run every hour to backup the rrds > directory. Occasionally a process (either gmetad or tar) would go into > an uninterruptable sleep state which would lock up the ramfs partition. > I would be forced to reboot in order to continue collecting data. Also, > it appeared that there was an almost 50% overhead in using ramfs (my > 225MB rrds directory would consume about 337MB of RAM. > > Then I finally settled on a ramdisk, although it takes a little more > effort to setup and use. The performance appears to be about the same > as ramfs, without the lockups and 50% overhead, which really helps to > alleviate the disk I/O load on the gmetad node. > > So, how do other people handle their large database directories? Is > everyone using a ramdisk or has anyone used ramfs successfully? How > different will things be in the upcoming ganglia3 release? Will the > rrds be basically the same as they are now or will there be major > changes in that part of ganglia also? > > ~Jason > > >

