On Wed, Aug 16, 2006 at 08:10:09AM +0200, Henrik Stoerner wrote: > I am using a network/systems monitoring tool - Hobbit - which uses > lots of RRD files for tracking all sorts of data. This works really > well - kudos to Tobi. > > However, my main system for this currently has about 20.000 RRD files, > all of which are updated every 5 minutes. So that's about 70 updates > per second, and I can see that the amount of disk I/O happening on > this server will become a performance problem soon, as more systems are > added and hence more RRD files need updating.
I've been in a similar situation myself, doing 20-30 sec updates on 50k+ RRD files. The bottom line is that rrdtool is just not designed to do that, and it will go kicking and screaming into the night when you try to make it. The "typical user" is calling the rrdtool binary from a perl script, graphing a few dozen or at worst hundreds of items, and doesn't have a care in the world about the internal architecture. The situation I was trying to solve involved a constant stream of high resolution data across a large set of records, and relatively infrequent viewing of that data. It sounds like you're trying to do something similar. Honestly if all you care about is databasing it would probably be easier to ditch RRD and use something else or write your own db which is more efficient, but at the end of the day (for me anyways :P) rrdtool does the best job of producing pretty pictures that don't look like they came off of gnuplot or my EKG, and I'm in no mood to become a graphics person and re-invent the wheel. So, probably your biggest issue is indeed thrashing the hell out of the disk if you just tried to naively fire off a pile of forks and hope it all works out for the best. In my application I implemented a data write queue and a single thread per disk for dispatching rrd updates, which helps quite a bit. It really depends on your polling application as to how easy this is though. Obviously a syscall to exec a shell to run the rrdtool binary every time scales to about nothing, and the API (if you can even call it that, I don't think (argc, argv) counts :P) to rrdtool functions in C really and truly bites. If your application is in C, and you can link directly to the librrd, thats a quick and dirty fix for at least some of the evils. What really should happen is for that entire section of code to be gutted with a vengence, split the text parsing code out of it and send it in the direction of the cli frontend, and develop an actual API for passing in data in a sensible format for other users who want to link to a C lib. This really isn't that difficult to do either. The big daddy of performance suck is then going to be, opening, closing, and seeking the right spot in the files every time. Again, perfectly straight forward for very light scripty use, but using .rrd files as an indexing method for large datasets scales horribly. One thing you could do if you really wanted to scale this db format (since the updated are relatively simple compared to the graphing) is to write your own code to keep open handles on the files and do your own direct db access. This would be fairly effective up to a point, obviously there is a limit to the number of files you can keep open on your OS, but by the point you reach it you've probably crossed that threshold to where looking at a different solution to replace rrd completely is worth your time again. Of course, also make sure that your polling app isn't completely braindead, because you can do plenty of intelligent aggregation of datasources inside a single .rrd file. One option I explored for doing 10 sec updates was to keep my .rrd files in a ram disk, and periodically sync to disk at intervals where you want to save long term data (say for example 5 minutes, so you only lose 5 mins of data in the event of a failure). Of course the problem I ran into is that in addition to doing very high resolution short term data collection (it makes for really nice graphs of realtime data, honest :P), I'm storing a fair amount of long term data too. This means that it is perfectly reasonable for a .rrd file to be large (say 500KB-1MB), but for only a few KB of the data per file to actually be touched on any given update interval. What you'd really be looking for out of a ram disk there is file/disk-backed storage and a really slow periodic flush of dirty blocks to disk, which is again probably more work then you should put into a hack around rrdtool. Of course if you can afford the ram in the first place to make all your data fit, you can just dd a raw image at the block level and get much less disk thrashing than accessing tens of thousands of small files. Or hell you could always just throw more spindles at it or throw a few more $500 linux PCs at it, what do I care. :) -- Richard A Steenbergen <[EMAIL PROTECTED]> http://www.e-gerbil.net/ras GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC) -- Unsubscribe mailto:[EMAIL PROTECTED] Help mailto:[EMAIL PROTECTED] Archive http://lists.ee.ethz.ch/rrd-developers WebAdmin http://lists.ee.ethz.ch/lsg2.cgi
