Hi Tobi, On Thu, Sep 04, 2008 at 09:57:02PM +0200, Tobias Oetiker wrote: > Hi Scott, > > > In our experience on linux centos 4 and 5, rrdtool performance > > problems seem primarily related to the rate at which the os flushes > > the rrd updated pages in the filesystem cache to disk. Not so much to > > the actual rate that the rrd_update function is called. > > > > There is a tip on the rrdtool web site under VM optimizations, > > > > By setting dirty_expire_centisecs to a high value (several steps), > > while all rrd data fits into the cache, will cause your system to > > bundle up several rounds of updates before writing the dirty buffers > > back to disk. > > > > http://oss.oetiker.ch/rrdtool-trac/wiki/TuningRRD > > > > Wondering if any thought has been given to addressing the update > > bottleneck problem at a lower level? > > in 1.3 we did address the problem of cache pollution by giving > advice about the planned usage patterns to the OS. We also file > access over to mmapedio. Apart from this I think the OS is the > right place to optimize disk access. > > So as long as the rrdtool database format remains column oriented I > don't see what optimization is left on that front ...
Maybe it would be worth while to consider an approach where rrd_update calculates the rra values but caches them in application memory instead of immediately dirtying pages in the os filesystem cache. If rrd_fetch were able to readout rra values from this same application cache it would go far in decoupling rrdtool from filesystem performance. There is a difference between this architecture and the current rrdcache work that relies on delaying calls to rrd_update. The latter is definitely simpler to implement but the former is more along the lines of application level caching that is used successfully by other database engines. Thanks, Scott B _______________________________________________ rrd-developers mailing list [email protected] https://lists.oetiker.ch/cgi-bin/listinfo/rrd-developers
