Old thread I know, but I was wondering about a way to make compaction more fluid:
On Dec 21, 2009, at 23:20 , Damien Katz wrote: > I saw recently some issues people where having with compaction, and I thought > I'd get some thoughts down about ways to improve the compaction > code/experience. > > 1. Multi-process pipeline processing. Similar to the enhancements to the view > indexing, there is opportunities for pipelining operations instead of the > current read/write batch operations it does. This can reduce memory usage and > make compaction faster. > 2. Multiple disks/mount points. CouchDB could easily have 2 or more database > dirs, and each time it compacts, it copies the new database file to another > dir/disk/mountpoint. For servers with multiple disks this will greatly smooth > the copying as the disk heads won't need to seek between reads and writes. > 3. Better compaction algorithms. There are all sorts of clever things that > could be done to make the compaction faster. Right now it rebuilds the > database in a similar manner as if it would if it clients were bulk updating > it. This was the simplest way to do it, but certainly not the fastest. There > are a lot of ways to make this much more efficient, they just take more work. > 4. Tracking wasted space. This can be used to determine threshold for > compaction. We don't need to track with 100% accuracy how much disk space is > being wasted, but it would be a big improvement to at least know how much > disk space the raw docs take, and maybe calculate an estimate of the indexes > necessary to support them in a freshly compacted database. > 5. Better Low level file driver support. Because we are using the Erlang > built-in file system drivers, we don't have access to a lot of flags. If we > had our own drivers, one option we'd like to use is to not OS cache the reads > and write during the compaction, it's unnecessary for compaction and it could > completely consume the cache with rarely accessed data, evicting lots of > recently used live data, greatly hurting performance of other databases. > > Anyway, just getting these thoughts out. More ideas and especially code > welcome. How about 6. Store the databases in multiple files. Instead of one really big file, use several big chunk-files of fixed maximum length. One chunk-file is "active" and receives writes. Once that chunk-file grows past a certain size, for example 25MB, start a new file. Then, at compaction time, you can do the compaction one chunk-file at a time. Possible optimization: If a certain chunk-file has no outdated documents (or only a small %), leave it alone. I'm armchair-programming here, I have only a vague idea of what the on-disk format looks like, but this could allow continuous compaction, by only compacting (slowly) the completed chunk-files. Furthermore, it would allow spreading the database across multiple disks (since there are now multiple files per db), although one disk would still be receiving all the writes. A smart write scheduler could make sure different databases have different active disks. Possibly, multiple chunk-files could be active at the same time, providing all sorts of interesting failure scenarios ;-) Thoughts? Wout.
