ways to improve compaction

Damien Katz Mon, 21 Dec 2009 14:20:57 -0800

I saw recently some issues people where having with compaction, and I thought 
I'd get some thoughts down about ways to improve the compaction code/experience.


1. Multi-process pipeline processing. Similar to the enhancements to the view 
indexing, there is opportunities for pipelining operations instead of the 
current read/write batch operations it does. This can reduce memory usage and 
make compaction faster.
2. Multiple disks/mount points. CouchDB could easily have 2 or more database 
dirs, and each time it compacts, it copies the new database file to another 
dir/disk/mountpoint. For servers with multiple disks this will greatly smooth 
the copying as the disk heads won't need to seek between reads and writes.
3. Better compaction algorithms. There are all sorts of clever things that 
could be done to make the compaction faster. Right now it rebuilds the database 
in a similar manner as if it would if it clients were bulk updating it. This 
was the simplest way to do it, but certainly not the fastest. There are a lot 
of ways to make this much more efficient, they just take more work.
4. Tracking wasted space. This can be used to determine threshold for 
compaction. We don't  need to track with 100% accuracy how much disk space is 
being wasted, but it would be a big improvement to at least know how much disk 
space the raw docs take, and maybe calculate an estimate of the indexes 
necessary to support them in a freshly compacted database.
5. Better Low level file driver support. Because we are using the Erlang 
built-in file system drivers, we don't have access to a lot of flags. If we had 
our own drivers, one option we'd like to use is to not OS cache the reads and 
write during the compaction, it's unnecessary for compaction and it could 
completely consume the cache with rarely accessed data, evicting lots of 
recently used live data, greatly hurting performance of other databases.

Anyway, just getting these thoughts out. More ideas and especially code welcome.

-Damien

ways to improve compaction

Reply via email to