On Dec 22, 2009, at 11:56 AM, Chris Anderson wrote: > On Mon, Dec 21, 2009 at 2:20 PM, Damien Katz <[email protected]> wrote: >> I saw recently some issues people where having with compaction, and I >> thought I'd get some thoughts down about ways to improve the compaction >> code/experience. >> >> 1. Multi-process pipeline processing. Similar to the enhancements to the >> view indexing, there is opportunities for pipelining operations instead of >> the current read/write batch operations it does. This can reduce memory >> usage and make compaction faster. >> 2. Multiple disks/mount points. CouchDB could easily have 2 or more database >> dirs, and each time it compacts, it copies the new database file to another >> dir/disk/mountpoint. For servers with multiple disks this will greatly >> smooth the copying as the disk heads won't need to seek between reads and >> writes. >> 3. Better compaction algorithms. There are all sorts of clever things that >> could be done to make the compaction faster. Right now it rebuilds the >> database in a similar manner as if it would if it clients were bulk updating >> it. This was the simplest way to do it, but certainly not the fastest. There >> are a lot of ways to make this much more efficient, they just take more work. >> 4. Tracking wasted space. This can be used to determine threshold for >> compaction. We don't need to track with 100% accuracy how much disk space >> is being wasted, but it would be a big improvement to at least know how much >> disk space the raw docs take, and maybe calculate an estimate of the indexes >> necessary to support them in a freshly compacted database. >> 5. Better Low level file driver support. Because we are using the Erlang >> built-in file system drivers, we don't have access to a lot of flags. If we >> had our own drivers, one option we'd like to use is to not OS cache the >> reads and write during the compaction, it's unnecessary for compaction and >> it could completely consume the cache with rarely accessed data, evicting >> lots of recently used live data, greatly hurting performance of other >> databases. >> >> Anyway, just getting these thoughts out. More ideas and especially code >> welcome. >> >> -Damien > > Another thing worth considering, is that if we get block alignment > right, then our copy-to-a-new-file compaction could end up working as > compact-in-place on content-addressable filesystems. Most of the > blocks won't change content, so the FS can just write new pointers to > existing blocks, and then garbage collect unneeded blocks later. If we > get the block alignment right...
I think that requires rearrangeing the blocks which means working below the FS system level (using what is essentially our own file system), or using a file system that exposes the raw file block mgmt (does such a thing exist?). Would be cool though. > > Chris > > -- > Chris Anderson > http://jchrisa.net > http://couch.io
