On Dec 22, 2009, at 11:56 AM, Chris Anderson wrote:

> On Mon, Dec 21, 2009 at 2:20 PM, Damien Katz <[email protected]> wrote:
>> I saw recently some issues people where having with compaction, and I 
>> thought I'd get some thoughts down about ways to improve the compaction 
>> code/experience.
>> 
>> 1. Multi-process pipeline processing. Similar to the enhancements to the 
>> view indexing, there is opportunities for pipelining operations instead of 
>> the current read/write batch operations it does. This can reduce memory 
>> usage and make compaction faster.
>> 2. Multiple disks/mount points. CouchDB could easily have 2 or more database 
>> dirs, and each time it compacts, it copies the new database file to another 
>> dir/disk/mountpoint. For servers with multiple disks this will greatly 
>> smooth the copying as the disk heads won't need to seek between reads and 
>> writes.
>> 3. Better compaction algorithms. There are all sorts of clever things that 
>> could be done to make the compaction faster. Right now it rebuilds the 
>> database in a similar manner as if it would if it clients were bulk updating 
>> it. This was the simplest way to do it, but certainly not the fastest. There 
>> are a lot of ways to make this much more efficient, they just take more work.
>> 4. Tracking wasted space. This can be used to determine threshold for 
>> compaction. We don't  need to track with 100% accuracy how much disk space 
>> is being wasted, but it would be a big improvement to at least know how much 
>> disk space the raw docs take, and maybe calculate an estimate of the indexes 
>> necessary to support them in a freshly compacted database.
>> 5. Better Low level file driver support. Because we are using the Erlang 
>> built-in file system drivers, we don't have access to a lot of flags. If we 
>> had our own drivers, one option we'd like to use is to not OS cache the 
>> reads and write during the compaction, it's unnecessary for compaction and 
>> it could completely consume the cache with rarely accessed data, evicting 
>> lots of recently used live data, greatly hurting performance of other 
>> databases.
>> 
>> Anyway, just getting these thoughts out. More ideas and especially code 
>> welcome.
>> 
>> -Damien
> 
> Another thing worth considering, is that if we get block alignment
> right, then our copy-to-a-new-file compaction could end up working as
> compact-in-place on content-addressable filesystems. Most of the
> blocks won't change content, so the FS can just write new pointers to
> existing blocks, and then garbage collect unneeded blocks later. If we
> get the block alignment right...

I think that requires rearrangeing the blocks which means working below the FS 
system level (using what is essentially our own file system), or using a file 
system that exposes the raw file block mgmt (does such a thing exist?). Would 
be cool though.


> 
> Chris
> 
> -- 
> Chris Anderson
> http://jchrisa.net
> http://couch.io

Reply via email to