Hi, I've noticed that compacting large DBs pretty much kills any
filesystem caching benefits for CouchDB. I believe the problem is
that the OS (Linux 2.6.21 kernel in my case) is caching blocks from
the .compact file, even though those blocks won't be read again until
compaction has finished. In the meantime, the portion of the cache
dedicated to the old DB file shrinks and performance really suffers.
I think a better mode of operation would be to advise/instruct the OS
not to cache any portion of the .compact file until we're ready to
replace the main DB. On Linux, specifying the POSIX_FADV_DONTNEED
option to posix_fadvise() seems like the way to go:
http://linux.die.net/man/2/posix_fadvise
This link has a little more detail and a usage example:
http://insights.oetiker.ch/linux/fadvise.html
Of course, POSIX_FADV_DONTNEED isn't really available from inside the
Erlang VM. Perhaps the simplest approach would be to have a helper
process that we can spawn which calls that function (or its equivalent
on a non-Linux OS) periodically during compaction? I'm not really
sure, but I wanted to get this out on the list for discussion. Best,
Adam