Oh yeah, one more option that is kind of crazy is to spawn a small
external child process for file io. It would be a very small simple
process that opens a file and responds to read/write commands from the
erlang server. Then we can implement exactly the low level apis and
caching behavior desired. The cost is extra IPC, but that should be
small compare the the cost of a blown file cache.
-Damien
On Feb 27, 2009, at 1:23 PM, Damien Katz wrote:
The problem is we don't get access to the low level apis or flags
passed in to the OS unless Erlang chooses to expose it. We have
similar problems with compaction on windows because we need special
flags to give us unix file semantics.
To fix this, we'll either need the Erlang VM changed or use our own
Erlang file driver interface.
-Damien
On Feb 26, 2009, at 8:25 PM, Adam Kocoloski wrote:
Hi, I've noticed that compacting large DBs pretty much kills any
filesystem caching benefits for CouchDB. I believe the problem is
that the OS (Linux 2.6.21 kernel in my case) is caching blocks from
the .compact file, even though those blocks won't be read again
until compaction has finished. In the meantime, the portion of the
cache dedicated to the old DB file shrinks and performance really
suffers.
I think a better mode of operation would be to advise/instruct the
OS not to cache any portion of the .compact file until we're ready
to replace the main DB. On Linux, specifying the
POSIX_FADV_DONTNEED option to posix_fadvise() seems like the way to
go:
http://linux.die.net/man/2/posix_fadvise
This link has a little more detail and a usage example:
http://insights.oetiker.ch/linux/fadvise.html
Of course, POSIX_FADV_DONTNEED isn't really available from inside
the Erlang VM. Perhaps the simplest approach would be to have a
helper process that we can spawn which calls that function (or its
equivalent on a non-Linux OS) periodically during compaction? I'm
not really sure, but I wanted to get this out on the list for
discussion. Best,
Adam