On Oct 22, 2009, at 11:28 AM, Fabio Forno wrote:

On Thu, Oct 22, 2009 at 5:12 PM, Paul Davis <[email protected] > wrote:
Fabio,

There are about four things that will slow view generation down from
the _bulk_docs rate:

1. JSON conversion (twice) when passing data to the view process
2. Collation of keys on tree insertion
3. I/O (Disk and stdio)
4. Memory thresholds

Things like native views will give noticeable speed improvements
because it avoids JSON serialization and transfer over stdio. The
other (theoretically) tunable parameter is the memory threshold that
triggers flushes to disk. Its not currently configurable by the client
(requires a rebuild of couchdb) and as such I haven't seen anyone
attempt to tune it.

Thanks fro the answer, so I see that there are considerable margins
for improvements, because ideally the index re-generation should be
bound by disk speed once all possible optimizations are kicked in
(except some pathological situations such as an application I have
which stores chunks of xml in document strings, obliging double
parsing in order to process them ;))

bye

There are optimizations in trunk that get CouchDB closer to achieving this goal. Re-indexing does lots of random I/O, so you won't be seeing 30MB/s on spinning platters, but it's many times better than what we had in 0.9. Best,

Adam

Reply via email to