Fundamentally, the issue is that updating a view is processing an
incoming, ordered list of changes, there's not much parallelism to be
had there. The reason sharding works is that you then have two smaller
lists which can be processed in parallel (but each still processed
serially). And, yes, my thought was that BigCouch will allow multiple
shards of the database on a single node, allowing parallel view builds
within that node.

Some care must be taken if we pursue optimizing a single view build.
In the context of a production system, with more than one database and
more than one view, all the cores get plenty of exercise. If every
task could run on all cores then I think you might hit other issues.

B.

On 11 May 2012 13:35, Dirkjan Ochtman <dirk...@ochtman.nl> wrote:
> On Fri, May 11, 2012 at 2:29 PM, Robert Newson <rnew...@apache.org> wrote:
>> Making a single view indexing process faster keeps coming up. For one
>> thing, it's not that easy, otherwise it would have been done by now.
>> For another, this problem vanishes when you shard (and the BigCouch
>> merge will bring this to CouchDB).
>
> What does sharding mean in this context? Running CouchDB/BigCouch on
> multiple servers, or just running multiple processes on a single box?
> If the latter, why can't we run multiple threads/Erlang processes
> within a single shard/OS process? If the former, that's kind of silly,
> in the sense that building indexes (at least for me) is CPU-bound but
> leaves many of the cores in my server idle.
>
> Cheers,
>
> Dirkjan

Reply via email to