Re: Lucene index creation using Hadoop

Ted Dunning Thu, 09 Jul 2009 09:58:32 -0700

Exactly as we do.

Also, I find that with a large enough collection to care about speed that we
have many more shards than we have reducers so parallelism in indexing is
nearly perfect.


On Thu, Jul 9, 2009 at 9:13 AM, Ken Krugler <[email protected]>wrote:

>
> We wind up with one index (shard) per reducer, so by controlling the number
> of reducers we can vary the shard count, down to a minimum count == the
> number of slaves in the processing cluster.




-- 
Ted Dunning, CTO
DeepDyve

Re: Lucene index creation using Hadoop

Reply via email to