[ 
https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13800947#comment-13800947
 ] 

Shikhar Bhushan commented on LUCENE-5299:
-----------------------------------------

bq. Could you describe a bit about the high level design changes?

There is an overview in this email under 'Idea': 
http://mail-archives.apache.org/mod_mbox/lucene-dev/201310.mbox/%3CCAE_Gd_dt6LY5T9r6ty%2B1j2xEbdr84OCPkU5swsQn10cbDt81Ew%40mail.gmail.com%3E

bq. In the benchmarks, is "par vs par" the before/after test? Ie baseline = 
current trunk, passed an ES to IndexSearcher, and then comp = with this patch, 
also passing ES to IndexSearcher?

Exactly, sorry that wasn't made clear.

bq. In general, I suspect fine grained parallelism is trickier / most costly 
then the "merge in the end" parallelism we have now. Typically collection is 
not a very costly part of the search ... and merging the results in the end 
should be a minor cost, that shrinks as the index gets larger.

"Typically collection is not a very costly part of the search" - I don't know 
if that's true. Are you referring to just the bits that might happen inside a 
Collector, or a broader definition of collection as including scoring and 
potentially some degree of I/O? This change is aiming to parallelize the 
latter. To do this the Collector API needs refactoring to cleanly separate out 
the AtomicReader-level state and the composite state, in case they are 
different. 

> Refactor Collector API for parallelism
> --------------------------------------
>
>                 Key: LUCENE-5299
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5299
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Shikhar Bhushan
>         Attachments: benchmarks.txt, LUCENE-5299.patch
>
>
> h2. Motivation
> We should be able to scale-up better with Solr/Lucene by utilizing multiple 
> CPU cores, and not have to resort to scaling-out by sharding (with all the 
> associated distributed system pitfalls) when the index size does not warrant 
> it.
> Presently, IndexSearcher has an optional constructor arg for an 
> ExecutorService, which gets used for searching in parallel for call paths 
> where one of the TopDocCollector's is created internally. The 
> per-atomic-reader search happens in parallel and then the 
> TopDocs/TopFieldDocs results are merged with locking around the merge bit.
> However there are some problems with this approach:
> * If arbitary Collector args come into play, we can't parallelize. Note that 
> even if ultimately results are going to a TopDocCollector it may be wrapped 
> inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both.
> * The special-casing with parallelism baked on top does not scale, there are 
> many Collector's that could potentially lend themselves to parallelism, and 
> special-casing means the parallelization has to be re-implemented if a 
> different permutation of collectors is to be used.
> h2. Proposal
> A refactoring of collectors that allows for parallelization at the level of 
> the collection protocol. 
> Some requirements that should guide the implementation:
> * easy migration path for collectors that need to remain serial
> * the parallelization should be composable (when collectors wrap other 
> collectors)
> * allow collectors to pick the optimal solution (e.g. there might be memory 
> tradeoffs to be made) by advising the collector about whether a search will 
> be parallelized, so that the serial use-case is not penalized.
> * encourage use of non-blocking constructs and lock-free parallelism, 
> blocking is not advisable for the hot-spot of a search, besides wasting 
> pooled threads.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to