[ 
https://issues.apache.org/jira/browse/LUCENE-8675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17575251#comment-17575251
 ] 

Adrien Grand commented on LUCENE-8675:
--------------------------------------

I wonder if we could avoid paying the cost of Scorer/BulkScorer initialization 
multiple times by implementing Cloneable on these classes, similarly to how we 
use cloning on IndexInputs to consume them from multiple threads. It would 
require implementing Cloneable on a few other classes, e.g. PostingsEnum, and 
maybe we'd need to set some restrictions to keep this feature reasonable, e.g. 
it's only legal to clone when the current doc ID is -1. But this could help 
parallelize collecting a single segment by assigning each clone its own range 
of doc IDs.

A downside of this approach is that it wouldn't help parallelize the 
initialization of Scorers, but I don't know if there is a way around it.

> Divide Segment Search Amongst Multiple Threads
> ----------------------------------------------
>
>                 Key: LUCENE-8675
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8675
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/search
>            Reporter: Atri Sharma
>            Priority: Major
>         Attachments: PhraseHighFreqP50.png, PhraseHighFreqP90.png, 
> TermHighFreqP50.png, TermHighFreqP90.png
>
>
> Segment search is a single threaded operation today, which can be a 
> bottleneck for large analytical queries which index a lot of data and have 
> complex queries which touch multiple segments (imagine a composite query with 
> range query and filters on top). This ticket is for discussing the idea of 
> splitting a single segment into multiple threads based on mutually exclusive 
> document ID ranges.
> This will be a two phase effort, the first phase targeting queries returning 
> all matching documents (collectors not terminating early). The second phase 
> patch will introduce staged execution and will build on top of this patch.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to