[ 
https://issues.apache.org/jira/browse/LUCENE-8675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16757614#comment-16757614
 ] 

Atri Sharma commented on LUCENE-8675:
-------------------------------------

Thanks for the comments.

Having a multi shard approach makes sense, but a search is still bottlenecked 
by the largest segment it needs to scan. If there are many segments of that 
type, that might become a problem.

While I agree that range queries might not be directly benefited from parallel 
scans, but other queries (such as TermQueries) might be benefitted from a 
segment parallel scan. In a typical ElasticSearch interactive query, we see 
spikes when a large segment is hit for an interactive use case. Such cases can 
be optimized with parallel scans.

We should have a method of deciding whether a scan should be parallelized or 
not, and then let the execution operator get a set of nodes to execute. That is 
probably outside the scope of this JIRA, but I wanted to open this thread to 
get the conversation going.

> Divide Segment Search Amongst Multiple Threads
> ----------------------------------------------
>
>                 Key: LUCENE-8675
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8675
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/search
>            Reporter: Atri Sharma
>            Priority: Major
>
> Segment search is a single threaded operation today, which can be a 
> bottleneck for large analytical queries which index a lot of data and have 
> complex queries which touch multiple segments (imagine a composite query with 
> range query and filters on top). This ticket is for discussing the idea of 
> splitting a single segment into multiple threads based on mutually exclusive 
> document ID ranges.
> This will be a two phase effort, the first phase targeting queries returning 
> all matching documents (collectors not terminating early). The second phase 
> patch will introduce staged execution and will build on top of this patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to