[ 
https://issues.apache.org/jira/browse/LUCENE-8757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832382#comment-16832382
 ] 

Atri Sharma commented on LUCENE-8757:
-------------------------------------

[~simonw] Attached is an updated patch.

My two cents are that segregating segments to keep the document count fair is a 
more complex operation that what the slices API does today (and in this patch). 
Fair segmentation is a known hard problem (integer partitioning, for eg).

 

We should also consider how much of a bootstrap time latency would a more 
complex algorithm add. Given that a user has the option of overriding 
IndexSearcher to add their own ways of splicing, I feel our default algorithm 
should do well on the common usecase, but not more than that.

 

Happy to discuss the alternatives.

> Better Segment To Thread Mapping Algorithm
> ------------------------------------------
>
>                 Key: LUCENE-8757
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8757
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Atri Sharma
>            Priority: Major
>         Attachments: LUCENE-8757.patch
>
>
> The current segments to threads allocation algorithm always allocates one 
> thread per segment. This is detrimental to performance in case of skew in 
> segment sizes since small segments also get their dedicated thread. This can 
> lead to performance degradation due to context switching overheads.
>  
> A better algorithm which is cognizant of size skew would have better 
> performance for realistic scenarios



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to