[ 
https://issues.apache.org/jira/browse/LUCENE-8688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16791557#comment-16791557
 ] 

Adrien Grand commented on LUCENE-8688:
--------------------------------------

Thanks for iterating on this, I think there are still some issues wrt not 
running the final merge if there are on-going merges:
 - It feels wrong that the code block under the "This is the special case of 
merging down to one segment" comment runs _before_ we check whether the merge 
is a final merge. (Do we need this special case at all?)
 - If there are less than maxMergeAtOnceExplicit segments in the index, we are 
doing the right thing, but if there are eg. maxMergeAtOnceExplicit +3 segments 
in the index, maxSegmentCount is 2, and a merge is ongoing, then we will run 
one merge of maxMergeAtOnceExplicit segments and another one of 3 segments, 
which feels wrong: if there is an ongoing merge, we should only run merges of 
the maximum size, ie. that either merge maxMergeAtOnceExplicit segments 
together, or that create a segment that is close to the maximum segment size. 
(This is why the check for the final merge was done after the loop in prior 
versions in TieredMergePolicy.)

> Forced merges merge more than necessary
> ---------------------------------------
>
>                 Key: LUCENE-8688
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8688
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-8688.patch, LUCENE-8688.patch, LUCENE-8688.patch
>
>
> A user reported some surprise after the upgrade to Lucene 7.5 due to changes 
> to how forced merges are selected when maxSegmentCount is greater than 1.
> Before 7.5 forceMerge used to pick up the least amount of merging that would 
> result in an index that has maxSegmentCount segments at most. Now that we 
> share the same logic as regular merges, we are almost sure to pick a 
> maxMergeAtOnceExplicit-segments merge (30 segments) given that merges that 
> have more segments usually score better. This is due to the fact that natural 
> merges assume that merges that run now save work for later, so the more 
> segments get merged, the better. This assumption doesn't hold for forced 
> merges that should run on read-only indices, so there won't be any future 
> merging.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to