[
https://issues.apache.org/jira/browse/LUCENE-8688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16776543#comment-16776543
]
Armin Braun commented on LUCENE-8688:
-------------------------------------
Gave this a shot in the attached patch:
* Basically brought back the old logic (pre LUCENE-7976) of simply collecting
as many of the smallest segments as possible ("possible" now including the max
segment size check).
** Made the tradeoff of merging the smallest remaining segments to get to the
requested segment count
** Technically speaking one could do better than the above trade-off (in some
cases) by using a smarter bin-packing algorithm but the above comment described
merging large segments close to bin-size with tiny segments as wasteful so I
didn't try that
* Added a new rough test that checks that we arrive at the exact max segment
count and don't exceed max segment size significantly
** It's much stricter than the existing test for this size-wise
[^LUCENE-8688.patch]
> Forced merges merge more than necessary
> ---------------------------------------
>
> Key: LUCENE-8688
> URL: https://issues.apache.org/jira/browse/LUCENE-8688
> Project: Lucene - Core
> Issue Type: Bug
> Reporter: Adrien Grand
> Priority: Minor
> Attachments: LUCENE-8688.patch
>
>
> A user reported some surprise after the upgrade to Lucene 7.5 due to changes
> to how forced merges are selected when maxSegmentCount is greater than 1.
> Before 7.5 forceMerge used to pick up the least amount of merging that would
> result in an index that has maxSegmentCount segments at most. Now that we
> share the same logic as regular merges, we are almost sure to pick a
> maxMergeAtOnceExplicit-segments merge (30 segments) given that merges that
> have more segments usually score better. This is due to the fact that natural
> merges assume that merges that run now save work for later, so the more
> segments get merged, the better. This assumption doesn't hold for forced
> merges that should run on read-only indices, so there won't be any future
> merging.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]