[ 
https://issues.apache.org/jira/browse/LUCENE-7523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15615379#comment-15615379
 ] 

Michael McCandless commented on LUCENE-7523:
--------------------------------------------

UIMP was really designed for one-off usage via the {{IndexUpgrader}} tool, but 
I agree it's interesting to maybe have it become instead a merge policy that 
passes through ordinary merging as well?

It's a somewhat complex problem, though: if the merge policy is presented with 
an index that has N old segments and M new ones, and it's in need of merging, 
how does it pick?  Is it only {{forceMerge}} that would explicitly target only 
old segments first?  Would there be just an added bias to favor old ones, like 
how {{TieredMergePolicy}} biases to segments that have more deletions.

Maybe we just fold this behavior into TMP and remove UIMP?

bq.  That extra new segment could be quite a large 'monster' segment.

Maybe we could have a {{maxMergedSegmentMB}}, like {{TieredMergePolicy}}?  Then 
UIMP could only send segments whose total size is less than that to the wrapped 
merge policy, maybe?

bq. UIMP.findMerges does not pass the mergeTrigger to the inner/delegate merge 
policy.

That seems like a bug to me.

> UpgradeIndexMergePolicy: beyond one-off use, monster segment avoidance
> ----------------------------------------------------------------------
>
>                 Key: LUCENE-7523
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7523
>             Project: Lucene - Core
>          Issue Type: Task
>            Reporter: Christine Poerschke
>            Priority: Minor
>         Attachments: LUCENE-7523-outline.patch
>
>
> (Was looking at UpgradeIndexMergePolicy as part of SOLR-9648 and came up with 
> these possibilities here, what do people think?)
> Currently one probably would not configure use of the 
> {{UpgradeIndexMergePolicy}} (UIMP) permanently since 
> [findForcedMerges|https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/index/UpgradeIndexMergePolicy.java#L74]
>  becomes a no-op after all segments have been upgraded.
> * How about adding an optional {{fallbackToInnerAfterUpgrade}} flag? That way 
> UIMP.findForcedMerges could fallback to its inner/delegate merge policy's 
> findForcedMerges call after all segments have been upgraded.
> Currently UIMP.findForcedMerges identifies all the segments to be upgraded 
> and then asks its inner/delegate merge policy to come up with a 
> MergeSpecification for those segments. If the inner/delegate merge policy 
> does not supply a merge for all the segments to be upgraded then UIMP merges 
> the remaining segments into _one_ new segment. That extra new segment could 
> be quite a large 'monster' segment.
> * How about adding an optional {{upgradeUnmergedSegmentsIndividually}} flag? 
> That way UIMP.findForcedMerges could upgrade (but not merge) the remaining 
> segments.
> * Or indeed should 'upgradeUnmergedSegmentsIndividually' be the default 
> behaviour?
> Noticed whilst looking at the code:
> * 
> [UIMP.findMerges|https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/index/UpgradeIndexMergePolicy.java#L69]
>  does not pass the mergeTrigger to the inner/delegate merge policy.
> ** If we can figure out why that is, let's add a comment to say why that is.
> ** Understanding why that is would also be needed before proceeding with 
> beyond one-off use of UIMP.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to