[ 
https://issues.apache.org/jira/browse/LUCENE-8162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16355352#comment-16355352
 ] 

Michael McCandless commented on LUCENE-8162:
--------------------------------------------

The class looks like a fork of TMP, but it looks like it could be done instead 
as a subclass, i.e. calling super.findMerges, but then implementing its logic 
to return null if it wants to throttle?  It would make it easier to see what 
logic it's changing.

It seems to use docs/sec, not commit rate, right?  So if I index at a high rate 
but don't commit, the throttling logic can still kick in?

I think the logic is dangerous for general usage: it seems to throttle merges 
when indexing rate is high?  This may work well for Oak usage, as long as 
sometimes indexing rate falls to a slow rate, but many users index at full 
speed for a long time and suppressing merges in that case is dangerous.

> Make it possible to throttle (Tiered)MergePolicy when commit rate is high
> -------------------------------------------------------------------------
>
>                 Key: LUCENE-8162
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8162
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/index
>            Reporter: Tommaso Teofili
>            Priority: Major
>             Fix For: trunk
>
>
> As discussed in a recent mailing list thread [1] and observed in a project 
> using Lucene (see OAK-5192 and OAK-6710), it is sometimes helpful to throttle 
> the aggressiveness of (Tiered)MergePolicy when commit rate is high.
> In the case of Apache Jackrabbit Oak a dedicated {{MergePolicy}} was 
> implemented [2].
> That MP doesn't merge in case the number of segments is below a certain 
> threshold (e.g. 30) and commit rate (docs per sec and MB per sec) is high 
> (e.g. above 1000 doc / sec , 5MB / sec).
> In such impl, the commit rate thresholds adapt to average commit rate by 
> means of single exponential smoothing.
> The results in that specific case looked encouraging as it brought a 5% perf 
> improvement in querying and ~10% reduced IO. However Oak has some specifics 
> which might not fit in other scenarios. Anyway it could be interesting to see 
> how this behaves in plain Lucene scenario.
> [1] : [http://markmail.org/message/re3ifmq2664bqfjk]
> [2] : 
> [https://github.com/apache/jackrabbit-oak/blob/trunk/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/writer/CommitMitigatingTieredMergePolicy.java]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to