[ 
https://issues.apache.org/jira/browse/LUCENE-8162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16492469#comment-16492469
 ] 

Tommaso Teofili commented on LUCENE-8162:
-----------------------------------------

{quote}but many users index at full speed for a long time and suppressing 
merges in that case is dangerous
{quote}
yes, that might make search degrade. To mitigate that the proposed MP has a 
maximum number of segments allowed for throttling. So for example if the 
throttling algorithm makes the number of segments go beyond a configurable 
threshold (e.g. 20), the throttling algorithm doesn't kick in in the next merge 
and until the number of segments gets back beyond the threshold.

I have been trying to use [https://github.com/mikemccand/luceneutil] to make 
some benchmarks. However it seems the tool only creates one index per 
benchmark. 

> Make it possible to throttle (Tiered)MergePolicy when commit rate is high
> -------------------------------------------------------------------------
>
>                 Key: LUCENE-8162
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8162
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/index
>            Reporter: Tommaso Teofili
>            Priority: Major
>             Fix For: trunk
>
>         Attachments: LUCENE-8162.0.patch
>
>
> As discussed in a recent mailing list thread [1] and observed in a project 
> using Lucene (see OAK-5192 and OAK-6710), it is sometimes helpful to throttle 
> the aggressiveness of (Tiered)MergePolicy when commit rate is high.
> In the case of Apache Jackrabbit Oak a dedicated {{MergePolicy}} was 
> implemented [2].
> That MP doesn't merge in case the number of segments is below a certain 
> threshold (e.g. 30) and commit rate (docs per sec and MB per sec) is high 
> (e.g. above 1000 doc / sec , 5MB / sec).
> In such impl, the commit rate thresholds adapt to average commit rate by 
> means of single exponential smoothing.
> The results in that specific case looked encouraging as it brought a 5% perf 
> improvement in querying and ~10% reduced IO. However Oak has some specifics 
> which might not fit in other scenarios. Anyway it could be interesting to see 
> how this behaves in plain Lucene scenario.
> [1] : [http://markmail.org/message/re3ifmq2664bqfjk]
> [2] : 
> [https://github.com/apache/jackrabbit-oak/blob/trunk/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/writer/CommitMitigatingTieredMergePolicy.java]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to