[ 
https://issues.apache.org/jira/browse/LUCENE-8331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16489726#comment-16489726
 ] 

David Smiley commented on LUCENE-8331:
--------------------------------------

CC [~mikemccand] [~simonw] [~erickerickson]

I used this utility (with some other edits not in this patch) to evaluate a 
custom merge policy that had a notion of "cheap" merges.  It turned out to be 
very successful; I may open other issues about ways TieredMergePolicy and/or 
the MergeScheduler can be improved.

The main features about this simulator are:
* doesn't require actual indexing and is thus super-fast
* calculates useful stats like the average number of segments and the average 
write amplification factor.
* provides a random sequence of flushed segment sizes that can be controlled in 
a couple ways to make it more/less realistic depending on your environment

Some not so great parts:
* does not yet handle deletes!
* configuration tweaking of the merge policy to be tested and varying the 
inputs is a manual affair, editing main() and/or makeMergePolicy().  I added 
some System property overrides though, and some basic args parsing.  It's 
probably not realistic to expect much better given the use of this for 
experimentation.

What do you think guys?

> MergePolicy simulator utility
> -----------------------------
>
>                 Key: LUCENE-8331
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8331
>             Project: Lucene - Core
>          Issue Type: New Feature
>            Reporter: David Smiley
>            Assignee: David Smiley
>            Priority: Major
>         Attachments: LUCENE-8331.patch
>
>
> This issue introduces a MergePolicy simulator utility to help evaluate the 
> effectiveness of a MergePolicy.  The simulator does not result in the actual 
> indexing and merging of segments; instead it provides some dummy constructs 
> to MergePolicy to evaluate its decisions.  Therefore you can do simulation 
> runs in little time.
> I'm not sure where it would live.  Perhaps dev-tools, or in tests, or in 
> benchmark?
> I mentioned this recently here:
> https://issues.apache.org/jira/browse/LUCENE-7976?focusedCommentId=16446985&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16446985
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to