[jira] [Commented] (LUCENE-8331) MergePolicy simulator utility

2018-05-29 Thread David Smiley (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16493480#comment-16493480
 ] 

David Smiley commented on LUCENE-8331:
--

bq. can it just be a utility class that I call from a test or so I mean I am 
not sure how userfriendly it is to specify classpaths etc. I'd just run it from 
a test.

Ooh, ok.  FWIW what I do is simply right-click the main method and tell my IDE 
to run it.  It fails the first go-round because it needs args so then I update 
the args.  Since it's on the test classpath and run from my IDE, there's no 
issue.  I expect others can just run it similarly?  Documentation could spell 
this out!  Why would a test call this?  To assert that the stats are "good"?

{quote}I think it should support deletes and should not use IW then I ok with it
{quote}
Sure thing – now made possible with LUCENE-8330.  I'll work on this.

> MergePolicy simulator utility
> -
>
> Key: LUCENE-8331
> URL: https://issues.apache.org/jira/browse/LUCENE-8331
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Major
> Attachments: LUCENE-8331.patch
>
>
> This issue introduces a MergePolicy simulator utility to help evaluate the 
> effectiveness of a MergePolicy.  The simulator does not result in the actual 
> indexing and merging of segments; instead it provides some dummy constructs 
> to MergePolicy to evaluate its decisions.  Therefore you can do simulation 
> runs in little time.
> I'm not sure where it would live.  Perhaps dev-tools, or in tests, or in 
> benchmark?
> I mentioned this recently here:
> https://issues.apache.org/jira/browse/LUCENE-7976?focusedCommentId=16446985=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16446985
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8331) MergePolicy simulator utility

2018-05-29 Thread Tommaso Teofili (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16493290#comment-16493290
 ] 

Tommaso Teofili commented on LUCENE-8331:
-

bq. I think it should support deletes and should not use IW then I ok with it
 
+1

> MergePolicy simulator utility
> -
>
> Key: LUCENE-8331
> URL: https://issues.apache.org/jira/browse/LUCENE-8331
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Major
> Attachments: LUCENE-8331.patch
>
>
> This issue introduces a MergePolicy simulator utility to help evaluate the 
> effectiveness of a MergePolicy.  The simulator does not result in the actual 
> indexing and merging of segments; instead it provides some dummy constructs 
> to MergePolicy to evaluate its decisions.  Therefore you can do simulation 
> runs in little time.
> I'm not sure where it would live.  Perhaps dev-tools, or in tests, or in 
> benchmark?
> I mentioned this recently here:
> https://issues.apache.org/jira/browse/LUCENE-7976?focusedCommentId=16446985=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16446985
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8331) MergePolicy simulator utility

2018-05-29 Thread Simon Willnauer (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16493222#comment-16493222
 ] 

Simon Willnauer commented on LUCENE-8331:
-

{quote}How else would something like this be executed? Maybe I don't understand 
your subsequent recommendation...\{quote}

can it just be a utility class that I call from a test or so I mean I am not 
sure how userfriendly it is to specify classpaths etc. I'd just run it from a 
test. I also think it's way more flexible if you have a java API to call rather 
than some cmd args you need to parse etc.

 

{quote}Are you basically fine with me committing this?\{quote}

I think it should support deletes and should not use IW then I ok with it

> MergePolicy simulator utility
> -
>
> Key: LUCENE-8331
> URL: https://issues.apache.org/jira/browse/LUCENE-8331
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Major
> Attachments: LUCENE-8331.patch
>
>
> This issue introduces a MergePolicy simulator utility to help evaluate the 
> effectiveness of a MergePolicy.  The simulator does not result in the actual 
> indexing and merging of segments; instead it provides some dummy constructs 
> to MergePolicy to evaluate its decisions.  Therefore you can do simulation 
> runs in little time.
> I'm not sure where it would live.  Perhaps dev-tools, or in tests, or in 
> benchmark?
> I mentioned this recently here:
> https://issues.apache.org/jira/browse/LUCENE-7976?focusedCommentId=16446985=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16446985
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8331) MergePolicy simulator utility

2018-05-28 Thread David Smiley (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16492919#comment-16492919
 ] 

David Smiley commented on LUCENE-8331:
--

Thanks for your input Simon.

bq.  I am not sure it needs to be a commandline util.

How else would something like this be executed?  Maybe I don't understand your 
subsequent recommendation...

bq. I would rather build the individual tools to plug stuff together as an API 
and put most of the utils like creating the simulated segments into the base 
tests class.

I may not be getting your point but I think you're saying you'd like Lucene's 
test infrastructure to have _some_ of the elements of what this test does.  
Sounds good to me.  Nevertheless the outcome of that would be less code in this 
simulator... but somewhere there needs to be a main() to literally run the 
simulation and setup whatever the simulated environment is, and code to track 
some stats of interest.  Right?

Are you basically fine with me committing this?

> MergePolicy simulator utility
> -
>
> Key: LUCENE-8331
> URL: https://issues.apache.org/jira/browse/LUCENE-8331
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Major
> Attachments: LUCENE-8331.patch
>
>
> This issue introduces a MergePolicy simulator utility to help evaluate the 
> effectiveness of a MergePolicy.  The simulator does not result in the actual 
> indexing and merging of segments; instead it provides some dummy constructs 
> to MergePolicy to evaluate its decisions.  Therefore you can do simulation 
> runs in little time.
> I'm not sure where it would live.  Perhaps dev-tools, or in tests, or in 
> benchmark?
> I mentioned this recently here:
> https://issues.apache.org/jira/browse/LUCENE-7976?focusedCommentId=16446985=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16446985
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8331) MergePolicy simulator utility

2018-05-24 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16490270#comment-16490270
 ] 

Simon Willnauer commented on LUCENE-8331:
-

I thinks something like this can be helpful if you are working on a MP and/or 
trying to debug an issue. I am not sure it needs to be a commandline util. I 
would rather build the individual tools to plug stuff together as an API and 
put most of the utils like creating the simulated segments into the base tests 
class. I was going to do something similar to make testing simpler. I like the 
idea. LUCENE-8330 will help doing this as well

> MergePolicy simulator utility
> -
>
> Key: LUCENE-8331
> URL: https://issues.apache.org/jira/browse/LUCENE-8331
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Major
> Attachments: LUCENE-8331.patch
>
>
> This issue introduces a MergePolicy simulator utility to help evaluate the 
> effectiveness of a MergePolicy.  The simulator does not result in the actual 
> indexing and merging of segments; instead it provides some dummy constructs 
> to MergePolicy to evaluate its decisions.  Therefore you can do simulation 
> runs in little time.
> I'm not sure where it would live.  Perhaps dev-tools, or in tests, or in 
> benchmark?
> I mentioned this recently here:
> https://issues.apache.org/jira/browse/LUCENE-7976?focusedCommentId=16446985=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16446985
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8331) MergePolicy simulator utility

2018-05-24 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489726#comment-16489726
 ] 

David Smiley commented on LUCENE-8331:
--

CC [~mikemccand] [~simonw] [~erickerickson]

I used this utility (with some other edits not in this patch) to evaluate a 
custom merge policy that had a notion of "cheap" merges.  It turned out to be 
very successful; I may open other issues about ways TieredMergePolicy and/or 
the MergeScheduler can be improved.

The main features about this simulator are:
* doesn't require actual indexing and is thus super-fast
* calculates useful stats like the average number of segments and the average 
write amplification factor.
* provides a random sequence of flushed segment sizes that can be controlled in 
a couple ways to make it more/less realistic depending on your environment

Some not so great parts:
* does not yet handle deletes!
* configuration tweaking of the merge policy to be tested and varying the 
inputs is a manual affair, editing main() and/or makeMergePolicy().  I added 
some System property overrides though, and some basic args parsing.  It's 
probably not realistic to expect much better given the use of this for 
experimentation.

What do you think guys?

> MergePolicy simulator utility
> -
>
> Key: LUCENE-8331
> URL: https://issues.apache.org/jira/browse/LUCENE-8331
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Major
> Attachments: LUCENE-8331.patch
>
>
> This issue introduces a MergePolicy simulator utility to help evaluate the 
> effectiveness of a MergePolicy.  The simulator does not result in the actual 
> indexing and merging of segments; instead it provides some dummy constructs 
> to MergePolicy to evaluate its decisions.  Therefore you can do simulation 
> runs in little time.
> I'm not sure where it would live.  Perhaps dev-tools, or in tests, or in 
> benchmark?
> I mentioned this recently here:
> https://issues.apache.org/jira/browse/LUCENE-7976?focusedCommentId=16446985=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16446985
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org