[ 
https://issues.apache.org/jira/browse/LUCENE-3569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13146988#comment-13146988
 ] 

Robert Muir commented on LUCENE-3569:
-------------------------------------

{quote}
If the MP's name is clear enough that this is a devastating thing to do 
'regularly' then isn't it the user's decision whether to set it or not?
{quote}

No, if we do this: I will add the instanceof+IllegalArgumentException to 
IndexWriterConfig myself. I think this is a step backwards from
renaming optimize to sound cool, now its starting to imply that its regularly 
scheduled required maintenance, and we are making it
possible to automate the O(n^2) indexing that some people are doing today...
                
> Consolidate IndexWriter's optimize, maybeMerge and expungeDeletes under one 
> merge(MP) method
> --------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3569
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3569
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/index
>            Reporter: Shai Erera
>
> Today, IndexWriter exposes 3 methods for 'cleaning up' / 'compacting' / 
> 'optimizing' your index:
> * optimize() -- merges as much segments as possible (down to 1 segment), and 
> is discouraged in many cases because of its performance implications.
> * maybeMerge() -- runs 'subtle' merges. Attempts to balance the index by not 
> leaving too many segments, yet not merging large segments if unneeded.
> * expungeDeletes() -- cleans up deleted documents from segments and on the go 
> merges them.
> * a default MP that can be set on IndexWriterConfig, for ongoing merges IW 
> performs (i.e. as a result of flushing a new segment).
> These methods are confusing in several levels:
> * Their names are misleading, see LUCENE-3454.
> * Why does expungeDeletes need to merge segments?
> * Eventually, they really do what the MergePolicy decides that should be 
> done. I.e., one could write an MP that always merges all segments, and 
> therefore calling maybeMerge would not be so subtle anymore. On the other 
> hand, one could write an MP that never merges large segments (we in fact have 
> several of those), and therefore calling optimize(1) would not end up with 
> one segment.
> So the proposal is to replace all these methods with a single one 
> merge(MergePolicy) (more on the names later). MergePolicy will have only one 
> method findSegmentsForMerge and the caller will be responsible to configure 
> it in order to perform the needed merges. We will provide ready-to-use MPs:
> * LightMergePolicy -- for setting on IWC and doing the ongoing merges IW 
> executes. This one will pick segments respecting various parameters such as 
> mergeFactor, segmentSizes etc.
> * HeavyMergePolicy -- for doing the optimize()-style merges.
> * ExpungeDeletesMergePolicy -- for expunging deletes (my proposal is to drop 
> segment merging from it, by default).
> Now about the names:
> * I think that it will be good, API-backcompat wise and in general, if we 
> name that method doMaintenance (as expungeDeletes does not have to merge 
> anything).
> * Instead of MergePolicy we call it MaintenancePolicy and similarly its 
> single method findSegmentsForMaintenance, or getMaintenanceSpecification.
> * I called the MPs Light and Heavy just for the text, I think a better name 
> should be found, but nothing comes up to mind now.
> It will allow us to use this on 3.x, by deprecating MP and all related 
> methods.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to