[ 
https://issues.apache.org/jira/browse/LUCENE-5941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14131356#comment-14131356
 ] 

Shai Erera commented on LUCENE-5941:
------------------------------------

bq. Did you disable the virus scanner in the test?

Oh good point. I'll do that anyway.

bq. Not being able to delete files in windows because you have open readers 
against them is probably realistic though.

True, but it's unrelated to the comment in forceMerge. That's a general 
statement about deleting index files while virus scanning holds them open. The 
point is how much space *Lucene* requires to do a forceMerge.

And BTW I don't think this is related to Windows. Even on unix, if you have a 
process which keeps the index files opened, that that it appears that the file 
is deleted has nothing to do with the space that the file still occupies, until 
that process releases the file handle, right?

> IndexWriter.forceMerge documentation error
> ------------------------------------------
>
>                 Key: LUCENE-5941
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5941
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>         Attachments: LUCENE-5941.patch
>
>
> IndexWriter.forceMerge documents that it requires up to 3X *FREE* space in 
> order to run successfully. We even go further with it and test it in 
> TestIWForceMerge.testForceMergeTempSpaceUsage(). But I think that's wrong. I 
> cannot think of a situation where we consume 3X *additional* space during 
> merge:
> * 1X - that's the source segments to be merged
> * 2X - that's the result non-CFS merged segment
> * 3X - that's the CFS creation
> At no point do we publish the non-CFS merged segment, therefore the merge, as 
> I understand it, only consumes up to 2X additional space during that merge.
> And anyway, we only require 2X of additional space of the *largest* merge (or 
> total batch of running merges, depends on your MergeScheduler), not the whole 
> index size. This is an important observation, since if you e.g. have a 500GB 
> index, users shouldn't think they need to reserve an additional 1TB for 
> merging, since most of their big segments won't be merged by default anyway 
> (TieredMP defaults to 5GB largest segment).
> I'll post a patch which fixes the documentation and the test. If anyone can 
> think of a scenario where we consume up to 3X *additional* space, please 
> chime, and I'll only modify IW.forceMerge documentation to explain that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to