[ 
https://issues.apache.org/jira/browse/LUCENE-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634094#action_12634094
 ] 

Michael McCandless commented on LUCENE-1401:
--------------------------------------------


That (cfx/cfs file creation) is actually "normal" behavior for
Lucene.

With autoCommit=false, in a single session of IndexWriter, Lucene
will share the doc store files (stored fields, term vectors) across
multiple segments.  This saves alot of merge time because those files
don't need to be merged if we are merging segments that all share the
same doc store files.  When building up a large index anew this saves
alot of time.

A cfx file is the compound-file format of the doc store files.

However, when segments spanning multiple doc stores are merged, then
the doc store files are in fact merged, and written privately for that
one segment, and then folded into that segment's cfs file.  When all
such segments reference a given doc store segment are merged away,
then that doc store segment is deleted.

So it's currently only the "level 0" segments that may share a cfx
file.  As a future optimization we could consider extending Lucene's
index format so that a single segment could reference multiple doc
stores.  This would require logic in FieldsReader and
TermVectorsReader to do a binary search when locating which doc store
segment holds a given document, but, would enable merging non level 0
segments to skip having to merge the doc store.  This is an invasive
optimization.

So you can't separately control when Lucene uses cfx file; it's the
merge policy that indirectly controls this.

> Deprecation of autoCommit in 2.4 leads to compile problems, when autoCommit 
> should be false
> -------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1401
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1401
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.4, 2.9
>            Reporter: Uwe Schindler
>            Assignee: Michael McCandless
>            Priority: Trivial
>             Fix For: 2.4, 2.9
>
>         Attachments: LUCENE-1401.patch
>
>
> I am currently changing my code to be most compatible with 2.4. I switched on 
> deprecation warnings and got a warning about the autoCommit parameter in 
> IndexWriter constructors.
> My code *should* use autoCommit=false, so I want to use the new semantics. 
> The default of IndexWriter is still autoCommit=true. My problem now: How to 
> disable autoCommit whithout deprecation warnings?
> Maybe, the "old" constructors, that are deprecated should use 
> autoCommit=true. But there are new constructors with this 
> "IndexWriter.MaxFieldLength mfl" in it, that appear new in 2.4 but are 
> deprecated:
> IndexWriter(Directory d, boolean autoCommit, Analyzer a, boolean create, 
> IndexDeletionPolicy deletionPolicy, IndexWriter.MaxFieldLength mfl) 
>           Deprecated. This will be removed in 3.0, when autoCommit will be 
> hardwired to false. Use 
> IndexWriter(Directory,Analyzer,boolean,IndexDeletionPolicy,MaxFieldLength) 
> instead, and call commit() when needed.
> What the hell is meant by this, a new constructor that is deprecated? And the 
> hint is wrong. If I use the other constructor in the warning, I get 
> autoCommit=true.
> There is something completely wrong.
> It should be clear, which constructors set autoCommit=true, which set it per 
> default to false (perhaps new ones), and the Deprecated text is wrong, if 
> autoCommit does not default to false.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to