[
https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048283#comment-13048283
]
Michael McCandless commented on LUCENE-2793:
--------------------------------------------
Patch is looking good!
* I thinkt the Context enum values should be ALL_CAPS
* FieldsWriter's ctor should take an IOContext? Generally low level
places shouldn't create a new IOContext; they should be passed
one (though there are exceptions... eg I think it's fine that
SegmentInfos.run uses DEFAULT).
* Same for values/IntsImpl, values/Bytes, values/Floats,
SegmentReader.get, TermVectorsTermsReader, TermVectorsWriter,
DefaultSegmentInfosWriter, DefaultSegmentInfosReader,
FixedGapTermsIndexReader, VariableGapTermsIndexReader,
NormsWriter, BitVector (read & write)
* I think MP.OneMerge should not extend the new MergeInfo; I'm
worried about cases where classes hang onto the IOContext (eg,
various places save the IOContxt away as a member) because this
could then risk holding refs to the SegmentReaders created for
merging. I think it's less risky to fully decouple the MergeInfo
from OneMerge. Then, maybe MergeInfo should be a static inner
class inside IOContext?
* IndexWriter.ReaderPool.get should pass the IOCtx down to
SegmentReader.get
* IndexWriter.prepareFlushedSegment should be FLUSH not DEFAULT
context.
* Can you add assert inside IOContext: if the ctx is MERGE then the
MergeInfo must not be null (ie in the ctor that takes only a
Context).
* MergeInfo needs a few more fields (eg optimize, isExternal)
* When IndexWriter.addIndexes makes the MERGE context it should pass
a MergeInfo with isExternal true
* In IndexWriter.mergeMiddle, you should make a single IOCtx(MERGE)
up front and pass to all readerPool.get calls
* At the end of IndexWriter.mergeMiddle, when we open the
mergedReader... we should not pass MERGE context here, somehow,
because this open is very different than when we open the
to-be-merged segments. IE, we are opening the merged segment.
Hmm, maybe, we can add a new flag to the MergeInfo,
"isMergedSegment"? Alternatively... we use Context.READ, but then
I think we need something else that states what "READ" this really
is -- eg NRT reader, merged segment reader, "normal" (IR.open)
reader?
* TermVectorsTermsWriter should be a Context.FLUSH
* SegmentWriteState.context can be final
* Somehow, PerFieldCodecWrapper.java shows as deleted...
* MergeInfo.java needs copygirht header & javadoc.
* Need whitespace around '=', eg "this.context=context;" should be
"this.context = context;"
> Directory createOutput and openInput should take an IOContext
> -------------------------------------------------------------
>
> Key: LUCENE-2793
> URL: https://issues.apache.org/jira/browse/LUCENE-2793
> Project: Lucene - Java
> Issue Type: Improvement
> Components: core/store
> Reporter: Michael McCandless
> Assignee: Varun Thacker
> Labels: gsoc2011, lucene-gsoc-11, mentor
> Attachments: LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch,
> LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch,
> LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch
>
>
> Today for merging we pass down a larger readBufferSize than for searching
> because we get better performance.
> I think we should generalize this to a class (IOContext), which would hold
> the buffer size, but then could hold other flags like DIRECT (bypass OS's
> buffer cache), SEQUENTIAL, etc.
> Then, we can make the DirectIOLinuxDirectory fully usable because we would
> only use DIRECT/SEQUENTIAL during merging.
> This will require fixing how IW pools readers, so that a reader opened for
> merging is not then used for searching, and vice/versa. Really, it's only
> all the open file handles that need to be different -- we could in theory
> share del docs, norms, etc, if that were somehow possible.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]