[ https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048283#comment-13048283 ]
Michael McCandless commented on LUCENE-2793: -------------------------------------------- Patch is looking good! * I thinkt the Context enum values should be ALL_CAPS * FieldsWriter's ctor should take an IOContext? Generally low level places shouldn't create a new IOContext; they should be passed one (though there are exceptions... eg I think it's fine that SegmentInfos.run uses DEFAULT). * Same for values/IntsImpl, values/Bytes, values/Floats, SegmentReader.get, TermVectorsTermsReader, TermVectorsWriter, DefaultSegmentInfosWriter, DefaultSegmentInfosReader, FixedGapTermsIndexReader, VariableGapTermsIndexReader, NormsWriter, BitVector (read & write) * I think MP.OneMerge should not extend the new MergeInfo; I'm worried about cases where classes hang onto the IOContext (eg, various places save the IOContxt away as a member) because this could then risk holding refs to the SegmentReaders created for merging. I think it's less risky to fully decouple the MergeInfo from OneMerge. Then, maybe MergeInfo should be a static inner class inside IOContext? * IndexWriter.ReaderPool.get should pass the IOCtx down to SegmentReader.get * IndexWriter.prepareFlushedSegment should be FLUSH not DEFAULT context. * Can you add assert inside IOContext: if the ctx is MERGE then the MergeInfo must not be null (ie in the ctor that takes only a Context). * MergeInfo needs a few more fields (eg optimize, isExternal) * When IndexWriter.addIndexes makes the MERGE context it should pass a MergeInfo with isExternal true * In IndexWriter.mergeMiddle, you should make a single IOCtx(MERGE) up front and pass to all readerPool.get calls * At the end of IndexWriter.mergeMiddle, when we open the mergedReader... we should not pass MERGE context here, somehow, because this open is very different than when we open the to-be-merged segments. IE, we are opening the merged segment. Hmm, maybe, we can add a new flag to the MergeInfo, "isMergedSegment"? Alternatively... we use Context.READ, but then I think we need something else that states what "READ" this really is -- eg NRT reader, merged segment reader, "normal" (IR.open) reader? * TermVectorsTermsWriter should be a Context.FLUSH * SegmentWriteState.context can be final * Somehow, PerFieldCodecWrapper.java shows as deleted... * MergeInfo.java needs copygirht header & javadoc. * Need whitespace around '=', eg "this.context=context;" should be "this.context = context;" > Directory createOutput and openInput should take an IOContext > ------------------------------------------------------------- > > Key: LUCENE-2793 > URL: https://issues.apache.org/jira/browse/LUCENE-2793 > Project: Lucene - Java > Issue Type: Improvement > Components: core/store > Reporter: Michael McCandless > Assignee: Varun Thacker > Labels: gsoc2011, lucene-gsoc-11, mentor > Attachments: LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, > LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, > LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch > > > Today for merging we pass down a larger readBufferSize than for searching > because we get better performance. > I think we should generalize this to a class (IOContext), which would hold > the buffer size, but then could hold other flags like DIRECT (bypass OS's > buffer cache), SEQUENTIAL, etc. > Then, we can make the DirectIOLinuxDirectory fully usable because we would > only use DIRECT/SEQUENTIAL during merging. > This will require fixing how IW pools readers, so that a reader opened for > merging is not then used for searching, and vice/versa. Really, it's only > all the open file handles that need to be different -- we could in theory > share del docs, norms, etc, if that were somehow possible. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org