[jira] [Commented] (LUCENE-2793) Directory createOutput and openInput should take an IOContext

Michael McCandless (JIRA) Sun, 12 Jun 2011 02:58:52 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048283#comment-13048283
 ]


Michael McCandless commented on LUCENE-2793:
--------------------------------------------


Patch is looking good!

  * I thinkt the Context enum values should be ALL_CAPS

  * FieldsWriter's ctor should take an IOContext?  Generally low level
    places shouldn't create a new IOContext; they should be passed
    one (though there are exceptions... eg I think it's fine that
    SegmentInfos.run uses DEFAULT).

  * Same for values/IntsImpl, values/Bytes, values/Floats,
    SegmentReader.get, TermVectorsTermsReader, TermVectorsWriter,
    DefaultSegmentInfosWriter, DefaultSegmentInfosReader,
    FixedGapTermsIndexReader, VariableGapTermsIndexReader,
    NormsWriter, BitVector (read & write)

  * I think MP.OneMerge should not extend the new MergeInfo; I'm
    worried about cases where classes hang onto the IOContext (eg,
    various places save the IOContxt away as a member) because this
    could then risk holding refs to the SegmentReaders created for
    merging.  I think it's less risky to fully decouple the MergeInfo
    from OneMerge.  Then, maybe MergeInfo should be a static inner
    class inside IOContext?

  * IndexWriter.ReaderPool.get should pass the IOCtx down to
    SegmentReader.get

  * IndexWriter.prepareFlushedSegment should be FLUSH not DEFAULT
    context.

  * Can you add assert inside IOContext: if the ctx is MERGE then the
    MergeInfo must not be null (ie in the ctor that takes only a
    Context).

  * MergeInfo needs a few more fields (eg optimize, isExternal)

  * When IndexWriter.addIndexes makes the MERGE context it should pass
    a MergeInfo with isExternal true

  * In IndexWriter.mergeMiddle, you should make a single IOCtx(MERGE)
    up front and pass to all readerPool.get calls

  * At the end of IndexWriter.mergeMiddle, when we open the
    mergedReader... we should not pass MERGE context here, somehow,
    because this open is very different than when we open the
    to-be-merged segments.  IE, we are opening the merged segment.
    Hmm, maybe, we can add a new flag to the MergeInfo,
    "isMergedSegment"?  Alternatively... we use Context.READ, but then
    I think we need something else that states what "READ" this really
    is -- eg NRT reader, merged segment reader, "normal" (IR.open)
    reader?

  * TermVectorsTermsWriter should be a Context.FLUSH

  * SegmentWriteState.context can be final

  * Somehow, PerFieldCodecWrapper.java shows as deleted...

  * MergeInfo.java needs copygirht header & javadoc.

  * Need whitespace around '=', eg "this.context=context;" should be
    "this.context = context;"


> Directory createOutput and openInput should take an IOContext
> -------------------------------------------------------------
>
>                 Key: LUCENE-2793
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2793
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Michael McCandless
>            Assignee: Varun Thacker
>              Labels: gsoc2011, lucene-gsoc-11, mentor
>         Attachments: LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, 
> LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, 
> LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch
>
>
> Today for merging we pass down a larger readBufferSize than for searching 
> because we get better performance.
> I think we should generalize this to a class (IOContext), which would hold 
> the buffer size, but then could hold other flags like DIRECT (bypass OS's 
> buffer cache), SEQUENTIAL, etc.
> Then, we can make the DirectIOLinuxDirectory fully usable because we would 
> only use DIRECT/SEQUENTIAL during merging.
> This will require fixing how IW pools readers, so that a reader opened for 
> merging is not then used for searching, and vice/versa.  Really, it's only 
> all the open file handles that need to be different -- we could in theory 
> share del docs, norms, etc, if that were somehow possible.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-2793) Directory createOutput and openInput should take an IOContext

Reply via email to