[jira] [Commented] (LUCENE-2793) Directory createOutput and openInput should take an IOContext

Simon Willnauer (JIRA) Thu, 09 Jun 2011 01:50:51 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13046403#comment-13046403
 ]


Simon Willnauer commented on LUCENE-2793:
-----------------------------------------

Hey varun,

here are some more comments for the latest complete patch:

* We should have a static instance for IOContext with Context.Other which you 
can use in BitVector / CheckIndex for instance Maybe IOContext#DEFAULT_CONTEXT
* It seems that we don't need to provide IOContext to FieldInfos and 
SegmentInfo since we are reading them into memory anyway. I think you can just 
use a default context here without changing the constructors. Same is true for 
SegmentInfos 
* This is unrelated to your patch but in PreFlexFields we should use 
IndexFileNames.segmentFileName(info.name, "", PreFlexCodec.FREQ_EXTENSION) and 
IndexFileNames.segmentFileName(info.name, "", PreFlexCodec.PROX_EXTENSION) 
instead of info.name + ".frq"  and info.name + ".prx"
* it seems that we should communicate the IOContext to the codec somehow. I 
suggest we put IOContext to SegmentWriteState and SegmentReadState that way we 
don't need to change the Codec interface and clutter it with internals. This 
would also fix mikes comment for FieldsConsumer etc.
* TermVectorsWriter is only used in Merges so maybe it should also get a 
Context.Merge for consistency?
* I really don't like OneMerge :) I think we should add an abstract class  
(maybe MergeInfo) that exposes the estimatedMergeBytes, totalDocCount for now.
* small typo in RamDirectory, there is a space missing after the second file 
here: dir.copy(this, file, file,context);
* SegmentReader should also use the static Default IOContext - make sure its 
used where needed :)

Regarding the IOContext class I think we should design for what we have right 
now and since SegementInfo is not used anywhere (as far as I can see) we should 
add it once we need it. OneMerge should not go in there but rather the 
interface / abstract class I talked about above. 

> Directory createOutput and openInput should take an IOContext
> -------------------------------------------------------------
>
>                 Key: LUCENE-2793
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2793
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Michael McCandless
>            Assignee: Varun Thacker
>              Labels: gsoc2011, lucene-gsoc-11, mentor
>         Attachments: LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, 
> LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch
>
>
> Today for merging we pass down a larger readBufferSize than for searching 
> because we get better performance.
> I think we should generalize this to a class (IOContext), which would hold 
> the buffer size, but then could hold other flags like DIRECT (bypass OS's 
> buffer cache), SEQUENTIAL, etc.
> Then, we can make the DirectIOLinuxDirectory fully usable because we would 
> only use DIRECT/SEQUENTIAL during merging.
> This will require fixing how IW pools readers, so that a reader opened for 
> merging is not then used for searching, and vice/versa.  Really, it's only 
> all the open file handles that need to be different -- we could in theory 
> share del docs, norms, etc, if that were somehow possible.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-2793) Directory createOutput and openInput should take an IOContext

Reply via email to