Change behavior of ParallelReader.document(int)
-----------------------------------------------

         Key: LUCENE-606
         URL: http://issues.apache.org/jira/browse/LUCENE-606
     Project: Lucene - Java
        Type: Improvement

  Components: Index  
    Versions: 2.0.0    
    Reporter: Christian Kohlschuetter


Currently, the returned documents contain, for each field, the stored data from 
all enclosed IndexReaders which contain the corresponding field.
That is, a call to ParallelReader.document(doc).getFields(fieldName) returns an 
array of possibly several Field objects. Since null entries are disallowed, 
there is no way to determine to which IndexReader the field data exactly 
belongs.

On the other side, a search for a term on that field only yields results if 
that term was contained in the *first* matching IndexReader which contained the 
field.
Thus, when merging the ParallelReader contents to another IndexWriter, the 
indexed data does not correspond to the stored information.

I am not sure whether this can be considered a bug (in some cases, this may 
exactly be required). However I would like to see an option to change this 
behaviour.

I suggest a parameter for ParallelReader which specifies whether stored data 
from all IndexReaders or only from the one which is repsonsible for the field's 
indexed data will be returned by ParallelReader.document(int).

Please find my proposed implementation attached, as well as a JUnit testcase.


-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to