Track FieldInfo per segment instead of per-IW-session
-----------------------------------------------------

                 Key: LUCENE-2881
                 URL: https://issues.apache.org/jira/browse/LUCENE-2881
             Project: Lucene - Java
          Issue Type: Improvement
    Affects Versions: Realtime Branch, CSF branch, 4.0
            Reporter: Simon Willnauer
             Fix For: Realtime Branch, CSF branch, 4.0


Currently FieldInfo is tracked per IW session to guarantee consistent global 
field-naming / ordering. IW carries FI instances over from previous segments 
which also carries over field properties like isIndexed etc. While having 
consistent field ordering per IW session appears to be important due to bulk 
merging stored fields etc. carrying over other properties might become 
problematic with Lucene's Codec support.  Codecs that rely on consistent 
properties in FI will fail if FI properties are carried over.

The DocValuesCodec (DocValuesBranch) for instance writes files per segment and 
field (using the field id within the file name). Yet, if a segment has no 
DocValues indexed in a particular segment but a previous segment in the same IW 
session had DocValues, FieldInfo#docValues will be true  since those values are 
reused from previous segments. 

We already work around this "limitation" in SegmentInfo with properties like 
hasVectors or hasProx which is really something we should manage per Codec & 
Segment. Ideally FieldInfo would be managed per Segment and Codec such that its 
properties are valid per segment. It also seems to be necessary to bind 
FieldInfoS to SegmentInfo logically since its really just per segment metadata. 
 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to