[ https://issues.apache.org/jira/browse/LUCENE-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13001214#comment-13001214 ]
Michael Busch commented on LUCENE-2881: --------------------------------------- bq. maybe, we should store the FieldInfos inside the segments file? Hmmm.... I had the same thought while adding the ref to FieldInfos to SegmentInfo. Actually this is probably the right thing to do. At the same time we could switch to a human-readable format :) bq. I fear we may not necessarily ever "stabilize" on a fixed global name/number bimap, because we re-compute this map on every IW init? We could also store the global map on disk? addIndexes() would have to ignore the global map from the external index(es). > Track FieldInfo per segment instead of per-IW-session > ----------------------------------------------------- > > Key: LUCENE-2881 > URL: https://issues.apache.org/jira/browse/LUCENE-2881 > Project: Lucene - Java > Issue Type: Improvement > Affects Versions: Realtime Branch, CSF branch, 4.0 > Reporter: Simon Willnauer > Assignee: Michael Busch > Fix For: Realtime Branch, CSF branch, 4.0 > > Attachments: LUCENE-2881.patch, lucene-2881.patch, lucene-2881.patch, > lucene-2881.patch, lucene-2881.patch, lucene-2881.patch > > > Currently FieldInfo is tracked per IW session to guarantee consistent global > field-naming / ordering. IW carries FI instances over from previous segments > which also carries over field properties like isIndexed etc. While having > consistent field ordering per IW session appears to be important due to bulk > merging stored fields etc. carrying over other properties might become > problematic with Lucene's Codec support. Codecs that rely on consistent > properties in FI will fail if FI properties are carried over. > The DocValuesCodec (DocValuesBranch) for instance writes files per segment > and field (using the field id within the file name). Yet, if a segment has no > DocValues indexed in a particular segment but a previous segment in the same > IW session had DocValues, FieldInfo#docValues will be true since those > values are reused from previous segments. > We already work around this "limitation" in SegmentInfo with properties like > hasVectors or hasProx which is really something we should manage per Codec & > Segment. Ideally FieldInfo would be managed per Segment and Codec such that > its properties are valid per segment. It also seems to be necessary to bind > FieldInfoS to SegmentInfo logically since its really just per segment > metadata. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org