[
https://issues.apache.org/jira/browse/LUCENE-5618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13982931#comment-13982931
]
Shai Erera commented on LUCENE-5618:
------------------------------------
If we separate each DV update into its own file, I think we will need to track
another gen in SegmentCommitInfo: deletes, fieldInfos and dvUpdates. Though
each FI writes its dvGen in the FIS file, we need to know from where to
increment the gen for the next update. This isn't a big deal, just adds
complexity to SCI (4 methods and index format change).
But why do you think that it's wrong to write 2 fields and then at read time
ask to provide only 1 field? I.e. what if the Codecs API was more "lazy", or a
Codec wants to implement lazy loading of even just the metadata?
Passing all the fields a Codec wrote, e.g. in the {{gen=-1}} case, even though
none of them is not going to be used because they were all updated in later
gens, seems awkward to me as well.
What sort of index corruption does this check detect? As I see it, the Codec
gets a subset of the fields that it already wrote. It's worse if it gets a
superset of those fields, because you don't know e.g. if there are perhaps
missing fields that disappeared from the file system.
> DocValues updates send wrong fieldinfos to codec producers
> ----------------------------------------------------------
>
> Key: LUCENE-5618
> URL: https://issues.apache.org/jira/browse/LUCENE-5618
> Project: Lucene - Core
> Issue Type: Bug
> Reporter: Robert Muir
>
> Spinoff from LUCENE-5616.
> See the example there, docvalues readers get a fieldinfos, but it doesn't
> contain the correct ones, so they have invalid field numbers at read time.
> This should really be fixed. Maybe a simple solution is to not write
> "batches" of fields in updates but just have only one field per gen?
> This removes many-many relationships and would make things easy to understand.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]