[ 
https://issues.apache.org/jira/browse/LUCENE-5618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13982931#comment-13982931
 ] 

Shai Erera commented on LUCENE-5618:
------------------------------------

If we separate each DV update into its own file, I think we will need to track 
another gen in SegmentCommitInfo: deletes, fieldInfos and dvUpdates. Though 
each FI writes its dvGen in the FIS file, we need to know from where to 
increment the gen for the next update. This isn't a big deal, just adds 
complexity to SCI (4 methods and index format change).

But why do you think that it's wrong to write 2 fields and then at read time 
ask to provide only 1 field? I.e. what if the Codecs API was more "lazy", or a 
Codec wants to implement lazy loading of even just the metadata?

Passing all the fields a Codec wrote, e.g. in the {{gen=-1}} case, even though 
none of them is not going to be used because they were all updated in later 
gens, seems awkward to me as well.

What sort of index corruption does this check detect? As I see it, the Codec 
gets a subset of the fields that it already wrote. It's worse if it gets a 
superset of those fields, because you don't know e.g. if there are perhaps 
missing fields that disappeared from the file system.

> DocValues updates send wrong fieldinfos to codec producers
> ----------------------------------------------------------
>
>                 Key: LUCENE-5618
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5618
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Robert Muir
>
> Spinoff from LUCENE-5616.
> See the example there, docvalues readers get a fieldinfos, but it doesn't 
> contain the correct ones, so they have invalid field numbers at read time.
> This should really be fixed. Maybe a simple solution is to not write 
> "batches" of fields in updates but just have only one field per gen? 
> This removes many-many relationships and would make things easy to understand.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to