[
https://issues.apache.org/jira/browse/LUCENE-5618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13982853#comment-13982853
]
Shai Erera commented on LUCENE-5618:
------------------------------------
I modified the code to pass all the FIs to the codec, no matter the gen, and
tests fail with FileNotFoundException. The reason is that PerFieldDVF tries to
open DVPs e.g. of {{gen=1}} of all fields, whether they were written in that
gen or not, which leads to the FNFE. I am not sure that we can pass all FIs to
the Codec that way ... so our options are:
* Pass all the fields that were written in a gen (whether we need them or not)
-- this does not make sense to me, as we'll need to track it somewhere, and it
seems a waste
* Add leniency in the form of "here are the fields you should care about" --
this makes the codec partially updates aware, but I don't think it's a bad idea
* Write each updated field in its own gen -- if you update many fields, many
times, this will create many files in the index directory. Technically it's not
"wrong", it just looks weird
* Remain w/ the current code's corruption detection if the read fieldNumber < 0
Anyway, I think the issue's title is wrong -- DocValues updates *do* pass the
correct fieldInfos to the producers. They pass only the infos that the producer
should care about, and we see that passing too many is wrong (PerFieldDVF).
I will think about it more. If you see other alternatives, feel free to propose
them.
> DocValues updates send wrong fieldinfos to codec producers
> ----------------------------------------------------------
>
> Key: LUCENE-5618
> URL: https://issues.apache.org/jira/browse/LUCENE-5618
> Project: Lucene - Core
> Issue Type: Bug
> Reporter: Robert Muir
>
> Spinoff from LUCENE-5616.
> See the example there, docvalues readers get a fieldinfos, but it doesn't
> contain the correct ones, so they have invalid field numbers at read time.
> This should really be fixed. Maybe a simple solution is to not write
> "batches" of fields in updates but just have only one field per gen?
> This removes many-many relationships and would make things easy to understand.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]