DiskDVFormat does not have index back compatibility between minor
releases; maybe that's what you are seeing?  So, you must fully
re-index after any DiskDVFormat field after upgrading ...

Only the default formats support index back compatibility between releases.


Mike McCandless

http://blog.mikemccandless.com


On Tue, Aug 13, 2013 at 4:54 AM, Duke DAI <duke.dai....@gmail.com> wrote:
> Hi experts,
>
> I'm upgrading Lucene 4.4 and trying to use DocValues instead of store field
> for performance reason. But due to unknown size of index(depends on
> customer), so I will use DiskDocValuesFormat, especially for some binary
> field. Then I wrote my customized Codec:
>
>       final Codec codec = new Lucene42Codec() {
>
>         private final Lucene42DocValuesFormat memoryDVFormat = new
> Lucene42DocValuesFormat();
>         private final DiskDocValuesFormat diskDVFormat = new
> DiskDocValuesFormat();
>
>         @Override
>         public DocValuesFormat getDocValuesFormatForField(String field) {
>           if
> (LucenePluginConstants.INDEX_STORED_RETURNABLE_FIELD.equals(field)
>               || LucenePluginConstants.PAYLOAD_FIELD_NAME.equals(field) ||
> LucenePluginConstants.INDEX_NODE_ID_DOC_VALUE.equals(field)) {
>             return diskDVFormat;
>           } else {
>             return memoryDVFormat
>           }
>         }
>       };
>       iwc.setCodec(codec);
>
> Here field LucenePluginConstants.INDEX_NODE_ID_DOC_VALUE is numeric field,
> long type. And others are binary.
>
> Then I consume DV like below pseudo-code:
>     nodeIDDocValuesSource =
>             MultiDocValues.getNumericValues(searcher.getIndexReader(),
>                 LucenePluginConstants.INDEX_NODE_ID_DOC_VALUE);
>
>    ......
>    long nodeId= nodeIDDocValuesSource.get(scoreDoc.doc);
>
> Then I'm sure I get a wrong nodeId, which will be verified by upper logic
> and treated as data corruption.
>
>
> But if I change to memoryDVFormat for the long type field, then everything
> is OK.
>
> Also for upgrading legacy data, I keep two index format, DV or stored
> field, controlled by version. If I use stored field, everything is OK.
> So I guess there is a bug with  DiskDocValuesFormat, numeric data type,
> does it relate to byte-aligned numeric compression?
> Or I didn't use DiskDocValuesFormat correctly? Seems no other parameters
> for it.
>
> Sorry that I have no pure Lucene test case yet. Hope someone shed some
> light on this.
>
>
>
>
> Best regards,
> Duke
> If not now, when? If not me, who?

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to