Hi Mike, Thanks for your quick response.
All data was newly indexed, so compatibility is not the culprit. Is it possible a multi-thread issue? I use shared IndexReaders between different IndexSearchers. No evidence for this guess because I have many multi-thread test cases and they passed, but the one which has problem is not a multi-thread scenario for index. Best regards, Duke If not now, when? If not me, who? On Tue, Aug 13, 2013 at 7:34 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > DiskDVFormat does not have index back compatibility between minor > releases; maybe that's what you are seeing? So, you must fully > re-index after any DiskDVFormat field after upgrading ... > > Only the default formats support index back compatibility between releases. > > > Mike McCandless > > http://blog.mikemccandless.com > > > On Tue, Aug 13, 2013 at 4:54 AM, Duke DAI <duke.dai....@gmail.com> wrote: > > Hi experts, > > > > I'm upgrading Lucene 4.4 and trying to use DocValues instead of store > field > > for performance reason. But due to unknown size of index(depends on > > customer), so I will use DiskDocValuesFormat, especially for some binary > > field. Then I wrote my customized Codec: > > > > final Codec codec = new Lucene42Codec() { > > > > private final Lucene42DocValuesFormat memoryDVFormat = new > > Lucene42DocValuesFormat(); > > private final DiskDocValuesFormat diskDVFormat = new > > DiskDocValuesFormat(); > > > > @Override > > public DocValuesFormat getDocValuesFormatForField(String field) { > > if > > (LucenePluginConstants.INDEX_STORED_RETURNABLE_FIELD.equals(field) > > || LucenePluginConstants.PAYLOAD_FIELD_NAME.equals(field) > || > > LucenePluginConstants.INDEX_NODE_ID_DOC_VALUE.equals(field)) { > > return diskDVFormat; > > } else { > > return memoryDVFormat > > } > > } > > }; > > iwc.setCodec(codec); > > > > Here field LucenePluginConstants.INDEX_NODE_ID_DOC_VALUE is numeric > field, > > long type. And others are binary. > > > > Then I consume DV like below pseudo-code: > > nodeIDDocValuesSource = > > MultiDocValues.getNumericValues(searcher.getIndexReader(), > > LucenePluginConstants.INDEX_NODE_ID_DOC_VALUE); > > > > ...... > > long nodeId= nodeIDDocValuesSource.get(scoreDoc.doc); > > > > Then I'm sure I get a wrong nodeId, which will be verified by upper logic > > and treated as data corruption. > > > > > > But if I change to memoryDVFormat for the long type field, then > everything > > is OK. > > > > Also for upgrading legacy data, I keep two index format, DV or stored > > field, controlled by version. If I use stored field, everything is OK. > > So I guess there is a bug with DiskDocValuesFormat, numeric data type, > > does it relate to byte-aligned numeric compression? > > Or I didn't use DiskDocValuesFormat correctly? Seems no other parameters > > for it. > > > > Sorry that I have no pure Lucene test case yet. Hope someone shed some > > light on this. > > > > > > > > > > Best regards, > > Duke > > If not now, when? If not me, who? > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >