What is the recommended way to use DiskDocValuesFormat in production if we can't reindex when we upgrade?
Will the 4.4 version of DDVF be backwards compatible, or should we make our own copy of DDVF and give it a different codec name to protect ourselves against incompatible changes? Thanks, Sean On Tue, Aug 13, 2013 at 4:34 AM, Michael McCandless < luc...@mikemccandless.com> wrote: > DiskDVFormat does not have index back compatibility between minor > releases; maybe that's what you are seeing? So, you must fully > re-index after any DiskDVFormat field after upgrading ... > > Only the default formats support index back compatibility between releases. > > > Mike McCandless > > http://blog.mikemccandless.com > > > On Tue, Aug 13, 2013 at 4:54 AM, Duke DAI <duke.dai....@gmail.com> wrote: > > Hi experts, > > > > I'm upgrading Lucene 4.4 and trying to use DocValues instead of store > field > > for performance reason. But due to unknown size of index(depends on > > customer), so I will use DiskDocValuesFormat, especially for some binary > > field. Then I wrote my customized Codec: > > > > final Codec codec = new Lucene42Codec() { > > > > private final Lucene42DocValuesFormat memoryDVFormat = new > > Lucene42DocValuesFormat(); > > private final DiskDocValuesFormat diskDVFormat = new > > DiskDocValuesFormat(); > > > > @Override > > public DocValuesFormat getDocValuesFormatForField(String field) { > > if > > (LucenePluginConstants.INDEX_STORED_RETURNABLE_FIELD.equals(field) > > || LucenePluginConstants.PAYLOAD_FIELD_NAME.equals(field) > || > > LucenePluginConstants.INDEX_NODE_ID_DOC_VALUE.equals(field)) { > > return diskDVFormat; > > } else { > > return memoryDVFormat > > } > > } > > }; > > iwc.setCodec(codec); > > > > Here field LucenePluginConstants.INDEX_NODE_ID_DOC_VALUE is numeric > field, > > long type. And others are binary. > > > > Then I consume DV like below pseudo-code: > > nodeIDDocValuesSource = > > MultiDocValues.getNumericValues(searcher.getIndexReader(), > > LucenePluginConstants.INDEX_NODE_ID_DOC_VALUE); > > > > ...... > > long nodeId= nodeIDDocValuesSource.get(scoreDoc.doc); > > > > Then I'm sure I get a wrong nodeId, which will be verified by upper logic > > and treated as data corruption. > > > > > > But if I change to memoryDVFormat for the long type field, then > everything > > is OK. > > > > Also for upgrading legacy data, I keep two index format, DV or stored > > field, controlled by version. If I use stored field, everything is OK. > > So I guess there is a bug with DiskDocValuesFormat, numeric data type, > > does it relate to byte-aligned numeric compression? > > Or I didn't use DiskDocValuesFormat correctly? Seems no other parameters > > for it. > > > > Sorry that I have no pure Lucene test case yet. Hope someone shed some > > light on this. > > > > > > > > > > Best regards, > > Duke > > If not now, when? If not me, who? > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >