Hi Mike,

Thanks for your quick response.

All data was newly indexed, so compatibility is not the culprit.

Is it possible a multi-thread issue? I use shared IndexReaders between
different IndexSearchers. No evidence for this guess because I have many
multi-thread test cases and they passed, but the one which has problem is
not a multi-thread scenario for index.


Best regards,
Duke
If not now, when? If not me, who?


On Tue, Aug 13, 2013 at 7:34 PM, Michael McCandless <
luc...@mikemccandless.com> wrote:

> DiskDVFormat does not have index back compatibility between minor
> releases; maybe that's what you are seeing?  So, you must fully
> re-index after any DiskDVFormat field after upgrading ...
>
> Only the default formats support index back compatibility between releases.
>
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Tue, Aug 13, 2013 at 4:54 AM, Duke DAI <duke.dai....@gmail.com> wrote:
> > Hi experts,
> >
> > I'm upgrading Lucene 4.4 and trying to use DocValues instead of store
> field
> > for performance reason. But due to unknown size of index(depends on
> > customer), so I will use DiskDocValuesFormat, especially for some binary
> > field. Then I wrote my customized Codec:
> >
> >       final Codec codec = new Lucene42Codec() {
> >
> >         private final Lucene42DocValuesFormat memoryDVFormat = new
> > Lucene42DocValuesFormat();
> >         private final DiskDocValuesFormat diskDVFormat = new
> > DiskDocValuesFormat();
> >
> >         @Override
> >         public DocValuesFormat getDocValuesFormatForField(String field) {
> >           if
> > (LucenePluginConstants.INDEX_STORED_RETURNABLE_FIELD.equals(field)
> >               || LucenePluginConstants.PAYLOAD_FIELD_NAME.equals(field)
> ||
> > LucenePluginConstants.INDEX_NODE_ID_DOC_VALUE.equals(field)) {
> >             return diskDVFormat;
> >           } else {
> >             return memoryDVFormat
> >           }
> >         }
> >       };
> >       iwc.setCodec(codec);
> >
> > Here field LucenePluginConstants.INDEX_NODE_ID_DOC_VALUE is numeric
> field,
> > long type. And others are binary.
> >
> > Then I consume DV like below pseudo-code:
> >     nodeIDDocValuesSource =
> >             MultiDocValues.getNumericValues(searcher.getIndexReader(),
> >                 LucenePluginConstants.INDEX_NODE_ID_DOC_VALUE);
> >
> >    ......
> >    long nodeId= nodeIDDocValuesSource.get(scoreDoc.doc);
> >
> > Then I'm sure I get a wrong nodeId, which will be verified by upper logic
> > and treated as data corruption.
> >
> >
> > But if I change to memoryDVFormat for the long type field, then
> everything
> > is OK.
> >
> > Also for upgrading legacy data, I keep two index format, DV or stored
> > field, controlled by version. If I use stored field, everything is OK.
> > So I guess there is a bug with  DiskDocValuesFormat, numeric data type,
> > does it relate to byte-aligned numeric compression?
> > Or I didn't use DiskDocValuesFormat correctly? Seems no other parameters
> > for it.
> >
> > Sorry that I have no pure Lucene test case yet. Hope someone shed some
> > light on this.
> >
> >
> >
> >
> > Best regards,
> > Duke
> > If not now, when? If not me, who?
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

Reply via email to