Can you describe what problem you are actually hitting? The purpose of docValuesLocal is to hold the per-Thread instance of each doc values, and re-use it when that thread comes back again asking for the same doc values.
Mike McCandless http://blog.mikemccandless.com On Mon, Oct 21, 2013 at 6:28 AM, Duke DAI <duke.dai....@gmail.com> wrote: > Hi guys, > > Seems I have the same problem with Lucene45DocValuesFormat, no problem with > MemoryDocValuesFormat. The problem I encountered with Lucene4.4 is with > DiskDocValuesFormat, no with Lucene42DocValuesFormat. > > I dig into a little and found the superficial cause. In SegmentCoreReaders, > there is a ThreadLocal variable, docValuesLocal. Its purpose is avoid > building data structure repeatedly by query thread . But how about the > query thread is from thread pool, and reused for different query? > I removed docValuesLocal and built a lucene-core.jar, it works with my > multi-threads(thread pool) test cases. > > Do you have any idea about this? Information is enough? > > > Thanks, > Duke > > > Best regards, > Duke > If not now, when? If not me, who? > > > On Tue, Aug 13, 2013 at 4:54 PM, Duke DAI <duke.dai....@gmail.com> wrote: > >> Hi experts, >> >> I'm upgrading Lucene 4.4 and trying to use DocValues instead of store >> field for performance reason. But due to unknown size of index(depends on >> customer), so I will use DiskDocValuesFormat, especially for some binary >> field. Then I wrote my customized Codec: >> >> final Codec codec = new Lucene42Codec() { >> >> private final Lucene42DocValuesFormat memoryDVFormat = new >> Lucene42DocValuesFormat(); >> private final DiskDocValuesFormat diskDVFormat = new >> DiskDocValuesFormat(); >> >> @Override >> public DocValuesFormat getDocValuesFormatForField(String field) { >> if >> (LucenePluginConstants.INDEX_STORED_RETURNABLE_FIELD.equals(field) >> || LucenePluginConstants.PAYLOAD_FIELD_NAME.equals(field) || >> LucenePluginConstants.INDEX_NODE_ID_DOC_VALUE.equals(field)) { >> return diskDVFormat; >> } else { >> return memoryDVFormat >> } >> } >> }; >> iwc.setCodec(codec); >> >> Here field LucenePluginConstants.INDEX_NODE_ID_DOC_VALUE is numeric field, >> long type. And others are binary. >> >> Then I consume DV like below pseudo-code: >> nodeIDDocValuesSource = >> MultiDocValues.getNumericValues(searcher.getIndexReader(), >> LucenePluginConstants.INDEX_NODE_ID_DOC_VALUE); >> >> ...... >> long nodeId= nodeIDDocValuesSource.get(scoreDoc.doc); >> >> Then I'm sure I get a wrong nodeId, which will be verified by upper logic >> and treated as data corruption. >> >> >> But if I change to memoryDVFormat for the long type field, then everything >> is OK. >> >> Also for upgrading legacy data, I keep two index format, DV or stored >> field, controlled by version. If I use stored field, everything is OK. >> So I guess there is a bug with DiskDocValuesFormat, numeric data type, >> does it relate to byte-aligned numeric compression? >> Or I didn't use DiskDocValuesFormat correctly? Seems no other parameters >> for it. >> >> Sorry that I have no pure Lucene test case yet. Hope someone shed some >> light on this. >> >> >> >> >> Best regards, >> Duke >> If not now, when? If not me, who? >> --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org