Thanks, Mike. Finally I figured out the root cause. I use thread from Thread-Pool-1 to probe indexes parallelly on multiple collections, but will consume documents by thread from Thread-Pool-2. I hold the same DocValue object reference to get values. After paying attention to thread switch, the problem was resolved.
Thank you guys for building this feature into lucene-core.jar, it dispels my worry about compatibility by using lucene-codecs.jar Best regards, Duke If not now, when? If not me, who? On Tue, Oct 22, 2013 at 12:48 AM, Michael McCandless < luc...@mikemccandless.com> wrote: > It's perfectly fine, and recommended, to reuse a thread across > different queries (ie, use a thread pool in your app, up above > Lucene). > > The ThreadLocals used in SegmentCoreReaders should not interfere or > cause problems with that: they can easily be re-used across queries. > > Maybe you can boil down the issue you are seeing into a small test case? > > Mike McCandless > > http://blog.mikemccandless.com > > > On Mon, Oct 21, 2013 at 10:35 AM, Duke DAI <duke.dai....@gmail.com> wrote: > > Hi Mike, > > > > My scenario, query thread from a ThreadPool will be used to execute > query. > > So thread must have to be reused to handle various queries. Now that > > SegmentCoreReaders > > uses ThreadLocal to hold per-thread instance, I think some private > > variables must belong to the given thread(file offset? I didn't find any > > other thread-dependent status), otherwise object-level instance is > enough. > > And ThreadPool is very common to facilitate heavy load queries, does the > > ThreadLocal mechanism support thread reuse for different queries? You > know, > > either thread creation is heavy or ThreadLocal cleanup from outside is > > complicated. > > My test shows NumericDocValues will return wrong value, but sure that > it's > > a long value, upper logic can verify whether the value is valid or not. > > > > As I described in earlier mail, in Lucene4.4 > Lucene42DocValuesFormat(in-memory) > > has no problem, DiskDocValuesFormat(in-disk) has problem. Now in > > Lucene4.5, MemoryDocValuesFormat(in-memory) > > has no problem, but Lucene45DocValuesFormat(in-disk) has problem. > > Coincidency? My test is far more complex than I described, two > ThreadPool, > > one is used to handle main query, one is used to query sub collections > > parallelly with proper RejectedExecutionHandler(now one sub rejected, > > cancel and fail all subs). > > > > For simple, what's the private status of per-thread NumericDocValues > > instance? The private status can be re-used for different queries? > > > > > > Best regards, > > Duke > > If not now, when? If not me, who? > > > > > > On Mon, Oct 21, 2013 at 7:26 PM, Michael McCandless < > > luc...@mikemccandless.com> wrote: > > > >> Can you describe what problem you are actually hitting? > >> > >> The purpose of docValuesLocal is to hold the per-Thread instance of > >> each doc values, and re-use it when that thread comes back again > >> asking for the same doc values. > >> > >> Mike McCandless > >> > >> http://blog.mikemccandless.com > >> > >> > >> On Mon, Oct 21, 2013 at 6:28 AM, Duke DAI <duke.dai....@gmail.com> > wrote: > >> > Hi guys, > >> > > >> > Seems I have the same problem with Lucene45DocValuesFormat, no problem > >> with > >> > MemoryDocValuesFormat. The problem I encountered with Lucene4.4 is > with > >> > DiskDocValuesFormat, no with Lucene42DocValuesFormat. > >> > > >> > I dig into a little and found the superficial cause. In > >> SegmentCoreReaders, > >> > there is a ThreadLocal variable, docValuesLocal. Its purpose is avoid > >> > building data structure repeatedly by query thread . But how about the > >> > query thread is from thread pool, and reused for different query? > >> > I removed docValuesLocal and built a lucene-core.jar, it works with my > >> > multi-threads(thread pool) test cases. > >> > > >> > Do you have any idea about this? Information is enough? > >> > > >> > > >> > Thanks, > >> > Duke > >> > > >> > > >> > Best regards, > >> > Duke > >> > If not now, when? If not me, who? > >> > > >> > > >> > On Tue, Aug 13, 2013 at 4:54 PM, Duke DAI <duke.dai....@gmail.com> > >> wrote: > >> > > >> >> Hi experts, > >> >> > >> >> I'm upgrading Lucene 4.4 and trying to use DocValues instead of store > >> >> field for performance reason. But due to unknown size of > index(depends > >> on > >> >> customer), so I will use DiskDocValuesFormat, especially for some > binary > >> >> field. Then I wrote my customized Codec: > >> >> > >> >> final Codec codec = new Lucene42Codec() { > >> >> > >> >> private final Lucene42DocValuesFormat memoryDVFormat = new > >> >> Lucene42DocValuesFormat(); > >> >> private final DiskDocValuesFormat diskDVFormat = new > >> >> DiskDocValuesFormat(); > >> >> > >> >> @Override > >> >> public DocValuesFormat getDocValuesFormatForField(String > field) > >> { > >> >> if > >> >> (LucenePluginConstants.INDEX_STORED_RETURNABLE_FIELD.equals(field) > >> >> || > LucenePluginConstants.PAYLOAD_FIELD_NAME.equals(field) > >> || > >> >> LucenePluginConstants.INDEX_NODE_ID_DOC_VALUE.equals(field)) { > >> >> return diskDVFormat; > >> >> } else { > >> >> return memoryDVFormat > >> >> } > >> >> } > >> >> }; > >> >> iwc.setCodec(codec); > >> >> > >> >> Here field LucenePluginConstants.INDEX_NODE_ID_DOC_VALUE is numeric > >> field, > >> >> long type. And others are binary. > >> >> > >> >> Then I consume DV like below pseudo-code: > >> >> nodeIDDocValuesSource = > >> >> > MultiDocValues.getNumericValues(searcher.getIndexReader(), > >> >> LucenePluginConstants.INDEX_NODE_ID_DOC_VALUE); > >> >> > >> >> ...... > >> >> long nodeId= nodeIDDocValuesSource.get(scoreDoc.doc); > >> >> > >> >> Then I'm sure I get a wrong nodeId, which will be verified by upper > >> logic > >> >> and treated as data corruption. > >> >> > >> >> > >> >> But if I change to memoryDVFormat for the long type field, then > >> everything > >> >> is OK. > >> >> > >> >> Also for upgrading legacy data, I keep two index format, DV or stored > >> >> field, controlled by version. If I use stored field, everything is > OK. > >> >> So I guess there is a bug with DiskDocValuesFormat, numeric data > type, > >> >> does it relate to byte-aligned numeric compression? > >> >> Or I didn't use DiskDocValuesFormat correctly? Seems no other > parameters > >> >> for it. > >> >> > >> >> Sorry that I have no pure Lucene test case yet. Hope someone shed > some > >> >> light on this. > >> >> > >> >> > >> >> > >> >> > >> >> Best regards, > >> >> Duke > >> >> If not now, when? If not me, who? > >> >> > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > >> For additional commands, e-mail: java-user-h...@lucene.apache.org > >> > >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >