It's perfectly fine, and recommended, to reuse a thread across different queries (ie, use a thread pool in your app, up above Lucene).
The ThreadLocals used in SegmentCoreReaders should not interfere or cause problems with that: they can easily be re-used across queries. Maybe you can boil down the issue you are seeing into a small test case? Mike McCandless http://blog.mikemccandless.com On Mon, Oct 21, 2013 at 10:35 AM, Duke DAI <duke.dai....@gmail.com> wrote: > Hi Mike, > > My scenario, query thread from a ThreadPool will be used to execute query. > So thread must have to be reused to handle various queries. Now that > SegmentCoreReaders > uses ThreadLocal to hold per-thread instance, I think some private > variables must belong to the given thread(file offset? I didn't find any > other thread-dependent status), otherwise object-level instance is enough. > And ThreadPool is very common to facilitate heavy load queries, does the > ThreadLocal mechanism support thread reuse for different queries? You know, > either thread creation is heavy or ThreadLocal cleanup from outside is > complicated. > My test shows NumericDocValues will return wrong value, but sure that it's > a long value, upper logic can verify whether the value is valid or not. > > As I described in earlier mail, in Lucene4.4 > Lucene42DocValuesFormat(in-memory) > has no problem, DiskDocValuesFormat(in-disk) has problem. Now in > Lucene4.5, MemoryDocValuesFormat(in-memory) > has no problem, but Lucene45DocValuesFormat(in-disk) has problem. > Coincidency? My test is far more complex than I described, two ThreadPool, > one is used to handle main query, one is used to query sub collections > parallelly with proper RejectedExecutionHandler(now one sub rejected, > cancel and fail all subs). > > For simple, what's the private status of per-thread NumericDocValues > instance? The private status can be re-used for different queries? > > > Best regards, > Duke > If not now, when? If not me, who? > > > On Mon, Oct 21, 2013 at 7:26 PM, Michael McCandless < > luc...@mikemccandless.com> wrote: > >> Can you describe what problem you are actually hitting? >> >> The purpose of docValuesLocal is to hold the per-Thread instance of >> each doc values, and re-use it when that thread comes back again >> asking for the same doc values. >> >> Mike McCandless >> >> http://blog.mikemccandless.com >> >> >> On Mon, Oct 21, 2013 at 6:28 AM, Duke DAI <duke.dai....@gmail.com> wrote: >> > Hi guys, >> > >> > Seems I have the same problem with Lucene45DocValuesFormat, no problem >> with >> > MemoryDocValuesFormat. The problem I encountered with Lucene4.4 is with >> > DiskDocValuesFormat, no with Lucene42DocValuesFormat. >> > >> > I dig into a little and found the superficial cause. In >> SegmentCoreReaders, >> > there is a ThreadLocal variable, docValuesLocal. Its purpose is avoid >> > building data structure repeatedly by query thread . But how about the >> > query thread is from thread pool, and reused for different query? >> > I removed docValuesLocal and built a lucene-core.jar, it works with my >> > multi-threads(thread pool) test cases. >> > >> > Do you have any idea about this? Information is enough? >> > >> > >> > Thanks, >> > Duke >> > >> > >> > Best regards, >> > Duke >> > If not now, when? If not me, who? >> > >> > >> > On Tue, Aug 13, 2013 at 4:54 PM, Duke DAI <duke.dai....@gmail.com> >> wrote: >> > >> >> Hi experts, >> >> >> >> I'm upgrading Lucene 4.4 and trying to use DocValues instead of store >> >> field for performance reason. But due to unknown size of index(depends >> on >> >> customer), so I will use DiskDocValuesFormat, especially for some binary >> >> field. Then I wrote my customized Codec: >> >> >> >> final Codec codec = new Lucene42Codec() { >> >> >> >> private final Lucene42DocValuesFormat memoryDVFormat = new >> >> Lucene42DocValuesFormat(); >> >> private final DiskDocValuesFormat diskDVFormat = new >> >> DiskDocValuesFormat(); >> >> >> >> @Override >> >> public DocValuesFormat getDocValuesFormatForField(String field) >> { >> >> if >> >> (LucenePluginConstants.INDEX_STORED_RETURNABLE_FIELD.equals(field) >> >> || LucenePluginConstants.PAYLOAD_FIELD_NAME.equals(field) >> || >> >> LucenePluginConstants.INDEX_NODE_ID_DOC_VALUE.equals(field)) { >> >> return diskDVFormat; >> >> } else { >> >> return memoryDVFormat >> >> } >> >> } >> >> }; >> >> iwc.setCodec(codec); >> >> >> >> Here field LucenePluginConstants.INDEX_NODE_ID_DOC_VALUE is numeric >> field, >> >> long type. And others are binary. >> >> >> >> Then I consume DV like below pseudo-code: >> >> nodeIDDocValuesSource = >> >> MultiDocValues.getNumericValues(searcher.getIndexReader(), >> >> LucenePluginConstants.INDEX_NODE_ID_DOC_VALUE); >> >> >> >> ...... >> >> long nodeId= nodeIDDocValuesSource.get(scoreDoc.doc); >> >> >> >> Then I'm sure I get a wrong nodeId, which will be verified by upper >> logic >> >> and treated as data corruption. >> >> >> >> >> >> But if I change to memoryDVFormat for the long type field, then >> everything >> >> is OK. >> >> >> >> Also for upgrading legacy data, I keep two index format, DV or stored >> >> field, controlled by version. If I use stored field, everything is OK. >> >> So I guess there is a bug with DiskDocValuesFormat, numeric data type, >> >> does it relate to byte-aligned numeric compression? >> >> Or I didn't use DiskDocValuesFormat correctly? Seems no other parameters >> >> for it. >> >> >> >> Sorry that I have no pure Lucene test case yet. Hope someone shed some >> >> light on this. >> >> >> >> >> >> >> >> >> >> Best regards, >> >> Duke >> >> If not now, when? If not me, who? >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org