It's perfectly fine, and recommended, to reuse a thread across
different queries (ie, use a thread pool in your app, up above
Lucene).

The ThreadLocals used in SegmentCoreReaders should not interfere or
cause problems with that: they can easily be re-used across queries.

Maybe you can boil down the issue you are seeing into a small test case?

Mike McCandless

http://blog.mikemccandless.com


On Mon, Oct 21, 2013 at 10:35 AM, Duke DAI <duke.dai....@gmail.com> wrote:
> Hi Mike,
>
> My scenario, query thread from a ThreadPool will be used to execute query.
> So thread must have to be reused to handle various queries. Now that
> SegmentCoreReaders
> uses ThreadLocal to hold per-thread instance, I think some private
> variables must belong to the given thread(file offset? I didn't find any
> other thread-dependent status), otherwise object-level instance is enough.
> And ThreadPool is very common to facilitate heavy load queries, does the
> ThreadLocal mechanism support thread reuse for different queries? You know,
> either thread creation is heavy or ThreadLocal cleanup from outside is
> complicated.
> My test shows NumericDocValues will return wrong value, but sure that it's
> a long value, upper logic can verify whether the value is valid or not.
>
> As I described in earlier mail, in Lucene4.4 
> Lucene42DocValuesFormat(in-memory)
> has no problem, DiskDocValuesFormat(in-disk) has problem. Now in
> Lucene4.5, MemoryDocValuesFormat(in-memory)
> has no problem, but Lucene45DocValuesFormat(in-disk) has problem.
> Coincidency? My test is far more complex than I described, two ThreadPool,
> one is used to handle main query, one is used to query sub collections
> parallelly with proper RejectedExecutionHandler(now one sub rejected,
> cancel and fail all subs).
>
> For simple, what's the private status of per-thread NumericDocValues
> instance? The private status can be re-used for different queries?
>
>
> Best regards,
> Duke
> If not now, when? If not me, who?
>
>
> On Mon, Oct 21, 2013 at 7:26 PM, Michael McCandless <
> luc...@mikemccandless.com> wrote:
>
>> Can you describe what problem you are actually hitting?
>>
>> The purpose of docValuesLocal is to hold the per-Thread instance of
>> each doc values, and re-use it when that thread comes back again
>> asking for the same doc values.
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>>
>> On Mon, Oct 21, 2013 at 6:28 AM, Duke DAI <duke.dai....@gmail.com> wrote:
>> > Hi guys,
>> >
>> > Seems I have the same problem with Lucene45DocValuesFormat, no problem
>> with
>> > MemoryDocValuesFormat. The problem I encountered with Lucene4.4 is with
>> > DiskDocValuesFormat, no with Lucene42DocValuesFormat.
>> >
>> > I dig into a little and found the superficial cause. In
>> SegmentCoreReaders,
>> > there is a ThreadLocal variable, docValuesLocal. Its purpose is avoid
>> > building data structure repeatedly by query thread . But how about the
>> > query thread is from thread pool, and reused for different query?
>> > I removed docValuesLocal and built a lucene-core.jar, it works with my
>> > multi-threads(thread pool) test cases.
>> >
>> > Do you have any idea about this? Information is enough?
>> >
>> >
>> > Thanks,
>> > Duke
>> >
>> >
>> > Best regards,
>> > Duke
>> > If not now, when? If not me, who?
>> >
>> >
>> > On Tue, Aug 13, 2013 at 4:54 PM, Duke DAI <duke.dai....@gmail.com>
>> wrote:
>> >
>> >> Hi experts,
>> >>
>> >> I'm upgrading Lucene 4.4 and trying to use DocValues instead of store
>> >> field for performance reason. But due to unknown size of index(depends
>> on
>> >> customer), so I will use DiskDocValuesFormat, especially for some binary
>> >> field. Then I wrote my customized Codec:
>> >>
>> >>       final Codec codec = new Lucene42Codec() {
>> >>
>> >>         private final Lucene42DocValuesFormat memoryDVFormat = new
>> >> Lucene42DocValuesFormat();
>> >>         private final DiskDocValuesFormat diskDVFormat = new
>> >> DiskDocValuesFormat();
>> >>
>> >>         @Override
>> >>         public DocValuesFormat getDocValuesFormatForField(String field)
>> {
>> >>           if
>> >> (LucenePluginConstants.INDEX_STORED_RETURNABLE_FIELD.equals(field)
>> >>               || LucenePluginConstants.PAYLOAD_FIELD_NAME.equals(field)
>> ||
>> >> LucenePluginConstants.INDEX_NODE_ID_DOC_VALUE.equals(field)) {
>> >>             return diskDVFormat;
>> >>           } else {
>> >>             return memoryDVFormat
>> >>           }
>> >>         }
>> >>       };
>> >>       iwc.setCodec(codec);
>> >>
>> >> Here field LucenePluginConstants.INDEX_NODE_ID_DOC_VALUE is numeric
>> field,
>> >> long type. And others are binary.
>> >>
>> >> Then I consume DV like below pseudo-code:
>> >>     nodeIDDocValuesSource =
>> >>             MultiDocValues.getNumericValues(searcher.getIndexReader(),
>> >>                 LucenePluginConstants.INDEX_NODE_ID_DOC_VALUE);
>> >>
>> >>    ......
>> >>    long nodeId= nodeIDDocValuesSource.get(scoreDoc.doc);
>> >>
>> >> Then I'm sure I get a wrong nodeId, which will be verified by upper
>> logic
>> >> and treated as data corruption.
>> >>
>> >>
>> >> But if I change to memoryDVFormat for the long type field, then
>> everything
>> >> is OK.
>> >>
>> >> Also for upgrading legacy data, I keep two index format, DV or stored
>> >> field, controlled by version. If I use stored field, everything is OK.
>> >> So I guess there is a bug with  DiskDocValuesFormat, numeric data type,
>> >> does it relate to byte-aligned numeric compression?
>> >> Or I didn't use DiskDocValuesFormat correctly? Seems no other parameters
>> >> for it.
>> >>
>> >> Sorry that I have no pure Lucene test case yet. Hope someone shed some
>> >> light on this.
>> >>
>> >>
>> >>
>> >>
>> >> Best regards,
>> >> Duke
>> >> If not now, when? If not me, who?
>> >>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to