Hi Steve,

I'd appreciate knowing your results since I have a similar problem.

Thanks
Kyle



On 10/30/13 8:44 PM, "Stephen GRAY" <stephen.g...@immi.gov.au> wrote:

>UNOFFICIAL
>
>Hi Mike,
>
>Thanks for the helpful response. I'll try them both and see if any
>performance imrpovement I get from the mre complicated method is worth
>the extra complexity.
>
>Thanks,
>Steve
>
>-----Original Message-----
>From: Michael McCandless [mailto:luc...@mikemccandless.com]
>Sent: Wednesday, 30 October 2013 9:57 PM
>To: Lucene Users
>Subject: Re: splitting docIds from a search by segment [SEC=UNOFFICIAL]
>
>You should try MultiDocValues first; it's trivial to use and may not be
>horribly slow.
>
>It must do a binary-search for every docID lookup.
>
>And then if this is too slow, assuming you traverse the docIDs in order,
>you can use IndexReader.leaves() to get the sub-readers.  The docIDs are
>just "appended" from these sub-readers, so you'd walk your docIDs and
>also walk you sub-readers, moving to the next sub-reader once you have a
>docID that's beyond its end.  Each sub-reader spans
>AtomicReaderContext.docBase to docBase +
>AtomicReaderContext.reader.maxDoc().
>
>Mike McCandless
>
>http://blog.mikemccandless.com
>
>On Wed, Oct 30, 2013 at 2:21 AM, Stephen GRAY <stephen.g...@immi.gov.au>
>wrote:
>> UNOFFICIAL
>> Hi everyone,
>>
>> I am trying to write an application that loops through 500,000 -
>>1,000,000 documents returned by a search and calculates some statistics
>>using the value in a stored field. Obviously this needs to be as fast as
>>possible so I am using a NumericDocValues field to store the value.
>>
>> What I don't know is how to get the NumericDocValues value for each
>>docId returned by the search. What I've been told to do in a previous
>>thread was:
>>
>> 1.       Split the docIds according to the segment they belong to
>>
>> 2.       Get a per-segment NumericDocValues instance and use this to
>>extract the values
>>
>> Can someone tell me how to do 1 and 2? I don't know how to discover
>>what segment a given docId is in, or how to convert a segment into a
>>NumericDocValues array.
>>
>> By the way it's also been suggested that I just use
>>MultiDocValue.getNumericValues, but I gather that this will be much
>>slower.
>>
>> I'd appreciate any help,
>>
>> Thanks,
>> Steve
>>
>> UNOFFICIAL
>>
>>
>> --------------------------------------------------------------------
>> Important Notice: If you have received this email by mistake, please
>> advise the sender and delete the message and attachments immediately.
>> This email, including attachments, may contain confidential,
>> sensitive, legally privileged and/or copyright information.  Any
>> review, retransmission, dissemination or other use of this information
>> by persons or entities other than the intended recipient is
>> prohibited.  DIAC respects your privacy and has obligations under the
>> Privacy Act 1988.  The official departmental privacy policy can be
>>viewed on the department's website at www.immi.gov.au.  See:
>> http://www.immi.gov.au/functional/privacy.htm
>>
>>
>> ---------------------------------------------------------------------
>>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>
>UNOFFICIAL
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>For additional commands, e-mail: java-user-h...@lucene.apache.org
>



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to