Hi Steve, I'd appreciate knowing your results since I have a similar problem.
Thanks Kyle On 10/30/13 8:44 PM, "Stephen GRAY" <stephen.g...@immi.gov.au> wrote: >UNOFFICIAL > >Hi Mike, > >Thanks for the helpful response. I'll try them both and see if any >performance imrpovement I get from the mre complicated method is worth >the extra complexity. > >Thanks, >Steve > >-----Original Message----- >From: Michael McCandless [mailto:luc...@mikemccandless.com] >Sent: Wednesday, 30 October 2013 9:57 PM >To: Lucene Users >Subject: Re: splitting docIds from a search by segment [SEC=UNOFFICIAL] > >You should try MultiDocValues first; it's trivial to use and may not be >horribly slow. > >It must do a binary-search for every docID lookup. > >And then if this is too slow, assuming you traverse the docIDs in order, >you can use IndexReader.leaves() to get the sub-readers. The docIDs are >just "appended" from these sub-readers, so you'd walk your docIDs and >also walk you sub-readers, moving to the next sub-reader once you have a >docID that's beyond its end. Each sub-reader spans >AtomicReaderContext.docBase to docBase + >AtomicReaderContext.reader.maxDoc(). > >Mike McCandless > >http://blog.mikemccandless.com > >On Wed, Oct 30, 2013 at 2:21 AM, Stephen GRAY <stephen.g...@immi.gov.au> >wrote: >> UNOFFICIAL >> Hi everyone, >> >> I am trying to write an application that loops through 500,000 - >>1,000,000 documents returned by a search and calculates some statistics >>using the value in a stored field. Obviously this needs to be as fast as >>possible so I am using a NumericDocValues field to store the value. >> >> What I don't know is how to get the NumericDocValues value for each >>docId returned by the search. What I've been told to do in a previous >>thread was: >> >> 1. Split the docIds according to the segment they belong to >> >> 2. Get a per-segment NumericDocValues instance and use this to >>extract the values >> >> Can someone tell me how to do 1 and 2? I don't know how to discover >>what segment a given docId is in, or how to convert a segment into a >>NumericDocValues array. >> >> By the way it's also been suggested that I just use >>MultiDocValue.getNumericValues, but I gather that this will be much >>slower. >> >> I'd appreciate any help, >> >> Thanks, >> Steve >> >> UNOFFICIAL >> >> >> -------------------------------------------------------------------- >> Important Notice: If you have received this email by mistake, please >> advise the sender and delete the message and attachments immediately. >> This email, including attachments, may contain confidential, >> sensitive, legally privileged and/or copyright information. Any >> review, retransmission, dissemination or other use of this information >> by persons or entities other than the intended recipient is >> prohibited. DIAC respects your privacy and has obligations under the >> Privacy Act 1988. The official departmental privacy policy can be >>viewed on the department's website at www.immi.gov.au. See: >> http://www.immi.gov.au/functional/privacy.htm >> >> >> --------------------------------------------------------------------- >> > >--------------------------------------------------------------------- >To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >For additional commands, e-mail: java-user-h...@lucene.apache.org > > >UNOFFICIAL > >--------------------------------------------------------------------- >To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >For additional commands, e-mail: java-user-h...@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org