RE: Possible to "quickly" fetch count of other terms based on a query

Lars-Erik Aabech Fri, 22 Feb 2013 03:24:05 -0800

Thanks again. I'll look into this at a later time. :)
(Have to read the entire book too..)

-----Original Message-----
From: Michael McCandless [mailto:[email protected]] 
Sent: 22. februar 2013 12:21
To: [email protected]
Subject: Re: Possible to "quickly" fetch count of other terms based on a query

On Fri, Feb 22, 2013 at 6:08 AM, Lars-Erik Aabech <[email protected]> 
wrote:
> Thanks.
>
> ANDing was what I ment with "combined" queries.
> I think I'll go with that one for now and see how it performs. Not too 
> many docs/terms in the index. (~1500/30)
>
> Bit sets sounds appealing, but I've got no idea how to go about it. :) 
> In "lucene in action", I only find a short mention of DocIdBitSet.
> Any hints?

You can just create a FixedBitSet of size maxDoc(), and then call
.or(DocsEnum) which you got for each term, to get the bitset for each term.

For a Query, it's a bit trickier: you need to pull its Weight, and then pull a 
Scorer from that, and then create a FixedBitSet and call
.or(Scorer) to set all bits.

Then you can .and these bitsets together and call .cardinality to get total 
bits set.

To get best perf, you should do this per-segment (ie, iterate over IR.leaves(), 
and do the code above per-segment), but for easiest-to-write code, you can 
operate on the top-level reader by wrapping your IR in 
SlowCompositeReaderWrapper).

Mike McCandless

http://blog.mikemccandless.com

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

RE: Possible to "quickly" fetch count of other terms based on a query

Reply via email to