I guess it performs alright :P Overall Elapsed: 00:00:00.0290029
(29ms) Lars-Erik -----Original Message----- From: Lars-Erik Aabech [mailto:l...@markedspartner.no] Sent: 22. februar 2013 12:09 To: java-user@lucene.apache.org Subject: RE: Possible to "quickly" fetch count of other terms based on a query Thanks. ANDing was what I ment with "combined" queries. I think I'll go with that one for now and see how it performs. Not too many docs/terms in the index. (~1500/30) Bit sets sounds appealing, but I've got no idea how to go about it. :) In "lucene in action", I only find a short mention of DocIdBitSet. Any hints? Lars-Erik -----Original Message----- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: 22. februar 2013 11:27 To: java-user@lucene.apache.org Subject: Re: Possible to "quickly" fetch count of other terms based on a query For terms that are in your query, you could use the Scorer.getChildScorers API up front to hold onto each Scorer and then in a custom collector check if that Scorer matched this particular hit. For terms that are not in your query.....: You could use term vectors and count up the terms yourself as you go (in a custom collector), but that'd be insanely slow. You could create a bit set of all matching docs, and then a bit set for each of the terms of interest, and intersect them and count the set bits. You could pull the DocsEnum for each term of interest up front, and then in a custom collector call .advance on each, for each collected docID, and increment counts if that term matches that doc. Or you could just do a separate query for each of the terms of interest AND'd with your original query. Mike McCandless http://blog.mikemccandless.com On Fri, Feb 22, 2013 at 4:14 AM, Lars-Erik Aabech <l...@markedspartner.no> wrote: > Hi! > > I'm sorry I didn't do any hard research on this, it's so quick to ask. > ;) > > Is it possible to somehow find the count of each term in a set for each > document returned by a query? > > For instance, if I use the query +(foo:bar foo:morebar) +(bar:foo), > Could I without fetching all the documents from this query, find the count of > occurances of the terms [barette, fooish, bar, morebar, foo]? > The result I'm after is something like > barette: 10, > fooish: 0, > bar: 5, > morebar: 8 > foo: 3 > > Hope the question is clear enough. > Any suggestion is welcome. > I'd prefer not having to build a second index, though.. > > (I guess I could do a new "combined" query for each term in the set, > but if any other way it'd be nice) > > mvh. > Lars-Erik Aabech > Faglig leder utvikling > MarkedsPartner AS > Mobil: +47 920 30 537 > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org