Our problem is that we need to count hits for sub-categories. There are
over 550,000 categories. I am assuming I can't do this inside of a
bitset? Is there a good way to do this type of functionality?
Dennis
Andrzej Bialecki wrote:
Dennis Kubes wrote:
We are running into the same issue. Remember that hits just give you
doc id and getting hit details from the hit does another read. So
looping through the hits to access every document will do a read per
document. If it is a small number of hits, no big deal, but the more
hits to access, the more time. For our situation limiting the query
doesn't work, we need to know information about the hit itself (i.e.
a certain field so we can do a count based on the field). We
implemented it using HitCollector modifications in Lucene. This
works but is not ideal in terms of speed so we are looking at making
modifications to the IndexReader itself so when it gets the Hits it
also gets our field. Understand that doing something like this
though changes core Lucene functionality. I am not necessarily
recommending doing it this way, we just couldn't find another way.
Well, all depends on what kind of details you need to get from each
hit. Have you tried using FieldCache instead? Or pre-populated BitSets
which you then would intersect with the result BitSet to get counts of
matching docs?