The bitset thing is just an example of a trivial operation in a HitCollector. You'll want to do something like use TermDocs/TermEnum to see what category your document is in and add it to some counts you use rather than just add something to a bitset. Or see the idea at the end of this mail.
That said, I wouldn't do any sorting here. Just use a Sort object in your original search. You can even do primary and secondary sorts based upon different fields. Say category first and relevance second etc. Also, look at TopDocs. Hits objects are optimized for getting the top 100 or so documents. Every time you cross a boundary, you re-execute the query for the next chunk. So, for instance, you'd re-execute the query when you asked for doc 101, 201, 301, 401 etc. (although I think there's been some work on the chunk size recently). So, if you don't ever expect very many documents or if your searches are *very* cheap, go ahead and use a Hits object. Otherwise you have to do some extra work for efficiency if speed issues arise. Be aware that loading the document in a HitCollector is expensive, but you can do the lazy loading trick and/or just go directly to your indexed Category data via TermDocs/TermEnum. And here's a way to just use the bitset thing, depending. You could create a filter (or a bitset) for each category for *all* your documents and cache them. I.e. a b1 for category1, b2 for category 2 etc. You could do this by using the TermDocs/TermEnum classes. Then, to count how many of your current search hits were in each category, AND the bitset from HitCollector you included in your e-mail with each of your category bitsets and ask for the cardinality of the result. Best Erick On Jan 22, 2008 12:37 PM, Cam Bazz <[EMAIL PROTECTED]> wrote: > Hello, > > Could someone show me a concrete example of how to use HitCollector? > I have documents which have a field category. When I run a query, I need > to > sort results by category as well as count how many hits are there for a > given category. > > I understand: > > searcher.search(Query, new HitCollector() { > public void collect(int docnum, float score) { > bitSet.add(docNum); > } > } > ); > > > So we now have a bitset that contains docnums. > > How do we do sorting and filtering over this, and why is it more efficient > to do it from hits? > > Best Regards, > -C.B. >