If you are referring to the number of documents containing a particular term, that is available from IndexReader.termDocs(Term t). However, if it is anything more complex than a single term (like a phrase or some other query), I think the only way is to actually run a search on this query and get the length of the Hits object returned. Slightly more efficient, but requiring a bit more work, is to create a HitCollector that uses a BitVector (see org.apache.lucene.util.BitVector) to mark off documents that the searcher finds. Afterwards you can get the count from the bit vector. This will skip over sorting that is done in the standard HitCollector. You cannot simply count the number of times the method collect() is called on your collector because some queries may result in the same document being selected more than once and so you'd end up with a double-count. (Can anyone confirm that this is the case?)
Nioche, Julien wrote: >Hello All, > >I'm trying to get a word count information for exact phrases, i-e to know >how many times a given form occur in the index. Does anyone know how I can >do this in a clean way? > >Does it recquire modifying the score() methods of the different Scorers? Or >is this information already computed somewhere else? > >Thanks a lot for your help > >Julien Nioche >