Ahh! You are right, we did expose this before 4.0. But yes it has the same requirement -- it only works on a SegmentReader.
Mike On Sat, Oct 16, 2010 at 5:52 AM, Uwe Schindler <u...@thetaphi.de> wrote: > Hi Mike, > > As far as I know, 3.0 also has this method: > http://lucene.apache.org/java/3_0_2/api/core/org/apache/lucene/index/IndexRe > ader.html#getUniqueTermCount() > > But it also only works on segment level, too! So you have to use > getSequentialSubReaders/ReaderUtil.gatherSubReaders() and do it per segment. > But to get the unique count for the whole index, there is no way around > iterating every term, as duplicates must be removed (which TermEnum does). > > Uwe > > ----- > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > >> -----Original Message----- >> From: Michael McCandless [mailto:luc...@mikemccandless.com] >> Sent: Saturday, October 16, 2010 11:17 AM >> To: java-user@lucene.apache.org >> Subject: Re: API that return the amount of terms indexed >> >> 4.0 will have an API to get the number of unique terms for a given field, > or >> across all fields, but only at the segment level. (Getting the count > across >> segments requires a merge sort). >> >> 3.x and before doesn't have such an API, though the information is tracked >> under the hood. If you open the _X.tis file, skip the first int, then > call >> readLong(), that should be the number of unique terms in that segment. >> >> You can always simply fallback to getting the term enum and stepping > counting >> how many .next()'s there are until exhaustion... >> >> Mike >> >> On Fri, Oct 15, 2010 at 7:51 PM, APOLO_11 <barhen....@gmail.com> wrote: >> > >> > hey - is there an API that return the number of term indexed? >> > >> > I found the API return the amount of document indexed >> > (IndexWriter.docCount) but cant find an API for the amount of terms in >> > the index. >> > >> > any idea ? >> > >> > thanks,d. >> > -- >> > View this message in context: >> > http://lucene.472066.n3.nabble.com/API-that-return-the-amount-of-terms >> > -indexed-tp1712290p1712290.html Sent from the Lucene - Java Users >> > mailing list archive at Nabble.com. >> > >> > --------------------------------------------------------------------- >> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> > For additional commands, e-mail: java-user-h...@lucene.apache.org >> > >> > >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org