In a word, no. You'd need to customize the Lucene source to accomplish this.
On Wed, Nov 10, 2010 at 1:02 PM, Burton-West, Tom <tburt...@umich.edu> wrote: > Hello all, > > We have an extremely large number of terms in our indexes. I want to be able > to extract a sample of the terms, say something like every 128th term. If I > use code based on org.apache.lucene.misc.HighFreqTerms or > org.apache.lucene.index.CheckIndex I would get a TermsEnum, call > termEnum.next() 128 times, grab the term and then call next another 128 times. > termEnum = reader.terms(); > while (termEnum.next() > { } > > Since the tii file contains every 128th (or IndexInterval ) term and it is > loaded into memory, is there some programmatic way (in the public API) to > read that data structure in memory rather than having to force Lucene to > actually read the entire tis file by using termEnum.next() ? > > > Tom Burton-West > http://www.hathitrust.org/blogs/large-scale-search > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org