In a word, no.  You'd need to customize the Lucene source to accomplish this.

On Wed, Nov 10, 2010 at 1:02 PM, Burton-West, Tom <tburt...@umich.edu> wrote:
> Hello all,
>
> We have an extremely large number of terms in our indexes.  I want to be able 
> to extract a sample of the terms, say something like every 128th term.   If I 
> use code based on org.apache.lucene.misc.HighFreqTerms or 
> org.apache.lucene.index.CheckIndex I would get a TermsEnum, call 
> termEnum.next() 128 times, grab the term and then call next another 128 times.
> termEnum = reader.terms();
> while (termEnum.next()
> { }
>
> Since the tii file contains every 128th (or IndexInterval ) term and it is 
> loaded into memory, is there some programmatic way (in the public API) to 
> read that data structure in memory rather than having to force Lucene to 
> actually read the entire tis file by using termEnum.next() ?
>
>
> Tom Burton-West
> http://www.hathitrust.org/blogs/large-scale-search
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to