Michael,

I'm doing something similar to an autocomplete on a textbox. As a word is
typed I want to have a popup display number of hits for that particular
word, a stretch goal would be to have similar words listed as well ranked by
number of hits. This is a desktop app so I think that I can cache the list
of terms in memory for quick retrieval and load it in at app startup.

Andy

On Wed, Sep 2, 2009 at 10:38 AM, Michael Garski <[email protected]>wrote:

> Andy,
>
> Enumerating over all of the terms in an index to retrieve the number of
> instances of each is not going to be a fast operation.  What is it that
> you are trying to accomplish with the that data?
>
> Michael
>
> -----Original Message-----
> From: Andrew Schuler [mailto:[email protected]]
> Sent: Wednesday, September 02, 2009 7:34 AM
> To: [email protected]
> Subject: Re: enumerating all terms in index
>
> Michael,
>
> I was looking for all the terms in the index and the number of instances
> of
> each. I ended using IR.Terms and TermEnum but from some of the
> discussions I
> saw in my Google search it seemed like that might not be the best
> (fastest)
> way to accomplish this. Is this still the accepted best pracice?
>
>
> On Mon, Aug 31, 2009 at 11:38 AM, Michael Garski
> <[email protected]>wrote:
>
> > Andy,
> >
> > Are you looking for the number of documents that contain a term, or
> the
> > total number of term instances?
> >
> > To enumerate over all of the terms in an index, use IndexReader.Terms
> to
> > get a TermEnum to walk through the terms.  From there you can use
> > IndexReader.DocFreq to get the number of documents that contain a
> term.
> > To find the total number of occurrences of a term use
> > IndexReader.TermDocs to retrieve the frequency of a term within a
> > document.
> >
> > Hope that gets you in the right direction.
> >
> > Michael
> >
> > -----Original Message-----
> > From: Andrew Schuler [mailto:[email protected]]
> > Sent: Friday, August 28, 2009 6:38 PM
> > To: [email protected]
> > Subject: enumerating all terms in index
> >
> > This seems pretty straightforward but Google is failing me today.
> > What is the generally accepted best (fastest) way to enumerate all the
> > terms
> > in and index with the number of times they occur? TIA.
> >
> > -andy
> >
> >
>
>

Reply via email to