Re: Indexing documents through a set of (word, weight) pairs

Otis Gospodnetic Tue, 06 Jul 2004 08:45:27 -0700

Bok Igor,

For the first use case, you are really looking for what's called a
Forward Index.  If my memory serves me well, there was a project that
used Lucene at MIT called Haystack, and its author developed code that
worked with Lucene and created forward indices.  I never actually tried
it.


If that doesn't do it for you, see this:
http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/index/IndexReader.html#getTermFreqVectors(int)

For the second use case - well, that is precisely what Lucene does. :)

It sounds like you are going in the right direction.  I suggest reading
a few Lucene articles (links on the Lucene Wiki) to get started.

Otis



--- Igor Perisic <[EMAIL PROTECTED]> wrote:
> Hi Lucene experts:
> 
>    We are trying to build a simple document index on top of Lucene. 
> 
>    We have: 
>       Given a document, there is a list of terms (e.g. word and weight
> pairs). 
>       The queries we want to be able to handle are:
>               * Given a document, what are the terms?
>               * Given some terms, what are the documents? 
>               Note here that the above weights are used for our own customized
> scoring.
> 
> We want to use Lucene as much as possible (not wanting to reinvent
> the wheel), what are our options?
> 
> We can reuse some of the classes such as TermInfosWriter/Reader to
> store our lexicon, but is there more stuff in Lucene we can take
> advantage of? Are we going in the right direction?
> 
> 
> 
> Cheers,
> 
>               Igor
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Indexing documents through a set of (word, weight) pairs

Reply via email to