Well, I'm planning to have the term weights (assume in a matrix) and then using 
an adaptive learning system transform them into a new weights in such a way 
that index formed of these be optimized. Its just a test to see if this 
hypothesis is working or not.


--- On Thu, 4/9/09, Grant Ingersoll <[email protected]> wrote:

From: Grant Ingersoll <[email protected]>
Subject: Re: Vector space implemantion
To: [email protected]
Date: Thursday, April 9, 2009, 6:29 PM

Assuming you want to handle the vectors yourself, as opposed to relying on the 
fact that Lucene itself implements the VSM, you should index your documents 
with TermVector.YES.  That will give you the term freq on a per doc basis, but 
you will have to use the TermEnum to get the Doc Freq.  All and all, this is 
not going to be very efficient for you, but you should be able to build up a 
matrix from it.

What is the problem you are trying to solve?



On Apr 9, 2009, at 2:33 AM, Andy wrote:

> Hello all,
> 
> I'm trying to implement a vector space model using lucene. I need to have a 
> file (or on memory) with TF/IDF weight of each term in each document. (in 
> fact that is a matrix with documents presented as vectors, in which the 
> elements of each vector is the TF weight ...)
> 
> Please Please help me on this
> contact me if you need any further info via [email protected]
> Many Many thanks
> 
> 
> 
> 

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using 
Solr/Lucene:
http://www.lucidimagination.com/search


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]




      

Reply via email to