Hi Grant

On Mon, 21 Jun 2004 14:11:37 -0400, Grant Ingersoll <[EMAIL PROTECTED]> wrote:
> 
> Space will vary based on the content (number of unique terms), obviously, but I did 
> submit some rough numbers that I saw for my implementation.  Here they are (from my 
> original patch submission):
> 
> I also tested by indexing 12,598 documents (88,362 terms) using both term vectors 
> and no term vectors.
> Index size w/o term vectors: 42 MB
> Index size w/ term vectors: 71.3 MB

It would be interesting to know also the size of the index storing the
whole document, but I think I can test this on my own.

> Time for the first test was 5 minutes 30 seconds, time for the second test was 6 
> minutes 2 seconds.
> 
> The term vector you get back is a list of strings, containing the term and the term 
> frequency for the given document.  I also submitted a Term Vector representation for 
> the Query (see QueryTermVector), so I suppose you could loop over the two vectors 
> and compare.

Very intersting!

> Don't know if that solves your problem, but I hope it helps.

In the next days I will try these suggestions.

Thank you very much.

Regards,

Giulio Cesare Solaroli

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to