Hi Grant On Mon, 21 Jun 2004 14:11:37 -0400, Grant Ingersoll <[EMAIL PROTECTED]> wrote: > > Space will vary based on the content (number of unique terms), obviously, but I did > submit some rough numbers that I saw for my implementation. Here they are (from my > original patch submission): > > I also tested by indexing 12,598 documents (88,362 terms) using both term vectors > and no term vectors. > Index size w/o term vectors: 42 MB > Index size w/ term vectors: 71.3 MB
It would be interesting to know also the size of the index storing the whole document, but I think I can test this on my own. > Time for the first test was 5 minutes 30 seconds, time for the second test was 6 > minutes 2 seconds. > > The term vector you get back is a list of strings, containing the term and the term > frequency for the given document. I also submitted a Term Vector representation for > the Query (see QueryTermVector), so I suppose you could loop over the two vectors > and compare. Very intersting! > Don't know if that solves your problem, but I hope it helps. In the next days I will try these suggestions. Thank you very much. Regards, Giulio Cesare Solaroli --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
