If you didn't want to store term vectors you could also run the document fields through the analyzer yourself and collect the Tokens (you should still have the fields you just indexed... no need to retrieve it again).
-Yonik On 1/20/06, Klaus <[EMAIL PROTECTED]> wrote: > > >In my case, i need to filter similar documents in search results and > >therefore determine document similarity during indexing process using > >term vectors. Obviously, i can't compare currently indexing document > >with all documents in my collection. > > Yes you can. Right after indexing the new documents fetch the termvector for > this document from the index. Computer some kind of weight for each term, > und construct a Boolean query from all terms. You can use the termweights to > boost the termqueries. The hits will be scored, this score is a measure for > the similarity between the documents. > > peace --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]