>In my case, i need to filter similar documents in search results and
>therefore determine document similarity during indexing process using
>term vectors. Obviously, i can't compare currently indexing document
>with all documents in my collection. 

Yes you can. Right after indexing the new documents fetch the termvector for
this document from the index. Computer some kind of weight for each term,
und construct a Boolean query from all terms. You can use the termweights to
boost the termqueries. The hits will be scored, this score is a measure for
the similarity between the documents.

peace 


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to