Hello lucene people! First of all, i would like to thank all of community participants ( developers, users, Erik and Otis for "Lucene in Action" book ) for their great work.
As far as i understand it, there are two most popular approches concerning document similarity: 1. "cosine metrics" using term vectors 2. constructing MoreLikeThis query by document In my case, i need to filter similar documents in search results and therefore determine document similarity during indexing process using term vectors. Obviously, i can't compare currently indexing document with all documents in my collection. Should i restrict documents in my collection using constructing some kind of "LikeThis" query? What's a best/common practices for such things? Thanks in advance, Alex Serba --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]