Re: About DuplicateFilter

2019-04-23 Thread Erick Erickson
How is the score being calculated? Because if it’s the usual scoring algorithm, there will be very few scores that are exactly identical. And the usual BM25 scores really don’t mean the documents are “similar”. This feels like an XY problem. How is “similarity” determined here? Best, Erick > O

About DuplicateFilter

2019-04-23 Thread kongchao...@163.com
Hi! Here I hava some questions about DuplicateFilter. I use lucene search news,news contains 'id','title','content','pubtime','score' and so on.'score' value type is Long,same 'score' means similar news. I want to search news filter resultset just first one when 'score' is same. The indexed e