The problem with your approach is that Lucene does not provide a score in terms of how similar a document is to a query. The score is based on the (default) TFIDF algorithm and is not an absolute measure. You can score a document against all others, and the scores will be comparable for that one document, but the overall score can vary greatly.
For example, the range of scores of one document against all others might be 0.5 - 30. The range of scores for another document against the same documents might be 1.2 - 24. It would be difficult to establish an overall threshold. You can of course, always find the top % of documents. The other issue is that the similarity will change as you index more documents. If you only have one document in your index, the similarity score for the next document should different than if you indexed against an index with millions of documents because of the IDF values. Even if your range of scores is comparable between documents, there is nothing in Elasticsearch to help you with this task. The better question is why do you need to calculate document relevancy between documents and not simply rank documents according to a query? -- Ivan On Mon, Apr 28, 2014 at 12:34 AM, Rgs <[email protected]> wrote: > Could you guys please help on this? > > > > -- > View this message in context: > http://elasticsearch-users.115913.n3.nabble.com/Need-help-on-similarity-ranking-approach-tp4054847p4054889.html > Sent from the ElasticSearch Users mailing list archive at Nabble.com. > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/1398670453057-4054889.post%40n3.nabble.com > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQByimRSWh9%3D%2BzyJfKG9ijzH-zWWBaVdq7Xc1SvjMeBKTg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
