The problem with your approach is that Lucene does not provide a score in
terms of how similar a document is to a query. The score is based on the
(default) TFIDF algorithm and is not an absolute measure. You can score a
document against all others, and the scores will be comparable for that one
document, but the overall score can vary greatly.

For example, the range of scores of one document against all others might
be 0.5 - 30. The range of scores for another document against the same
documents might be 1.2 - 24. It would be difficult to establish an overall
threshold. You can of course, always find the top % of documents.

The other issue is that the similarity will change as you index more
documents. If you only have one document in your index, the similarity
score for the next document should different than if you indexed against an
index with millions of documents because of the IDF values.

Even if your range of scores is comparable between documents, there is
nothing in Elasticsearch to help you with this task. The better question is
why do you need to calculate document relevancy between documents and not
simply rank documents according to a query?

-- 
Ivan




On Mon, Apr 28, 2014 at 12:34 AM, Rgs <[email protected]> wrote:

> Could you guys please help on this?
>
>
>
> --
> View this message in context:
> http://elasticsearch-users.115913.n3.nabble.com/Need-help-on-similarity-ranking-approach-tp4054847p4054889.html
> Sent from the ElasticSearch Users mailing list archive at Nabble.com.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/1398670453057-4054889.post%40n3.nabble.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQByimRSWh9%3D%2BzyJfKG9ijzH-zWWBaVdq7Xc1SvjMeBKTg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to