Re: Lucene cosine similarity score for more like this query

Koji Sekiguchi Tue, 03 Feb 2015 02:10:16 -0800

Lucene uses TFIDFSimilarity class to calculate the similarity.
It is implemented on the idea of cosine measurement but it modifies the cosine 
formula.
Please take a look at "Lucene Practical Scoring Function" in the following 
Javadoc:


http://lucene.apache.org/core/4_10_3/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html

Koji
--
http://soleami.com/blog/comparing-document-classification-functions-of-lucene-and-mahout.html

On 2015/02/03 5:39, Ali Nazemian wrote:

Dear Erik,
Thank you for your response. Would younplease tell me why this score could
be higher than 1? While cosine similarity can not be higher than 1.
On Feb 2, 2015 7:32 PM, "Erik Hatcher" <erik.hatc...@gmail.com> wrote:

The scoring is the same as Lucene.  To get deeper insight into how a score
is computed, use Solr’s debug=true mode to see the explain details in the
response.

         Erik

On Feb 2, 2015, at 10:49 AM, Ali Nazemian <alinazem...@gmail.com> wrote:

Hi,
I was wondering what is the range of score is brought by more like this
query in Solr? I know that the Lucene uses cosine similarity in vector
space model for calculating similarity between two documents. I also know
that cosine similarity is between -1 and 1 but the fact that I dont
understand is why the score which is brought by more like this query

could

be "12" for example?! Would you please explain what is the calculation
process is Solr?
Thank you very much.

Best regards.

--
A.Nazemian

Re: Lucene cosine similarity score for more like this query

Reply via email to