Hi there Lucene calucaltes the string similarity between two strings s1 and s2 according to the formula
Similarity = Levenshtein-Distance(s1,s2)/min(Length(s1),Length(s2)) I would have thought Lucene would divide by the length of the longer string. In particular, the above formula could - in my understanding - lead to a negative similarity, since the Levenshtein distance can be as long as the length of the longer string. Why does Lucene calculate the similarity in this way? Cheers, Damian
smime.p7s
Description: S/MIME cryptographic signature