Extended javadocs in spellchecker --------------------------------- Key: LUCENE-786 URL: https://issues.apache.org/jira/browse/LUCENE-786 Project: Lucene - Java Issue Type: Improvement Components: Javadocs Affects Versions: 2.0.0 Reporter: Karl Wettin Priority: Trivial
Added some javadocs that explains why the spellchecker does not work as one might expect it to. http://www.nabble.com/SpellChecker%3A%3AsuggestSimilar%28%29-Question-tf3118660.html#a8640395 > Without having looked at the code for a long time, I think the problem is > what the > lucene scoring consider to be best. First the grams are searched, resulting > in a number > of hits. Then the edit-distance is calculated on each hit. "Genetics" is > appearently the > third most similar hit according to Lucene, but the best according to > Levenshtein. > > I.e. Lucene does not use edit-distance as similarity. You need to get a bunch > of best hits > in order to find the one with the smallest edit-distance. I took a look at the code, and my assessment seems to be right. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]