Date: 2004-10-11T10:21:52 Editor: NicolasMaisonneuve <[EMAIL PROTECTED]> Wiki: Jakarta Lucene Wiki Page: SpellChecker URL: http://wiki.apache.org/jakarta-lucene/SpellChecker
no comment New Page: SpellChecker a Spell Checker allow to suggest a list of words close to a misspelled word. This implementation use the n-gram technic and the levensthein distance. A Index (the dictionary) with all the possible words (a lucene index) must be created. The structure of this index is (for a 3-4 gram): word: gram3: gram4: 3start: 4start: 3end: 4end: transposition: it's independant of the user index. So we can add words becoming to several fields of several index for example or, why not, to a file with a list of words. source: SpellChecker spellChecker= new SpellChecker(); The suggestSimilar method return a list of suggests word sorted by the Levenshtein distance and optionaly to the popularity of the word for a specific field in a user index. More of that, this list can be restricted only to words present in a specific field of a user index. See the test case code for example download file to [http://issues.apache.org/bugzilla/show_bug.cgi?id=31617] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]