=?iso-8859-1?q?=5BJakarta_Lucene_Wiki=5D_Updated=3A__SpellChecker?=

lucene-cvs Thu, 21 Oct 2004 19:10:54 -0700

   Date: 2004-10-21T19:10:49
   Editor: NicolasMaisonneuve <[EMAIL PROTECTED]>
   Wiki: Jakarta Lucene Wiki
   Page: SpellChecker
   URL: http://wiki.apache.org/jakarta-lucene/SpellChecker


   no comment

Change Log:

------------------------------------------------------------------------------
@@ -1,6 +1,6 @@
 === SpellChecker ===
 
-A Spell Checker allow to suggest a list of words close to a misspelled word. This 
implementation is based on the David Spencer code using the n-gram technic and the 
levensthein distance. 
+A Spell Checker allow to suggest a list of words close to a misspelled word. This 
implementation is based on the David Spencer code using the n-gram technic and the 
levensthein distance.
 
 == Structure of a dictionary index ==
 A  Index (the dictionary) with all the possible words (a lucene index) must be  
created. The structure of this index is (for a 3-4 gram).
@@ -25,12 +25,12 @@
 == get a list of suggest words ==
 The suggestSimilar method return a list of suggests word sorted by:
   1.   the Levenshtein distance (the closer word of the misspelled word is the first 
of the list).
-  2.   (optionaly) the popularity of the word for a specific field in a user index. 
+  2.   (optionaly) the popularity of the word for a specific field in a user index.
 
 More of that, this list can be restricted only to words present in a specific field 
of a user index.
 
  * First example: the suggestSimilar(misspelled_word, num_list) method.
-  The ''num_list'' is the maximum number of words returned. 
+  The ''num_list'' is the maximum number of words returned.
   In this example the list is just sorted with the levenshtein distance.
  {{{
    String[] l=spellChecker.suggestSimilar("sevanty", 2);
@@ -41,26 +41,26 @@
  ''''Note'''': if myIndex_reader and myField are null this method is the same as the 
first method
 
   1.   The returned words are restricted only to the words presents in the field 
''myField'' of the user index "myIndex_Reader"
-  2.   the list is sorted with also a second criteria : the popularity (the 
frequence) of the word in the user field 
+  2.   the list is sorted with also a second criteria : the popularity (the 
frequence) of the word in the user field
   3.   If ''morePopular'' is true and the mispelled word exist in the user field , 
return only the words more frequent than this.
 
- See the test case code for example 
+ See the test case code for example
 
 == Download ==
 attachment:spellchecker1.1.zip
 
-== Changes == 
-1.1 
-- sort fixed (the sort was inversed!) 
-- set gram dynamicaly (depending of the length of the word) 
-- use the FuzzyQuery score: ((edit distance)/(length of word))
-- new Dictionary interface + LuceneDictionary  and PlaintextDictionary implementation
-- replace addWords method by indexDictionary(Dictionnary dic)
-- add  a new public method: boolean exist(word) 
-- add a build.xml
+
+== Changes ==
+Version 1.1 :
+ * sort fixed (the sort was inversed!)
+ * set gram dynamicaly (depending of the length of the word)
+ * use the FuzzyQuery score: ((edit distance)/(length of word))
+ * new Dictionary interface + LuceneDictionary  and PlaintextDictionary implementation
+ * replace addWords method by indexDictionary(Dictionnary dic)
+ * add  a new public method: boolean exist(word)
+ * add a build.xml
 
 == Credits ==
  *   Maisonneuve Nicolas
  *   Spencer David
-
 

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

=?iso-8859-1?q?=5BJakarta_Lucene_Wiki=5D_Updated=3A__SpellChecker?=

Reply via email to