Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change 
notification.

The "SpellCheckComponent" page has been changed by YonikSeeley.
The comment on this change is: fix URLs, update syntax, move some stuff around.
http://wiki.apache.org/solr/SpellCheckComponent?action=diff&rev1=36&rev2=37

--------------------------------------------------

  <!> [[Solr1.3]]
  
- /!\ :TODO: /!\  HOOK in links to Javadocs.
- 
  <<TableOfContents>>
  
  = Introduction =
  
  The SpellCheckComponent is designed to provide inline spell checking of 
queries without having to issue separate requests. Another and possibly clearer 
way of stating this is that it makes query suggestions (as do well-known web 
search engines), for example if it thinks the input query might have been 
misspelled. (Some people tend to think that "spellchecker" is actually a 
misnomer, and something along the lines of "query suggest" would have been more 
appropriate.)
- 
- For discussion of the development of this feature, see 
[[https://issues.apache.org/jira/browse/SOLR-572|SOLR-572]].
  
  The SpellCheckComponent can use the 
[[http://wiki.apache.org/jakarta-lucene/SpellChecker|Lucene SpellChecker]] to 
give suggestion for given words, or one can implement their own spell checker 
using the SolrSpellChecker abstract base class.
  
@@ -92, +88 @@

  
  }}}
  
- When adding <str name="field">FieldName</str> be aware all fieldType 
processing is done prior to the dictionary creation.  It is best to avoid a 
heavily processed field (ie synonyms and stemming) to get more accurate 
results.  If the field has many word variations from processing then the 
dictionary will be created with those in addition to more valid spell checking 
data.
+ When adding {{{<str name="field">FieldName</str>}}} be aware all fieldType 
processing is done prior to the dictionary creation.  It is best to avoid a 
heavily processed field (ie synonyms and stemming) to get more accurate 
results.  If the field has many word variations from processing then the 
dictionary will be created with those in addition to more valid spell checking 
data.
  
  Multiple "spellchecker" instances can be configured in the same way. The 
currently available spellchecker implementations are:
   * org.apache.solr.spelling.IndexBasedSpellChecker -- Create and use a 
spelling dictionary that is based on the Solr index or an existing Lucene index
@@ -159, +155 @@

  A simple result using the spellcheck.q parameter. Note the 
spellcheck.build=true which is needed only once to build the index. It should 
not be specified with for each request.
  
  {{{
- 
http://localhost:8983/solr/spellCheckCompRH?q=*:*&spellcheck.q=hell%20ultrashar&spellcheck=true&spellcheck.build=true
+ 
http://localhost:8983/solr/spell?q=*:*&spellcheck.build=true&spellcheck.q=hell%20ultrashar&spellcheck=true
  }}}
  
  {{{
@@ -189, +185 @@

  
  The spellcheck.extendedResults=true parameter provides frequency of each 
original term in the index (origFreq) as well as the frequency of each 
suggestion in the index (frequency).
  
- '''''NOTE''': This result format differs from the non-extended one as the 
returned suggestions is actually an array of lists, where each list holds the 
suggested term and its frequency.'' <!> [[Solr1.4]]
+ '''''NOTE''': This result format differs from the non-extended one as the 
returned suggestion for a word is actually an array of lists, where each list 
holds the suggested term and its frequency.'' <!> [[Solr1.4]]
  
  {{{
- 
http://localhost:8983/solr/spellCheckCompRH?q=*:*&spellcheck.q=hell+ultrashar&spellcheck=true&spellcheck.extendedResults=true
+ 
http://localhost:8983/solr/spell?q=*:*&spellcheck.q=hell+ultrashar&spellcheck=true&spellcheck.extendedResults=true
  }}}
  
  {{{
  <lst name="spellcheck">
-       <lst name="suggestions">
+  <lst name="suggestions">
-               <lst name="hell">
+   <lst name="hell">
-                       <int name="numFound">1</int>
+       <int name="numFound">1</int>
-                       <int name="startOffset">0</int>
+       <int name="startOffset">0</int>
-                       <int name="endOffset">4</int>
+       <int name="endOffset">4</int>
-                       <int name="origFreq">0</int>
+       <int name="origFreq">0</int>
-                       <arr name="suggestion">
+       <arr name="suggestion">
-                                 <lst>
-                                       <int name="frequency">1</int>
+        <lst>
+ 
-                                       <str name="word">dell</str>
+         <str name="word">dell</str>
+         <int name="freq">2</int>
-                                 </lst>
-                       </arr>
-               </lst>
+        </lst>
+       </arr>
+   </lst>
-               <lst name="ultrashar">
+   <lst name="ultrashar">
-                       <int name="numFound">1</int>
+       <int name="numFound">1</int>
+ 
-                       <int name="startOffset">5</int>
+       <int name="startOffset">5</int>
-                       <int name="endOffset">14</int>
+       <int name="endOffset">14</int>
-                       <int name="origFreq">0</int>
+       <int name="origFreq">0</int>
-                       <arr name="suggestion">
+       <arr name="suggestion">
+        <lst>
-                                 <lst>
-                                       <int name="frequency">1</int>
-                                       <str name="word">ultrasharp</str>
+         <str name="word">ultrasharp</str>
-                                 </lst>
-                       </arr>
+         <int name="freq">2</int>
+ 
-               </lst>
+        </lst>
+       </arr>
+   </lst>
-               <bool name="correctlySpelled">false</bool>
+   <bool name="correctlySpelled">false</bool>
-       </lst>
+  </lst>
  </lst>
  }}}
  
@@ -232, +231 @@

  Adding the spellcheck.collate=true parameter returns a query with the 
misspelled terms replaced by the top suggestions. Note that the 
non-spellcheckable terms such as those for range queries, prefix queries etc. 
are detected and excluded for spellchecking. Such non-spellcheckable terms are 
preserved in the collated output so that the original query can be run again, 
as is.
  
  {{{
- http://localhost:8983/solr/spellCheckCompRH?q=price:[80 TO 100] hell 
ultrashar&spellcheck=true&spellcheck.extendedResults=true&spellcheck.collate=true
+ http://localhost:8983/solr/spell?q=price:[80 TO 100] hell 
ultrashar&spellcheck=true&spellcheck.extendedResults=true&spellcheck.collate=true
  }}}
  
  {{{
  <lst name="spellcheck">
-       <lst name="suggestions">
+  <lst name="suggestions">
-               <lst name="hell">
+   <lst name="hell">
-                       <int name="numFound">1</int>
+       <int name="numFound">1</int>
-                       <int name="startOffset">18</int>
+       <int name="startOffset">18</int>
-                       <int name="endOffset">22</int>
+       <int name="endOffset">22</int>
-                       <int name="origFreq">0</int>
+       <int name="origFreq">0</int>
-                       <lst name="suggestion">
+       <arr name="suggestion">
-                               <int name="frequency">1</int>
+        <lst>
-                               <str name="word">dell</str>
+         <str name="word">dell</str>
-                       </lst>
+         <int name="freq">2</int>
-               </lst>
+        </lst>
+       </arr>
+   </lst>
-               <lst name="ultrashar">
+   <lst name="ultrashar">
-                       <int name="numFound">1</int>
+       <int name="numFound">1</int>
-                       <int name="startOffset">23</int>
+       <int name="startOffset">23</int>
-                       <int name="endOffset">32</int>
+       <int name="endOffset">32</int>
-                       <int name="origFreq">0</int>
+       <int name="origFreq">0</int>
-                       <lst name="suggestion">
+       <arr name="suggestion">
-                               <int name="frequency">1</int>
+        <lst>
-                               <str name="word">ultrasharp</str>
+         <str name="word">ultrasharp</str>
-                       </lst>
+         <int name="freq">2</int>
-               </lst>
+        </lst>
+       </arr>
+   </lst>
-               <bool name="correctlySpelled">false</bool>
+   <bool name="correctlySpelled">false</bool>
-               <str name="collation">price:[80 TO 100] dell ultrasharp</str>
+   <str name="collation">price:[80 TO 100] dell ultrasharp</str>
-       </lst>
+  </lst>
  </lst>
  }}}
  
- = Implementing a SolrSpellChecker =
+ = Implementing a new java SolrSpellChecker =
+ 
+ /!\ :TODO: /!\  HOOK in links to Javadocs.
  
  The SolrSpellChecker class provides an abstract base class for defining 
common spelling constructs for use in the SpellCheckComponent.  Implementing
  classes need to define the following methods:
@@ -309, +314 @@

  <str name="buildOnOptimize">true</str>
  }}}
  
+ = History =
+ For discussion of the development of this feature, see 
[[https://issues.apache.org/jira/browse/SOLR-572|SOLR-572]].
+ 

Reply via email to