[Solr Wiki] Update of "AnalyzersTokenizersTokenFilters" by JasonRutherglen

Apache Wiki Tue, 19 Jan 2010 07:56:24 -0800

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change 
notification.


The "AnalyzersTokenizersTokenFilters" page has been changed by JasonRutherglen.
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters?action=diff&rev1=67&rev2=68

--------------------------------------------------

  </fieldtype>
  }}}
  
+ <<Anchor(CommonGramsFilter)>>
+ ==== solr.CommonGramsFilterFactory ====
+ 
+ Creates `org.apache.solr.analysis.CommonGramsFilter`.
+ 
+ Makes shingles (i.e. the_cat) by combining common tokens (usually the same as 
the stop words list) and regular tokens.  CommonGramsFilter is useful for 
issuing phrase queries (i.e. "the cat") that contain stop words.  Normally 
phrases contaning stop words would not match their intended target and instead, 
the query "the cat" would match all documents containing "cat", which can be 
undesirable behavior.  Phrase query slop (eg, "the cat"~2) will not match any 
documents because common grams are indexed as shingled tokens that are adjacent 
to each other (i.e. the_cat is indexed as a single term).
+ 
+ A customized common word list may be specified with the "words" attribute in 
the schema.
+ Optionally, the "ignoreCase" attribute may be used to ignore the case of 
tokens when comparing to the common words list.
+ 
+ {{{
+ <fieldtype name="testcommongrams" class="solr.TextField">
+    <analyzer>
+      <tokenizer class="solr.LowerCaseTokenizerFactory"/>
+      <filter class="solr.CommonGramsFilterFactory" words="stopwords.txt" 
ignoreCase="true"/>
+    </analyzer>
+ </fieldtype>
+ }}}
  
  <<Anchor(KeepWordFilter)>>
  ==== solr.KeepWordFilterFactory ====

[Solr Wiki] Update of "AnalyzersTokenizersTokenFilters" by JasonRutherglen

Reply via email to