[Solr Wiki] Update of "AnalyzersTokenizersTokenFilters" by HossMan

Apache Wiki Wed, 05 Jul 2006 23:07:02 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change 
notification.


The following page has been changed by HossMan:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters

------------------------------------------------------------------------------
  
  A query for `text:TV` will expand into `(text:TV text:Television)` and the 
lower docFreq for `text:Television` will give the documents that match 
"Television" a much higher score then docs that match "TV" comparably -- which 
may be somewhat counter intuative to the client.  Index time expansion (or 
reduction) will result in the same idf for all documents regardless of which 
term the orriginal text contained.
  
+ ==== solr.RemoveDuplicatesTokenFilterFactory ====
+ 
+ Creates `org.apache.solr.analysis.RemoveDuplicatesTokenFilter`.
+ 
+ Filters out any tokens which are at the same logical position in the 
tokenstream as a previous token with the same text.  This situation can arise 
from a number of situations depending on what the "up stream" token filters are 
-- notably when stemming synonyms with similar roots.  It is usefull to remove 
the duplicates to prevent `idf` inflation at index time, or `tf` inflation (in 
a !MultiPhraseQuery) at query time.
+

[Solr Wiki] Update of "AnalyzersTokenizersTokenFilters" by HossMan

Reply via email to