Dear Wiki user, You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.
The following page has been changed by HossMan: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters ------------------------------------------------------------------------------ A query for `text:TV` will expand into `(text:TV text:Television)` and the lower docFreq for `text:Television` will give the documents that match "Television" a much higher score then docs that match "TV" comparably -- which may be somewhat counter intuative to the client. Index time expansion (or reduction) will result in the same idf for all documents regardless of which term the orriginal text contained. + ==== solr.RemoveDuplicatesTokenFilterFactory ==== + + Creates `org.apache.solr.analysis.RemoveDuplicatesTokenFilter`. + + Filters out any tokens which are at the same logical position in the tokenstream as a previous token with the same text. This situation can arise from a number of situations depending on what the "up stream" token filters are -- notably when stemming synonyms with similar roots. It is usefull to remove the duplicates to prevent `idf` inflation at index time, or `tf` inflation (in a !MultiPhraseQuery) at query time. +
