Christoph Straßer created SOLR-4873:
---------------------------------------

             Summary: star-wildcard (*) does not work together with stemming
                 Key: SOLR-4873
                 URL: https://issues.apache.org/jira/browse/SOLR-4873
             Project: Solr
          Issue Type: Bug
          Components: search
    Affects Versions: 4.2
         Environment: Windows 7, Java 7
            Reporter: Christoph Straßer


Without using a stemming-filter (e.g. solr.SnowballPorterFilterFactory)
http://localhost:8983/solr/collection1/select?q=Tochter*
matches "Tochter", "Tochterunternehmen" or "Töchter".

<fieldType name="text_general" class="solr.TextField" 
positionIncrementGap="100">
        <analyzer type="index">
                <charFilter class="solr.MappingCharFilterFactory" 
mapping="mapping-ISOLatin1Accent.txt"/>
                <tokenizer class="solr.WhitespaceTokenizerFactory"/>
                <filter class="solr.StopFilterFactory" ignoreCase="true" 
words="stopwords.txt" enablePositionIncrements="true"/>
                <filter class="solr.WordDelimiterFilterFactory" 
generateWordParts="1" generateNumberParts="1" catenateWords="1" 
catenateNumbers="1" catenateAll="0" splitOnCaseChange="0"/>
                <filter class="solr.LowerCaseFilterFactory"/>
        </analyzer>

With using a stemming-filter the same query
http://localhost:8983/solr/collection1/select?q=Tochter*
only matches "Tochterunternehmen" but not "Tochter" or "Töchter". (Stemming is 
applied for type="index" and type="query")

<fieldType name="text_general" class="solr.TextField" 
positionIncrementGap="100">
        <analyzer type="index">
                <charFilter class="solr.MappingCharFilterFactory" 
mapping="mapping-ISOLatin1Accent.txt"/>
                <tokenizer class="solr.WhitespaceTokenizerFactory"/>
                <filter class="solr.StopFilterFactory" ignoreCase="true" 
words="stopwords.txt" enablePositionIncrements="true"/>
                <filter class="solr.WordDelimiterFilterFactory" 
generateWordParts="1" generateNumberParts="1" catenateWords="1" 
catenateNumbers="1" catenateAll="0" splitOnCaseChange="0"/>
                <filter class="solr.LowerCaseFilterFactory"/>
                <filter class="solr.SnowballPorterFilterFactory" 
language="German2" protected="protwords.txt" />
        </analyzer>
        
Sample-Files attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to