[
https://issues.apache.org/jira/browse/SOLR-10145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexandre Rafalovitch closed SOLR-10145.
----------------------------------------
Resolution: Information Provided
> Alphanumeric text getting indexed separately for WhitespaceTokenizerFactory
> ----------------------------------------------------------------------------
>
> Key: SOLR-10145
> URL: https://issues.apache.org/jira/browse/SOLR-10145
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Lakshmi Kanta Nandi
>
> Alphanumeric text is getting indexed separately for the
> WhitespaceTokenizerFactory. I have tried tokenizer class
> solr.KeywordTokenizerFactory too but still my text getting splitted and
> indexed.
> Scenario
> Input string: ABCD1234EFGH
> Generated index: ABCD, 1234, EFGH
> Expected index: ABCD1234EFGH
> Search
> Input: ABC* returns success
> Input: ABCD123* returns fail (success expected)
> Inout: ABCD1234 returns success
> Configuration
> {code}
> <fieldType name="text" class="solr.TextField" positionIncrementGap="100">
> <analyzer type="index">
> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> <!-- in this example, we will only use synonyms at query time
> <filter class="solr.SynonymFilterFactory"
> synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
> -->
> <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt"/>
> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="0"
> generateNumberParts="0" catenateWords="1" catenateNumbers="1"
> catenateAll="0"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> <filter class="solr.SnowballPorterFilterFactory" language="English"
> protected="protwords.txt"/>
> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
> </analyzer>
> <analyzer type="query">
> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
> <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt"/>
> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
> generateNumberParts="1" catenateWords="0" catenateNumbers="0"
> catenateAll="0"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> <filter class="solr.SnowballPorterFilterFactory" language="English"
> protected="protwords.txt"/>
> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
> </analyzer>
> </fieldType>
> {code}
> Solr version: 4.3
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]