[ 
https://issues.apache.org/jira/browse/SOLR-876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dan Rosher updated SOLR-876:
----------------------------

    Attachment: SOLR-876.patch

Added in ability to 'protect' words against further tokenizing

> Add ability to optionally splitOnNumerics WordDelimiterFilter/Factory
> ---------------------------------------------------------------------
>
>                 Key: SOLR-876
>                 URL: https://issues.apache.org/jira/browse/SOLR-876
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>            Reporter: Dan Rosher
>            Priority: Minor
>         Attachments: SOLR-876.patch, SOLR-876.patch
>
>
> Add ability to optionally splitOnNumerics WordDelimiterFilter/Factory
> Default behaviour is to splitOnNumerics as WordDelimiterFilter/Factory does 
> now
> I was having issues with e.g. Java/J2SE becoming split into tokens 'Java','J' 
> '2' and 'SE'  which isn't desired behavior in my instance, I wanted this to 
> be tokens 'Java', 'J2SE'. Another option I thought about but not implemented 
> was to have a protected list of words like solr.EnglishPorterFilterFactory

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to