[
https://issues.apache.org/jira/browse/LUCENE-5369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13849582#comment-13849582
]
Ryan McKinley commented on LUCENE-5369:
---------------------------------------
bq. Maybe add a boolean option in the factory/filter? To remove code
duplication?
Are you suggesting adding a flag to LowerCaseFilter? I'm think that is more
confusing than having a distinct UpperCaseFlter -- and the code duplication is
essentially the minimum code required for a functioning Filter
bq. to me the analysis chain is not really the best tool to do the job of
cleaning up faceting labels
I understand and often agree that other tools are more appropriate. But there
are lots of cases where the search analysis chain gets you so close to the
desired display that duplicating things to a specific facet field seems
redundant.
This is the analyzer I am working with:
{code}
<analyzer>
<charFilter class="solr.MappingCharFilterFactory"
mapping="normalize-my-field-chars.txt"/>
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.TrimFilterFactory"/>
<filter class="solr.ASCIIFoldingFilterFactory"/>
<filter class="xxx.UpperCaseFilterFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="path/to/synonyms.txt"
ignoreCase="false" expand="false"/>
</analyzer>
{code}
> Add an UpperCaseFilter
> ----------------------
>
> Key: LUCENE-5369
> URL: https://issues.apache.org/jira/browse/LUCENE-5369
> Project: Lucene - Core
> Issue Type: New Feature
> Reporter: Ryan McKinley
> Assignee: Ryan McKinley
> Priority: Minor
> Attachments: LUCENE-5369-uppercase-filter.patch
>
>
> We should offer a standard way to force upper-case tokens. I understand that
> lowercase is safer for general search quality because some uppercase
> characters can represent multiple lowercase ones.
> However, having upper-case tokens is often nice for faceting (consider
> normalizing to standard acronyms)
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]