[ https://issues.apache.org/jira/browse/SOLR-248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12498817 ]
J.J. Larrea commented on SOLR-248: ---------------------------------- While I fully agree that faceting does raise some odd issues stemming from the display of normally-invisible indexed values to humans, and that it theoretically should be responsibility of the front-end to translate index values into human-readable values, there are great practical advantages in both efficiency and convenience to making the indexed values "pretty", and to centralize as much of that as possible in the Analysis stage. In particular, I will try this and am very likely to put this into use this weekend, so thank you Ryan! So I'm +1 to adding it to the Solr distribution, though to avoid confusing people it should have a JavaDoc comment explaining that the main use is in faceting to avoid having to introduce such common logic into the presentation-layer. Regarding the implementation, 1. For 'keep' and 'okPrefix' (and were it not for reverse-compatibility issues, for 'words' in StopFilter), it would be nice to have a means to specify either a direct list or a filename in the same parameter. A simple approach might be something like keep="word word word..." vs. keep="<file", or even keep="<file <file word word" (with the requirement for backslash-escaping spaces in either)... Or alternately something like txt:filename (vs. xml:filename, json:filename, etc.) with an unescaped : being significant. 2. Why is so much of the logic in the Factory? This drags Solr-specific stuff in when a user might want to use just the Analyzer in a non-Solr context. Wouldn't it be better in general for Solr Analyzers to be self-complete, with the Factory merely being an adaptor between SolrParams & external resources and the Analyzer's constructor? Also, why is keep in a synchronized map, since there is no mutator? (I know, picky picky...) Good luck with the deadline! > Capitalization Filter Factory > ----------------------------- > > Key: SOLR-248 > URL: https://issues.apache.org/jira/browse/SOLR-248 > Project: Solr > Issue Type: New Feature > Reporter: Ryan McKinley > Priority: Minor > Attachments: SOLR-248-CapitalizationFilter.patch > > > For tokens that are used in faceting, it is nice to have standard > capitalization. > I want "Aerial views" and "Aerial Views" to both be: "Aerial Views" -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.