[ 
https://issues.apache.org/jira/browse/SOLR-248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12498817
 ] 

J.J. Larrea commented on SOLR-248:
----------------------------------

While I fully agree that faceting does raise some odd issues stemming from the 
display of normally-invisible indexed values to humans, and that it  
theoretically should be responsibility of the front-end to translate index 
values into human-readable values, there are great practical advantages in both 
efficiency and convenience to making the indexed values "pretty", and to 
centralize as much of that as possible in the Analysis stage.

In particular, I will try this and am very likely to put this into use this 
weekend, so thank you Ryan!  So I'm +1 to adding it to the Solr distribution, 
though to avoid confusing people it should have a JavaDoc comment explaining 
that the main use is in faceting to avoid having to introduce such common logic 
into the presentation-layer.

Regarding the implementation,

1. For 'keep' and 'okPrefix' (and were it not for reverse-compatibility issues, 
for 'words' in StopFilter), it would be nice to have a means to specify either 
a direct list or a filename in the same parameter.  A simple approach might be 
something like keep="word word word..." vs. keep="<file", or even keep="<file 
<file word word" (with the requirement for backslash-escaping spaces in 
either)...  Or alternately something like txt:filename (vs. xml:filename, 
json:filename, etc.) with an unescaped : being significant.

2. Why is so much of the logic in the Factory?  This drags Solr-specific stuff 
in when a user might want to use just the Analyzer in a non-Solr context. 
Wouldn't it be better in general for Solr Analyzers to be self-complete, with 
the Factory merely being an adaptor between SolrParams & external resources and 
the Analyzer's constructor?

Also, why is keep in a synchronized map, since there is no mutator?  (I know, 
picky picky...)

Good luck with the deadline!


> Capitalization Filter Factory
> -----------------------------
>
>                 Key: SOLR-248
>                 URL: https://issues.apache.org/jira/browse/SOLR-248
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Ryan McKinley
>            Priority: Minor
>         Attachments: SOLR-248-CapitalizationFilter.patch
>
>
> For tokens that are used in faceting, it is nice to have standard 
> capitalization.  
> I want "Aerial views" and "Aerial Views" to both be: "Aerial Views"

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to