[
https://issues.apache.org/jira/browse/SOLR-248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12498817
]
J.J. Larrea commented on SOLR-248:
----------------------------------
While I fully agree that faceting does raise some odd issues stemming from the
display of normally-invisible indexed values to humans, and that it
theoretically should be responsibility of the front-end to translate index
values into human-readable values, there are great practical advantages in both
efficiency and convenience to making the indexed values "pretty", and to
centralize as much of that as possible in the Analysis stage.
In particular, I will try this and am very likely to put this into use this
weekend, so thank you Ryan! So I'm +1 to adding it to the Solr distribution,
though to avoid confusing people it should have a JavaDoc comment explaining
that the main use is in faceting to avoid having to introduce such common logic
into the presentation-layer.
Regarding the implementation,
1. For 'keep' and 'okPrefix' (and were it not for reverse-compatibility issues,
for 'words' in StopFilter), it would be nice to have a means to specify either
a direct list or a filename in the same parameter. A simple approach might be
something like keep="word word word..." vs. keep="<file", or even keep="<file
<file word word" (with the requirement for backslash-escaping spaces in
either)... Or alternately something like txt:filename (vs. xml:filename,
json:filename, etc.) with an unescaped : being significant.
2. Why is so much of the logic in the Factory? This drags Solr-specific stuff
in when a user might want to use just the Analyzer in a non-Solr context.
Wouldn't it be better in general for Solr Analyzers to be self-complete, with
the Factory merely being an adaptor between SolrParams & external resources and
the Analyzer's constructor?
Also, why is keep in a synchronized map, since there is no mutator? (I know,
picky picky...)
Good luck with the deadline!
> Capitalization Filter Factory
> -----------------------------
>
> Key: SOLR-248
> URL: https://issues.apache.org/jira/browse/SOLR-248
> Project: Solr
> Issue Type: New Feature
> Reporter: Ryan McKinley
> Priority: Minor
> Attachments: SOLR-248-CapitalizationFilter.patch
>
>
> For tokens that are used in faceting, it is nice to have standard
> capitalization.
> I want "Aerial views" and "Aerial Views" to both be: "Aerial Views"
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.