[ https://issues.apache.org/jira/browse/SOLR-1571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12783466#action_12783466 ]
Robert Muir commented on SOLR-1571: ----------------------------------- Shalin, yes I think the ICUCollationFilter is much better (faster and smaller index, more languages), but should be a separate factory imo. I figured I would start with the JDK impl. since there is no external dependency, its the simplest. The icu impl has slightly different options and behavior, and doing something fancy like detecting which impl to use with reflection I don't much like either... if the ICU jar file was no longer in the classpath or its version changed, things could suddenly silently stop working correctly. > unicode collation support > ------------------------- > > Key: SOLR-1571 > URL: https://issues.apache.org/jira/browse/SOLR-1571 > Project: Solr > Issue Type: New Feature > Components: Schema and Analysis > Reporter: Robert Muir > Priority: Minor > Attachments: SOLR-1571.patch > > > This patch adds support for unicode collation (searching and sorting). > Unicode collation is helpful in a search engine, for many languages you want > things to match or sort differently. > You might even want to use copyfield and support different sort > orders/matching schemes if you need to support multiple languages. > This is simply a factory for lucene's CollationKeyFilter, which indexes > binary collation keys in a special format that preserves binary sort order. > I've added support for creating a Collator in two ways: > * system collator from a Locale spec (language + country + variant) > * tailored collator from custom rules in a text file > in no way is there an option to use the "default" locale of the jvm, (I > consider this a bit dangerous) > in this patch, it is mandatory to define the locale explicitly for a system > collator. > The required lucene-collation-2.9.1.jar is only 12KB. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.