I tried using solr.CollationKeyFilterFactory in my facets: <fieldType name="text_navegador_collation" class="solr.TextField"> <analyzer> <tokenizer class="solr.KeywordTokenizerFactory" /> <filter class="solr.CollationKeyFilterFactory" language="en" strength="primary" /> </analyzer> </fieldType>
I got this: <lst name="facet_counts"> <lst name="facet_queries"/> <lst name="facet_fields"> <lst name="nv_descricao_ue_nv_sigla_uf"> <int name=")䀖䀍#5;᠃᠁Ⰰ堀㌀ᓀఠٰʘŐ¦e*#20;怌倅᠂ࠁ䰀挀#0;#0;#1;">16</int> <int name=")䀖䀍#5;᠃᠁Ⰰ堀㌀ᓀీذˀŘÊb!#25;䀌 #0;#0;#0;">9</int> <int name=")䀗怌瀆ဃ᠁☀愀㎀ᢀࡀՐˀ#0;#0;#1;">4</int> <int name=")䀘#12;々᠃ᐁ䐀嘀ᦀِ̨Ō`-#0;#0;#0;#0;">6</int> <int name=")䀙 々⠃ࠁ㠀匀⨀ᓀી̰ŠÊe)䀐䀌怆᠀#0;#0;#0;">14</int> </lst> </lst> <lst name="facet_dates"/> <lst name="facet_ranges"/> </lst> If I remove the solr.CollationKeyFilterFactory, I get: <lst name="facet_counts"> <lst name="facet_queries"/> <lst name="facet_fields"> <lst name="nv_descricao_ue_nv_sigla_uf"> <int name="ALTO SANTO|CE">4</int> <int name="AMPARO DO SERRA|MG">6</int> <int name="ARAÇOIABA DA SERRA|SP">14</int> <int name="BANDEIRA DO SUL|MG">4</int> <int name="BARRA DE SANTA ROSA|PB">5</int> </lst> </lst> <lst name="facet_dates"/> <lst name="facet_ranges"/> </lst> Is it a bug of Solr? I am using solr 3.5.0 (stable). Would anyone help me? -----Mensagem original----- De: Claudio Ranieri [mailto:claudio.rani...@estadao.com] Enviada em: segunda-feira, 10 de setembro de 2012 08:29 Para: solr-user@lucene.apache.org Assunto: Problem with accented words sorting Hi, I have a facet (type = "string") and I want to sort it. The problem is that accented words are appearing at the end of the sequence. Example sorted sequence: "Santa Catarina", "Sergipe", "São Paulo". I would like to get in order: "Santa Catarina", "São Paulo", "Sergipe." I can not normalize input because I want to show users the text is not normalized. Is there easy way to setup this? If there is not easy way, how could I customize a comparable of String? Thanks, Thanks