RES: Problem with accented words sorting

Claudio Ranieri Mon, 10 Sep 2012 05:29:54 -0700

I tried using solr.CollationKeyFilterFactory in my facets:

<fieldType name="text_navegador_collation" class="solr.TextField">
        <analyzer>
                <tokenizer class="solr.KeywordTokenizerFactory" />
                <filter class="solr.CollationKeyFilterFactory" language="en" 
strength="primary" />
        </analyzer>
</fieldType>


I got this:

<lst name="facet_counts">
        <lst name="facet_queries"/>
                <lst name="facet_fields">
                        <lst name="nv_descricao_ue_nv_sigla_uf">
                                <int 
name=")䀖䀍#5;᠃᠁Ⰰ堀㌀ᓀఠٰʘŐ¦e*#20;怌倅᠂ࠁ䰀挀#0;#0;#1;">16</int>
                                <int name=")䀖䀍#5;᠃᠁Ⰰ堀㌀ᓀీذˀŘÊb!#25;䀌　
#0;#0;#0;">9</int>
                                <int name=")䀗怌瀆ဃ᠁☀愀㎀ᢀࡀՐˀ#0;#0;#1;">4</int>
                                <int 
name=")䀘#12;々᠃ᐁ䐀嘀㄀ᦀ଀ِ̨Ō`-#0;#0;#0;#0;">6</int>
                                <int name=")䀙 
々⠃ࠁ㠀匀⨀ᓀી԰̰ŠÊe)䀐䀌怆᠀#0;#0;#0;">14</int>
                        </lst>
                </lst>
        <lst name="facet_dates"/>
        <lst name="facet_ranges"/>
</lst>

If I remove the solr.CollationKeyFilterFactory, I get:

<lst name="facet_counts">
        <lst name="facet_queries"/>
                <lst name="facet_fields">
                        <lst name="nv_descricao_ue_nv_sigla_uf">
                                <int name="ALTO SANTO|CE">4</int>
                                <int name="AMPARO DO SERRA|MG">6</int>
                                <int name="ARAÇOIABA DA SERRA|SP">14</int>
                                <int name="BANDEIRA DO SUL|MG">4</int>
                                <int name="BARRA DE SANTA ROSA|PB">5</int>
                        </lst>
                </lst>
        <lst name="facet_dates"/>
        <lst name="facet_ranges"/>
</lst>

Is it a bug of Solr?
I am using solr 3.5.0 (stable).
Would anyone help me?


-----Mensagem original-----
De: Claudio Ranieri [mailto:claudio.rani...@estadao.com] 
Enviada em: segunda-feira, 10 de setembro de 2012 08:29
Para: solr-user@lucene.apache.org
Assunto: Problem with accented words sorting

Hi,

I have a facet (type = "string") and I want to sort it.
The problem is that accented words are appearing at the end of the sequence. 
Example sorted sequence: "Santa Catarina", "Sergipe", "São Paulo".
I would like to get in order: "Santa Catarina", "São Paulo", "Sergipe."
I can not normalize input because I want to show users the text is not 
normalized. Is there easy way to setup this?
If there is not easy way, how could I customize a comparable of String?
Thanks,
Thanks

RES: Problem with accented words sorting

Reply via email to