Maciej Niemczyk created SOLR-5153:
-------------------------------------

             Summary: CollationKeyFilter returns unexpected output
                 Key: SOLR-5153
                 URL: https://issues.apache.org/jira/browse/SOLR-5153
             Project: Solr
          Issue Type: Bug
          Components: SearchComponents - other
    Affects Versions: 4.3
         Environment: Mac os x
            Reporter: Maciej Niemczyk


Given the default situation and the example from solr-wiki: 
http://wiki.apache.org/solr/UnicodeCollation
the solr analysis reports strange output for the CKF.
Settings:
{code}
<fieldType name="germanText" class="solr.TextField">
        <analyzer type="index">
                <tokenizer class="solr.WhitespaceTokenizerFactory"/>
                <filter class="solr.CollationKeyFilterFactory" language="de" 
strength="primary"/>
        </analyzer>
        <analyzer type="query">
                <tokenizer class="solr.WhitespaceTokenizerFactory"/>
                <filter class="solr.CollationKeyFilterFactory" language="de" 
strength="primary"/>
        </analyzer>
</fieldType>

<field name="germanText" type="germanText" indexed="true" stored="false" 
multiValued="true"/>

<copyField source="title" dest="germanText"/>
{code}

Output:
{code}

WT
text
raw_bytes
start
end
position
type
Peter
[50 65 74 65 72]
0
5
1
word
CKF
text
raw_bytes
position
start
end
type
1䀖瀅䀃᐀
[31 e4 80 96 c e7 80 85 e4 80 83 e1 90 80 0 0 0]
1
0
5
word
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to