Maciej Niemczyk created SOLR-5153:
-------------------------------------
Summary: CollationKeyFilter returns unexpected output
Key: SOLR-5153
URL: https://issues.apache.org/jira/browse/SOLR-5153
Project: Solr
Issue Type: Bug
Components: SearchComponents - other
Affects Versions: 4.3
Environment: Mac os x
Reporter: Maciej Niemczyk
Given the default situation and the example from solr-wiki:
http://wiki.apache.org/solr/UnicodeCollation
the solr analysis reports strange output for the CKF.
Settings:
{code}
<fieldType name="germanText" class="solr.TextField">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.CollationKeyFilterFactory" language="de"
strength="primary"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.CollationKeyFilterFactory" language="de"
strength="primary"/>
</analyzer>
</fieldType>
<field name="germanText" type="germanText" indexed="true" stored="false"
multiValued="true"/>
<copyField source="title" dest="germanText"/>
{code}
Output:
{code}
WT
text
raw_bytes
start
end
position
type
Peter
[50 65 74 65 72]
0
5
1
word
CKF
text
raw_bytes
position
start
end
type
1䀖瀅䀃᐀
[31 e4 80 96 c e7 80 85 e4 80 83 e1 90 80 0 0 0]
1
0
5
word
{code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]