[
https://issues.apache.org/jira/browse/PHOENIX-4237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208145#comment-16208145
]
ASF GitHub Bot commented on PHOENIX-4237:
-----------------------------------------
Github user snakhoda-sfdc commented on the issue:
https://github.com/apache/phoenix/pull/275
@JamesRTaylor Thanks for your comments. I added two further commits:
199c389: This addresses your comment about the byte array comparison. You
were right! I must have got confused earlier with what was being displayed on
sqlline.py not matching the sort order. I also added a formatter for
PVarBinary because without it you simply get a Java hash code in sqlline.py
which is hard to do anything with.
8cc2b5c: This adds the end-to-end tests you mentioned and also changes the
unit test to use the hex representation of the byte array to make it easier to
read.
> Allow sorting on (Java) collation keys for non-English locales
> --------------------------------------------------------------
>
> Key: PHOENIX-4237
> URL: https://issues.apache.org/jira/browse/PHOENIX-4237
> Project: Phoenix
> Issue Type: Improvement
> Reporter: Shehzaad Nakhoda
> Fix For: 4.12.0
>
>
> Strings stored via Phoenix can be composed from a subset of the entire set of
> Unicode characters. The natural sort order for strings for different
> languages often differs from the order dictated by the binary representation
> of the characters of these strings. Java provides the idea of a Collator
> which given an input string and a (language) locale can generate a Collation
> Key which can then be used to compare strings in that natural order.
> Salesforce has recently open-sourced grammaticus. IBM has open-sourced ICU4J
> some time ago. These technologies can be combined to provide a robust new
> Phoenix function that can be used in an ORDER BY clause to sort strings
> according to the user's locale.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)