[ 
https://issues.apache.org/jira/browse/IGNITE-14545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17326584#comment-17326584
 ] 

Aleksey Plekhanov commented on IGNITE-14545:
--------------------------------------------

It's relatively easy to support 16-bit unicode characters (I've raised the pull 
request). But there is limited support for 32-bit characters, since java uses 
16-bit Char array for strings. Length, substring, and some other methods treat 
each 32-bit character as two 16-bit characters. For example, the result of 
{{"🦆".length()}} will be 2. String functions in calcite reuse java {{String}} 
methods and have the same problem. The result of {{SELECT 
CHAR_LENGTH('{{🦆}}')}} will be 2. And we cannot change this behavior without 
rewriting Calcite string functions. I'm not sure we need it right now.

> Calcite engine. Unicode literal not supported
> ---------------------------------------------
>
>                 Key: IGNITE-14545
>                 URL: https://issues.apache.org/jira/browse/IGNITE-14545
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Taras Ledkov
>            Assignee: Aleksey Plekhanov
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Unicode literal not supported.
>  e.g. {{SELECT <any_string_with_unicode_symbols>}}
> Tests:
>  {{aggregate/aggregates/test_aggr_string.test}}
> {{types/string/test_unicode.test_ignored}}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to