[
https://issues.apache.org/jira/browse/IMPALA-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16898861#comment-16898861
]
Philip commented on IMPALA-2019:
--------------------------------
Also String lengths seem to be an issue.
It appears to return the *byte length* rather than the *number of characters*.
I would suggest this is +not a minor issue+.
{color:#205081} *select length('€')* {color}
In Hive returns 1
In Impala returns 3
> Proper UTF-8 support in string functions
> ----------------------------------------
>
> Key: IMPALA-2019
> URL: https://issues.apache.org/jira/browse/IMPALA-2019
> Project: IMPALA
> Issue Type: New Feature
> Components: Backend
> Affects Versions: Impala 2.1, Impala 2.2
> Reporter: Andrés Cordero
> Priority: Minor
> Labels: sql-language
>
> As documented here:
> http://www.cloudera.com/content/cloudera/en/documentation/cloudera-impala/latest/topics/impala_string.html
> Impala does not properly handle non-ASCII UTF-8 characters, and will return
> results in string functions such as length that are inconsistent with Hive.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]