[ 
https://issues.apache.org/jira/browse/IMPALA-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang resolved IMPALA-2019.
------------------------------------
    Fix Version/s: Impala 4.1.0
       Resolution: Fixed

Resolving this. String functions will have UTF-8 aware behaviors when query 
option UTF8_MODE is set to true. They are
* lenght(), substring() and reverse()
* instr() and locate()
* mask functions, i.e. mask(), mask_fisrt_n(), mask_last_n(), 
mask_show_first_n(), mask_show_last_n()
* case conversion functions, i.e. upper(), lower(), initcap()

> Proper UTF-8 support in string functions
> ----------------------------------------
>
>                 Key: IMPALA-2019
>                 URL: https://issues.apache.org/jira/browse/IMPALA-2019
>             Project: IMPALA
>          Issue Type: New Feature
>          Components: Backend
>    Affects Versions: Impala 2.1, Impala 2.2
>            Reporter: Andrés Cordero
>            Assignee: Quanlong Huang
>            Priority: Critical
>              Labels: sql-language
>             Fix For: Impala 4.1.0
>
>
> As documented here: 
> https://impala.apache.org/docs/build/html/topics/impala_string.html
> Impala does not properly handle non-ASCII UTF-8 characters, and will return 
> results in string functions such as length that are inconsistent with Hive.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to