Joe McDonnell created IMPALA-11492:
--------------------------------------

             Summary: ExprTest.Utf8MaskTest fails when en_US.UTF-8 is not 
present
                 Key: IMPALA-11492
                 URL: https://issues.apache.org/jira/browse/IMPALA-11492
             Project: IMPALA
          Issue Type: Bug
          Components: Backend
    Affects Versions: Impala 4.2.0
            Reporter: Joe McDonnell


In the docker-based tests on Redhat 8 / Ubuntu 20, the ExprTest.Utf8MaskTest 
fails:
{noformat}
/home/impdev/Impala/be/src/exprs/expr-test.cc:369
Value of: GetValue(expr, ColumnType(TYPE_STRING))
  Actual: "xxxx \xC3\xA1\xC3\xA4\xC3\xA8\xC3\xBC XXXX 
\xC3\x81\xC3\x84\xC3\x88\xC3\x9C"
Expected: expected_result
Which is: "xxxx xxxx XXXX XXXX"
mask('abcd  ABCD '){noformat}
These come with the C.UTF-8 locale installed. This error goes away if I change 
bin/bootstrap_system.sh to install langpacks-us (Centos) or language-pack-en 
(Ubuntu), which installs the en_US.UTF-8 locale.

This might be related to this code: 
[https://github.com/apache/impala/blob/master/be/src/exprs/mask-functions-ir.cc#L150]

Installing the language packs is easy, but I'm not sure if users would have 
those installed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to