Joe McDonnell created IMPALA-11492:
--------------------------------------
Summary: ExprTest.Utf8MaskTest fails when en_US.UTF-8 is not
present
Key: IMPALA-11492
URL: https://issues.apache.org/jira/browse/IMPALA-11492
Project: IMPALA
Issue Type: Bug
Components: Backend
Affects Versions: Impala 4.2.0
Reporter: Joe McDonnell
In the docker-based tests on Redhat 8 / Ubuntu 20, the ExprTest.Utf8MaskTest
fails:
{noformat}
/home/impdev/Impala/be/src/exprs/expr-test.cc:369
Value of: GetValue(expr, ColumnType(TYPE_STRING))
Actual: "xxxx \xC3\xA1\xC3\xA4\xC3\xA8\xC3\xBC XXXX
\xC3\x81\xC3\x84\xC3\x88\xC3\x9C"
Expected: expected_result
Which is: "xxxx xxxx XXXX XXXX"
mask('abcd ABCD '){noformat}
These come with the C.UTF-8 locale installed. This error goes away if I change
bin/bootstrap_system.sh to install langpacks-us (Centos) or language-pack-en
(Ubuntu), which installs the en_US.UTF-8 locale.
This might be related to this code:
[https://github.com/apache/impala/blob/master/be/src/exprs/mask-functions-ir.cc#L150]
Installing the language packs is easy, but I'm not sure if users would have
those installed.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)