Grant Henke created IMPALA-9583:
-----------------------------------
Summary: Add automated tests for Kudu VARCHAR multibyte truncation
Key: IMPALA-9583
URL: https://issues.apache.org/jira/browse/IMPALA-9583
Project: IMPALA
Issue Type: Improvement
Reporter: Grant Henke
Kudu VARCHAR support is added in IMPALA-5092, however adding an automated test
to validate that multibyte characters are truncated when too wide was not
added. Instead a manual test was performed.
Something like below should be added to _test_kudu.py_ along with updates to
the test framework to support non-ascii characters:
{code:java}
@SkipIfKudu.no_hybrid_clock
def test_kudu_multibyte_vc(self, vector, cursor, kudu_client,
unique_database):
"""Test multibyte Kudu VARCHAR values that are wider than Impala's Varchar
length."""
cursor.execute("""CREATE TABLE %s.multibyte (a INT PRIMARY KEY, vc
VARCHAR(8))
PARTITION BY HASH(a) PARTITIONS 3 STORED AS KUDU""" % unique_database)
assert kudu_client.table_exists(
KuduTestSuite.to_kudu_table_name(unique_database, "multibyte"))
table = kudu_client.table(KuduTestSuite.to_kudu_table_name(unique_database,
"multibyte"))
session = kudu_client.new_session()
# Not truncated: 1 character in Kudu, 4 bytes in Impala.
session.apply(table.new_insert((0, "测")))
# Truncated: 2 characters in Kudu, 8 bytes in Impala.
session.apply(table.new_insert((1, "测试")))
session.flush() self.run_test_case('QueryTest/kudu_multibyte_vc',
vector, use_db=unique_database) {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]