Hello Impala Public Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/21950
to look at the new patch set (#2).
Change subject: IMPALA-13463: Impala should ignore case of Iceberg schema
elements
......................................................................
IMPALA-13463: Impala should ignore case of Iceberg schema elements
Schema is case insensitive in Impala. Via Spark it's possible to create
schema elements with upper/lower case letters and store them in the
metadata JSON files of Iceberg, e.g.:
"schemas" : [ {
"type" : "struct",
"schema-id" : 0,
"fields" : [ {
"id" : 1,
"name" : "ID",
"required" : false,
"type" : "string"
}, {
"id" : 2,
"name" : "OWNERID",
"required" : false,
"type" : "string"
} ]
} ],
This can cause problems in Impala during predicate pushdown, as we can
get a ValidationException from the Iceberg library (as Impala pushes
down predicates with lower case column names, while Iceberg sees upper
case names).
With this patch Impala invokes Scan.caseSensitive(boolean caseSensitive)
on the TableScan object to set case insensitivity.
Testing:
* added e2e test
Change-Id: Iedaf152d8a0c02a124c3dcf8acb59b4ba4e81cf4
---
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
A
testdata/data/iceberg_test/iceberg_column_case_sensitivity_issue/data/id_bucket=3/5b4ef6d2e91c0129-f018b1d800000000_872469098_data.0.parq
A
testdata/data/iceberg_test/iceberg_column_case_sensitivity_issue/data/id_bucket=7/504c5f5ae97c4c65-c9fce43a00000000_1852333400_data.0.parq
A
testdata/data/iceberg_test/iceberg_column_case_sensitivity_issue/metadata/1a457d69-768a-4bfd-8da5-c080d3b88e50-m0.avro
A
testdata/data/iceberg_test/iceberg_column_case_sensitivity_issue/metadata/96461a99-3b56-4573-ab6d-8b8ba3fbcae2-m0.avro
A
testdata/data/iceberg_test/iceberg_column_case_sensitivity_issue/metadata/snap-1855055649619147667-1-96461a99-3b56-4573-ab6d-8b8ba3fbcae2.avro
A
testdata/data/iceberg_test/iceberg_column_case_sensitivity_issue/metadata/snap-7743982156242154468-1-1a457d69-768a-4bfd-8da5-c080d3b88e50.avro
A
testdata/data/iceberg_test/iceberg_column_case_sensitivity_issue/metadata/v3.metadata.json
A
testdata/data/iceberg_test/iceberg_column_case_sensitivity_issue/metadata/version-hint.text
A
testdata/workloads/functional-query/queries/QueryTest/iceberg-column-case-sensitivity-issue.test
M tests/query_test/test_iceberg.py
11 files changed, 161 insertions(+), 2 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/21950/2
--
To view, visit http://gerrit.cloudera.org:8080/21950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iedaf152d8a0c02a124c3dcf8acb59b4ba4e81cf4
Gerrit-Change-Number: 21950
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>