Hello Tamas Mate, Zoltan Borok-Nagy, Impala Public Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/20951
to look at the new patch set (#4).
Change subject: IMPALA-12598: Allow multiple equality field id lists for
Iceberg tables
......................................................................
IMPALA-12598: Allow multiple equality field id lists for Iceberg tables
This patch adds support for reading Iceberg tables that have
different equality field ID lists associated to different equality
delete files. In practice this is a use case when one equality delete
file deletes by e.g. columnA and columnB while another one deletes by
columnB and columnC.
In order to achieve such functionality the plan tree creation needed
some adjustments so that it can create separate LEFT ANTI JOIN nodes
for the different equality field ID lists.
Testing:
- Flink and NiFi was used for creating some test tables with the
desired equality field IDs. Coverage on these tables are added to
the test suite.
Change-Id: I3e52d7a5800bf1b479f0c234679be92442d09f79
---
M common/fbs/IcebergObjects.fbs
M common/thrift/CatalogObjects.thrift
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java
M fe/src/main/java/org/apache/impala/catalog/IcebergEqualityDeleteTable.java
M fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M testdata/data/README
D
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_different_equality_ids/data/af4e128ee3256830-d9bd9e2f00000000_1372039299_data.0.parq
D
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_different_equality_ids/data/delete-41417e7df44b347b-e035009600000001_138281890_data.0.parq
D
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_different_equality_ids/data/delete-61438487836ebfcc-95c9ce7a00000000_909175610_data.0.parq
D
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_different_equality_ids/metadata/2d3fafd7-bce6-483f-be82-e0ccce9203fc-m0.avro
D
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_different_equality_ids/metadata/57a963d3-0e4e-4540-8080-a57afd51ba99-m0.avro
D
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_different_equality_ids/metadata/8bd425d8-25fb-4603-8cc7-aeb5ad2a3917-m0.avro
D
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_different_equality_ids/metadata/snap-397031335297740726-1-2d3fafd7-bce6-483f-be82-e0ccce9203fc.avro
D
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_different_equality_ids/metadata/snap-6117850509763739078-1-57a963d3-0e4e-4540-8080-a57afd51ba99.avro
D
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_different_equality_ids/metadata/snap-8494861454990126958-1-8bd425d8-25fb-4603-8cc7-aeb5ad2a3917.avro
D
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_different_equality_ids/metadata/v3.metadata.json
D
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_different_equality_ids/metadata/v4.metadata.json
D
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_different_equality_ids/metadata/version-hint.text
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_equality_multi_eq_ids/data/00000-22-09b63f85-c950-4d32-a9fd-0abd5229d768-00001.parquet
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_equality_multi_eq_ids/data/00000-23-1365d982-ca94-4e1d-9b86-2f973c000cf0-00001.parquet
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_equality_multi_eq_ids/data/00000-23-1365d982-ca94-4e1d-9b86-2f973c000cf0-00002.parquet
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_equality_multi_eq_ids/data/00000-24-94d478ff-204b-4eb4-9a71-9876a31daf64-00001.parquet
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_equality_multi_eq_ids/data/00000-24-94d478ff-204b-4eb4-9a71-9876a31daf64-00002.parquet
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_equality_multi_eq_ids/data/00000-25-8814a344-4216-4d37-9e9b-091de336e3fa-00001.parquet
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_equality_multi_eq_ids/data/00000-25-8814a344-4216-4d37-9e9b-091de336e3fa-00002.parquet
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_equality_multi_eq_ids/metadata/2eb78a39-190a-4eab-b24a-df07f32f2cc0-m0.avro
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_equality_multi_eq_ids/metadata/2eb78a39-190a-4eab-b24a-df07f32f2cc0-m1.avro
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_equality_multi_eq_ids/metadata/58a6295d-0076-4e3e-ac77-84ce48c406cf-m0.avro
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_equality_multi_eq_ids/metadata/58a6295d-0076-4e3e-ac77-84ce48c406cf-m1.avro
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_equality_multi_eq_ids/metadata/b8fe0a34-e755-4ba0-92f1-c72bef85a82a-m0.avro
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_equality_multi_eq_ids/metadata/b8fe0a34-e755-4ba0-92f1-c72bef85a82a-m1.avro
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_equality_multi_eq_ids/metadata/ea25da34-c91b-4a0f-a003-3958e87caffd-m0.avro
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_equality_multi_eq_ids/metadata/snap-2374780975027972430-1-ea25da34-c91b-4a0f-a003-3958e87caffd.avro
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_equality_multi_eq_ids/metadata/snap-4077234998626563290-1-58a6295d-0076-4e3e-ac77-84ce48c406cf.avro
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_equality_multi_eq_ids/metadata/snap-5777805847908928861-1-2eb78a39-190a-4eab-b24a-df07f32f2cc0.avro
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_equality_multi_eq_ids/metadata/snap-8127619959873391049-1-b8fe0a34-e755-4ba0-92f1-c72bef85a82a.avro
R
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_equality_multi_eq_ids/metadata/v1.metadata.json
R
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_equality_multi_eq_ids/metadata/v2.metadata.json
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_equality_multi_eq_ids/metadata/v3.metadata.json
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_equality_multi_eq_ids/metadata/v4.metadata.json
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_equality_multi_eq_ids/metadata/v5.metadata.json
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_equality_multi_eq_ids/metadata/version-hint.text
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_pos_and_multi_eq_ids/data/00000-0-7788dcf5-a880-466d-ae9d-2dd332f98412-00001.parquet
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_pos_and_multi_eq_ids/data/00000-0-7788dcf5-a880-466d-ae9d-2dd332f98412-00002.parquet
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_pos_and_multi_eq_ids/data/00000-0-ddf90527-66f7-41de-bd3a-a6ef952918fc-00001.parquet
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_pos_and_multi_eq_ids/data/00000-0-ddf90527-66f7-41de-bd3a-a6ef952918fc-00002.parquet
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_pos_and_multi_eq_ids/data/00000-0-e93b89d3-fcf6-4847-8fd1-68e5b33d0ad6-00001.parquet
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_pos_and_multi_eq_ids/data/00000-0-e93b89d3-fcf6-4847-8fd1-68e5b33d0ad6-00002.parquet
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_pos_and_multi_eq_ids/data/delete-3e480099fc20aca4-23ae231a00000001_738940911_data.0.parq
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_pos_and_multi_eq_ids/metadata/103b5b20-fb15-41bb-a97d-1e2ddc147650-m0.avro
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_pos_and_multi_eq_ids/metadata/c0500e2e-00c0-48fb-9c29-31bbafc91d57-m0.avro
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_pos_and_multi_eq_ids/metadata/c0500e2e-00c0-48fb-9c29-31bbafc91d57-m1.avro
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_pos_and_multi_eq_ids/metadata/d7fa3972-f84c-4b35-aa37-2079458ccea8-m0.avro
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_pos_and_multi_eq_ids/metadata/d7fa3972-f84c-4b35-aa37-2079458ccea8-m1.avro
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_pos_and_multi_eq_ids/metadata/f9fa006c-0078-4caf-8eaf-f9d499fc6939-m0.avro
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_pos_and_multi_eq_ids/metadata/f9fa006c-0078-4caf-8eaf-f9d499fc6939-m1.avro
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_pos_and_multi_eq_ids/metadata/snap-152862018760071153-1-c0500e2e-00c0-48fb-9c29-31bbafc91d57.avro
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_pos_and_multi_eq_ids/metadata/snap-2066775081852432762-1-f9fa006c-0078-4caf-8eaf-f9d499fc6939.avro
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_pos_and_multi_eq_ids/metadata/snap-6283211732171745116-1-103b5b20-fb15-41bb-a97d-1e2ddc147650.avro
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_pos_and_multi_eq_ids/metadata/snap-7591397613223797435-1-d7fa3972-f84c-4b35-aa37-2079458ccea8.avro
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_pos_and_multi_eq_ids/metadata/v1.metadata.json
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_pos_and_multi_eq_ids/metadata/v2.metadata.json
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_pos_and_multi_eq_ids/metadata/v3.metadata.json
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_pos_and_multi_eq_ids/metadata/v4.metadata.json
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_pos_and_multi_eq_ids/metadata/v5.metadata.json
A
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_pos_and_multi_eq_ids/metadata/version-hint.text
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables.test
M
testdata/workloads/functional-query/queries/QueryTest/iceberg-v2-read-equality-deletes.test
M tests/query_test/test_iceberg.py
73 files changed, 1,599 insertions(+), 426 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/20951/4
--
To view, visit http://gerrit.cloudera.org:8080/20951
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3e52d7a5800bf1b479f0c234679be92442d09f79
Gerrit-Change-Number: 20951
Gerrit-PatchSet: 4
Gerrit-Owner: Gabor Kaszab <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Tamas Mate <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>