Hello Zoltan Borok-Nagy, Impala Public Jenkins,
I'd like you to do a code review. Please visit
http://gerrit.cloudera.org:8080/18912
to review the following change.
Change subject: IMPALA-11034: Resolve schema of old data files in migrated
Iceberg tables
......................................................................
IMPALA-11034: Resolve schema of old data files in migrated Iceberg tables
When external tables are converted to Iceberg, the data files remain
intact, thus missing field IDs. Previously, Impala used name based
column resolution in this case.
Added a feature to traverse through the data files before column
resolution and assign field IDs the same way as iceberg would, to be
able to use field ID based column resolutions.
Testing:
Default resolution method was changed to field id for migrated tables,
existing tests use that from now.
Added new tests to cover edge cases with complex types and schema
evolution.
Change-Id: I77570bbfc2fcc60c2756812d7210110e8cc11ccc
Reviewed-on: http://gerrit.cloudera.org:8080/18639
Reviewed-by: Zoltan Borok-Nagy <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
---
M be/src/exec/orc-metadata-utils.cc
M be/src/exec/orc-metadata-utils.h
M be/src/exec/parquet/parquet-metadata-utils.cc
M be/src/exec/parquet/parquet-metadata-utils.h
M testdata/data/README
A testdata/data/iceberg_test/iceberg_migrated_alter_test/000000_0
A
testdata/data/iceberg_test/iceberg_migrated_alter_test/metadata/c9f83a82-60f4-443b-9ca4-359cad16fe12-m0.avro
A
testdata/data/iceberg_test/iceberg_migrated_alter_test/metadata/snap-2941076094076108396-1-c9f83a82-60f4-443b-9ca4-359cad16fe12.avro
A
testdata/data/iceberg_test/iceberg_migrated_alter_test/metadata/v1.metadata.json
A
testdata/data/iceberg_test/iceberg_migrated_alter_test/metadata/v2.metadata.json
A
testdata/data/iceberg_test/iceberg_migrated_alter_test/metadata/version-hint.text
A testdata/data/iceberg_test/iceberg_migrated_alter_test_orc/000000_0
A
testdata/data/iceberg_test/iceberg_migrated_alter_test_orc/metadata/340a3b82-71e3-4f50-b030-aecb5a5ea730-m0.avro
A
testdata/data/iceberg_test/iceberg_migrated_alter_test_orc/metadata/snap-2205107170480729038-1-340a3b82-71e3-4f50-b030-aecb5a5ea730.avro
A
testdata/data/iceberg_test/iceberg_migrated_alter_test_orc/metadata/v1.metadata.json
A
testdata/data/iceberg_test/iceberg_migrated_alter_test_orc/metadata/v2.metadata.json
A
testdata/data/iceberg_test/iceberg_migrated_alter_test_orc/metadata/version-hint.text
A testdata/data/iceberg_test/iceberg_migrated_complex_test/000000_0
A
testdata/data/iceberg_test/iceberg_migrated_complex_test/metadata/152e384f-2851-44b7-9ada-1bfbec74e9fc-m0.avro
A
testdata/data/iceberg_test/iceberg_migrated_complex_test/metadata/snap-3911840040574896148-1-152e384f-2851-44b7-9ada-1bfbec74e9fc.avro
A
testdata/data/iceberg_test/iceberg_migrated_complex_test/metadata/v1.metadata.json
A
testdata/data/iceberg_test/iceberg_migrated_complex_test/metadata/v2.metadata.json
A
testdata/data/iceberg_test/iceberg_migrated_complex_test/metadata/version-hint.text
A testdata/data/iceberg_test/iceberg_migrated_complex_test_orc/000000_0
A
testdata/data/iceberg_test/iceberg_migrated_complex_test_orc/metadata/8588fd4b-13c1-4451-80ad-5cf71a959b94-m0.avro
A
testdata/data/iceberg_test/iceberg_migrated_complex_test_orc/metadata/snap-3622599918649152504-1-8588fd4b-13c1-4451-80ad-5cf71a959b94.avro
A
testdata/data/iceberg_test/iceberg_migrated_complex_test_orc/metadata/v1.metadata.json
A
testdata/data/iceberg_test/iceberg_migrated_complex_test_orc/metadata/v2.metadata.json
A
testdata/data/iceberg_test/iceberg_migrated_complex_test_orc/metadata/version-hint.text
A
testdata/workloads/functional-query/queries/QueryTest/iceberg-migrated-table-field-id-resolution.test
M tests/common/file_utils.py
M tests/query_test/test_iceberg.py
32 files changed, 1,874 insertions(+), 21 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/18912/1
--
To view, visit http://gerrit.cloudera.org:8080/18912
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: branch-4.1.1
Gerrit-MessageType: newchange
Gerrit-Change-Id: I77570bbfc2fcc60c2756812d7210110e8cc11ccc
Gerrit-Change-Number: 18912
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Gergely Fürnstáhl <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>