Quanlong Huang has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/18912 )

Change subject: IMPALA-11034: Resolve schema of old data files in migrated 
Iceberg tables
......................................................................

IMPALA-11034: Resolve schema of old data files in migrated Iceberg tables

When external tables are converted to Iceberg, the data files remain
intact, thus missing field IDs. Previously, Impala used name based
column resolution in this case.

Added a feature to traverse through the data files before column
resolution and assign field IDs the same way as iceberg would, to be
able to use field ID based column resolutions.

Testing:

Default resolution method was changed to field id for migrated tables,
existing tests use that from now.

Added new tests to cover edge cases with complex types and schema
evolution.

Change-Id: I77570bbfc2fcc60c2756812d7210110e8cc11ccc
Reviewed-on: http://gerrit.cloudera.org:8080/18639
Reviewed-by: Zoltan Borok-Nagy <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
Reviewed-on: http://gerrit.cloudera.org:8080/18912
Tested-by: Quanlong Huang <[email protected]>
Reviewed-by: Tamas Mate <[email protected]>
---
M be/src/exec/orc-metadata-utils.cc
M be/src/exec/orc-metadata-utils.h
M be/src/exec/parquet/parquet-metadata-utils.cc
M be/src/exec/parquet/parquet-metadata-utils.h
M testdata/data/README
A testdata/data/iceberg_test/iceberg_migrated_alter_test/000000_0
A 
testdata/data/iceberg_test/iceberg_migrated_alter_test/metadata/c9f83a82-60f4-443b-9ca4-359cad16fe12-m0.avro
A 
testdata/data/iceberg_test/iceberg_migrated_alter_test/metadata/snap-2941076094076108396-1-c9f83a82-60f4-443b-9ca4-359cad16fe12.avro
A 
testdata/data/iceberg_test/iceberg_migrated_alter_test/metadata/v1.metadata.json
A 
testdata/data/iceberg_test/iceberg_migrated_alter_test/metadata/v2.metadata.json
A 
testdata/data/iceberg_test/iceberg_migrated_alter_test/metadata/version-hint.text
A testdata/data/iceberg_test/iceberg_migrated_alter_test_orc/000000_0
A 
testdata/data/iceberg_test/iceberg_migrated_alter_test_orc/metadata/340a3b82-71e3-4f50-b030-aecb5a5ea730-m0.avro
A 
testdata/data/iceberg_test/iceberg_migrated_alter_test_orc/metadata/snap-2205107170480729038-1-340a3b82-71e3-4f50-b030-aecb5a5ea730.avro
A 
testdata/data/iceberg_test/iceberg_migrated_alter_test_orc/metadata/v1.metadata.json
A 
testdata/data/iceberg_test/iceberg_migrated_alter_test_orc/metadata/v2.metadata.json
A 
testdata/data/iceberg_test/iceberg_migrated_alter_test_orc/metadata/version-hint.text
A testdata/data/iceberg_test/iceberg_migrated_complex_test/000000_0
A 
testdata/data/iceberg_test/iceberg_migrated_complex_test/metadata/152e384f-2851-44b7-9ada-1bfbec74e9fc-m0.avro
A 
testdata/data/iceberg_test/iceberg_migrated_complex_test/metadata/snap-3911840040574896148-1-152e384f-2851-44b7-9ada-1bfbec74e9fc.avro
A 
testdata/data/iceberg_test/iceberg_migrated_complex_test/metadata/v1.metadata.json
A 
testdata/data/iceberg_test/iceberg_migrated_complex_test/metadata/v2.metadata.json
A 
testdata/data/iceberg_test/iceberg_migrated_complex_test/metadata/version-hint.text
A testdata/data/iceberg_test/iceberg_migrated_complex_test_orc/000000_0
A 
testdata/data/iceberg_test/iceberg_migrated_complex_test_orc/metadata/8588fd4b-13c1-4451-80ad-5cf71a959b94-m0.avro
A 
testdata/data/iceberg_test/iceberg_migrated_complex_test_orc/metadata/snap-3622599918649152504-1-8588fd4b-13c1-4451-80ad-5cf71a959b94.avro
A 
testdata/data/iceberg_test/iceberg_migrated_complex_test_orc/metadata/v1.metadata.json
A 
testdata/data/iceberg_test/iceberg_migrated_complex_test_orc/metadata/v2.metadata.json
A 
testdata/data/iceberg_test/iceberg_migrated_complex_test_orc/metadata/version-hint.text
A 
testdata/workloads/functional-query/queries/QueryTest/iceberg-migrated-table-field-id-resolution.test
M tests/common/file_utils.py
M tests/query_test/test_iceberg.py
32 files changed, 1,874 insertions(+), 21 deletions(-)

Approvals:
  Quanlong Huang: Verified
  Tamas Mate: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/18912
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: branch-4.1.1
Gerrit-MessageType: merged
Gerrit-Change-Id: I77570bbfc2fcc60c2756812d7210110e8cc11ccc
Gerrit-Change-Number: 18912
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Gergely Fürnstáhl <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Tamas Mate <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>

Reply via email to