Gergely Fürnstáhl has uploaded a new patch set (#10). ( http://gerrit.cloudera.org:8080/18639 )
Change subject: IMPALA-11034: Resolve schema of old data files in migrated Iceberg tables ...................................................................... IMPALA-11034: Resolve schema of old data files in migrated Iceberg tables When external tables are converted to Iceberg, the data files remain intact, thus missing field IDs. Previously, Impala used name based column resolution in this case. Added a feature to traverse through the data files before column resolution and assign field IDs the same way as iceberg would, to be able to use field ID based column resolutions. Testing: Default resolution method was changed to field id for migrated tables, existing tests use that from now. Added new tests to cover edge cases with complex types and schema evolution. Change-Id: I77570bbfc2fcc60c2756812d7210110e8cc11ccc --- M be/src/exec/orc-metadata-utils.cc M be/src/exec/orc-metadata-utils.h M be/src/exec/parquet/parquet-metadata-utils.cc M be/src/exec/parquet/parquet-metadata-utils.h M testdata/data/README A testdata/workloads/functional-query/queries/QueryTest/iceberg-migrated-table-field-id-resolution.test M tests/common/file_utils.py M tests/query_test/test_iceberg.py 8 files changed, 402 insertions(+), 21 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/39/18639/10 -- To view, visit http://gerrit.cloudera.org:8080/18639 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I77570bbfc2fcc60c2756812d7210110e8cc11ccc Gerrit-Change-Number: 18639 Gerrit-PatchSet: 10 Gerrit-Owner: Gergely Fürnstáhl <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
