Zoltan Borok-Nagy has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/22610 )
Change subject: IMPALA-13853: Don't adjust Iceberg field IDs for data files that don't have complex types ...................................................................... IMPALA-13853: Don't adjust Iceberg field IDs for data files that don't have complex types In migrated Iceberg tables we can have data files with missing field IDs. We assume that their schema corresponds to the table schema at the point when the table migration happened. This means during runtime we can generate the field ids. The logic is more complicated when there are complex types in the table and the table is partitioned. In such cases we need to do some adjustments during field ID generation, in which case we verify that the file schema corresponds to the table schema. These adjustments are not needed when the table doesn't have complex types, hence we can be a bit more relaxed and skip schema verification, because field ID generation for top-level columns are not affected. This means Impala would still be able to read the table if there were trivial schema changes before migration. With this change we allow all data files that have a compatible schema with the table schema, which was the case before IMPALA-13364. This behavior is also aligned with Hive. Testing: * e2e tests added for both Parquet and ORC files Change-Id: Ib1f1d0cf36792d0400de346c83e999fa50c0fa67 Reviewed-on: http://gerrit.cloudera.org:8080/22610 Reviewed-by: Daniel Becker <[email protected]> Tested-by: Impala Public Jenkins <[email protected]> --- M be/src/exec/orc/orc-metadata-utils.cc M be/src/exec/parquet/parquet-metadata-utils.cc A testdata/workloads/functional-query/queries/QueryTest/iceberg-migrated-table-field-id-resolution-orc.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-migrated-table-field-id-resolution.test M tests/query_test/test_iceberg.py 5 files changed, 98 insertions(+), 3 deletions(-) Approvals: Daniel Becker: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/22610 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ib1f1d0cf36792d0400de346c83e999fa50c0fa67 Gerrit-Change-Number: 22610 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy <[email protected]> Gerrit-Reviewer: Daniel Becker <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
