Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/21993 )
Change subject: IMPALA-13484: Don't call alter_table() on HMS when loading Iceberg table ...................................................................... IMPALA-13484: Don't call alter_table() on HMS when loading Iceberg table When Impala loads an Iceberg table it also loads the metastore representation from HMS. Additionally, when Impala loads the Iceberg metadata of the table it construct another metastore representation of that metadata. If these two metastore representations don't match, Impala calls alter_table() on HMS to persist the differences. In practice this behaviour is triggered for instance when one engine creates a table with column types that Impala can't read and instead it does an adjustment on the column type. E.g. if an engine creates a 'timestamp with local time zone' column, Impala will change the column type in the HMS representation to 'timestamp' type. There are some issues with this approach: 1: Impala calls HMS.alter_table() directly and doesn't change the table through the Iceberg API. As a result no conflict checks are performed. Since the metadata location is a simple table property this alter_table() call can overwrite the metadata pointer in case there had been other changes to the table since Impala started to load it. This can result in data correctness issues. 2: Even in use cases where Impala only reads a table, it can still perform table modifications that is a very weird behaviour. 3: With this approach Impala changes the table schema in HMS but doesn't change the Iceberg schema in the Iceberg metadata. As a solution we can simply get rid of the logic that makes the alter_table() call to HMS at the end of loading an Iceberg table. This can avoid a lot of confusions and in fact persisiting the schema adjustments Impala had done during table loading is not necessary. Testing: Ran the whole exhaustive test suite. Change-Id: Icd5d1eee421f22d3853833dc571b14d4e9005ab3 Reviewed-on: http://gerrit.cloudera.org:8080/21993 Reviewed-by: Impala Public Jenkins <[email protected]> Tested-by: Impala Public Jenkins <[email protected]> --- M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java 1 file changed, 0 insertions(+), 16 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/21993 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Icd5d1eee421f22d3853833dc571b14d4e9005ab3 Gerrit-Change-Number: 21993 Gerrit-PatchSet: 4 Gerrit-Owner: Gabor Kaszab <[email protected]> Gerrit-Reviewer: Daniel Becker <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
