[
https://issues.apache.org/jira/browse/IMPALA-13173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Noémi Pap-Takács updated IMPALA-13173:
--------------------------------------
Priority: Minor (was: Major)
> Redundant Catalog Update Check in Coordinator
> ---------------------------------------------
>
> Key: IMPALA-13173
> URL: https://issues.apache.org/jira/browse/IMPALA-13173
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Reporter: Noémi Pap-Takács
> Assignee: Noémi Pap-Takács
> Priority: Minor
> Labels: impala-iceberg
>
> In case of DML operations, the Coordinator sends an update to the Catalog
> about the files changed in the table. Before sending the update, we check if
> any file was created. If no files were added or deleted, we skip the catalog
> update. See the logic in _'DmlExecState::PrepareCatalogUpdate'._
> However, in case of unpartitioned Iceberg tables, the check in
> _'DmlExecState::PrepareCatalogUpdate'_ always returns true, and updates the
> Catalog even if no files were added. Currently, this does not cause incorrect
> behavior because the presence of created files is double-checked later in
> client-request-state.cc.
> On the other hand, there are cases, when not writing any files does not equal
> a NO-OP. For example overwriting a table with empty content or an OPTIMIZE
> TABLE that merges delete files. The Catalog needs to be informed about the
> changes in such cases.
> We should filter NO-OP DMLs correctly in the Coordinator, eliminating false
> positive and false negative updates as well.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]