[
https://issues.apache.org/jira/browse/IMPALA-12708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zoltán Borók-Nagy resolved IMPALA-12708.
----------------------------------------
Fix Version/s: Impala 4.4.0
Resolution: Fixed
> An UPDATE creates 2 new snapshots in Iceberg tables
> ---------------------------------------------------
>
> Key: IMPALA-12708
> URL: https://issues.apache.org/jira/browse/IMPALA-12708
> Project: IMPALA
> Issue Type: Bug
> Components: Catalog
> Reporter: Noemi Pap-Takacs
> Assignee: Zoltán Borók-Nagy
> Priority: Major
> Labels: iceberg, impala-iceberg
> Fix For: Impala 4.4.0
>
>
> UPDATE statement is now supported for Iceberg tables in Impala.
> The implementation creates the delete file(s) and the new data file(s) for
> the updated row(s). These files are committed in one Iceberg transaction, but
> the transaction adds two snapshots to the table. The first contains the
> delete file(s), the second adds the new data file(s) of the updated row(s).
> This results in an unusual table history, because the first - temporary -
> snapshot of the transaction will have no time information associated to it
> (the table will spend 0 time in that state), and it will not appear as a
> separate entry when we query table history. Therefore it cannot be queried
> with time travel based on system time. However, it will appear in the history
> as the parent of the current snapshot, and it can be queried based on
> snapshot id, which will give results of an invalid table state.
> Impala should create only 1 new snapshot per UPDATE statement, so that the
> parent of the current snapshot points to the previous valid table state.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]