Zoltán Borók-Nagy created IMPALA-12640:
------------------------------------------
Summary: Remove IcebergDeleteSink
Key: IMPALA-12640
URL: https://issues.apache.org/jira/browse/IMPALA-12640
Project: IMPALA
Issue Type: Bug
Components: Backend, Frontend
Reporter: Zoltán Borók-Nagy
UPDATE part 3 CR (https://gerrit.cloudera.org/#/c/20760/) introduces a new sink
operator for position delete records: IcebergBufferedDeleteSink.
The new operator can be used in the context of UPDATEs even in the case when
updating a partition column value, or the table has SORT BY properties.
IcebergBufferedDeleteSink doesn't require sorting by delete partitions, file
paths, and positions, as it takes care of it.
The only area where IcebergBufferedDeleteSink lags behind IcebergDeleteSink is
that it cannot spill to disk. But since it stores filepaths and positions in a
compact format it is unlikely that it would ever need to spill to disk in a
real life situation. E.g. even if there are 100M rows need to be deleted per
Impala executor, the amount of memory required is not much larger than 800 MBs
per executor.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]