[
https://issues.apache.org/jira/browse/IMPALA-13655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17911552#comment-17911552
]
Noemi Pap-Takacs edited comment on IMPALA-13655 at 1/9/25 3:10 PM:
-------------------------------------------------------------------
Same root cause with IMPALA-13598:
When IcebergUpdateImpl created the insert table sink it didn't set
'inputIsClustered' to true. Therefore HdfsTableSink expects random input and
keeps the output writers open for every partition, which results in high memory
consumption and potentially an OOM error when the number of updated rows and
the number of partitions are high.
was (Author: noemi):
Same root cause:
When IcebergUpdateImpl created the insert table sink it didn't set
'inputIsClustered' to true. Therefore HdfsTableSink expects random input and
keeps the output writers open for every partition, which results in high memory
consumption and potentially an OOM error when the number of updated rows and
the number of partitions are high.
> UPDATE redundantly accumulates memory in HDFS WRITER
> ----------------------------------------------------
>
> Key: IMPALA-13655
> URL: https://issues.apache.org/jira/browse/IMPALA-13655
> Project: IMPALA
> Issue Type: Bug
> Components: Frontend
> Reporter: Noemi Pap-Takacs
> Assignee: Noemi Pap-Takacs
> Priority: Major
> Labels: impala-iceberg
> Fix For: Impala 4.5.0
>
>
> When we have an Iceberg table that have lots of partitions, and we want to
> update lots of values in the table, it will use much more memory than needed.
> Repro steps:
> {noformat}
> create table tmp_ice_tpch
> partitioned by spec(truncate(500, l_orderkey))
> stored by iceberg as
> select * from tpch.lineitem;
> UPDATE TABLE tmp_ice_tpch SET l_partkey=l_partkey+1;
> # We likely get a Memory Limit Exceeded error here{noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]