[ 
https://issues.apache.org/jira/browse/IMPALA-13882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17942329#comment-17942329
 ] 

ASF subversion and git services commented on IMPALA-13882:
----------------------------------------------------------

Commit 7aa4d50484e5508ac2253f4b7e08bdccbaa43d54 in impala's branch 
refs/heads/master from Joe McDonnell
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=7aa4d5048 ]

IMPALA-13882: Fix Iceberg v2 deletes with tuple caching

A variety of Iceberg statements (including v2 deletes) rely
on getting information from the scan node child of the
delete node. Since tuple caching can insert a TupleCacheNode
above that scan, the logic is currently failing, because
it doesn't know how to bypass the TupleCacheNode and get
to the scan node below.

This modifies the logic in multiple places to detect a
TupleCacheNode and go past it to the get the scan node
below it.

Testing:
 - Added a basic Iceberg test with v2 deletes for the
  frontend test and custom cluster test

Change-Id: I162e738c4e4449a536701a740272aaac56ce8fd8
Reviewed-on: http://gerrit.cloudera.org:8080/22666
Reviewed-by: Kurt Deschler <[email protected]>
Reviewed-by: Michael Smith <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Iceberg deletes don't work with tuple caching
> ---------------------------------------------
>
>                 Key: IMPALA-13882
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13882
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 5.0.0
>            Reporter: Joe McDonnell
>            Assignee: Joe McDonnell
>            Priority: Critical
>
> Comparing enable_tuple_cache=false to enable_tuple_cache=true shows that 
> tuple caching hits an error for Iceberg tables with deletes.
> Examples:
> {noformat}
> set enable_tuple_cache=false;
> [localhost:21050] functional_parquet>  select * from 
> iceberg_v2_positional_update_all_rows;
> +---+---+
> | i | s |
> +---+---+
> | 1 | A |
> | 2 | B |
> | 3 | C |
> +---+---+
> Fetched 3 row(s) in 0.11s
> set enable_tuple_cache=true;
> [localhost:21050] functional_parquet> select * from 
> iceberg_v2_positional_update_all_rows; 2025-03-20 13:31:53 [Exception]  
> ERROR: Query 054e755acc4622e2:702eca1400000000 failed:
> Failed to find file to hosts mapping for plan node: 9{noformat}
> One theory is that some logic can depend on the scan node being the immediate 
> child of the delete node. Tuple caching can insert a TupleCacheNode in 
> between, so maybe the logic wasn't built to handle that case. We should add 
> tests of various Iceberg scenarios with tuple caching.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to