[
https://issues.apache.org/jira/browse/IMPALA-14951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18080478#comment-18080478
]
ASF subversion and git services commented on IMPALA-14951:
----------------------------------------------------------
Commit 53f5d74b9c905aad36c54f9251e0d25fc21d80bc in impala's branch
refs/heads/master from Joe McDonnell
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=53f5d74b9 ]
IMPALA-14951: Fix hang during Iceberg delete with tuple cache
With mt_dop, IcebergDeleteNode has the same independent
builder as PartitionedHashJoinNode. It needs the same logic
used for IMPALA-13660 to notify the build side when a probe
side thread closes before probing. Interestingly enough,
this is hard to demonstrate with a select. The coordinator
cancels the query when it receives all the rows, which posts
the builder out of its wait. However, for a delete, this is
not true, so it can hang indefinitely.
This centralizes the necessary logic to share it between
PartitionedHashJoinNode, NestedLoopJoinNode, and
IcebergDeleteNode. Applying it to IcebergDeleteNode fixes
the hang.
Testing:
- Added a test case that consistently reproduced the hang
before the fix
Change-Id: Iff9228446f69ce43ed303c96893a91b99474800d
Reviewed-on: http://gerrit.cloudera.org:8080/24279
Reviewed-by: Yida Wu <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Iceberg delete can hang when using the tuple cache
> --------------------------------------------------
>
> Key: IMPALA-14951
> URL: https://issues.apache.org/jira/browse/IMPALA-14951
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Affects Versions: Impala 5.0.0
> Reporter: Joe McDonnell
> Assignee: Joe McDonnell
> Priority: Blocker
> Attachments: tuple_cache_iceberg_delete_hang_repo.txt
>
>
> With mt_dop, IcebergDeleteNode has the same independent builder as
> PartitionedHashJoinNode. It needs the same logic as used for IMPALA-13660 to
> notify the build side when a probe side thread closes before probing.
> Interestingly enough, this is hard to demonstrate with a select. The
> coordinator cancels the query when it receives all the rows, which posts the
> builder out of its wait.
> In a delete DML, this is not true, so it can hang indefinitely. This
> reproduces via these steps:
> 0. Use mt_dop>2 and enable_tuple_cache=true
> 1. Create a partitioned Iceberg v2 tables with some rows
> 2. Delete some rows
> 3. Delete more rows - this hangs
> A more specific reproducing case is attached (it needs to be run with the
> tuple caching startup flags set).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]