[
https://issues.apache.org/jira/browse/IMPALA-13932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17942147#comment-17942147
]
ASF subversion and git services commented on IMPALA-13932:
----------------------------------------------------------
Commit 5c14877b05149ed9ddb6b8734cc77946774d4c0c in impala's branch
refs/heads/master from Peter Rozsa
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=5c14877b0 ]
IMPALA-13932: Add file path and position-based duplicate check for
IcebergMergeNode
IcebergMergeNode's duplicate checking mechanism was based on comparing
pointers of the target table's rows. This mechanism results in
false positives if a new row batch reused the memory of the previous row
batch provided for the merge node. This change adds an additional check
that validates the file position and file path as well.
Change-Id: I71b47414321675958c05438ef3aeeb5df0128033
Reviewed-on: http://gerrit.cloudera.org:8080/22761
Tested-by: Impala Public Jenkins <[email protected]>
Reviewed-by: Zoltan Borok-Nagy <[email protected]>
> MERGE duplicate check reports false-positive if the incoming row batch's
> memory is reused
> -----------------------------------------------------------------------------------------
>
> Key: IMPALA-13932
> URL: https://issues.apache.org/jira/browse/IMPALA-13932
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Affects Versions: Impala 4.5.0
> Reporter: Peter Rozsa
> Assignee: Peter Rozsa
> Priority: Critical
> Labels: impala-iceberg
>
> The Iceberg merge node uses a duplicate check mechanism that compares the
> actual target row's pointer with the previous target row's pointer. If a new
> row batch's first target table's tuple points to the same region as the
> previous row, then we report a duplicate row erroneously.
> The duplicate check should be aware whether the merge join's probe batch is
> ended, and this case, resetting the incoming row batch would solve the
> problem.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]