[ 
https://issues.apache.org/jira/browse/IMPALA-12327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17750502#comment-17750502
 ] 

ASF subversion and git services commented on IMPALA-12327:
----------------------------------------------------------

Commit 8638255e5074f1342dfc452bca39f649a76612d6 in impala's branch 
refs/heads/master from Zoltan Borok-Nagy
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=8638255e5 ]

IMPALA-12327: Iceberg V2 operator wrong results in PARTITIONED mode

The Iceberg delete node tries to do mini merge-joins between data
records and delete records. This works in BROADCAST mode, and most of
the time in PARTITIONED mode as well. Though the Iceberg delete node had
the wrong assumption that if the rows in a row batch belong to the same
file, and come in ascending order, we rely on the previous delete
updating IcebergDeleteState to the next deleted row id and skip the
binary search if it's greater than or equal to the current probe row id.

When PARTITIONED mode is used, we cannot rely on ascending row order,
not even inside row batches, not even when the previous file path is the
same as the current one. This is because files with multiple blocks can
be processed by multiple hosts in parallel, then the rows are getting
hash-exchanged based on their file paths. Then the exchange-receiver at
the LHS coalesces the row batches from multiple senders, hence the row
IDs being unordered.

This patch adds a fix to ignore presumptions and do a binary search when
the position-based difference between the current row and previous row
is not one, and we are in PARTITIONED mode.

Tests:
 * added e2e tests

Change-Id: Ib89a53e812af8c3b8ec5bc27bca0a50dcac5d924
Reviewed-on: http://gerrit.cloudera.org:8080/20295
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Iceberg V2 operator wrong results in PARTITIONED mode
> -----------------------------------------------------
>
>                 Key: IMPALA-12327
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12327
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>            Reporter: Zoltán Borók-Nagy
>            Assignee: Zoltán Borók-Nagy
>            Priority: Major
>              Labels: impala-iceberg
>
> The Iceberg delete node tries to do mini merge-joins between data records and 
> delete records. This works in DISTRIBUTED mode, and most of the time in 
> PARTITIONED mode as well. The Iceberg delete node had the wrong assumption 
> that if the rows in a row batch belong to the same file, and come in 
> ascending order, we don't need to update the IcebergDeleteState which tracks 
> the state of the probing.
> But when PARTITIONED mode is used, we cannot rely on ascending row order, not 
> even inside row batches, not even when the previous file path is the same as 
> the current one.
> This is because files with multiple blocks can be processed by multiple hosts 
> in parallel, then the rows are getting hash-exchanged based on their file 
> paths. Then the exchange-receiver at the LHS coalesces the row batches from 
> multiple senders, hence the row IDs getting unordered.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to