Zoltan Borok-Nagy has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21718


Change subject: IMPALA-13325: Use RowBatch::CopyRows in IcebergDeleteNode
......................................................................

IMPALA-13325: Use RowBatch::CopyRows in IcebergDeleteNode

Typically there are much more data records than delete records in a
healthy Iceberg table. This means it is suboptimal to copy probe rows
one by one in the IcebergDeleteNode. With this patch we switch to
RowBatch::CopyRows method to copy tuple rows in batches.

Measurements
I measured locally on a densely deleted table (2 Billion records,
1 Billion position delete records). Of course this optimization does
not help much in this case, but also doesn't regress the performance.
Time spent in IcebergDeleteNode operator  was between 16-18 seconds
with both implementations.

TODO:
 - measure on One Trillion Row Challenge

Change-Id: I46487fefa300027e9df6cd7fb36c78af01dd56c1
---
M be/src/exec/iceberg-delete-node.cc
M be/src/exec/iceberg-delete-node.h
M be/src/runtime/row-batch.h
3 files changed, 54 insertions(+), 72 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/21718/1
--
To view, visit http://gerrit.cloudera.org:8080/21718
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I46487fefa300027e9df6cd7fb36c78af01dd56c1
Gerrit-Change-Number: 21718
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy <[email protected]>

Reply via email to