Zoltan Borok-Nagy has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/21718 )
Change subject: IMPALA-13325: Use RowBatch::CopyRows in IcebergDeleteNode ...................................................................... IMPALA-13325: Use RowBatch::CopyRows in IcebergDeleteNode Typically there are much more data records than delete records in a healthy Iceberg table. This means it is suboptimal to copy probe rows one by one in the IcebergDeleteNode. With this patch we switch to RowBatch::CopyRows method to copy tuple rows in batches. We also switch to an iterator based approach when we test the deleted rows which seem to be more efficient than ContainsBulk(). I measured the Avg Time of DELETE EVENTS ICEBERG DELETE operator. Local Measurements +--------------+----------------+--------------------+--------------------+ | Data records | Delete records | Old implementation | New implementation | +--------------+----------------+--------------------+--------------------+ | 2 Billion | 1 Billion | 15.82s | 14.73s | | 1.2 Billion | 70 Million | 5.64s | 2.4s | +--------------+----------------+--------------------+--------------------+ Large scale measurements 1 Coordinator, 10 executors. +--------------+----------------+--------------------+--------------------+ | Data records | Delete records | Old implementation | New implementation | +--------------+----------------+--------------------+--------------------+ | 405 Billion | 68.5 Billion | 87.30s | 54.76s | | 301 Billion | 18 Billion | 67.38s | 25.31s | +--------------+----------------+--------------------+--------------------+ 1 Coordinator, 40 executors. +--------------+----------------+--------------------+--------------------+ | Data records | Delete records | Old implementation | New implementation | +--------------+----------------+--------------------+--------------------+ | 405 Billion | 68.5 Billion | 23.18s | 14.72s | | 301 Billion | 18 Billion | 16.52s | 6.09s | +--------------+----------------+--------------------+--------------------+ Testing * added unit tests for the new methods of RoaringBitmap Change-Id: I46487fefa300027e9df6cd7fb36c78af01dd56c1 Reviewed-on: http://gerrit.cloudera.org:8080/21718 Reviewed-by: Csaba Ringhofer <[email protected]> Tested-by: Impala Public Jenkins <[email protected]> --- M be/src/exec/iceberg-delete-node.cc M be/src/exec/iceberg-delete-node.h M be/src/runtime/row-batch.h M be/src/util/roaring-bitmap-test.cc M be/src/util/roaring-bitmap.h 5 files changed, 303 insertions(+), 74 deletions(-) Approvals: Csaba Ringhofer: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/21718 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I46487fefa300027e9df6cd7fb36c78af01dd56c1 Gerrit-Change-Number: 21718 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Borok-Nagy <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Daniel Becker <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
