Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/21718 )
Change subject: IMPALA-13325: Use RowBatch::CopyRows in IcebergDeleteNode ...................................................................... Patch Set 6: (7 comments) Thanks for the comments! http://gerrit.cloudera.org:8080/#/c/21718/5/be/src/exec/iceberg-delete-node.cc File be/src/exec/iceberg-delete-node.cc: http://gerrit.cloudera.org:8080/#/c/21718/5/be/src/exec/iceberg-delete-node.cc@309 PS5, Line 309: > I think that this is a bit confusing and doesn't remove much duplication. W I got rid of NextProbePos. Using Tuple** did not improve the performance, but complicated the code a tiny bit (casts, and we need to Advance() probe_it separately) http://gerrit.cloudera.org:8080/#/c/21718/5/be/src/exec/iceberg-delete-node.cc@319 PS5, Line 319: // by more than one, so let's adjust 'next_deleted_pos'. > When most rows are not deleted it could be faster to use binary search in t It significantly complicates the code as you'd need to do two searches in parallel, each of them can jump over the other (probe side can be filtered), and we need to calculate 'rows_to_copy' correctly. For now I'd stick to this simpler algorithm. It also wouldn't save much time if there are lots of deletes, or probe side is filtered. http://gerrit.cloudera.org:8080/#/c/21718/5/be/src/exec/iceberg-delete-node.cc@341 PS5, Line 341: builder_->deleted_rows(); : : const RoaringBitmap64* deletes = nullptr; > It's been a long time since I looked at this area of the code - can we assu Yes, we have this assumption, added a DCHECK to the body of the while-loop. http://gerrit.cloudera.org:8080/#/c/21718/5/be/src/util/roaring-bitmap-test.cc File be/src/util/roaring-bitmap-test.cc: http://gerrit.cloudera.org:8080/#/c/21718/5/be/src/util/roaring-bitmap-test.cc@88 PS5, Line 88: > If I didn't miss something then currently we don't test the case when MoveA Done http://gerrit.cloudera.org:8080/#/c/21718/5/be/src/util/roaring-bitmap-test.cc@94 PS5, Line 94: r jump_it(rbm); > Can you add a similar test where the second loop calls GetEqualOrLarger les Sure, added new test cases. http://gerrit.cloudera.org:8080/#/c/21718/5/be/src/util/roaring-bitmap.h File be/src/util/roaring-bitmap.h: http://gerrit.cloudera.org:8080/#/c/21718/5/be/src/util/roaring-bitmap.h@71 PS5, Line 71: HasValue > This doesn't seem to be used at other places, could be private. Value() and Advance() are also not used. I can move them if you feel strong about it, but these functions could be useful later if we start using roaring bitmaps elsewhere, as these functionalities are quite basic for an Iterator class. (sure we could make them public later as well, but that'll require modification, code review, re-compilation) http://gerrit.cloudera.org:8080/#/c/21718/5/be/src/util/roaring-bitmap.h@87 PS5, Line 87: // Incoming 'x' values must be in ascending order. > Can you add a DCHECK for this? My understanding is that x > Value() must be Initially the iterator points to the first element of the bitmap, e.g. to 42, or to 1000000. The very first 'x' value we test will be probably smaller than this. What we could probably DCHECK is that the iterator either points to the first element, or x > Value(). With the C API we can only test that the iterator points to the first element if we try to invoke roaring64_iterator_previous() on it. Then we need to invoke roaring64_iterator_advance() to reset its state. Alternatively the Iterator could hold a reference to the roaring bitamp, and check x against the Min() value: DCHECK(x <= rb.Min() || x > Value()); But this increases the size of the Iterator object. -- To view, visit http://gerrit.cloudera.org:8080/21718 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46487fefa300027e9df6cd7fb36c78af01dd56c1 Gerrit-Change-Number: 21718 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Borok-Nagy <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Daniel Becker <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Fri, 18 Oct 2024 16:18:55 +0000 Gerrit-HasComments: Yes
