Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21718 )

Change subject: IMPALA-13325: Use RowBatch::CopyRows in IcebergDeleteNode
......................................................................


Patch Set 6:

(7 comments)

Thanks for the comments!

http://gerrit.cloudera.org:8080/#/c/21718/5/be/src/exec/iceberg-delete-node.cc
File be/src/exec/iceberg-delete-node.cc:

http://gerrit.cloudera.org:8080/#/c/21718/5/be/src/exec/iceberg-delete-node.cc@309
PS5, Line 309:
> I think that this is a bit confusing and doesn't remove much duplication. W
I got rid of NextProbePos.
Using Tuple** did not improve the performance, but complicated the code a tiny 
bit (casts, and we need to Advance() probe_it separately)


http://gerrit.cloudera.org:8080/#/c/21718/5/be/src/exec/iceberg-delete-node.cc@319
PS5, Line 319:     // by more than one, so let's adjust 'next_deleted_pos'.
> When most rows are not deleted it could be faster to use binary search in t
It significantly complicates the code as you'd need to do two searches in 
parallel, each of them can jump over the other (probe side can be filtered), 
and we need to calculate 'rows_to_copy' correctly.

For now I'd stick to this simpler algorithm. It also wouldn't save much time if 
there are lots of deletes, or probe side is filtered.


http://gerrit.cloudera.org:8080/#/c/21718/5/be/src/exec/iceberg-delete-node.cc@341
PS5, Line 341:       builder_->deleted_rows();
             :
             :   const RoaringBitmap64* deletes = nullptr;
> It's been a long time since I looked at this area of the code - can we assu
Yes, we have this assumption, added a DCHECK to the body of the while-loop.


http://gerrit.cloudera.org:8080/#/c/21718/5/be/src/util/roaring-bitmap-test.cc
File be/src/util/roaring-bitmap-test.cc:

http://gerrit.cloudera.org:8080/#/c/21718/5/be/src/util/roaring-bitmap-test.cc@88
PS5, Line 88:
> If I didn't miss something then currently we don't test the case when MoveA
Done


http://gerrit.cloudera.org:8080/#/c/21718/5/be/src/util/roaring-bitmap-test.cc@94
PS5, Line 94: r jump_it(rbm);
> Can you add a similar test where the second loop calls GetEqualOrLarger les
Sure, added new test cases.


http://gerrit.cloudera.org:8080/#/c/21718/5/be/src/util/roaring-bitmap.h
File be/src/util/roaring-bitmap.h:

http://gerrit.cloudera.org:8080/#/c/21718/5/be/src/util/roaring-bitmap.h@71
PS5, Line 71: HasValue
> This doesn't seem to be used at other places, could be private.
Value() and Advance() are also not used. I can move them if you feel strong 
about it, but these functions could be useful later if we start using roaring 
bitmaps elsewhere, as these functionalities are quite basic for an Iterator 
class.

(sure we could make them public later as well, but that'll require 
modification, code review, re-compilation)


http://gerrit.cloudera.org:8080/#/c/21718/5/be/src/util/roaring-bitmap.h@87
PS5, Line 87:       // Incoming 'x' values must be in ascending order.
> Can you add a DCHECK for this? My understanding is that x > Value() must be
Initially the iterator points to the first element of the bitmap, e.g. to 42, 
or to 1000000. The very first 'x' value we test will be probably smaller than 
this.

What we could probably DCHECK is that the iterator either points to the first 
element, or x > Value(). With the C API we can only test that the iterator 
points to the first element if we try to invoke roaring64_iterator_previous() 
on it. Then we need to invoke roaring64_iterator_advance() to reset its state.

Alternatively the Iterator could hold a reference to the roaring bitamp, and 
check x against the Min() value:

 DCHECK(x <= rb.Min() || x > Value());

But this increases the size of the Iterator object.



--
To view, visit http://gerrit.cloudera.org:8080/21718
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I46487fefa300027e9df6cd7fb36c78af01dd56c1
Gerrit-Change-Number: 21718
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Fri, 18 Oct 2024 16:18:55 +0000
Gerrit-HasComments: Yes

Reply via email to