Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21563 )

Change subject: IMPALA-13194: Fast-serialize position delete records
......................................................................


Patch Set 1:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/21563/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21563/1//COMMIT_MSG@13
PS1, Line 13: e
Nit: tuples?


http://gerrit.cloudera.org:8080/#/c/21563/1/be/src/runtime/krpc-data-stream-sender.h
File be/src/runtime/krpc-data-stream-sender.h:

http://gerrit.cloudera.org:8080/#/c/21563/1/be/src/runtime/krpc-data-stream-sender.h@308
PS1, Line 308:   std::unordered_map<Channel*, 
std::unique_ptr<IcebergPositionDeleteChannel>>
Could add a comment that describes this variable.


http://gerrit.cloudera.org:8080/#/c/21563/1/be/src/runtime/krpc-data-stream-sender.cc
File be/src/runtime/krpc-data-stream-sender.cc:

http://gerrit.cloudera.org:8080/#/c/21563/1/be/src/runtime/krpc-data-stream-sender.cc@633
PS1, Line 633: unique_ptr
I think it would be cleaner if we returned OutboundRowBatch*. AFAICS 
KrpcDataStreamSender::Channel::TransmitData() could also take a raw pointer.


http://gerrit.cloudera.org:8080/#/c/21563/1/be/src/runtime/krpc-data-stream-sender.cc@797
PS1, Line 797:     if (row_count_ == capacity_) {
Can channel_->RowBatchCapacity() ever be 0 at L775? If it can then this check 
comes too late. If it can't we can add a DCHECK.


http://gerrit.cloudera.org:8080/#/c/21563/1/be/src/runtime/krpc-data-stream-sender.cc@857
PS1, Line 857: Ubsan::MemSet
What if 'tuple_data_size' is 0? In this case 'tuple_data' may be a nullptr 
according to https://en.cppreference.com/w/cpp/container/vector/data and I'm 
not sure if memset with a nullptr could be undefined behaviour.


http://gerrit.cloudera.org:8080/#/c/21563/1/be/src/runtime/krpc-data-stream-sender.cc@943
PS1, Line 943: auto
I think writing the actual type (Channel*) is easier to read.


http://gerrit.cloudera.org:8080/#/c/21563/1/be/src/runtime/krpc-data-stream-sender.cc@945
PS1, Line 945: row_desc_->tuple_descriptors()[0]
Could extract into a variable before the loop.


http://gerrit.cloudera.org:8080/#/c/21563/1/be/src/runtime/krpc-data-stream-sender.cc@1282
PS1, Line 1282: row
Not changed in this patch, but it should be 'tuple'.


http://gerrit.cloudera.org:8080/#/c/21563/1/be/src/runtime/string-value.h
File be/src/runtime/string-value.h:

http://gerrit.cloudera.org:8080/#/c/21563/1/be/src/runtime/string-value.h@200
PS1, Line 200: inline std::size_t hash_value(const StringValue& v) {
Not changed in this patch, but how does this work with small strings? 
StringValue::Eq() first converts the values to SimpleStrings to eliminate the 
difference between small and normal strings.

Or if we only use it with non-small strings, we should add a DCHECK.


http://gerrit.cloudera.org:8080/#/c/21563/1/be/src/runtime/string-value.h@204
PS1, Line 204: struct StringValueHashWrapper {
Do you think we should consider specialising std::hash for StringValue or 
should we keep this explicit?


http://gerrit.cloudera.org:8080/#/c/21563/1/be/src/runtime/string-value.h@206
PS1, Line 206: impala::
Do we need this qualifier?



--
To view, visit http://gerrit.cloudera.org:8080/21563
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6095f318e3d06dedb4197681156b40dd2a326c6f
Gerrit-Change-Number: 21563
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Comment-Date: Wed, 10 Jul 2024 11:01:31 +0000
Gerrit-HasComments: Yes

Reply via email to