Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21932 )

Change subject: IMPALA-13509: Copy rows directly to OutboundRowBatch during 
hash partitioning
......................................................................


Patch Set 12:

(5 comments)

Great improvement!

http://gerrit.cloudera.org:8080/#/c/21932/12//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21932/12//COMMIT_MSG@67
PS12, Line 67: TPCH benchmarks improved significantly
Nice improvements! Did you also measure TPC-DS?


http://gerrit.cloudera.org:8080/#/c/21932/12/be/src/runtime/descriptors.h
File be/src/runtime/descriptors.h:

http://gerrit.cloudera.org:8080/#/c/21932/12/be/src/runtime/descriptors.h@845
PS12, Line 845:   int IR_NO_INLINE num_tuples_no_inline() const { return 
tuple_desc_map_.size(); }
nit: Could you please add comment to this function?


http://gerrit.cloudera.org:8080/#/c/21932/12/be/src/runtime/krpc-data-stream-sender-ir.cc
File be/src/runtime/krpc-data-stream-sender-ir.cc:

http://gerrit.cloudera.org:8080/#/c/21932/12/be/src/runtime/krpc-data-stream-sender-ir.cc@35
PS12, Line 35:   while (row_idx < num_rows) {
             :     int row_count = 0;
             :     FOREACH_ROW_LIMIT(batch, row_idx, RowBatch::HASH_BATCH_SIZE, 
row_batch_iter) {
             :       TupleRow* row = row_batch_iter.Get();
             :       channel_ids[row_count++] = HashRow(row, 
exchange_hash_seed_) % num_channels;
             :     }
             :     row_count = 0;
             :     FOREACH_ROW_LIMIT(batch, row_idx, RowBatch::HASH_BATCH_SIZE, 
row_batch_iter) {
             :       int channel_id = channel_ids[row_count++];
             :       PartitionRowCollector& collector = 
partition_row_collectors_[channel_id];
             :       RETURN_IF_ERROR(collector.AppendRow(row_batch_iter.Get(), 
row_desc_));
             :     }
             :     row_idx += row_count;
             :   }
This code is a bit subtle due to the micro-optimization introduced by 
https://gerrit.cloudera.org/#/c/9221/
Probably it'd worth a short comment and pointer to the original Jira 
(IMPALA-6461).


http://gerrit.cloudera.org:8080/#/c/21932/12/be/src/runtime/krpc-data-stream-sender.h
File be/src/runtime/krpc-data-stream-sender.h:

http://gerrit.cloudera.org:8080/#/c/21932/12/be/src/runtime/krpc-data-stream-sender.h@184
PS12, Line 184: TransmiteData
typo


http://gerrit.cloudera.org:8080/#/c/21932/12/be/src/runtime/outbound-row-batch.inline.h
File be/src/runtime/outbound-row-batch.inline.h:

http://gerrit.cloudera.org:8080/#/c/21932/12/be/src/runtime/outbound-row-batch.inline.h@76
PS12, Line 76: &tuple_data_[0]) + tuple_data_.size()
Maybe 'tuple_data_.back()' is simpler.



--
To view, visit http://gerrit.cloudera.org:8080/21932
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I81a16c2f0fcfc1f3adef7077b3932a29a0f15a8f
Gerrit-Change-Number: 21932
Gerrit-PatchSet: 12
Gerrit-Owner: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Fri, 08 Nov 2024 17:01:02 +0000
Gerrit-HasComments: Yes

Reply via email to