Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/15497 )
Change subject: IMPALA-8005: Randomize partitioning exchanges. ...................................................................... Patch Set 4: (4 comments) Couple small comments http://gerrit.cloudera.org:8080/#/c/15497/4/be/src/runtime/data-stream-test.cc File be/src/runtime/data-stream-test.cc: http://gerrit.cloudera.org:8080/#/c/15497/4/be/src/runtime/data-stream-test.cc@496 PS4, Line 496: unordered_map<int, multiset<int64_t>> GetHashPartitionedReceiversDataMap( : unordered_map<int, multiset<int64_t>>& receiver_data_map, int num_receivers, : bool reset_hash_seed) { Couple small style things: 1. When returning a value, there are two options: a) You can return void and have an output parameter. For this option, we prefer output parameters to be pointers rather than references. (This is purely a style thing) b) You can return the map and not have an output parameter. I don't really have a strong preference between the two, but you shouldn't do both. It looks like we don't use the return value right now, and instead use the output parameter. 2. In general, when using output parameters, we order arguments with inputs first and outputs last. So, if using an output parameter, this should be num_receivers, reset_hash_seed, then receiver_data_map. For more detail, the google c++ style guide is what we use: https://google.github.io/styleguide/cppguide.html#Output_Parameters http://gerrit.cloudera.org:8080/#/c/15497/4/be/src/runtime/data-stream-test.cc@516 PS4, Line 516: for (int i = 0; i < receiver_info_.size(); i++) { : // Store a map of receiver and list of it's data values. : ReceiverInfo* info = receiver_info_[i].get(); : multiset<int64_t> data_set; : for (multiset<int64_t>::iterator j = info->data_values.begin(); : j != info->data_values.end(); ++j) { : data_set.insert(*j); : } : receiver_data_map[info->receiver_num] = data_set; : } Nit: For the output parameter receiver_data_map, go ahead and do receiver_data_map.clear() before you use it. http://gerrit.cloudera.org:8080/#/c/15497/4/be/src/runtime/data-stream-test.cc@596 PS4, Line 596: *(static_cast<const KrpcDataStreamSenderConfig*>(data_sink)) Since you did this above for the 'config' variable, I think it would be clearer to go ahead and pass in the 'config' variable here. http://gerrit.cloudera.org:8080/#/c/15497/4/be/src/runtime/krpc-data-stream-sender.cc File be/src/runtime/krpc-data-stream-sender.cc: http://gerrit.cloudera.org:8080/#/c/15497/4/be/src/runtime/krpc-data-stream-sender.cc@739 PS4, Line 739: } else { : exchange_hash_seed_ = sink_config.exchange_hash_seed_; Do we need this? We set exchange_hash_seed_ above. -- To view, visit http://gerrit.cloudera.org:8080/15497 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1936e6cc3e8d66420a5a9301f49221ca38f3e468 Gerrit-Change-Number: 15497 Gerrit-PatchSet: 4 Gerrit-Owner: Anurag Mantripragada <[email protected]> Gerrit-Reviewer: Aman Sinha <[email protected]> Gerrit-Reviewer: Anurag Mantripragada <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Comment-Date: Thu, 26 Mar 2020 20:57:01 +0000 Gerrit-HasComments: Yes
