Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/13882 )
Change subject: IMPALA-7984: Port runtime filter from Thrift RPC to KRPC ...................................................................... Patch Set 25: (26 comments) http://gerrit.cloudera.org:8080/#/c/13882/25/be/src/runtime/coordinator-backend-state.cc File be/src/runtime/coordinator-backend-state.cc: http://gerrit.cloudera.org:8080/#/c/13882/25/be/src/runtime/coordinator-backend-state.cc@556 PS25, Line 556: LOG(ERROR) << "PublishFilter() rpc failed: " << rpc_status.ToString(); return here as there is no point in continuing to print the result status if RPC failed. http://gerrit.cloudera.org:8080/#/c/13882/24/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/13882/24/be/src/runtime/coordinator.cc@1134 PS24, Line 1134: r is neither an always true filter nor an : // always false filter, then it must be the case that a non-empty sidecar slice > I actually found out that if we are using "std::move(sidecar_slice.ToString Sorry for the wrong advice. I believe the RVO should take care of the copy-elision so no need for the std::move. In other words, it should be sufficient to just call sidecar_slice.ToString(); http://gerrit.cloudera.org:8080/#/c/13882/25/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/13882/25/be/src/runtime/coordinator.cc@1010 PS25, Line 1010: // >>>>>>> IMPALA-7984: Port runtime filter from Thrift RPC to KRPC To be removed ?! http://gerrit.cloudera.org:8080/#/c/13882/25/be/src/runtime/coordinator.cc@1022 PS25, Line 1022: std:: nit: no need for std:: Also, may make sense to move this declaration to somewhere closer to its use (e.g. line 1068 below). Same for rpc_params and target_fragment_idxs above. http://gerrit.cloudera.org:8080/#/c/13882/25/be/src/runtime/coordinator.cc@1072 PS25, Line 1072: BloomFilterPB& aggregated_filter = state->bloom_filter(); : aggregated_filter.Swap(rpc_params.mutable_bloom_filter()); This swapping tricks seems unnecessary now that the bloom_filter_directory is a separate structure. http://gerrit.cloudera.org:8080/#/c/13882/25/be/src/runtime/coordinator.cc@1074 PS25, Line 1074: bloom_filter_directory Is there a reason to assign state->bloom_filter_directory() to this local variable ? Why cannot we use state->bloom_filter_directory() directly below ? http://gerrit.cloudera.org:8080/#/c/13882/25/be/src/runtime/coordinator.cc@1098 PS25, Line 1098: reinterpret_cast<const char*>(&(bloom_filter_directory[0]) state->bloom_filter_directory().data() http://gerrit.cloudera.org:8080/#/c/13882/25/be/src/runtime/coordinator.cc@1153 PS25, Line 1153: bloom_filter_ = BloomFilterPB(params.bloom_filter()); bloom_filter_ = params.bloom_filter(); The above should be sufficient. http://gerrit.cloudera.org:8080/#/c/13882/25/be/src/runtime/coordinator.cc@1155 PS25, Line 1155: std::string( : reinterpret_cast<const char*>(sidecar_slice.data()), sidecar_slice.size()) Can keep the original suggestion of sidecar_slice.ToString(). C++ ROV should take care of avoiding the copy so the TODO can be removed. There is no easy way to avoid copying the sidecar from the network buffer so we have to copy at least once. http://gerrit.cloudera.org:8080/#/c/13882/25/be/src/runtime/runtime-filter-bank.cc File be/src/runtime/runtime-filter-bank.cc: http://gerrit.cloudera.org:8080/#/c/13882/25/be/src/runtime/runtime-filter-bank.cc@190 PS25, Line 190: std:: nit: no need for std:: http://gerrit.cloudera.org:8080/#/c/13882/25/be/src/runtime/runtime-filter-bank.cc@204 PS25, Line 204: std:: nit: no need for std:: http://gerrit.cloudera.org:8080/#/c/13882/25/be/src/runtime/runtime-filter-bank.cc@205 PS25, Line 205: ++num_inflight_rpcs_; DCHECK_GE(num_inflight_rpcs_, 0); before increment. http://gerrit.cloudera.org:8080/#/c/13882/25/be/src/runtime/runtime-filter-bank.cc@323 PS25, Line 323: std:: nit: no need for std:: http://gerrit.cloudera.org:8080/#/c/13882/25/be/src/runtime/timestamp-value.h File be/src/runtime/timestamp-value.h: http://gerrit.cloudera.org:8080/#/c/13882/25/be/src/runtime/timestamp-value.h@170 PS25, Line 170: // Store the binary representation of this TimestampValue in 'tvalue'. : void ToTColumnValue(TColumnValue* tvalue) const { : const uint8_t* data = reinterpret_cast<const uint8_t*>(this); : tvalue->timestamp_val.assign(data, data + Size()); : tvalue->__isset.timestamp_val = true; : } Is this not used anymore ? http://gerrit.cloudera.org:8080/#/c/13882/25/be/src/runtime/timestamp-value.h@183 PS25, Line 183: // Returns a new TimestampValue created from the value in 'tvalue'. : static TimestampValue FromTColumnValue(const TColumnValue& tvalue) { : TimestampValue value; : memcpy(&value, tvalue.timestamp_val.c_str(), Size()); : value.Validate(); : return value; : } Is this not used anymore ? http://gerrit.cloudera.org:8080/#/c/13882/25/be/src/service/impala-server.h File be/src/service/impala-server.h: http://gerrit.cloudera.org:8080/#/c/13882/25/be/src/service/impala-server.h@35 PS25, Line 35: #include "gen-cpp/ImpalaInternalService.h" This can be removed. http://gerrit.cloudera.org:8080/#/c/13882/25/be/src/service/impala-server.h@70 PS25, Line 70: class TReportExecStatusArgs; : class TReportExecStatusResult; Not your change but these can be removed. http://gerrit.cloudera.org:8080/#/c/13882/25/be/src/util/bloom-filter-test.cc File be/src/util/bloom-filter-test.cc: http://gerrit.cloudera.org:8080/#/c/13882/25/be/src/util/bloom-filter-test.cc@75 PS25, Line 75: std::shared_ptr<RpcController Why shared_ptr ? http://gerrit.cloudera.org:8080/#/c/13882/25/be/src/util/bloom-filter-test.cc@352 PS25, Line 352: std::shared_ptr Why shared_ptr ? http://gerrit.cloudera.org:8080/#/c/13882/25/be/src/util/bloom-filter.cc File be/src/util/bloom-filter.cc: http://gerrit.cloudera.org:8080/#/c/13882/25/be/src/util/bloom-filter.cc@125 PS25, Line 125: DCHECK_EQ(protobuf->always_false(), false); DCHECK(!protobuf->always_false()); http://gerrit.cloudera.org:8080/#/c/13882/25/be/src/util/bloom-filter.cc@236 PS25, Line 236: &(directory_in[0]) directory_in http://gerrit.cloudera.org:8080/#/c/13882/25/be/src/util/bloom-filter.cc@237 PS25, Line 237: &(directory_out[0]) directory_out http://gerrit.cloudera.org:8080/#/c/13882/25/be/src/util/bloom-filter.cc@239 PS25, Line 239: &(directory_in[0]) directory_in http://gerrit.cloudera.org:8080/#/c/13882/25/be/src/util/bloom-filter.cc@241 PS25, Line 241: &(directory_in[0] directory_in http://gerrit.cloudera.org:8080/#/c/13882/25/be/src/util/bloom-filter.cc@242 PS25, Line 242: &(directory_out[0]) directory_out http://gerrit.cloudera.org:8080/#/c/13882/25/common/protobuf/common.proto File common/protobuf/common.proto: http://gerrit.cloudera.org:8080/#/c/13882/25/common/protobuf/common.proto@43 PS25, Line 43: // This is a union over all possible return types. If we upgrade to proto3 above, we can use the oneof feature in protobuf3 if only one of the fields below is set at a time. This saves some memory. May be worth considering but could be a TODO. https://developers.google.com/protocol-buffers/docs/proto3#oneof -- To view, visit http://gerrit.cloudera.org:8080/13882 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6b394796d250286510e157ae326882bfc01d387a Gerrit-Change-Number: 13882 Gerrit-PatchSet: 25 Gerrit-Owner: Fang-Yu Rao <fangyu....@cloudera.com> Gerrit-Reviewer: Fang-Yu Rao <fangyu....@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Michael Ho <k...@cloudera.com> Gerrit-Reviewer: Thomas Tauber-Marshall <tmarsh...@cloudera.com> Gerrit-Comment-Date: Thu, 24 Oct 2019 01:35:21 +0000 Gerrit-HasComments: Yes