Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13882 )

Change subject: IMPALA-7984: Port runtime filter from Thrift RPC to KRPC
......................................................................


Patch Set 19:

(26 comments)

http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/rpc/thrift-server-test.cc
File be/src/rpc/thrift-server-test.cc:

http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/rpc/thrift-server-test.cc@22
PS19, Line 22: #include "gen-cpp/ImpalaInternalService.h"
Why is this necessary?


http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/runtime/coordinator-backend-state.cc@572
PS19, Line 572: rpc_status.message().ToString()
I think you can just call ToString() directly on the kudu::Status without 
calling .message()


http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/runtime/coordinator-backend-state.cc@573
PS19, Line 573: controller.Reset();
Not needed, since we don't reuse the controller


http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/runtime/coordinator-backend-state.cc@576
PS19, Line 576: res.status().error_msgs(0)
We'll want to print the status code and other error messages. Easiest way is to 
just construct a regular Status object from the StatusPB, i.e. replace this 
with something like "Status(res.status()).GetDetail()"

That's of course less efficient, but this shouldn't happen very often


http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/runtime/coordinator.h@44
PS19, Line 44: class RpcController;
not used


http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/runtime/coordinator.cc@1070
PS19, Line 1070: if (rpc_params.has_bloom_filter()) {
               :         if (!rpc_params.bloom_filter().always_false()
               :             && !rpc_params.bloom_filter().always_true()) {
combine these with '&&'


http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/runtime/coordinator.cc@1107
PS19, Line 1107: } else if (!params->bloom_filter().always_false()) {
probably more straightforward to flip the conditions, i.e. put the 
always_false() block here and put this block in the 'else'


http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/runtime/coordinator.cc@1118
PS19, Line 1118: return;
I don't think we want to just return here - we still need to hit the "if 
(pending_count_ == 0..." below


http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/runtime/coordinator.cc@1122
PS19, Line 1122: // There are two cases in which 'bloom_filter_.always_false()'
               :         // evaluates to true: (1) 'bloom_filter_' has never 
been initialized
               :         // by any incoming Bloom filter previously, (2) 
'bloom_filter_' has
               :         // been initialized to an always false Bloom filter by 
an incoming
               :         // Bloom filter previously. We note that
               :         // 'bloom_filter_.has_log_bufferpool_space()' 
evaluates to false when
               :         // 'bloom_filter_' has never been initialized by any 
incoming Bloom
               :         // filter and to true otherwise.
               :         // In either case, we will always perform the swap 
operation.
I think this whole comment is unnecessary


http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/runtime/coordinator.cc@1141
PS19, Line 1141: // We want to move the payload from the request rather than 
copy it and
               :           // take double the memory cost.
this can be removed


http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/runtime/coordinator.cc@1157
PS19, Line 1157: non_const_filter = 
&const_cast<BloomFilterPB&>(params->bloom_filter());
might as well put this is in the 'if' for clarity


http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/runtime/coordinator.cc@1158
PS19, Line 1158: // Even if the current incoming Bloom filter is an always 
false filter,
               :       // we still need to perform the swap operation to 
properly initialize
               :       // the field of 'log_bufferpool_space_' if 
'bloom_filter_' has not been
               :       // initialized by any incoming Bloom filter yet. 
Otherwise we will hit a
               :       // DCHECK error in 'buffer-allocator.cc' that requires 
that the buffer
               :       // length of the directory associated with the Bloom 
filter to be
               :       // published is greater than some lower bound when 
attempting to call
               :       // BloomFilter::Init() in 
RuntimeFilterBank::PublishGlobalFilter().
Unnecessarily long comment. Lets just put a comment inside the 'if' that says 
something like 'bloom_filter_ hasn't been initialized yet, so do so now'


http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/runtime/coordinator.cc@1167
PS19, Line 1167: // We want to move the payload from the request rather than 
copy it and
               :         // take double the memory cost.
This can be removed. It was referring to swapping the directory to avoid a 
large copy, but the directory isn't being swapped here.


http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/runtime/data-stream-test.cc
File be/src/runtime/data-stream-test.cc:

http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/runtime/data-stream-test.cc@127
PS19, Line 127:
nit: extra whitespace. You can just remove all of it and do '{}'


http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/runtime/fragment-instance-state.h
File be/src/runtime/fragment-instance-state.h:

http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/runtime/fragment-instance-state.h@48
PS19, Line 48: using kudu::rpc::RpcContext;
don't use 'using' in headers


http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/runtime/krpc-data-stream-recvr.cc
File be/src/runtime/krpc-data-stream-recvr.cc:

http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/runtime/krpc-data-stream-recvr.cc@469
PS19, Line 469: LOG(INFO) << "KrpcDataStreamRecvr::SenderQueue::AddBatch(), "
              :           "rpc_context->GetTransferSize(): " << 
rpc_context->GetTransferSize();
I'm guessing you did this for debugging? It should be removed


http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/runtime/krpc-data-stream-recvr.cc@569
PS19, Line 569:     LOG(INFO) << 
"KrpcDataStreamRecvr::SenderQueue::TakeOverEarlySender, "
same here


http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/runtime/query-state.h@45
PS19, Line 45: using kudu::rpc::RpcContext;
don't use using in header files


http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/runtime/runtime-filter-bank.h
File be/src/runtime/runtime-filter-bank.h:

http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/runtime/runtime-filter-bank.h@156
PS19, Line 156: num_inflight_rpcs_
nit: when referring to variables in comments, we put single quotes around them


http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/runtime/runtime-filter-bank.cc
File be/src/runtime/runtime-filter-bank.cc:

http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/runtime/runtime-filter-bank.cc@121
PS19, Line 121: DCHECK_EQ(res->status().status_code(), TErrorCode::OK);
Brief comment about why this is correct, i.e. "UpdateFilter should never set an 
error status"


http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/service/data-stream-service.h
File be/src/service/data-stream-service.h:

http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/service/data-stream-service.h@23
PS19, Line 23: //#include "kudu/rpc/rpc_controller.h"
remove


http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/service/data-stream-service.h@71
PS19, Line 71: ::
nit: remove this, here and elsewhere, to be consistent with the rest of this 
file


http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/service/data-stream-service.cc
File be/src/service/data-stream-service.cc:

http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/service/data-stream-service.cc@137
PS19, Line 137: LOG(ERROR)
I think this isn't necessarily an error, eg. it can happen if the query was 
cancelled, so lets do LOG(INFO)


http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/service/impala-server.h
File be/src/service/impala-server.h:

http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/service/impala-server.h@60
PS19, Line 60: using kudu::rpc::RpcContext;
don't use 'using' in header files


http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/util/bloom-filter.h
File be/src/util/bloom-filter.h:

http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/util/bloom-filter.h@43
PS19, Line 43: class Coordinator;
not used


http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/util/min-max-filter.h
File be/src/util/min-max-filter.h:

http://gerrit.cloudera.org:8080/#/c/13882/19/be/src/util/min-max-filter.h@22
PS19, Line 22: #include "gen-cpp/control_service.pb.h"
shouldn't this be 'data_stream_service'?



--
To view, visit http://gerrit.cloudera.org:8080/13882
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6b394796d250286510e157ae326882bfc01d387a
Gerrit-Change-Number: 13882
Gerrit-PatchSet: 19
Gerrit-Owner: Fang-Yu Rao <[email protected]>
Gerrit-Reviewer: Fang-Yu Rao <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]>
Gerrit-Comment-Date: Tue, 17 Sep 2019 21:50:43 +0000
Gerrit-HasComments: Yes

Reply via email to