Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16412 )

Change subject: IMPALA-9930 (part 2): Introduce new admission control rpc 
service
......................................................................


Patch Set 5:

(22 comments)

still going through it all, but my comments so far.

http://gerrit.cloudera.org:8080/#/c/16412/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16412/5//COMMIT_MSG@12
PS5, Line 12: This patch adds some simple configuration flags that make it 
possible
not sure I understand the plan here. this patch adds flags is 
--is_admission_controller to exec-env, which if set will expose the 
AdmissionControllerService. what about this will change exactly in IMPALA-9975?


http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/scheduling/admission-control-service.h
File be/src/scheduling/admission-control-service.h:

http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/scheduling/admission-control-service.h@76
PS5, Line 76: AdmitQueryRequstPB
nit: typo


http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/scheduling/admission-control-service.h@78
PS5, Line 78:     UniqueIdPB query_id;
            :     UniqueIdPB coord_id;
can these both be const ref?


http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/scheduling/admission-control-service.h@105
PS5, Line 105:     RuntimeProfile* summary_profile;
might be worth mentioning that this is passed to 
AdmissionController::SubmitForAdmission, which is responsible for updating it.


http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/scheduling/admission-control-service.h@118
PS5, Line 118: fo
nit: typo


http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/scheduling/admission-control-service.cc
File be/src/scheduling/admission-control-service.cc:

http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/scheduling/admission-control-service.cc@85
PS5, Line 85: Adqmission
nit: typo


http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/scheduling/admission-control-service.cc@142
PS5, Line 142:   VLOG(1) << "GetQueryStatus " << req->query_id();
might be too verbose to log.


http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/scheduling/admission-control-service.cc@155
PS5, Line 155: 100
why wait at all? won't waiting tie up one of the RPC threads? I think the 
client already waits 100 ms.


http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/scheduling/admission-control-service.cc@159
PS5, Line 159:           DCHECK(query_info->admit_status.ok());
is this expected? do want to DCHECK if the admit_status is not Status::OK()? if 
AC rejects the query due to cancellation, won't it returned a Status::CANCELLED?


http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/scheduling/admission-control-service.cc@197
PS5, Line 197:     lock_guard<mutex> l(query_info->lock);
why does the lock need to be acquired here?


http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/scheduling/admission-control-service.cc@209
PS5, Line 209:   VLOG(1) << "ReleaseQueryBackends: query_id=" << 
req->query_id();
might be too verbose to log


http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/scheduling/admission-control-service.cc@219
PS5, Line 219:     lock_guard<mutex> l(query_info->lock);
why does the lock need to be acquired here?


http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/scheduling/admission-control-service.cc@249
PS5, Line 249:             {query_info->query_id, query_info->coord_id, 
query_info->query_exec_request,
             :                 query_info->query_options, 
query_info->summary_profile,
             :                 query_info->blacklisted_executor_addresses},
might make the code easier to read if this just created a AdmissionRequest and 
then passed it in.


http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/scheduling/admission-controller.h
File be/src/scheduling/admission-controller.h:

http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/scheduling/admission-controller.h@316
PS5, Line 316: (usually they
             :   /// are owned by the ClientRequestState).
might need to revise this now.


http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/scheduling/admission-controller.h@345
PS5, Line 345:   Status WaitOnQueued(const UniqueIdPB& query_id,
not clear to me what the returned Status is suppose to indicate.


http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/scheduling/remote-admission-control-client.h
File be/src/scheduling/remote-admission-control-client.h:

http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/scheduling/remote-admission-control-client.h@53
PS5, Line 53:   std::mutex lock_;
what is the lock used to protect?


http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/scheduling/remote-admission-control-client.cc
File be/src/scheduling/remote-admission-control-client.cc:

http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/scheduling/remote-admission-control-client.cc@63
PS5, Line 63:   KrpcSerializer serializer;
            :   int sidecar_idx1;
            :   RETURN_IF_ERROR(
            :       serializer.SerializeToSidecar(&request.request, 
&rpc_controller, &sidecar_idx1));
            :   req.set_query_exec_request_sidecar_idx(sidecar_idx1);
            :
            :   KrpcSerializer serializer2;
            :   int sidecar_idx2;
            :   RETURN_IF_ERROR(serializer2.SerializeToSidecar(
            :       &request.query_options, &rpc_controller, &sidecar_idx2));
            :   req.set_query_options_sidecar_idx(sidecar_idx2);
isn't TQueryOptions accessible from a TQueryExecRequest?

TQueryExecRequest --> ImpalaInternalService.TQueryCtx --> TClientRequest --> 
TQueryOptions


http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/scheduling/remote-admission-control-client.cc@86
PS5, Line 86:     KUDU_RETURN_IF_ERROR(
            :         proxy->AdmitQuery(req, &resp, &rpc_controller), 
"AdmitQuery rpc failed");
            :     Status admit_status(resp.status());
            :     RETURN_IF_ERROR(admit_status);
is it necessary to hold the lock while making the RPC to Admit the query? that 
seems like it could limit throughput since only one query can be submitted at a 
time.


http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/service/impala-http-handler.cc
File be/src/service/impala-http-handler.cc:

http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/service/impala-http-handler.cc@1057
PS5, Line 1057: void ImpalaHttpHandler::AdmissionStateHandler(
perhaps this will done in another patch, but will the /admission endpoint on 
the coordinator be moved from the coordinator web ui to the 
admission-controller-service web ui?


http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/util/sharded-query-map-util.cc
File be/src/util/sharded-query-map-util.cc:

http://gerrit.cloudera.org:8080/#/c/16412/5/be/src/util/sharded-query-map-util.cc@21
PS5, Line 21: #include "scheduling/admission-control-service.h"
seems odd that this needs to be imported here. can the templates be moved into 
the admission-control-service?

same applies to the query-driver


http://gerrit.cloudera.org:8080/#/c/16412/5/common/protobuf/admission_control_service.proto
File common/protobuf/admission_control_service.proto:

http://gerrit.cloudera.org:8080/#/c/16412/5/common/protobuf/admission_control_service.proto@162
PS5, Line 162: TQueryExwcRequest
nit: typo


http://gerrit.cloudera.org:8080/#/c/16412/5/common/protobuf/admission_control_service.proto@226
PS5, Line 226: service AdmissionControlService {
it might be nice to have Tim and/or Bikram review the service definition, since 
its a key part of the change.



--
To view, visit http://gerrit.cloudera.org:8080/16412
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I594fc593a27b24b6952e381a9bc1a9a5c6b757ae
Gerrit-Change-Number: 16412
Gerrit-PatchSet: 5
Gerrit-Owner: Thomas Tauber-Marshall <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Sahil Takiar <[email protected]>
Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]>
Gerrit-Comment-Date: Wed, 30 Sep 2020 19:33:18 +0000
Gerrit-HasComments: Yes

Reply via email to