Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/12260 )
Change subject: IMPALA-7985: Port RemoteShutdown() to KRPC. ...................................................................... Patch Set 3: (10 comments) Thanks Michael for comments http://gerrit.cloudera.org:8080/#/c/12260/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12260/1//COMMIT_MSG@13 PS1, Line 13: impalad > typo Done http://gerrit.cloudera.org:8080/#/c/12260/2/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/12260/2/be/src/service/client-request-state.cc@653 PS2, Line 653: // KRPC relies on resolved IP address, so convert hostname. : IpAddr ip_address; : Status ip_status = HostnameToIpAddr(request.backend.hostname, &ip_address); : if (!ip_status.ok()) { : VLOG(1) << "Could not convert hostname " << request.backend.hostname : << " to ip address, error: " << ip_status.GetDetail(); : return ip_status; : } : TNetworkAddress addr = MakeNetworkAddress(ip_address, port); > Instead of resolving the IP for the backend, it seems more consistent to re In my original (unpublished) prototype I was keeping the original UX where the user specified the backed thrift port. I then exposed LookUpBackendDesc and used it to find the thrift port of the impalad. As discussed in the Jira, the trouble with this is that it relies on the current cluster membership being correct. The corollary is that the user can't shutdown a failing remote impalad that has left the cluster. So I'm not sure that it is best to rely on the Scheduler's knowledge of the world here. http://gerrit.cloudera.org:8080/#/c/12260/2/be/src/service/control-service.h File be/src/service/control-service.h: http://gerrit.cloudera.org:8080/#/c/12260/2/be/src/service/control-service.h@79 PS2, Line 79: /// Retry the Rpc 'rpc_call' up to 3 times. : /// Pass 'debug_action' to DebugAction() to potentially inject errors. : template <typename F> : static Status DoRpcWithRetry(F&& rpc_call, const TQueryCtx& query_ctx, : const char* debug_action, const char* error_msg) { > Not specifically related to this change but it may be useful to factor this Thanks. The idea (suggested by Thomas) was for DoRpcWithRetry to be reusable. The KrpcDataStreamSender code looks very specific to its class, but the QueryState code may be able to use DoRpcWithRetry. The flexibility of DoRpcWithRetry is that it uses a lambda, but that may make that QueryState sidecar code tricky as the fun part of the lambda is the capture clause. Maybe I should look at this in a separate jira? http://gerrit.cloudera.org:8080/#/c/12260/2/be/src/service/control-service.h@87 PS2, Line 87: MonoDelta::FromSeconds(10) > Not sure if it makes sense to hard code the timeout here as it seems to be Yes, I wondered about this too. I decided that while my 2 clients are using the same numbers that I wouldn't expose these as parameters, but it is easy to do if we want. http://gerrit.cloudera.org:8080/#/c/12260/1/be/src/service/control-service.h File be/src/service/control-service.h: http://gerrit.cloudera.org:8080/#/c/12260/1/be/src/service/control-service.h@81 PS1, Line 81: name F> > typo ? DoRpcWithRetry ? Done http://gerrit.cloudera.org:8080/#/c/12260/2/be/src/service/control-service.cc File be/src/service/control-service.cc: http://gerrit.cloudera.org:8080/#/c/12260/2/be/src/service/control-service.cc@182 PS2, Line 182: Remot > Why is this needed ? It's not, I removed it http://gerrit.cloudera.org:8080/#/c/12260/2/be/src/service/control-service.cc@183 PS2, Line 183: Remot > Why is this needed ? Done http://gerrit.cloudera.org:8080/#/c/12260/2/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/12260/2/be/src/service/impala-server.cc@2435 PS2, Line 2435: "queries registered on coordinator: $2, queries executing: $3, " > The original formatting seems fine to me. I'm must doing what clang-format wants :-) http://gerrit.cloudera.org:8080/#/c/12260/2/be/src/service/impala-server.cc@2481 PS2, Line 2481: if (set_deadline) { : shutdown_status->set_deadline_remaining_ms(relative_deadline_s * 1000L); > Coding style nit: Done http://gerrit.cloudera.org:8080/#/c/12260/2/be/src/service/impala-server.cc@2489 PS2, Line 2489: SleepForMs(1000) > const ShutdownStatusPB& Is that safe? GetShutdownStatus() allocates the ShutdownStatusPB on the stack so I think I want to return it by value. -- To view, visit http://gerrit.cloudera.org:8080/12260 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4fd00ee4e638f5e71e27893162fd65501ef9e74e Gerrit-Change-Number: 12260 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Sherman <[email protected]> Gerrit-Reviewer: Andrew Sherman <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Michael Ho <[email protected]> Gerrit-Reviewer: Thomas Marshall <[email protected]> Gerrit-Comment-Date: Tue, 29 Jan 2019 18:52:23 +0000 Gerrit-HasComments: Yes
