Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/14662 )
Change subject: IMPALA-9128: improved diagnostics for slow data stream RPCs. ...................................................................... Patch Set 1: (7 comments) http://gerrit.cloudera.org:8080/#/c/14662/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14662/1//COMMIT_MSG@16 PS1, Line 16: a typo http://gerrit.cloudera.org:8080/#/c/14662/1/be/src/common/global-flags.cc File be/src/common/global-flags.cc: http://gerrit.cloudera.org:8080/#/c/14662/1/be/src/common/global-flags.cc@291 PS1, Line 291: DEFINE_int64(impala_slow_rpc_threshold_ms, 2 * 60 * 1000, (Advanced)? or maybe note in the "help" output that lowering the value may result in a lot of false positives http://gerrit.cloudera.org:8080/#/c/14662/1/be/src/runtime/krpc-data-stream-sender.h File be/src/runtime/krpc-data-stream-sender.h: http://gerrit.cloudera.org:8080/#/c/14662/1/be/src/runtime/krpc-data-stream-sender.h@225 PS1, Line 225: RuntimeProfile::SummaryStatsCounter* recvr_time_stats_ = nullptr; Does this provide us with enough info to determine if slow rpcs from the receiver's perspective are the result of Impala issues or krpc issues? 'receiver_latency_ns' looks like it probably includes both Impala processing time and potential krpc queueing/processing time. Maybe it would be helpful to add some similar tracking on the receiver side itself? http://gerrit.cloudera.org:8080/#/c/14662/1/be/src/runtime/krpc-data-stream-sender.cc File be/src/runtime/krpc-data-stream-sender.cc: http://gerrit.cloudera.org:8080/#/c/14662/1/be/src/runtime/krpc-data-stream-sender.cc@342 PS1, Line 342: LOG(INFO) << "slow " << rpc_name << " RPC to " << TNetworkAddressToString(address_) nit: Slow http://gerrit.cloudera.org:8080/#/c/14662/1/be/src/runtime/krpc-data-stream-sender.cc@361 PS1, Line 361: int64_t elapsed_time_ms = elapsed_time_ns / NANOS_PER_MICRO / MICROS_PER_MILLI; : if (elapsed_time_ms > FLAGS_impala_slow_rpc_threshold_ms) { IsSlowRpc http://gerrit.cloudera.org:8080/#/c/14662/1/be/src/runtime/krpc-data-stream-sender.cc@453 PS1, Line 453: DoRpcFn rpc_fn = Do we want to collect these timing stats for failed rpcs too? http://gerrit.cloudera.org:8080/#/c/14662/1/be/src/runtime/krpc-data-stream-sender.cc@565 PS1, Line 565: DoRpcFn rpc_fn = Same as above: do we want these stats for failed rpcs? -- To view, visit http://gerrit.cloudera.org:8080/14662 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I258ac91b9fbbdbc86d0e8091c34f511f8957c4cd Gerrit-Change-Number: 14662 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Lars Volker <[email protected]> Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]> Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-Comment-Date: Fri, 08 Nov 2019 00:51:47 +0000 Gerrit-HasComments: Yes
