Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/14641 )
Change subject: IMPALA-8138: Reintroduce rpc debugging options ...................................................................... Patch Set 4: (11 comments) http://gerrit.cloudera.org:8080/#/c/14641/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14641/3//COMMIT_MSG@26 PS3, Line 26: IMPALA_SERVICE_P > IMPALA_SERVICE_POOL? Done http://gerrit.cloudera.org:8080/#/c/14641/3/be/src/rpc/impala-service-pool.cc File be/src/rpc/impala-service-pool.cc: http://gerrit.cloudera.org:8080/#/c/14641/3/be/src/rpc/impala-service-pool.cc@207 PS3, Line 207: FailAndReleaseRpc > perhaps not in this patch, but would it be good to have an option to just c For sure. Lots of potential for additional testing in this basic area. http://gerrit.cloudera.org:8080/#/c/14641/3/be/src/rpc/rpc-mgr.cc File be/src/rpc/rpc-mgr.cc: http://gerrit.cloudera.org:8080/#/c/14641/3/be/src/rpc/rpc-mgr.cc@193 PS3, Line 193: // Convert 'address_' to Kudu's Sockaddr > move this to Init? since that is when address is passed in? Done http://gerrit.cloudera.org:8080/#/c/14641/2/be/src/util/debug-util.cc File be/src/util/debug-util.cc: http://gerrit.cloudera.org:8080/#/c/14641/2/be/src/util/debug-util.cc@395 PS2, Line 395: } : string error_msg = tokens.size() == 3 ? : tokens[2] : > Yeah, didn't see the check for "iequals(cmd, "FAIL")" above. Done http://gerrit.cloudera.org:8080/#/c/14641/3/be/src/util/debug-util.cc File be/src/util/debug-util.cc: http://gerrit.cloudera.org:8080/#/c/14641/3/be/src/util/debug-util.cc@400 PS3, Line 400: if (ImpaladMetrics::DEBUG_ACTION_NUM_FAIL != nullptr) { > this should never be nullptr right? add a DCHECK asserting it is never a nu No - as currently written, this patch only creates this if FLAGS_debug_actions is set, but its possible there could still be a debug_action set as a query option instead. I was thinking that if we want these metrics for tests that use the debug_action query option, we could do something like extend TestInfo to support is_ee_test() via a flag that gets passed in to start-impala-cluster.py in run-all-tests.sh or something like that, I just hadn't done it because its not required for this patch. http://gerrit.cloudera.org:8080/#/c/14641/2/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/14641/2/common/thrift/ImpalaService.thrift@92 PS2, Line 92: <global label>:<arg0>:...:<argN>:<command>@<param0>@<param1>@...<paramN>" > Yeah, maybe passing in a path to a file that contains JSON is the right way Done http://gerrit.cloudera.org:8080/#/c/14641/3/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/14641/3/common/thrift/ImpalaService.thrift@101 PS3, Line 101: ability>][ > should mention that the DebugAction might change code paths depending on th Done http://gerrit.cloudera.org:8080/#/c/14641/3/tests/custom_cluster/test_rpc_exception.py File tests/custom_cluster/test_rpc_exception.py: http://gerrit.cloudera.org:8080/#/c/14641/3/tests/custom_cluster/test_rpc_exception.py@63 PS3, Line 63: > is this used anywhere? Done http://gerrit.cloudera.org:8080/#/c/14641/3/tests/custom_cluster/test_rpc_exception.py@63 PS3, Line 63: execute_test_query > Not sure how easy this would be, more of a suggestion / something to think Sure. One issue of course is that there may be different behavior depending on if it fails the first time or a subsequent time, and we ideally want to test both. I thought about ways to make this more controllable, like maybe you could specify a number of times the rpc should succeed before it starts to fail, but its hard to come up with something that covers all possible cases and isn't super complicated. For now I think its best to keep it fairly simple and rely on a combination of randomness and running a bunch of times to ensure good coverage. And, all of the tests here have been designed where we really expect that they'll only run this loop once most of the time. http://gerrit.cloudera.org:8080/#/c/14641/3/tests/custom_cluster/test_rpc_exception.py@67 PS3, Line 67: # times. Each test in this file has at least a 50% chance of hitting the action per > should there be a limit or a timeout to the number of times a query is atte Done http://gerrit.cloudera.org:8080/#/c/14641/3/tests/custom_cluster/test_rpc_exception.py@76 PS3, Line 76: except ImpalaBeeswaxException as e: > is this necessary? if the while loop exits this should always be true, righ That was true in this version, but with adding a limit to the number of iterations above, I'll leave this here. -- To view, visit http://gerrit.cloudera.org:8080/14641 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9c047ebce6d32c5ae461f70279391fa2df4c2029 Gerrit-Change-Number: 14641 Gerrit-PatchSet: 4 Gerrit-Owner: Thomas Tauber-Marshall <tmarsh...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Michael Ho <k...@cloudera.com> Gerrit-Reviewer: Sahil Takiar <stak...@cloudera.com> Gerrit-Reviewer: Thomas Tauber-Marshall <tmarsh...@cloudera.com> Gerrit-Comment-Date: Wed, 13 Nov 2019 19:51:13 +0000 Gerrit-HasComments: Yes