Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/12672 )
Change subject: IMPALA-8143: Enhance DoRpcWithRetry(). ...................................................................... Patch Set 7: (5 comments) http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr-test.cc File be/src/rpc/rpc-mgr-test.cc: http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr-test.cc@333 PS5, Line 333: // Call DoRpcWithRetry with no retry on our busy service. : const int retries = 3; : PingRequestPB request3; : PingResponsePB response3; : impala::Status rpc_status = RpcMgr::DoRpcWithRetry(proxy, &PingServiceProxy::Ping, : request3, &response3, query_ctx, "ping failed", retries, 100 * MILLIS_PER_SEC); : EXPECT_ERROR(rpc_status, TErrorCode::GENERAL); : EXPECT_STR_CONTAINS(rpc_status.msg().msg(), : "dropped due to backpressure. The service queue contains 1 items out of a maximum " : "of 1; memory consumption is "); : ASSERT_EQ(overflow_metric->GetValue(), retries); : : // Call DoRpcWithRetry, but this time we do retey on a busy service, and this waiting : // allows the outstanding async calls to complete, so that then this call will succeed. : PingRequestPB request4; : PingResponsePB response4; : impala::Status rpc_status_backoff = : RpcMgr::DoRpcWithRetry(proxy, &PingServiceProxy::Ping, request4, &response4, : query_ctx, "ping failed", 10, 100 * MILLIS_PER_SEC, 2 * MILLIS_PER_SEC); : ASSERT_OK(rpc_status_backoff); : ASSERT_GT(overflow_metric->GetValue( > I reviewed the test and tried to make to more bulletproof. I think any test How about the following idea: - the RPC handler can share some sort of condition variable or barrier with the main test. The test can then indirectly control when the RPC handler will complete by signaling those condition variables. In this way, as long as the main thread doesn't signal the CV, the first RPC will never finish and the second RPC will also block forever. This makes the test not timing dependent and less prone to failure due to random timing issue. What do you think ? http://gerrit.cloudera.org:8080/#/c/12672/7/be/src/rpc/rpc-mgr.h File be/src/rpc/rpc-mgr.h: http://gerrit.cloudera.org:8080/#/c/12672/7/be/src/rpc/rpc-mgr.h@178 PS7, Line 178: // A no-op method that is used as part of overloading DoRpcWithRetry(). : static void logging_noop() {} : : // The type of the log method passed to DoRpcWithRetry(). : typedef boost::function<void()> RpcLogFn; Not needed. http://gerrit.cloudera.org:8080/#/c/12672/7/be/src/rpc/rpc-mgr.h@186 PS7, Line 186: timeout timeout_ms http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr.h File be/src/rpc/rpc-mgr.h: http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr.h@178 PS5, Line 178: // A no-op method that is used as part of overloading DoRpcWithRetry(). : static void logging_noop() {} : : // The type of the log method passed to DoRpcWithRetry(). : typedef boost::function<void()> RpcLogFn; > Removed They still seem to be in PS7 http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr.h@201 PS5, Line 201: : private: : /// One pool per registered service. scoped_refptr<> is dictated by the Kudu interface. : std::vector<scoped_refptr<ImpalaServicePool>> service_pools_; : > OK I'll leave this alone and Thomas will clean up the mess? :-) Sounds good to me. -- To view, visit http://gerrit.cloudera.org:8080/12672 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia9693151c35e02235665b3c285a48c585973d390 Gerrit-Change-Number: 12672 Gerrit-PatchSet: 7 Gerrit-Owner: Andrew Sherman <[email protected]> Gerrit-Reviewer: Andrew Sherman <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Michael Ho <[email protected]> Gerrit-Reviewer: Thomas Marshall <[email protected]> Gerrit-Comment-Date: Tue, 02 Apr 2019 02:08:54 +0000 Gerrit-HasComments: Yes
