Michael Ho has uploaded a new patch set (#4). Change subject: IMPALA-5537: Retry RPC on somes exceptions with SSL connection ......................................................................
IMPALA-5537: Retry RPC on somes exceptions with SSL connection After the fix for IMPALA-5388, all TSSLException thrown will be treated as fatal error and the query will fail. Turns out that this is too strict and in a secure cluster under load, queries can easily hit timeout waiting for RPC response. When running without SSL, we call RetryRpcRecv() to retry the recv part of an RPC if the TSocket underlying the RPC gets an EAGAIN during recv(). This change extends that logic to cover secure connection. In particular, we pattern match against the exception string "SSL_read: Resource temporarily unavailable" which corresponds to EAGAIN error code being thrown in the SSL_read() path. Similarly, we will handle closed connection in send() path with secure connection by pattern matching against the exception string "TTransportException: Transport not open". To verify that the exception is thrown during the send part of a RPC call, the RPC client interface has been augmented to take a bool* argument which is set to true after the send part of the RPC has completed but before the recv part starts. If DoRPC() catches an exception and the send part isn't done yet, the entire RPC if the exception string matches certain substrings which are safe to retry. The fault injection utility has also been updated to distinguish between time out and lost connection to exercise different error handling paths in the send and recv paths. Change-Id: I8243d4cac93c453e9396b0e24f41e147c8637b8c --- A be/src/catalog/catalog-service-client-wrapper.h M be/src/exec/catalog-op-executor.cc M be/src/rpc/thrift-server-test.cc M be/src/rpc/thrift-util.cc M be/src/runtime/backend-client.h M be/src/runtime/client-cache-types.h M be/src/runtime/client-cache.h M be/src/service/client-request-state.cc A be/src/statestore/statestore-service-client-wrapper.h A be/src/statestore/statestore-subscriber-client-wrapper.h M be/src/statestore/statestore-subscriber.cc M be/src/statestore/statestore-subscriber.h M be/src/statestore/statestore.cc M be/src/statestore/statestore.h M be/src/testutil/fault-injection-util.cc M be/src/testutil/fault-injection-util.h M tests/custom_cluster/test_rpc_exception.py 17 files changed, 387 insertions(+), 85 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/29/7229/4 -- To view, visit http://gerrit.cloudera.org:8080/7229 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8243d4cac93c453e9396b0e24f41e147c8637b8c Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho <[email protected]> Gerrit-Reviewer: Henry Robinson <[email protected]> Gerrit-Reviewer: Michael Ho <[email protected]> Gerrit-Reviewer: Sailesh Mukil <[email protected]>
