Riza Suminto has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/19157 )
Change subject: IMPALA-11674: Fix timeout detection for TSSLSocket ...................................................................... IMPALA-11674: Fix timeout detection for TSSLSocket Functions IsPeekTimeoutTException() and IsReadTimeoutTException() in be/src/rpc/thrift-util.cc make assumption about the implementation of read(), peek(), write() and write_partial() in TSocket.cpp and TSSLSocket.cpp. The functions read() and peek() in TSSLSocket.cpp were changed in version 0.11.0 and 0.16.0 to throw different exception for timeout. This cause IsPeekTimeoutTException() and IsReadTimeoutTException() to return wrong value after upgrade thrift, which in turn cause TAcceptQueueServer::Peek() to rethrow the exception to caller TAcceptQueueServer::run() and make TAcceptQueueServer::run() to close the connection, ignoring idle_session_timeout query option. The issue was reproducible through the following scenario: 1. From the local development environment, start the impala cluster with SSL enabled and idle_client_poll_period_s equals 5 seconds. export CERT_DIR="$IMPALA_HOME/be/src/testutil" export SSL_ARGS="--ssl_client_ca_certificate=$CERT_DIR/server-cert.pem --ssl_server_certificate=$CERT_DIR/server-cert.pem --ssl_private_key=$CERT_DIR/server-key.pem --hostname=localhost" ./bin/start-impala-cluster.py --state_store_args="$SSL_ARGS" \ --catalogd_args="$SSL_ARGS" \ --impalad_args="$SSL_ARGS --idle_client_poll_period_s=5" 2. Run impala-shell with a higher idle_session_timeout query option impala-shell.sh --ssl -Q idle_session_timeout=100 3. Run a simple query like "show databases" and rerun it after 15 seconds pass. The second query run will fail with the following error message in impala-shell: [localhost:21050] default> show databases; Caught exception TLS/SSL connection has been closed (EOF) (_ssl.c:1829), type=<class 'ssl.SSLZeroReturnError'> in CloseSession. Warning: close session RPC failed: TLS/SSL connection has been closed (EOF) (_ssl.c:1829), <class 'ssl.SSLZeroReturnError'> This patch fix the expected error message in IsReadTimeoutTException and IsPeekTimeoutTException to correctly detect timeout error from TSSLSocket. Additionally, this patch also fix typo in NEW_THRIFT_VERSION_MSG. Testing: - Redo the scenario manually, with and without SSL, and confirm that the second query complete without error. - Add test_thrift_socket.py to begin verifying IsPeekTimeoutTException function. Change-Id: I6ad168a1c96d751a3c50d924e6ecaf6404e589ab Reviewed-on: http://gerrit.cloudera.org:8080/19157 Reviewed-by: Wenzhe Zhou <[email protected]> Tested-by: Impala Public Jenkins <[email protected]> Reviewed-by: Zoltan Borok-Nagy <[email protected]> --- M be/src/rpc/TAcceptQueueServer.cpp M be/src/rpc/thrift-util.cc A tests/custom_cluster/test_thrift_socket.py 3 files changed, 116 insertions(+), 9 deletions(-) Approvals: Wenzhe Zhou: Looks good to me, but someone else must approve Impala Public Jenkins: Verified Zoltan Borok-Nagy: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/19157 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I6ad168a1c96d751a3c50d924e6ecaf6404e589ab Gerrit-Change-Number: 19157 Gerrit-PatchSet: 5 Gerrit-Owner: Riza Suminto <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Reviewer: Wenzhe Zhou <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
