[Impala-ASF-CR] IMPALA-3875: Thrift threaded server hang in some cases
Internal Jenkins has submitted this change and it was merged. Change subject: IMPALA-3875: Thrift threaded server hang in some cases .. IMPALA-3875: Thrift threaded server hang in some cases We use socket timeouts for our backend connections. We set these timeouts only after we've open()'d the connection, which ideally should be fine. However, our TSaslTransport stack does read()'s and write()'s over the network on an open(), which means that on a secure cluster we send and recieve non-TCP-handshake packets on open(). This is because the current code tries to establish a SASL handshake during open(). If for any reason the peer server does not respond to the read()'s during the open() call (after connect() is successful), the client will wait on read() indefinitely. This patch sets the socket timeout before we call open(), so that the read()'s and write()'s during the open() are subject to the timeout as well. We should also consider making a larger change where this SASL handshake does not take place during an open(), but instead after the open() call is completed, so as to have the open() semantics be the same for both secure and insecure clusters. Change-Id: I6c8f91a88f723e0e58e81bb385c5a8f190021868 Reviewed-on: http://gerrit.cloudera.org:8080/5263 Reviewed-by: Sailesh MukilTested-by: Internal Jenkins --- M be/src/rpc/thrift-client.h M be/src/runtime/client-cache.cc M be/src/statestore/statestore-test.cc 3 files changed, 14 insertions(+), 9 deletions(-) Approvals: Internal Jenkins: Verified Sailesh Mukil: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/5263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I6c8f91a88f723e0e58e81bb385c5a8f190021868 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-3875: Thrift threaded server hang in some cases
Internal Jenkins has posted comments on this change. Change subject: IMPALA-3875: Thrift threaded server hang in some cases .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/5263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6c8f91a88f723e0e58e81bb385c5a8f190021868 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3875: Thrift threaded server hang in some cases
Internal Jenkins has posted comments on this change. Change subject: IMPALA-3875: Thrift threaded server hang in some cases .. Patch Set 4: Verified-1 Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/585/ -- To view, visit http://gerrit.cloudera.org:8080/5263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6c8f91a88f723e0e58e81bb385c5a8f190021868 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3875: Thrift threaded server hang in some cases
Sailesh Mukil has posted comments on this change. Change subject: IMPALA-3875: Thrift threaded server hang in some cases .. Patch Set 3: Code-Review+2 (2 comments) Carry +2. http://gerrit.cloudera.org:8080/#/c/5263/3/be/src/runtime/client-cache.cc File be/src/runtime/client-cache.cc: PS3, Line 114: Status status = client_impl->socket_create_status(); > if (!client_impl->socket_create_status().ok()) ? Done PS3, Line 115: goto error; > Let's avoid goto unless it's the only good way to avoid very complex contro Done -- To view, visit http://gerrit.cloudera.org:8080/5263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6c8f91a88f723e0e58e81bb385c5a8f190021868 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3875: Thrift threaded server hang in some cases
Hello Henry Robinson, Internal Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5263 to look at the new patch set (#4). Change subject: IMPALA-3875: Thrift threaded server hang in some cases .. IMPALA-3875: Thrift threaded server hang in some cases We use socket timeouts for our backend connections. We set these timeouts only after we've open()'d the connection, which ideally should be fine. However, our TSaslTransport stack does read()'s and write()'s over the network on an open(), which means that on a secure cluster we send and recieve non-TCP-handshake packets on open(). This is because the current code tries to establish a SASL handshake during open(). If for any reason the peer server does not respond to the read()'s during the open() call (after connect() is successful), the client will wait on read() indefinitely. This patch sets the socket timeout before we call open(), so that the read()'s and write()'s during the open() are subject to the timeout as well. We should also consider making a larger change where this SASL handshake does not take place during an open(), but instead after the open() call is completed, so as to have the open() semantics be the same for both secure and insecure clusters. Change-Id: I6c8f91a88f723e0e58e81bb385c5a8f190021868 --- M be/src/rpc/thrift-client.h M be/src/runtime/client-cache.cc M be/src/statestore/statestore-test.cc 3 files changed, 14 insertions(+), 9 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/5263/4 -- To view, visit http://gerrit.cloudera.org:8080/5263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6c8f91a88f723e0e58e81bb385c5a8f190021868 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-3875: Thrift threaded server hang in some cases
Henry Robinson has posted comments on this change. Change subject: IMPALA-3875: Thrift threaded server hang in some cases .. Patch Set 3: Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/5263/3/be/src/runtime/client-cache.cc File be/src/runtime/client-cache.cc: PS3, Line 114: Status status = client_impl->socket_create_status(); if (!client_impl->socket_create_status().ok()) ? PS3, Line 115: goto error; Let's avoid goto unless it's the only good way to avoid very complex control flow. Here, you can just leave it as it was. -- To view, visit http://gerrit.cloudera.org:8080/5263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6c8f91a88f723e0e58e81bb385c5a8f190021868 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3875: Thrift threaded server hang in some cases
Hello Henry Robinson, Internal Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5263 to look at the new patch set (#3). Change subject: IMPALA-3875: Thrift threaded server hang in some cases .. IMPALA-3875: Thrift threaded server hang in some cases We use socket timeouts for our backend connections. We set these timeouts only after we've open()'d the connection, which ideally should be fine. However, our TSaslTransport stack does read()'s and write()'s over the network on an open(), which means that on a secure cluster we send and recieve non-TCP-handshake packets on open(). This is because the current code tries to establish a SASL handshake during open(). If for any reason the peer server does not respond to the read()'s during the open() call (after connect() is successful), the client will wait on read() indefinitely. This patch sets the socket timeout before we call open(), so that the read()'s and write()'s during the open() are subject to the timeout as well. We should also consider making a larger change where this SASL handshake does not take place during an open(), but instead after the open() call is completed, so as to have the open() semantics be the same for both secure and insecure clusters. Change-Id: I6c8f91a88f723e0e58e81bb385c5a8f190021868 --- M be/src/rpc/thrift-client.h M be/src/runtime/client-cache.cc M be/src/statestore/statestore-test.cc 3 files changed, 14 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/5263/3 -- To view, visit http://gerrit.cloudera.org:8080/5263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6c8f91a88f723e0e58e81bb385c5a8f190021868 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-3875: Thrift threaded server hang in some cases
Internal Jenkins has posted comments on this change. Change subject: IMPALA-3875: Thrift threaded server hang in some cases .. Patch Set 2: Verified-1 Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/580/ -- To view, visit http://gerrit.cloudera.org:8080/5263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6c8f91a88f723e0e58e81bb385c5a8f190021868 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3875: Thrift threaded server hang in some cases
Sailesh Mukil has posted comments on this change. Change subject: IMPALA-3875: Thrift threaded server hang in some cases .. Patch Set 2: Code-Review+2 Rebase, carry +2. -- To view, visit http://gerrit.cloudera.org:8080/5263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6c8f91a88f723e0e58e81bb385c5a8f190021868 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3875: Thrift threaded server hang in some cases
Henry Robinson has posted comments on this change. Change subject: IMPALA-3875: Thrift threaded server hang in some cases .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/5263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6c8f91a88f723e0e58e81bb385c5a8f190021868 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Henry Robinson Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3875: Thrift threaded server hang in some cases
Sailesh Mukil has uploaded a new change for review. http://gerrit.cloudera.org:8080/5263 Change subject: IMPALA-3875: Thrift threaded server hang in some cases .. IMPALA-3875: Thrift threaded server hang in some cases We use socket timeouts for our backend connections. We set these timeouts only after we've open()'d the connection, which ideally should be fine. However, our TSaslTransport stack does read()'s and write()'s over the network on an open(), which means that on a secure cluster we send and recieve non-TCP-handshake packets on open(). This is because the current code tries to establish a SASL handshake during open(). If for any reason the peer server does not respond to the read()'s during the open() call (after connect() is successful), the client will wait on read() indefinitely. This patch sets the socket timeout before we call open(), so that the read()'s and write()'s during the open() are subject to the timeout as well. We should also consider making a larger change where this SASL handshake does not take place during an open(), but instead after the open() call is completed, so as to have the open() semantics be the same for both secure and insecure clusters. Change-Id: I6c8f91a88f723e0e58e81bb385c5a8f190021868 --- M be/src/runtime/client-cache.cc 1 file changed, 5 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/5263/1 -- To view, visit http://gerrit.cloudera.org:8080/5263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I6c8f91a88f723e0e58e81bb385c5a8f190021868 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil