mridulm commented on code in PR #38959:
URL: https://github.com/apache/spark/pull/38959#discussion_r1067516221
##########
common/network-common/src/main/java/org/apache/spark/network/client/TransportClient.java:
##########
@@ -338,6 +342,15 @@ public String toString() {
.toString();
}
+ /**
+ * Exception thrown when sasl request times out.
+ */
+ public static class SaslTimeoutException extends RuntimeException {
+ public SaslTimeoutException(Throwable cause) {
+ super((cause));
+ }
+ }
Review Comment:
Remove this ?
##########
common/network-common/src/main/java/org/apache/spark/network/client/TransportClient.java:
##########
@@ -294,6 +294,15 @@ public void onFailure(Throwable e) {
}
}
+ /**
+ * Exception thrown when sasl request times out.
+ */
+ public static class SaslTimeoutException extends RuntimeException {
+ public SaslTimeoutException(Throwable cause) {
+ super((cause));
+ }
+ }
+
Review Comment:
Move this to ` org.apache.spark.network.sasl` package.
Also add the common constructors for an Exception -
`SaslTimeoutException()`, `SaslTimeoutException(String)`,
`SaslTimeoutException(String, Throwable)`, in addition to what we have here.
##########
common/network-common/src/main/java/org/apache/spark/network/sasl/SaslClientBootstrap.java:
##########
@@ -65,9 +67,18 @@ public void doBootstrap(TransportClient client, Channel
channel) {
SaslMessage msg = new SaslMessage(appId, payload);
ByteBuf buf = Unpooled.buffer(msg.encodedLength() + (int)
msg.body().size());
msg.encode(buf);
+ ByteBuffer response;
buf.writeBytes(msg.body().nioByteBuffer());
-
- ByteBuffer response = client.sendRpcSync(buf.nioBuffer(),
conf.authRTTimeoutMs());
+ try {
+ response = client.sendRpcSync(buf.nioBuffer(),
conf.authRTTimeoutMs());
+ } catch (RuntimeException ex) {
+ // We know it is a Sasl timeout here if it is a TimeoutException.
+ if (ex.getCause() instanceof TimeoutException) {
+ throw Throwables.propagate(new
TransportClient.SaslTimeoutException(ex.getCause()));
Review Comment:
Why `Throwables.propagate` ? Simply create and throw the new
`SaslTimeoutException` directly ?
##########
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RetryingBlockTransferor.java:
##########
@@ -192,8 +197,12 @@ private synchronized void initiateRetry() {
private synchronized boolean shouldRetry(Throwable e) {
boolean isIOException = e instanceof IOException
|| e.getCause() instanceof IOException;
+ boolean isSaslTimeout = enableSaslRetries &&
+ (e instanceof TransportClient.SaslTimeoutException ||
+ (e.getCause() != null && e.getCause() instanceof
TransportClient.SaslTimeoutException));
boolean hasRemainingRetries = retryCount < maxRetries;
- return isIOException && hasRemainingRetries &&
errorHandler.shouldRetryError(e);
+ return (isSaslTimeout || isIOException) &&
+ hasRemainingRetries && errorHandler.shouldRetryError(e);
Review Comment:
If we had sasl failures which triggered a retry - and we finally succeeded -
we should reset `retryCount` to `0`.
sasl is part of bootstrap, unlike other failures we retry for.
##########
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RetryingBlockTransferor.java:
##########
@@ -192,8 +197,12 @@ private synchronized void initiateRetry() {
private synchronized boolean shouldRetry(Throwable e) {
boolean isIOException = e instanceof IOException
|| e.getCause() instanceof IOException;
+ boolean isSaslTimeout = enableSaslRetries &&
+ (e instanceof TransportClient.SaslTimeoutException ||
+ (e.getCause() != null && e.getCause() instanceof
TransportClient.SaslTimeoutException));
Review Comment:
With the proposed changes, this will simply become:
```suggestion
(e instanceof TransportClient.SaslTimeoutException);
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]