[GitHub] [spark] turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in the last retry IO wait
turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in the last retry IO wait URL: https://github.com/apache/spark/pull/27943#discussion_r397623831 ## File path: common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java ## @@ -112,6 +114,7 @@ public TransportClientFactory( } this.metrics = new NettyMemoryMetrics( this.pooledAllocator, conf.getModuleName() + "-client", conf); +fastFailTimeWindow = conf.ioRetryWaitTimeMs() * 0.95; Review comment: thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in the last retry IO wait
turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in the last retry IO wait URL: https://github.com/apache/spark/pull/27943#discussion_r397566176 ## File path: common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java ## @@ -192,7 +193,20 @@ public TransportClient createClient(String remoteHost, int remotePort) logger.info("Found inactive connection to {}, creating a new one.", resolvedAddress); } } - clientPool.clients[clientIndex] = createClient(resolvedAddress); + double fastFailTimeWindow = conf.ioRetryWaitTimeMs() * 0.95; Review comment: thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in the last retry IO wait
turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in the last retry IO wait URL: https://github.com/apache/spark/pull/27943#discussion_r394829873 ## File path: common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java ## @@ -192,7 +194,19 @@ public TransportClient createClient(String remoteHost, int remotePort) logger.info("Found inactive connection to {}, creating a new one.", resolvedAddress); } } - clientPool.clients[clientIndex] = createClient(resolvedAddress); + if (System.currentTimeMillis() - clientPool.lastConnectionFailed[clientIndex] Review comment: I think I get your point. Do you means that, we need define a new exception type, when retryBlockFetcher catch this exception, the retry count should not increase? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in the last retry IO wait
turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in the last retry IO wait URL: https://github.com/apache/spark/pull/27943#discussion_r394487813 ## File path: common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java ## @@ -192,7 +194,18 @@ public TransportClient createClient(String remoteHost, int remotePort) logger.info("Found inactive connection to {}, creating a new one.", resolvedAddress); } } - clientPool.clients[clientIndex] = createClient(resolvedAddress); + if (System.currentTimeMillis() - clientPool.lastConnectionFailed[clientIndex] +< conf.ioRetryWaitTimeMs()) { +throw new IOException( + String.format("Connecting to %s failed in the last %s ms, fail this connection directly", +resolvedAddress, conf.ioRetryWaitTimeMs())); + } + try { +clientPool.clients[clientIndex] = createClient(resolvedAddress); Review comment: thanks, I will set `clientPool.lastConnectionFailed[clientIndex] = 0;` after create client successfully. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org