[GitHub] [spark] turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window
turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window URL: https://github.com/apache/spark/pull/27943#discussion_r402167657 ## File path: common/network-common/src/test/java/org/apache/spark/network/client/TransportClientFactorySuite.java ## @@ -224,4 +226,23 @@ public void closeFactoryBeforeCreateClient() throws IOException, InterruptedExce factory.close(); factory.createClient(TestUtils.getLocalHost(), server1.getPort()); } + + @Rule + public ExpectedException expectedException = ExpectedException.none(); + + @Test + public void fastFailConnectionInTimeWindow() throws IOException, InterruptedException { +TransportClientFactory factory = context.createClientFactory(); +TransportServer server = context.createServer(); +int unreachablePort = server.getPort(); +server.close(); +try { + factory.createClient(TestUtils.getLocalHost(), unreachablePort, true); +} catch (Exception e) { + assert(e instanceof IOException); +} +expectedException.expect(IOException.class); +expectedException.expectMessage("fail this connection directly"); +factory.createClient(TestUtils.getLocalHost(), unreachablePort, true); Review comment: done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window
turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window URL: https://github.com/apache/spark/pull/27943#discussion_r402149106 ## File path: common/network-common/src/test/java/org/apache/spark/network/client/TransportClientFactorySuite.java ## @@ -224,4 +226,23 @@ public void closeFactoryBeforeCreateClient() throws IOException, InterruptedExce factory.close(); factory.createClient(TestUtils.getLocalHost(), server1.getPort()); } + + @Rule + public ExpectedException expectedException = ExpectedException.none(); + + @Test + public void fastFailConnectionInTimeWindow() throws IOException, InterruptedException { +TransportClientFactory factory = context.createClientFactory(); +TransportServer server = context.createServer(); +int unreachablePort = server.getPort(); +server.close(); +try { + factory.createClient(TestUtils.getLocalHost(), unreachablePort, true); +} catch (Exception e) { + assert(e instanceof IOException); +} +expectedException.expect(IOException.class); +expectedException.expectMessage("fail this connection directly"); +factory.createClient(TestUtils.getLocalHost(), unreachablePort, true); Review comment: I think I get your point. Do you mean that add `expectedException = ExpectedException.none();` after this test case complete? I just refer the usage in SparkSubmitCommandBuilderSuite, it is the only suite uses **Rule** before this pr. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window
turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window URL: https://github.com/apache/spark/pull/27943#discussion_r402158870 ## File path: common/network-common/src/test/java/org/apache/spark/network/client/TransportClientFactorySuite.java ## @@ -224,4 +226,23 @@ public void closeFactoryBeforeCreateClient() throws IOException, InterruptedExce factory.close(); factory.createClient(TestUtils.getLocalHost(), server1.getPort()); } + + @Rule + public ExpectedException expectedException = ExpectedException.none(); + + @Test + public void fastFailConnectionInTimeWindow() throws IOException, InterruptedException { +TransportClientFactory factory = context.createClientFactory(); +TransportServer server = context.createServer(); +int unreachablePort = server.getPort(); +server.close(); +try { + factory.createClient(TestUtils.getLocalHost(), unreachablePort, true); +} catch (Exception e) { + assert(e instanceof IOException); +} +expectedException.expect(IOException.class); +expectedException.expectMessage("fail this connection directly"); +factory.createClient(TestUtils.getLocalHost(), unreachablePort, true); Review comment: https://github.com/junit-team/junit4/blob/435d41f0d45cfdbc1a38e1ad4eb1d5300da533f9/src/main/java/org/junit/Rule.java#L27-L42 Here is an example, this folder is a RULER. As described in the comments, it would create a temporary folder before each test method, and deletes it after each. So, I think \@RULE should be unique for each test. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window
turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window URL: https://github.com/apache/spark/pull/27943#discussion_r402158870 ## File path: common/network-common/src/test/java/org/apache/spark/network/client/TransportClientFactorySuite.java ## @@ -224,4 +226,23 @@ public void closeFactoryBeforeCreateClient() throws IOException, InterruptedExce factory.close(); factory.createClient(TestUtils.getLocalHost(), server1.getPort()); } + + @Rule + public ExpectedException expectedException = ExpectedException.none(); + + @Test + public void fastFailConnectionInTimeWindow() throws IOException, InterruptedException { +TransportClientFactory factory = context.createClientFactory(); +TransportServer server = context.createServer(); +int unreachablePort = server.getPort(); +server.close(); +try { + factory.createClient(TestUtils.getLocalHost(), unreachablePort, true); +} catch (Exception e) { + assert(e instanceof IOException); +} +expectedException.expect(IOException.class); +expectedException.expectMessage("fail this connection directly"); +factory.createClient(TestUtils.getLocalHost(), unreachablePort, true); Review comment: https://github.com/junit-team/junit4/blob/435d41f0d45cfdbc1a38e1ad4eb1d5300da533f9/src/main/java/org/junit/Rule.java#L27-L42 Here is an example, this folder is a **Rule**. As described in the comments, it would create a temporary folder before each test method, and deletes it after each. So, I think **Rule** should be unique for each test. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window
turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window URL: https://github.com/apache/spark/pull/27943#discussion_r402155245 ## File path: common/network-common/src/test/java/org/apache/spark/network/client/TransportClientFactorySuite.java ## @@ -224,4 +226,23 @@ public void closeFactoryBeforeCreateClient() throws IOException, InterruptedExce factory.close(); factory.createClient(TestUtils.getLocalHost(), server1.getPort()); } + + @Rule + public ExpectedException expectedException = ExpectedException.none(); + + @Test + public void fastFailConnectionInTimeWindow() throws IOException, InterruptedException { +TransportClientFactory factory = context.createClientFactory(); +TransportServer server = context.createServer(); +int unreachablePort = server.getPort(); +server.close(); +try { + factory.createClient(TestUtils.getLocalHost(), unreachablePort, true); +} catch (Exception e) { + assert(e instanceof IOException); +} +expectedException.expect(IOException.class); +expectedException.expectMessage("fail this connection directly"); +factory.createClient(TestUtils.getLocalHost(), unreachablePort, true); Review comment: https://github.com/junit-team/junit4/blob/435d41f0d45cfdbc1a38e1ad4eb1d5300da533f9/src/main/java/org/junit/Rule.java#L27-L30 Here is the comments, it is just for each \@Test This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window
turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window URL: https://github.com/apache/spark/pull/27943#discussion_r402155245 ## File path: common/network-common/src/test/java/org/apache/spark/network/client/TransportClientFactorySuite.java ## @@ -224,4 +226,23 @@ public void closeFactoryBeforeCreateClient() throws IOException, InterruptedExce factory.close(); factory.createClient(TestUtils.getLocalHost(), server1.getPort()); } + + @Rule + public ExpectedException expectedException = ExpectedException.none(); + + @Test + public void fastFailConnectionInTimeWindow() throws IOException, InterruptedException { +TransportClientFactory factory = context.createClientFactory(); +TransportServer server = context.createServer(); +int unreachablePort = server.getPort(); +server.close(); +try { + factory.createClient(TestUtils.getLocalHost(), unreachablePort, true); +} catch (Exception e) { + assert(e instanceof IOException); +} +expectedException.expect(IOException.class); +expectedException.expectMessage("fail this connection directly"); +factory.createClient(TestUtils.getLocalHost(), unreachablePort, true); Review comment: https://github.com/junit-team/junit4/blob/435d41f0d45cfdbc1a38e1ad4eb1d5300da533f9/src/main/java/org/junit/Rule.java#L27-L30 Here is the comments, it is just for each \@Test This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window
turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window URL: https://github.com/apache/spark/pull/27943#discussion_r402149106 ## File path: common/network-common/src/test/java/org/apache/spark/network/client/TransportClientFactorySuite.java ## @@ -224,4 +226,23 @@ public void closeFactoryBeforeCreateClient() throws IOException, InterruptedExce factory.close(); factory.createClient(TestUtils.getLocalHost(), server1.getPort()); } + + @Rule + public ExpectedException expectedException = ExpectedException.none(); + + @Test + public void fastFailConnectionInTimeWindow() throws IOException, InterruptedException { +TransportClientFactory factory = context.createClientFactory(); +TransportServer server = context.createServer(); +int unreachablePort = server.getPort(); +server.close(); +try { + factory.createClient(TestUtils.getLocalHost(), unreachablePort, true); +} catch (Exception e) { + assert(e instanceof IOException); +} +expectedException.expect(IOException.class); +expectedException.expectMessage("fail this connection directly"); +factory.createClient(TestUtils.getLocalHost(), unreachablePort, true); Review comment: I think I get your point. Do you mean that add `expectedException = ExpectedException.none();` after this test case complete? I just refer the usage in SparkSubmitCommandBuilderSuite, it is only used before this pr. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window
turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window URL: https://github.com/apache/spark/pull/27943#discussion_r402107095 ## File path: common/network-common/src/test/java/org/apache/spark/network/client/TransportClientFactorySuite.java ## @@ -224,4 +226,23 @@ public void closeFactoryBeforeCreateClient() throws IOException, InterruptedExce factory.close(); factory.createClient(TestUtils.getLocalHost(), server1.getPort()); } + + @Rule + public ExpectedException expectedException = ExpectedException.none(); + + @Test + public void fastFailConnectionInTimeWindow() throws IOException, InterruptedException { +TransportClientFactory factory = context.createClientFactory(); +TransportServer server = context.createServer(); +int unreachablePort = server.getPort(); +server.close(); +try { + factory.createClient(TestUtils.getLocalHost(), unreachablePort, true); +} catch (Exception e) { + assert(e instanceof IOException); +} +expectedException.expect(IOException.class); +expectedException.expectMessage("fail this connection directly"); +factory.createClient(TestUtils.getLocalHost(), unreachablePort, true); Review comment: It seems that we do not need reset the expectedException, it is unique for each test and there is also no reset method. Yes, junit5 provides assertThrows api, but our junit version is junit4. And junit4 just provides `@Test(expected = Exception.class)`, it is not suit for this test. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window
turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window URL: https://github.com/apache/spark/pull/27943#discussion_r402107095 ## File path: common/network-common/src/test/java/org/apache/spark/network/client/TransportClientFactorySuite.java ## @@ -224,4 +226,23 @@ public void closeFactoryBeforeCreateClient() throws IOException, InterruptedExce factory.close(); factory.createClient(TestUtils.getLocalHost(), server1.getPort()); } + + @Rule + public ExpectedException expectedException = ExpectedException.none(); + + @Test + public void fastFailConnectionInTimeWindow() throws IOException, InterruptedException { +TransportClientFactory factory = context.createClientFactory(); +TransportServer server = context.createServer(); +int unreachablePort = server.getPort(); +server.close(); +try { + factory.createClient(TestUtils.getLocalHost(), unreachablePort, true); +} catch (Exception e) { + assert(e instanceof IOException); +} +expectedException.expect(IOException.class); +expectedException.expectMessage("fail this connection directly"); +factory.createClient(TestUtils.getLocalHost(), unreachablePort, true); Review comment: It seems that we do not need reset the expectedException, it is unique for each test and there is also no reset method. Yes, junit5 provide assertThrows api, but our junit version is junit4. And junit4 just provides `@Test(expected = Exception.class)`, it is not suit for this test. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window
turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window URL: https://github.com/apache/spark/pull/27943#discussion_r402107095 ## File path: common/network-common/src/test/java/org/apache/spark/network/client/TransportClientFactorySuite.java ## @@ -224,4 +226,23 @@ public void closeFactoryBeforeCreateClient() throws IOException, InterruptedExce factory.close(); factory.createClient(TestUtils.getLocalHost(), server1.getPort()); } + + @Rule + public ExpectedException expectedException = ExpectedException.none(); + + @Test + public void fastFailConnectionInTimeWindow() throws IOException, InterruptedException { +TransportClientFactory factory = context.createClientFactory(); +TransportServer server = context.createServer(); +int unreachablePort = server.getPort(); +server.close(); +try { + factory.createClient(TestUtils.getLocalHost(), unreachablePort, true); +} catch (Exception e) { + assert(e instanceof IOException); +} +expectedException.expect(IOException.class); +expectedException.expectMessage("fail this connection directly"); +factory.createClient(TestUtils.getLocalHost(), unreachablePort, true); Review comment: It seems that we do not need reset the expectedException, it is unique for each test and there is also no reset method. Yes, junit5 provide assertThrows api, but our junit version is junit4. And junit4 just provide `@Test(expected = Exception.class)`, it is not suit for this test. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window
turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window URL: https://github.com/apache/spark/pull/27943#discussion_r402107095 ## File path: common/network-common/src/test/java/org/apache/spark/network/client/TransportClientFactorySuite.java ## @@ -224,4 +226,23 @@ public void closeFactoryBeforeCreateClient() throws IOException, InterruptedExce factory.close(); factory.createClient(TestUtils.getLocalHost(), server1.getPort()); } + + @Rule + public ExpectedException expectedException = ExpectedException.none(); + + @Test + public void fastFailConnectionInTimeWindow() throws IOException, InterruptedException { +TransportClientFactory factory = context.createClientFactory(); +TransportServer server = context.createServer(); +int unreachablePort = server.getPort(); +server.close(); +try { + factory.createClient(TestUtils.getLocalHost(), unreachablePort, true); +} catch (Exception e) { + assert(e instanceof IOException); +} +expectedException.expect(IOException.class); +expectedException.expectMessage("fail this connection directly"); +factory.createClient(TestUtils.getLocalHost(), unreachablePort, true); Review comment: It seems that we do not need reset the expectedException, it is unique for each test. Yes, junit5 provide assertThrows api, but our junit version is junit4. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window
turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window URL: https://github.com/apache/spark/pull/27943#discussion_r402107095 ## File path: common/network-common/src/test/java/org/apache/spark/network/client/TransportClientFactorySuite.java ## @@ -224,4 +226,23 @@ public void closeFactoryBeforeCreateClient() throws IOException, InterruptedExce factory.close(); factory.createClient(TestUtils.getLocalHost(), server1.getPort()); } + + @Rule + public ExpectedException expectedException = ExpectedException.none(); + + @Test + public void fastFailConnectionInTimeWindow() throws IOException, InterruptedException { +TransportClientFactory factory = context.createClientFactory(); +TransportServer server = context.createServer(); +int unreachablePort = server.getPort(); +server.close(); +try { + factory.createClient(TestUtils.getLocalHost(), unreachablePort, true); +} catch (Exception e) { + assert(e instanceof IOException); +} +expectedException.expect(IOException.class); +expectedException.expectMessage("fail this connection directly"); +factory.createClient(TestUtils.getLocalHost(), unreachablePort, true); Review comment: It seems that we do not need reset the expectedException, it is unique for each test and there is also no reset method. Yes, junit5 provide assertThrows api, but our junit version is junit4. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window
turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window URL: https://github.com/apache/spark/pull/27943#discussion_r402107095 ## File path: common/network-common/src/test/java/org/apache/spark/network/client/TransportClientFactorySuite.java ## @@ -224,4 +226,23 @@ public void closeFactoryBeforeCreateClient() throws IOException, InterruptedExce factory.close(); factory.createClient(TestUtils.getLocalHost(), server1.getPort()); } + + @Rule + public ExpectedException expectedException = ExpectedException.none(); + + @Test + public void fastFailConnectionInTimeWindow() throws IOException, InterruptedException { +TransportClientFactory factory = context.createClientFactory(); +TransportServer server = context.createServer(); +int unreachablePort = server.getPort(); +server.close(); +try { + factory.createClient(TestUtils.getLocalHost(), unreachablePort, true); +} catch (Exception e) { + assert(e instanceof IOException); +} +expectedException.expect(IOException.class); +expectedException.expectMessage("fail this connection directly"); +factory.createClient(TestUtils.getLocalHost(), unreachablePort, true); Review comment: It seems that we do not need reset the expectedException, it is uniq for each test. Yes, junit5 provide assertThrows api, but our junit version is junit4. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window
turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window URL: https://github.com/apache/spark/pull/27943#discussion_r402100291 ## File path: common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java ## @@ -192,11 +206,30 @@ public TransportClient createClient(String remoteHost, int remotePort) logger.info("Found inactive connection to {}, creating a new one.", resolvedAddress); } } - clientPool.clients[clientIndex] = createClient(resolvedAddress); + // If this connection should fast fail when last connection failed in last fast fail time + // window and it did, fail this connection directly. + if (fastFail && System.currentTimeMillis() - clientPool.lastConnectionFailed < +fastFailTimeWindow) { +throw new IOException( + String.format("Connecting to %s failed in the last %s ms, fail this connection directly", +resolvedAddress, fastFailTimeWindow)); + } + try { +clientPool.clients[clientIndex] = createClient(resolvedAddress); +clientPool.lastConnectionFailed = 0; + } catch (IOException e) { +clientPool.lastConnectionFailed = System.currentTimeMillis(); +throw e; + } return clientPool.clients[clientIndex]; } } + public TransportClient createClient(String remoteHost, int remotePort) Review comment: I think it is not necessary, because it is a public method. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window
turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window URL: https://github.com/apache/spark/pull/27943#discussion_r401327864 ## File path: common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java ## @@ -121,18 +126,22 @@ public MetricSet getAllMetrics() { /** * Create a {@link TransportClient} connecting to the given remote host / port. * - * We maintains an array of clients (size determined by spark.shuffle.io.numConnectionsPerPeer) + * We maintain an array of clients (size determined by spark.shuffle.io.numConnectionsPerPeer) * and randomly picks one to use. If no client was previously created in the randomly selected * spot, this function creates a new client and places it there. * + * We also maintain a last connection failed time of these clients and a fast fail time window + * based on io retry wait time. If this connection request can be retried and the last connection + * failed time of these clients in the fast fail time window, fail this connection directly. + * * Prior to the creation of a new TransportClient, we will execute all * {@link TransportClientBootstrap}s that are registered with this factory. * * This blocks until a connection is successfully established and fully bootstrapped. * * Concurrency: This method is safe to call from multiple threads. */ - public TransportClient createClient(String remoteHost, int remotePort) + public TransportClient createClient(String remoteHost, int remotePort, boolean withRetry) Review comment: thanks, have named this variable to `fastFail` and add java doc and comments. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window
turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window URL: https://github.com/apache/spark/pull/27943#discussion_r398591598 ## File path: common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java ## @@ -112,6 +116,7 @@ public TransportClientFactory( } this.metrics = new NettyMemoryMetrics( this.pooledAllocator, conf.getModuleName() + "-client", conf); +fastFailTimeWindow = conf.maxIORetries() > 0 ? (int)(conf.ioRetryWaitTimeMs() * 0.95) : 0; Review comment: thanks, make sense. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window
turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window URL: https://github.com/apache/spark/pull/27943#discussion_r398579036 ## File path: common/network-common/src/test/java/org/apache/spark/network/client/TransportClientFactorySuite.java ## @@ -224,4 +226,23 @@ public void closeFactoryBeforeCreateClient() throws IOException, InterruptedExce factory.close(); factory.createClient(TestUtils.getLocalHost(), server1.getPort()); } + + @Rule + public ExpectedException expectedException = ExpectedException.none(); + + @Test + public void fastFailConnectionInTimeWindow() throws IOException, InterruptedException { +TransportClientFactory factory = context.createClientFactory(); +TransportServer server = context.createServer(); +server.close(); +int unreachablePort = server.getPort(); Review comment: got This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window
turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window URL: https://github.com/apache/spark/pull/27943#discussion_r398574423 ## File path: common/network-common/src/test/java/org/apache/spark/network/client/TransportClientFactorySuite.java ## @@ -224,4 +226,23 @@ public void closeFactoryBeforeCreateClient() throws IOException, InterruptedExce factory.close(); factory.createClient(TestUtils.getLocalHost(), server1.getPort()); } + + @Rule + public ExpectedException expectedException = ExpectedException.none(); + + @Test + public void fastFailConnectionInTimeWindow() throws IOException, InterruptedException { +TransportClientFactory factory = context.createClientFactory(); +TransportServer server = context.createServer(); +server.close(); +int unreachablePort = server.getPort(); Review comment: have modified the code, I think it would not affect the result of ci. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window
turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window URL: https://github.com/apache/spark/pull/27943#discussion_r398562861 ## File path: common/network-common/src/test/java/org/apache/spark/network/client/TransportClientFactorySuite.java ## @@ -224,4 +226,23 @@ public void closeFactoryBeforeCreateClient() throws IOException, InterruptedExce factory.close(); factory.createClient(TestUtils.getLocalHost(), server1.getPort()); } + + @Rule + public ExpectedException expectedException = ExpectedException.none(); + + @Test + public void fastFailConnectionInTimeWindow() throws IOException, InterruptedException { +TransportClientFactory factory = context.createClientFactory(); +TransportServer server = context.createServer(); +server.close(); +int unreachablePort = server.getPort(); Review comment: I think It is not necessary. I have checked the code, the port would only been set during init() This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window
turboFei commented on a change in pull request #27943: [SPARK-31179] Fast fail the connection while last connection failed in fast fail time window URL: https://github.com/apache/spark/pull/27943#discussion_r398562861 ## File path: common/network-common/src/test/java/org/apache/spark/network/client/TransportClientFactorySuite.java ## @@ -224,4 +226,23 @@ public void closeFactoryBeforeCreateClient() throws IOException, InterruptedExce factory.close(); factory.createClient(TestUtils.getLocalHost(), server1.getPort()); } + + @Rule + public ExpectedException expectedException = ExpectedException.none(); + + @Test + public void fastFailConnectionInTimeWindow() throws IOException, InterruptedException { +TransportClientFactory factory = context.createClientFactory(); +TransportServer server = context.createServer(); +server.close(); +int unreachablePort = server.getPort(); Review comment: I think It is not necessary. I have checked the code, the port would be only set value during init() This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org