tgravescs commented on a change in pull request #27943: [SPARK-31179] Fast fail
the connection while last connection failed in fast fail time window
URL: https://github.com/apache/spark/pull/27943#discussion_r398585034
##########
File path:
common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java
##########
@@ -112,6 +116,7 @@ public TransportClientFactory(
}
this.metrics = new NettyMemoryMetrics(
this.pooledAllocator, conf.getModuleName() + "-client", conf);
+ fastFailTimeWindow = conf.maxIORetries() > 0 ?
(int)(conf.ioRetryWaitTimeMs() * 0.95) : 0;
Review comment:
so I think you may have misunderstood my comment. my concern is really on
those things that don't go through RetryFetcher and don't use maxIORetries.
Like just sending RPC messages.
For instance the external block client fetching the local host dirs, or just
any other rpc message. I guess the rpc messages through outbox all cache the
client, so that isn't as much of a concern. Looking some more, it looks like
those cases are pretty limited so it should be ok.
Also this doesn't really disable it if the max retries is 0 because you
could still theoretically hit the case that 2 try in the same millisecond and
then the second would fail fast. how about setting to -1 in that case
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]