otterc commented on code in PR #41071:
URL: https://github.com/apache/spark/pull/41071#discussion_r1194381617
##########
common/network-common/src/main/java/org/apache/spark/network/server/TransportChannelHandler.java:
##########
@@ -163,14 +163,11 @@ public void userEventTriggered(ChannelHandlerContext ctx,
Object evt) throws Exc
if (e.state() == IdleState.ALL_IDLE && isActuallyOverdue) {
if (responseHandler.hasOutstandingRequests()) {
String address = getRemoteAddress(ctx.channel());
- logger.error("Connection to {} has been quiet for {} ms while
there are outstanding " +
- "requests. Assuming connection is dead; please adjust" +
- " spark.{}.io.connectionTimeout if this is wrong.",
- address, requestTimeoutNs / 1000 / 1000,
transportContext.getConf().getModuleName());
- client.timeOut();
- ctx.close();
Review Comment:
I believe this change alters the semantics of the system. Previously, when a
client had pending requests but had not received any response from the server
within the last requestTimeoutNs, we inferred that the channel was inactive and
closed it. The client would then initiate a new fetch request. If we do not
close the channel, what would happen? Would the client wait indefinitely for
the shuffle response?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]