Aaron Davidson created SPARK-7003:
-------------------------------------
Summary: Improve reliability of connection failure between Netty
block transfer service endpoints
Key: SPARK-7003
URL: https://issues.apache.org/jira/browse/SPARK-7003
Project: Spark
Issue Type: Bug
Affects Versions: 1.3.1
Reporter: Aaron Davidson
Assignee: Aaron Davidson
Currently we rely on the assumption that an exception will be raised and the
channel closed if two endpoints cannot communicate over a Netty TCP channel.
However, this guarantee does not hold in all network environments, and
SPARK-6962 seems to point to a case where only the server side of the
connection detected a fault.
We should improve robustness of fetch/rpc requests by having an explicit
timeout in the transport layer which closes the connection if there is a period
of inactivity while there are outstanding requests.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]