Todd Lipcon created HDFS-3357: --------------------------------- Summary: DataXceiver reads from client socket with incorrect/no timeout Key: HDFS-3357 URL: https://issues.apache.org/jira/browse/HDFS-3357 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 1.0.2, 2.0.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Critical
In DataXceiver, we currently use Socket.setSoTimeout to try to manage the read timeout when switching between reading the initial opCode, reading a keepalive opcode, and reading the status after a successfully sent block. However, since all of these reads use the same underlying DataInputStream, the change to the socket timeout isn't respected. Thus, they all occur with whatever timeout is set on the socket at the time of DataXceiver construction. In practice this turns out to be 0, which can cause infinitely hung xceivers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira