Binglin Chang created HDFS-4273: ----------------------------------- Summary: Problem in DFSInputStream read retry logic may cause early failure Key: HDFS-4273 URL: https://issues.apache.org/jira/browse/HDFS-4273 Project: Hadoop HDFS Issue Type: Bug Reporter: Binglin Chang Assignee: Binglin Chang Priority: Minor
Assume the following call logic {noformat} readWithStrategy() -> blockSeekTo() -> readBuffer() -> reader.doRead() -> seekToNewSource() add currentNode to deadnode, wish to get a different datanode -> blockSeekTo() -> chooseDataNode() -> block missing, clear deadNodes and pick the currentNode again seekToNewSource() return false readBuffer() re-throw the exception quit loop readWithStrategy() got the exception, and may fail the read call before tried MaxBlockAcquireFailures. {noformat} some issues of the logic: 1. seekToNewSource() logic is broken because it may clear deadNodes in the middle. 2. the variable "int retries=2" in readWithStrategy seems have conflict with MaxBlockAcquireFailures, should it be removed? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira