Liang Xie created HDFS-6286: ------------------------------- Summary: adding a timeout setting for local read io Key: HDFS-6286 URL: https://issues.apache.org/jira/browse/HDFS-6286 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 2.4.0, 3.0.0 Reporter: Liang Xie Assignee: Liang Xie
Currently, if a write or remote read requested into a sick disk, DFSClient.hdfsTimeout could help the caller have a guaranteed time cost to return back. but it doesn't work on local read. Take HBase scan for example, DFSInputStream.read -> readWithStrategy -> readBuffer -> BlockReaderLocal.read -> dataIn.read -> FileChannelImpl.read if it hits a bad disk, the low read io probably takes tens of seconds, and what's worse is, the "DFSInputStream.read" hold a lock always. Per my knowledge, there's no good mechanism to cancel a running read io(Please correct me if it's wrong), so my opinion is adding a future around the read request, and we could set a timeout there, if the threshold reached, we can add the local node into deadnode probably... Any thought? -- This message was sent by Atlassian JIRA (v6.2#6252)