[ 
https://issues.apache.org/jira/browse/HDFS-7005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505636#comment-14505636
 ] 

Nick Dimiduk commented on HDFS-7005:
------------------------------------

Any chance of bringing this to a 2.5.x patch release? Over on HBASE-13339 we're 
trying to work out how best to support users with minimal impact on 
dependencies for our next minor release (1.1). Bumping Hadoop minor versions (I 
think) will break our semantic versioning compatibility guidelines.

FYI [~eclark], [~busbey], [~cnauroth]

> DFS input streams do not timeout
> --------------------------------
>
>                 Key: HDFS-7005
>                 URL: https://issues.apache.org/jira/browse/HDFS-7005
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs-client
>    Affects Versions: 3.0.0, 2.5.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Critical
>             Fix For: 2.6.0
>
>         Attachments: HDFS-7005.patch
>
>
> Input streams lost their timeout.  The problem appears to be 
> {{DFSClient#newConnectedPeer}} does not set the read timeout.  During a 
> temporary network interruption the server will close the socket, unbeknownst 
> to the client host, which blocks on a read forever.
> The results are dire.  Services such as the RM, JHS, NMs, oozie servers, etc 
> all need to be restarted to recover - unless you want to wait many hours for 
> the tcp stack keepalive to detect the broken socket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to