[ 
https://issues.apache.org/jira/browse/HDFS-13571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lisheng Sun updated HDFS-13571:
-------------------------------
    Release Note: When dead node blocks DFSInputStream,Deadnode detection can 
find it and share this information to other DFSInputStreams in the same 
DFSClient. Thus, these DFSInputStreams will not read from the dead node and be 
blocked by this dead node.   (was: When dead node blocks 
DFSInputStream,Deadnode detection can find it and share this information to 
other DFSInputStreams in the same DFSClient.
 Thus, these DFSInputStreams will not read from the dead node and be blocked by 
this dead node. )

> Deadnode detection
> ------------------
>
>                 Key: HDFS-13571
>                 URL: https://issues.apache.org/jira/browse/HDFS-13571
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>    Affects Versions: 2.4.0, 2.6.0, 3.0.2
>            Reporter: Gang Xie
>            Assignee: Lisheng Sun
>            Priority: Major
>             Fix For: 3.3.0
>
>         Attachments: DeadNodeDetectorDesign.pdf, HDFS-13571-2.6.diff, node 
> status machine.png
>
>
> Currently, the information of the dead datanode in DFSInputStream in stored 
> locally. So, it could not be shared among the inputstreams of the same 
> DFSClient. In our production env, every days, some datanodes dies with 
> different causes. At this time, after the first inputstream blocked and 
> detect this, it could share this information to others in the same DFSClient, 
> thus, the ohter inputstreams are still blocked by the dead node for some 
> time, which could cause bad service latency.
> To eliminate this impact from dead datanode, we designed a dead datanode 
> detector, which detect the dead ones in advance, and share this information 
> among all the inputstreams in the same client. This improvement has being 
> online for some months and works fine.  So, we decide to port to the 3.0 (the 
> version used in our production env is 2.4 and 2.6).
> I will do the porting work and upload the code later.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to