[ 
https://issues.apache.org/jira/browse/HDFS-7163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Payne updated HDFS-7163:
-----------------------------
    Attachment: HDFS-7163-branch-2.7.004.patch
                HDFS-7163-branch-2.004.patch
                HDFS-7163.004.patch

Attaching version 004 of patches for trunk, branch-2, and branch-2.7.

Version 003 had the issue that could result in the NN sending the client to a 
bad DN one extra time. In version 003, if the client received an IOException 
while reading from the DN, it failed to put the DN in the excluded nodes list. 
This could result in the NN sending the client back to the same DN. However, if 
that occurred, the open would fail and send the client back to the NN, this 
time with the bad DN in the excluded nodes list. The read would still succeed, 
but it would take a bit longer due to an extra attempt to open a bad DN.

Version 004 fixes that issue and supplies the bad DN in the excluded nodes list 
during a read when an IOException occurs.

> WebHdfsFileSystem should retry reads according to the configured retry policy.
> ------------------------------------------------------------------------------
>
>                 Key: HDFS-7163
>                 URL: https://issues.apache.org/jira/browse/HDFS-7163
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: webhdfs
>    Affects Versions: 3.0.0, 2.5.1
>            Reporter: Eric Payne
>            Assignee: Eric Payne
>         Attachments: HDFS-7163-branch-2.003.patch, 
> HDFS-7163-branch-2.004.patch, HDFS-7163-branch-2.7.003.patch, 
> HDFS-7163-branch-2.7.004.patch, HDFS-7163.001.patch, HDFS-7163.002.patch, 
> HDFS-7163.003.patch, HDFS-7163.004.patch, WebHDFS Read Retry.pdf
>
>
> In the current implementation of WebHdfsFileSystem, opens are retried 
> according to the configured retry policy, but not reads. Therefore, if a 
> connection goes down while data is being read, the read will fail and the 
> read will have to be retried by the client code.
> Also, after a connection has been established, the next read (or seek/read) 
> will fail and the read will have to be restarted by the client code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to