[jira] [Commented] (HDFS-4273) Fix some issue in DFSInputstream

Masatake Iwasaki (JIRA) Tue, 19 May 2015 00:53:23 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-4273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549996#comment-14549996
 ]


Masatake Iwasaki commented on HDFS-4273:
----------------------------------------

I'm looking into this and writing my understanding for other reviewers here:

All of HDFS-4273, HDFS-5917 and HDFS-6022 addresses improvement of refreshing 
{{deadNodes}}. I think HDFS-6022 is the most promising. (Actually, newest v8 
patch omitted the deadNodes part.)

Other issues addressed here are
# There is a case it should retry but don't
# There is race condition around {{failures}}

HDFS-5776 changed a lot of relevant code around {{chooseDataNode}} method and 
the v8 patch is difficult to rebase.

Fixes around {{seekToNewSource}} alone may work and I'll try to update the 
patch.


> Fix some issue in DFSInputstream
> --------------------------------
>
>                 Key: HDFS-4273
>                 URL: https://issues.apache.org/jira/browse/HDFS-4273
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.0.2-alpha
>            Reporter: Binglin Chang
>            Assignee: Binglin Chang
>            Priority: Minor
>         Attachments: HDFS-4273-v2.patch, HDFS-4273.patch, HDFS-4273.v3.patch, 
> HDFS-4273.v4.patch, HDFS-4273.v5.patch, HDFS-4273.v6.patch, 
> HDFS-4273.v7.patch, HDFS-4273.v8.patch, TestDFSInputStream.java
>
>
> Following issues in DFSInputStream are addressed in this jira:
> 1. read may not retry enough in some cases cause early failure
> Assume the following call logic
> {noformat} 
> readWithStrategy()
>   -> blockSeekTo()
>   -> readBuffer()
>      -> reader.doRead()
>      -> seekToNewSource() add currentNode to deadnode, wish to get a 
> different datanode
>         -> blockSeekTo()
>            -> chooseDataNode()
>               -> block missing, clear deadNodes and pick the currentNode again
>         seekToNewSource() return false
>      readBuffer() re-throw the exception quit loop
> readWithStrategy() got the exception,  and may fail the read call before 
> tried MaxBlockAcquireFailures.
> {noformat} 
> 2. In multi-threaded scenario(like hbase), DFSInputStream.failures has race 
> condition, it is cleared to 0 when it is still used by other thread. So it is 
> possible that  some read thread may never quit. Change failures to local 
> variable solve this issue.
> 3. If local datanode is added to deadNodes, it will not be removed from 
> deadNodes if DN is back alive. We need a way to remove local datanode from 
> deadNodes when the local datanode is become live.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-4273) Fix some issue in DFSInputstream

Reply via email to