[ 
https://issues.apache.org/jira/browse/HDFS-927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12828244#action_12828244
 ] 

Todd Lipcon commented on HDFS-927:
----------------------------------

Hey Nicholas. Thanks for taking a look.

bq. Should the failure count be reset per block, but not per read?

This doesn't match my expectation. Consider the case of HBase, where a region 
server opens a single region (which may very well be a single block) and holds 
it open for days at a time. During the time while it's open, it may experience 
sporadic errors every once in a while due to a network blip or what have you. 
Just because the reader saw an error at 12pm, 3pm, and 6pm doesn't mean it 
should fail when it sees one at 9pm. Any successful read operation should reset 
the count, regardless of which block is being accessed, don't you think?

> DFSInputStream retries too many times for new block locations
> -------------------------------------------------------------
>
>                 Key: HDFS-927
>                 URL: https://issues.apache.org/jira/browse/HDFS-927
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs client
>    Affects Versions: 0.21.0, 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>         Attachments: hdfs-927.txt
>
>
> I think this is a regression caused by HDFS-127 -- DFSInputStream is supposed 
> to only go back to the NN max.block.acquires times, but in trunk it goes back 
> twice as many - the default is 3, but I am counting 7 calls to 
> getBlockLocations before an exception is thrown.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to