[
https://issues.apache.org/jira/browse/HDFS-927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12828244#action_12828244
]
Todd Lipcon commented on HDFS-927:
----------------------------------
Hey Nicholas. Thanks for taking a look.
bq. Should the failure count be reset per block, but not per read?
This doesn't match my expectation. Consider the case of HBase, where a region
server opens a single region (which may very well be a single block) and holds
it open for days at a time. During the time while it's open, it may experience
sporadic errors every once in a while due to a network blip or what have you.
Just because the reader saw an error at 12pm, 3pm, and 6pm doesn't mean it
should fail when it sees one at 9pm. Any successful read operation should reset
the count, regardless of which block is being accessed, don't you think?
> DFSInputStream retries too many times for new block locations
> -------------------------------------------------------------
>
> Key: HDFS-927
> URL: https://issues.apache.org/jira/browse/HDFS-927
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs client
> Affects Versions: 0.21.0, 0.22.0
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Priority: Critical
> Attachments: hdfs-927.txt
>
>
> I think this is a regression caused by HDFS-127 -- DFSInputStream is supposed
> to only go back to the NN max.block.acquires times, but in trunk it goes back
> twice as many - the default is 3, but I am counting 7 calls to
> getBlockLocations before an exception is thrown.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.