[
https://issues.apache.org/jira/browse/HDFS-1054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12847897#action_12847897
]
Todd Lipcon commented on HDFS-1054:
-----------------------------------
FYI the situation where I'm running into this is an hbase edits stress test
where I'm killing a DN that's local to one of the region servers. That region
server immediately starts acting up because all file creates take an extra 6
seconds (it still thinks the local DN is up until the NN marks it down). In
this case it's getting an immediate "Connection Refused" from the local DN
anyway, and the second attempt always works fine since it makes it into the
HDFS-630 excludedNodes list
> Make sleep after failure in nextBlockOutputStream smarter and configurable
> --------------------------------------------------------------------------
>
> Key: HDFS-1054
> URL: https://issues.apache.org/jira/browse/HDFS-1054
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: hdfs client
> Affects Versions: 0.22.0
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
>
> If DFSOutputStream fails to create a pipeline, it currently sleeps 6 seconds
> before retrying. I don't see a great reason to wait at all, much less 6
> seconds (especially now that HDFS-630 ensures that a retry won't go back to
> the bad node). We should at least make it configurable, and perhaps something
> like backoff makes some sense.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.