[
https://issues.apache.org/jira/browse/HADOOP-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12639929#action_12639929
]
Raghu Angadi commented on HADOOP-4278:
--------------------------------------
The fact that TestDatanodeDeath is unpredictable is probably a good thing,
though it is painful to diagnose. It points out rare race conditions.
HADOOP-3416 is about an actual bug in DFSClient and not a bug in the test. I
don't think this jira is a bug in the test either. I see couple of issues
pointed by this failure :
- Datanode (not client) does not detect the problem datanode in the pipeline
some cases.
-- We could live with this since write should succeed (though it got
replicated to fewer datanodes than what was possible). This is also the reason
why HADOOP-3416 was not blocker but HADOOP-3339 was.
- In such cases, DFSClient does not seem to be able to recover. I think this is
a pretty important bug to fix since it will result in hard failures on the
write.
Does this sound correct?
> TestDatanodeDeath failed occasionally
> -------------------------------------
>
> Key: HADOOP-4278
> URL: https://issues.apache.org/jira/browse/HADOOP-4278
> Project: Hadoop Core
> Issue Type: Bug
> Components: dfs
> Reporter: Tsz Wo (Nicholas), SZE
> Assignee: dhruba borthakur
> Priority: Blocker
> Fix For: 0.19.0
>
>
> TestDatanodeDeath keeps failing occasionally. For example, see
> http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3365/testReport/
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.