[ 
https://issues.apache.org/jira/browse/HADOOP-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12639923#action_12639923
 ] 

dhruba borthakur commented on HADOOP-4278:
------------------------------------------

@Sameer: I agree that out test is triggering a known issue that is more likely 
to occur when the Blockreceiver thread dies but the PacketResponder thread is 
still able to communicate a response to the upstream datanode (or client). This 
case typically does not occur in real life, because when a datanode dies all 
its threads cannot communicate with other datanodes in the pipeline. I am 
saying that this is not likely to be a blocker for 0.19; but on the other hand, 
if it is occuring frequently whiel running unit tests it is better to fix it 
quickly. I am thinking of changing the way the unit test kills a datanode. BTW, 
do you ever see this scenario being triggered in real-life cases, e.g. GridMix 
and or perforamance benchmarks?

@Raghu: I was thinking that both this one and HADOOP-3416 is refers to the fact 
that the method employed by the unit test to kill datanodes is not very 
deterministic and that is the reason why this JIRA is related to 3416. Fixing 
one issue probably fixes the other one too. Let me do some investigation on how 
to fix it.

> TestDatanodeDeath failed occasionally
> -------------------------------------
>
>                 Key: HADOOP-4278
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4278
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.0
>
>
> TestDatanodeDeath keeps failing occasionally.  For example, see
> http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3365/testReport/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to