[ 
https://issues.apache.org/jira/browse/HDFS-8704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14732314#comment-14732314
 ] 

Li Bo commented on HDFS-8704:
-----------------------------

Thanks for [~walter.k.su] and [~zhz]' s review.
The cause of failed test cases are caused by the unit test itself. I have fixed 
them in patch 007.
The test will hang at 
{{TestDFSStripedOutputStreamWithFailure#DFSTestUtil.waitReplication}}, I just 
omit this sentence in patch in order to make the tests pass. [~walter.k.su], 
could you help me check this problem?
I will switch to HDFS-8383 later.

> Erasure Coding: client fails to write large file when one datanode fails
> ------------------------------------------------------------------------
>
>                 Key: HDFS-8704
>                 URL: https://issues.apache.org/jira/browse/HDFS-8704
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Li Bo
>            Assignee: Li Bo
>         Attachments: HDFS-8704-000.patch, HDFS-8704-HDFS-7285-002.patch, 
> HDFS-8704-HDFS-7285-003.patch, HDFS-8704-HDFS-7285-004.patch, 
> HDFS-8704-HDFS-7285-005.patch, HDFS-8704-HDFS-7285-006.patch, 
> HDFS-8704-HDFS-7285-007.patch
>
>
> I test current code on a 5-node cluster using RS(3,2).  When a datanode is 
> corrupt, client succeeds to write a file smaller than a block group but fails 
> to write a large one. {{TestDFSStripeOutputStreamWithFailure}} only tests 
> files smaller than a block group, this jira will add more test situations.
> A streamer may encounter some bad datanodes when writing blocks allocated to 
> it. When it fails to connect datanode or send a packet, the streamer needs to 
> prepare for the next block. First it removes the packets of current  block 
> from its data queue. If the first packet of next block has already been in 
> the data queue, the streamer will reset its state and start to wait for the 
> next block allocated for it; otherwise it will just wait for the first packet 
> of next block. The streamer will check periodically if it is asked to 
> terminate during its waiting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to