[ 
https://issues.apache.org/jira/browse/HDFS-9646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107850#comment-15107850
 ] 

Jing Zhao commented on HDFS-9646:
---------------------------------

Thanks for the review, Kai!

bq. Wonder if it is or should, recoverying can also be triggered by corrupt 
case (the DN is live or not stopped).

yes, recovery will also be triggered for corrupted blocks. However for this 
test we need a process to detect the corruption first. This can either be a 
client reading the data or a datanode recovering missing blocks. Here I want to 
make sure the DataNode can correctly detect and report the corruption during 
the recovery so we need to first generate at least one missing block by 
shutting down a DN.

bq. Woner if we could share the following utility between client and datanode

Yes, I planned to do so but could not find a good way for this small piece of 
logic. Maybe we can separate this into a different jira?

> ErasureCodingWorker may fail when recovering data blocks with length less 
> than the first internal block
> -------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-9646
>                 URL: https://issues.apache.org/jira/browse/HDFS-9646
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: erasure-coding
>    Affects Versions: 3.0.0
>            Reporter: Takuya Fukudome
>            Assignee: Jing Zhao
>            Priority: Critical
>         Attachments: HDFS-9646.000.patch, HDFS-9646.001.patch, 
> HDFS-9646.002.patch, HDFS-9646.003.patch, test-reconstruct-stripe-file.patch
>
>
> This is reported by [~tfukudom]: ErasureCodingWorker may fail with the 
> following exception when recovering a non-full internal block.
> {code}
> 2016-01-06 11:14:44,740 WARN  datanode.DataNode 
> (ErasureCodingWorker.java:run(467)) - Failed to recover striped block: 
> BP-987302662-172.29.4.13-1450757377698:blk_-92233720368
> 54322288_29751
> java.io.IOException: Transfer failed for all targets.
>         at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker$ReconstructAndTransferBlock.run(ErasureCodingWorker.java:455)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to