[
https://issues.apache.org/jira/browse/HDFS-9646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107850#comment-15107850
]
Jing Zhao commented on HDFS-9646:
---------------------------------
Thanks for the review, Kai!
bq. Wonder if it is or should, recoverying can also be triggered by corrupt
case (the DN is live or not stopped).
yes, recovery will also be triggered for corrupted blocks. However for this
test we need a process to detect the corruption first. This can either be a
client reading the data or a datanode recovering missing blocks. Here I want to
make sure the DataNode can correctly detect and report the corruption during
the recovery so we need to first generate at least one missing block by
shutting down a DN.
bq. Woner if we could share the following utility between client and datanode
Yes, I planned to do so but could not find a good way for this small piece of
logic. Maybe we can separate this into a different jira?
> ErasureCodingWorker may fail when recovering data blocks with length less
> than the first internal block
> -------------------------------------------------------------------------------------------------------
>
> Key: HDFS-9646
> URL: https://issues.apache.org/jira/browse/HDFS-9646
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: erasure-coding
> Affects Versions: 3.0.0
> Reporter: Takuya Fukudome
> Assignee: Jing Zhao
> Priority: Critical
> Attachments: HDFS-9646.000.patch, HDFS-9646.001.patch,
> HDFS-9646.002.patch, HDFS-9646.003.patch, test-reconstruct-stripe-file.patch
>
>
> This is reported by [~tfukudom]: ErasureCodingWorker may fail with the
> following exception when recovering a non-full internal block.
> {code}
> 2016-01-06 11:14:44,740 WARN datanode.DataNode
> (ErasureCodingWorker.java:run(467)) - Failed to recover striped block:
> BP-987302662-172.29.4.13-1450757377698:blk_-92233720368
> 54322288_29751
> java.io.IOException: Transfer failed for all targets.
> at
> org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker$ReconstructAndTransferBlock.run(ErasureCodingWorker.java:455)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)