[
https://issues.apache.org/jira/browse/HDFS-9818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15151731#comment-15151731
]
Rakesh R commented on HDFS-9818:
--------------------------------
[~zhz] yes, correct. {{HadoopIllegalArgumentException}} is a similar situation
I came across during HDFS-8786 work. I think [~jingzhao]'s, 2nd point is a very
good one and is also applicable in the case of decommissioning a live datanode.
Instead of scheduling an EC reconstruction command, can just issue a block
transfer to another target datanode. That simplifies the reconstruction logic.
Between I'm also trying to understand the notion of {{erasedIndexes}} ==
{{erasedOrNotToReadIndexes}} check.
> Correctly handle EC reconstruction work caused by not enough racks
> ------------------------------------------------------------------
>
> Key: HDFS-9818
> URL: https://issues.apache.org/jira/browse/HDFS-9818
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: datanode, namenode
> Affects Versions: 3.0.0
> Reporter: Takuya Fukudome
> Assignee: Jing Zhao
> Attachments: HDFS-9818.000.patch
>
>
> This is reported by [~tfukudom]:
> In a system test where 1 of 7 datanode racks were stopped,
> {{HadoopIllegalArgumentException}} was seen on DataNode side while
> reconstructing missing EC blocks:
> {code}
> 2016-02-16 11:09:06,672 WARN datanode.DataNode
> (ErasureCodingWorker.java:run(482)) - Failed to recover striped block:
> BP-480558282-172.29.4.13-1453805190696:blk_-9223372036850962784_278270
> org.apache.hadoop.HadoopIllegalArgumentException: Inputs not fully
> corresponding to erasedIndexes in null places. erasedOrNotToReadIndexes: [1,
> 2, 6], erasedIndexes: [3]
> at
> org.apache.hadoop.io.erasurecode.rawcoder.RSRawDecoder.doDecode(RSRawDecoder.java:166)
> at
> org.apache.hadoop.io.erasurecode.rawcoder.AbstractRawErasureDecoder.decode(AbstractRawErasureDecoder.java:84)
> at
> org.apache.hadoop.io.erasurecode.rawcoder.RSRawDecoder.decode(RSRawDecoder.java:89)
> at
> org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker$ReconstructAndTransferBlock.recoverTargets(ErasureCodingWorker.java:683)
> at
> org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker$ReconstructAndTransferBlock.run(ErasureCodingWorker.java:465)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)