Jing Zhao created HDFS-9818:
-------------------------------
Summary: Correctly handle EC reconstruction work caused by not
enough racks
Key: HDFS-9818
URL: https://issues.apache.org/jira/browse/HDFS-9818
Project: Hadoop HDFS
Issue Type: Sub-task
Components: datanode, namenode
Affects Versions: 3.0.0
Reporter: Takuya Fukudome
Assignee: Jing Zhao
This is reported by [~tfukudom]:
In a system test where 1 of 7 datanode racks were stopped,
{{HadoopIllegalArgumentException}} was seen on DataNode side while
reconstructing missing EC blocks:
{code}
2016-02-16 11:09:06,672 WARN datanode.DataNode
(ErasureCodingWorker.java:run(482)) - Failed to recover striped block:
BP-480558282-172.29.4.13-1453805190696:blk_-9223372036850962784_278270
org.apache.hadoop.HadoopIllegalArgumentException: Inputs not fully
corresponding to erasedIndexes in null places. erasedOrNotToReadIndexes: [1, 2,
6], erasedIndexes: [3]
at
org.apache.hadoop.io.erasurecode.rawcoder.RSRawDecoder.doDecode(RSRawDecoder.java:166)
at
org.apache.hadoop.io.erasurecode.rawcoder.AbstractRawErasureDecoder.decode(AbstractRawErasureDecoder.java:84)
at
org.apache.hadoop.io.erasurecode.rawcoder.RSRawDecoder.decode(RSRawDecoder.java:89)
at
org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker$ReconstructAndTransferBlock.recoverTargets(ErasureCodingWorker.java:683)
at
org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker$ReconstructAndTransferBlock.run(ErasureCodingWorker.java:465)
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)