[
https://issues.apache.org/jira/browse/HDFS-7348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14517080#comment-14517080
]
Yi Liu commented on HDFS-7348:
------------------------------
{noformat}
DataRecoveryAndTransfer recover one or more missed striped block in the
striped block group, the minimum number of live striped blocks should be
no less than data block number.
| <- Striped Block Group -> |
blk_0 blk_1 blk_2(*) blk_3 ... <- A striped block group
| | | |
v v v v
+------+ +------+ +------+ +------+
|cell_0| |cell_1| |cell_2| |cell_3| ... <- The striped cell group
(cell_0, cell_1, ...)
+------+ +------+ +------+ +------+
|cell_4| |cell_5| |cell_6| |cell_7| ...
+------+ +------+ +------+ +------+
|cell_8| |cell_9| |cell10| |cell11| ...
+------+ +------+ +------+ +------+
... ... ... ...
We use following steps to recover striped cell group sequentially:
step1: read minimum striped cells required by recovery.
step2: decode cells for targets.
step3: transfer cells to targets.
In step1, try to read minimum striped cells, if there is corrupt or
stale sources, read from new source will be scheduled. The best sources
are remembered for next round and may be updated in each round.
In step2, It's blocked by HADOOP-11847, currently we only fill 11111...
to target block for test. Typically if source blocks we read are all data
blocks, we need to call encode, and if there is one parity block, we need
to call decode. Notice we only read once and recover all missed striped
block if they are more than one.
In step3, send the recovered cells to targets by constructing packet
and send them directly. Same as continuous block replication, we
don't check the packet ack. Since the datanode doing the recovery work
are one of the source datanodes, so the recovered cells are sent
remotely.
There are some points we can do further improvements in next phase:
1. we can read the block file directly on the local datanode,
currently we use remote block reader. (Notice short-circuit is not
a good choice, see inline comments).
2. We need to check the packet ack for EC recovery? Since EC recovery
is more expensive than continuous block replication, it needs to
read from several other datanodes, should we make sure the
recovered result received by targets?
{noformat}
> Erasure Coding: striped block recovery
> --------------------------------------
>
> Key: HDFS-7348
> URL: https://issues.apache.org/jira/browse/HDFS-7348
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: datanode
> Reporter: Kai Zheng
> Assignee: Yi Liu
> Attachments: ECWorker.java, HDFS-7348.001.patch
>
>
> This JIRA is to recover one or more missed striped block in the striped block
> group.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)