[ 
https://issues.apache.org/jira/browse/HDFS-7369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14359570#comment-14359570
 ] 

Zhe Zhang commented on HDFS-7369:
---------------------------------

Thanks Jing and Kai for the helpful reviews!

bq. For a striped block, I think it will be better to use BlockInfo(Striped), 
instead of its individual blocks, as the basic unit for recovery. E.g., suppose 
we lose 2 blocks for a 6+3 EC block. For recovery, I guess we want these two 
blocks are recovered in a single recovery work instead of 2.
Agreed. A striped block group should be recovered as a whole; multiple missing 
blocks should be recovered on a single target first. In the new patch, an array 
of {{targets}} is selected, and {{targets[0]}} is used as the DN to send 
recovery command to.

bq. As you mentioned in HDFS-7912, BlockManager and ReplicationMonitor never 
see individual data/parity blocks currently. But it may be better to have a 
more strict type restriction in UnderReplicatedBlocks, ReplicationWork, 
ErasureCodingWork, and computeRecoveryWorkForBlocks's parameter.
That's a good point. I don't think there's any additional overhead of doing 
that. I'll rebase after HDFS-7912.

bq. What is the case we're targeting for, is it the block recovering in 
stripping ec case ? If so, we need to make the title clearer, since we also 
have other cases for erased block recovering in pure ec form.
Yes this patch is for recovering striped block. Updated subject.

bq. is it possible to explicitly assemble all the necessary information in a 
BlockGroup and pass around to construct a ErasureCodingRecoveryWork
I think the current {{BlockCodecCommand}} has all that a DN needs for recovery, 
except for the schema, which is still hard-coded.

bq. I don't see {missingBlockIdx}} is actually used.
Good catch, updated in the new patch.

One thing I'm not yet sure about is how to efficiently get the indices of 
missing blocks. Maybe something like the below?
{code}
for(DatanodeStorageInfo storage : blocksMap.getStorages(block, State.FAILED)) {
  indices.add(block.getStorageBlockIndex(storage);
}
{code}

> Erasure coding: distribute recovery work for striped blocks to DataNode
> -----------------------------------------------------------------------
>
>                 Key: HDFS-7369
>                 URL: https://issues.apache.org/jira/browse/HDFS-7369
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Zhe Zhang
>            Assignee: Zhe Zhang
>         Attachments: HDFS-7369-000-part1.patch, HDFS-7369-000-part2.patch, 
> HDFS-7369-001.patch, HDFS-7369-002.patch
>
>
> This JIRA updates NameNode to handle background / offline recovery of erasure 
> coded blocks. It includes 2 parts:
> # Extend {{UnderReplicatedBlocks}} to recognize EC blocks and insert them to 
> appropriate priority levels. 
> # Update {{ReplicationMonitor}} to distinguish block codec tasks and send a 
> new DataNode command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to