[
https://issues.apache.org/jira/browse/HDFS-8005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jing Zhao updated HDFS-8005:
----------------------------
Attachment: HDFS-8005.000.patch
Upload the patch. The patch fixes #1 and #3, and also adds a new test to make
sure the recovery work computation/distribution is correct. The test also
provides a utility function that can create a file with striped blocks by using
synthetic block reports. We can use this function for testing NN side logic
before the client side work is done.
Besides, the patch also makes the following simplification:
# Instead of recording missing block indies, we can directly capture the
live/healthy block indies. This can simplify both the NN side computation and
later the DN side interpretation.
# Instead of providing precisely {{NUM_DATA_BLOCKS}} source nodes, I think we
can simply providing all the existing live nodes as sources. This can avoid the
random selection logic, also it will not bring in too much redundant
information (at most k-1 sources node ). This can also provide some flexibility
to DN for later recovery. E.g., to support different EC schemas (LRC) or to
allow hedged read of source data.
To apply the patch we may need to apply HDFS-7907 first.
> Erasure Coding: simplify striped block recovery work computation and add tests
> ------------------------------------------------------------------------------
>
> Key: HDFS-8005
> URL: https://issues.apache.org/jira/browse/HDFS-8005
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Jing Zhao
> Assignee: Jing Zhao
> Attachments: HDFS-8005.000.patch
>
>
> HDFS-7369 adds the functionality to distribute recovery work of striped
> blocks to datanodes. There are still some pending issues:
> # In {{BlockManager#chooseSourceNode}}, a node is added into
> {{healthyIndices}} without checking if its block is live and healthy
> # The test {{TestRecoverStripedBlcoks#testMissingStripedBlock}} has not
> tested striped blocks because the file is created before setting the storage
> policy
> # In {{computeRecoveryWorkForBlocks}}, instead of using
> {{BlockCollection#isStriped}}, we'd better use {{BlockInfo#isStriped}}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)