[
https://issues.apache.org/jira/browse/HDFS-7348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14518362#comment-14518362
]
Zhe Zhang commented on HDFS-7348:
---------------------------------
Please find detailed comments below:
Logics:
# Since recovering multiple missing blocks at once is a pretty rare case,
should we just reconstruct all missing blocks and use {{DataNode#DataTransfer}}
to push them out?
# I filed HDFS-8282 to move {{StripedReadResult}} and {{waitNextCompletion}} to
{{StripedBlockUtil}}.
# In foreground recovery we read in parallel to minimize latency. It's an
interesting design question whether we should we do the same in background
recovery. More discussions are needed here.
# If we do choose to read source blocks in parallel, how should we design the
unit of sync-and-decode? Right now the readers read a cell at a time. Another
option is to read entire blocks and then decode. The drawback is larger
temporary memory usage. The benefits are: i) simpler logic (no need to recreate
reading threads) and avoid the overhead of initializing connection to source
DNs; ii) maintain open connections as short as possible (fast readers don't
need to wait for slow ones); iii) Does it save CPU to decode in big chunks?
[~drankye] Could you advise?
# Should we save a copy of reconstructed block locally? More space will be
used; but it will avoid re-decoding if push fails.
Nits:
# Could use {{ArrayList<>}}
{code}
stripedReaders = new ArrayList<StripedReader>(sources.length);
{code}
# Maybe we can move {{getBlock}} to {{StripedBlockUtil}} too; it's a useful
util to only parse the {{Block}}. If it sounds good to you I'll move it in
HDFS-8282.
> Erasure Coding: striped block recovery
> --------------------------------------
>
> Key: HDFS-7348
> URL: https://issues.apache.org/jira/browse/HDFS-7348
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: datanode
> Reporter: Kai Zheng
> Assignee: Yi Liu
> Attachments: ECWorker.java, HDFS-7348.001.patch
>
>
> This JIRA is to recover one or more missed striped block in the striped block
> group.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)