[
https://issues.apache.org/jira/browse/HDFS-9040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14805089#comment-14805089
]
Walter Su commented on HDFS-9040:
---------------------------------
bq. For the last block group, we can calculate the length of each block via the
file length, so if a block doesn’t satisfy the required length, then we can
conclude it’s corrupt.
[~libo-intel], You missed a fact. File length is calculated from blocks(See
INodeFile). If lastBlock isn't successfully committed, it's not included in
file length. So, to recovery a UC lastBlock, to truncate last UC block to a
suitable length, we can only depend on the internal replica lengths got from
other DNs.(See DataNode.recoverBlock) How do we get the suitable length? Does
the length include last data block of last stripe? What if the last data block
fails? I agree part of [~jingzhao] said.
I think if the last stripe is not in the middle the committed length, we can
just dispose last stripe, and truncate the last UC blockgroup to (numOfStripe
-1). If the committed length is in the middle of last stripe. We can do
[~libo-intel]'s method. I agree part of [~libo-intel] said.
We can identify the failure purely based on length. Assume there's no bit
reversion corruption using checksum.
> Erasure coding: Refactor DFSStripedOutputStream (Move Namenode RPC Requests
> to Coordinator)
> -------------------------------------------------------------------------------------------
>
> Key: HDFS-9040
> URL: https://issues.apache.org/jira/browse/HDFS-9040
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Walter Su
> Attachments: HDFS-9040-HDFS-7285.002.patch,
> HDFS-9040-HDFS-7285.003.patch, HDFS-9040.00.patch, HDFS-9040.001.wip.patch,
> HDFS-9040.02.bgstreamer.patch
>
>
> The general idea is to simplify error handling logic.
> Proposal 1:
> A BlockGroupDataStreamer to communicate with NN to allocate/update block, and
> StripedDataStreamer s only have to stream blocks to DNs.
> Proposal 2:
> See below the
> [comment|https://issues.apache.org/jira/browse/HDFS-9040?focusedCommentId=14741388&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14741388]
> from [~jingzhao].
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)