[
https://issues.apache.org/jira/browse/HDFS-9256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17404129#comment-17404129
]
ayu wulandari commented on HDFS-9256:
-------------------------------------
thank you very much the [information|http://namaanakbayi.com] is very useful
> Erasure Coding: Improve failure handling of ECWorker striped block
> reconstruction
> ---------------------------------------------------------------------------------
>
> Key: HDFS-9256
> URL: https://issues.apache.org/jira/browse/HDFS-9256
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: erasure-coding
> Reporter: Rakesh Radhakrishnan
> Assignee: Rakesh Radhakrishnan
> Priority: Major
> Labels: hdfs-ec-3.0-nice-to-have
>
> As we know reconstruction of missed striped block is a costly operation, it
> involves the following steps:-
> step-1) read the data from minimum number of sources(remotely reading the
> data)
> step-2) decode data for the targets (CPU cycles)
> step-3) transfer the data to the targets(remotely writing the data)
> Assume there is a failure in step-3 due to target DN disconnected or dead
> etc. Presently {{ECWorker}} is skipping the failed DN and continue
> transferring data to the other targets. In the next round, it should again
> start the reconstruction operation from first step. Considering the cost of
> reconstruction, it would be good to give another chance to retry the failed
> operation. The idea of this jira is to disucss the possible approaches and
> implement it.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]