[ 
https://issues.apache.org/jira/browse/HDFS-9256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17404129#comment-17404129
 ] 

ayu wulandari commented on HDFS-9256:
-------------------------------------

thank you very much the [information|http://namaanakbayi.com] is very useful

> Erasure Coding: Improve failure handling of ECWorker striped block 
> reconstruction
> ---------------------------------------------------------------------------------
>
>                 Key: HDFS-9256
>                 URL: https://issues.apache.org/jira/browse/HDFS-9256
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: erasure-coding
>            Reporter: Rakesh Radhakrishnan
>            Assignee: Rakesh Radhakrishnan
>            Priority: Major
>              Labels: hdfs-ec-3.0-nice-to-have
>
> As we know reconstruction of missed striped block is a costly operation, it 
> involves the following steps:-
> step-1) read the data from minimum number of sources(remotely reading the 
> data)
> step-2) decode data for the targets (CPU cycles)
> step-3) transfer the data to the targets(remotely writing the data)
> Assume there is a failure in step-3 due to target DN disconnected or dead 
> etc. Presently {{ECWorker}} is skipping the failed DN and continue 
> transferring data to the other targets. In the next round, it should again 
> start the reconstruction operation from first step. Considering the cost of 
> reconstruction, it would be good to give another chance to retry the failed 
> operation. The idea of this jira is to disucss the possible approaches and 
> implement it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to