[ https://issues.apache.org/jira/browse/HDFS-9256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17404129#comment-17404129 ]
ayu wulandari commented on HDFS-9256: ------------------------------------- thank you very much the [information|http://namaanakbayi.com] is very useful > Erasure Coding: Improve failure handling of ECWorker striped block > reconstruction > --------------------------------------------------------------------------------- > > Key: HDFS-9256 > URL: https://issues.apache.org/jira/browse/HDFS-9256 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding > Reporter: Rakesh Radhakrishnan > Assignee: Rakesh Radhakrishnan > Priority: Major > Labels: hdfs-ec-3.0-nice-to-have > > As we know reconstruction of missed striped block is a costly operation, it > involves the following steps:- > step-1) read the data from minimum number of sources(remotely reading the > data) > step-2) decode data for the targets (CPU cycles) > step-3) transfer the data to the targets(remotely writing the data) > Assume there is a failure in step-3 due to target DN disconnected or dead > etc. Presently {{ECWorker}} is skipping the failed DN and continue > transferring data to the other targets. In the next round, it should again > start the reconstruction operation from first step. Considering the cost of > reconstruction, it would be good to give another chance to retry the failed > operation. The idea of this jira is to disucss the possible approaches and > implement it. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org