[
https://issues.apache.org/jira/browse/HDFS-7344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14379785#comment-14379785
]
Kai Zheng commented on HDFS-7344:
---------------------------------
Hello [~szetszwo],
Thanks for taking care of this. Let me address your comments together. Please
let know if it works. Thanks.
bq.For 1 missing block, we may not need to recover it at all since
(6,3)\-Reed-Solomon can tolerate 3 missing blocks. Also recovery is more
efficient for 2- or 3- missing blocks.
Good thoughts. I remembered we had related discussion with [~zhz]. The idea is
we have different priorities for recovery tasks considering how urgent the
erased blocks are necessarily to be recovered. As you said, 2- or 3- erased
blocks are more urgent than 1- erased so would be of higher priority when NN
schedules. Note 1- erased block is still needed to be recovered when possible
because as existing customer runs, in most cases only one block is erased and
to be recovered. Recovering 1- erased block can also be efficient, because in
such case simple XOR calculation can be used and no RS overhead will incur.
bq.Since you are working on the client, how about letting someone else working
on the datanode changes?
Good suggestion. Discussed with [~libo-intel], I will help before he can be
back to this after done with the client side. As it's going in the client side
where [~libo-intel] collaborates with [~jingzhao], [~zhz] and gets the hard
part already done, I believe we also need the very good community collaboration
here as well. How do you like this, let me update the design doc first in the
early of next week, discussing with [~umamaheswararao], [~vinayrpet] and etc.,
incorporating the discussions here by [~zhz] and you. The doc is subject to
your review and further discussion here. Meanwhile I will also update and
refine Bo's codes based on the latest design and the branch in another week, so
have concrete doable thoughts to break this whole down into smaller tasks, then
others than me and Bo can also help in parallel as you suggested. Hope this
works.
> Erasure Coding worker and support in DataNode
> ---------------------------------------------
>
> Key: HDFS-7344
> URL: https://issues.apache.org/jira/browse/HDFS-7344
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: datanode
> Reporter: Kai Zheng
> Assignee: Li Bo
> Attachments: HDFS ECWorker Design.pdf, hdfs-ec-datanode.0108.zip,
> hdfs-ec-datanode.0108.zip
>
>
> According to HDFS-7285 and the design, this handles DataNode side extension
> and related support for Erasure Coding, and implements ECWorker. It mainly
> covers the following aspects, and separate tasks may be opened to handle each
> of them.
> * Process encoding work, calculating parity blocks as specified in block
> groups and codec schema;
> * Process decoding work, recovering data blocks according to block groups and
> codec schema;
> * Handle client requests for passive recovery blocks data and serving data on
> demand while reconstructing;
> * Write parity blocks according to storage policy.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)