[ 
https://issues.apache.org/jira/browse/HDFS-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394657#comment-14394657
 ] 

Kai Zheng commented on HDFS-7715:
---------------------------------

Hi [~rashmikv],

I thought you made good point, we should make our interface general enough in 
order to be able to cover more codes other than just HH code. I guess your 
suggestion is based on HDFS-RAID. In our new codec & coder framework, how to 
read chunk from a block is up to the coder caller since codec & coder don't 
want to involve concrete environment like HDFS specifics. {{ECBlock}} is an 
abstraction which is subject to be extended and customized by HDFS block stuff. 
The coder caller by default will simply read chunk by chunk from a block for 
now. For HH and also other possible codes, it may have different tweak 
regarding how to read chunk(s) from block for a coding procedure, that's why I 
suggested adding {{readChunk}} method in {{ECBlock}} class so {{HHECBlock}} can 
then be able to customize the behavior. The mentioned offset and len would be 
kept in HHECBlock internally, and will be then used to call the real read 
method for a real HDFS block. So in the underlying we will have some method to 
use the two parameters for the real read, though it's not in the interface 
level. I will refine related codes in this way in a patch to illustrate this 
idea. Maybe you could look at it then for what I'm really meaning here.

> Implement the Hitchhiker erasure coding algorithm
> -------------------------------------------------
>
>                 Key: HDFS-7715
>                 URL: https://issues.apache.org/jira/browse/HDFS-7715
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Zhe Zhang
>            Assignee: jack liuquan
>         Attachments: 7715-hitchhikerXOR-v2.patch, 
> HDFS-7715-hhxor-decoder.patch, HDFS-7715-hhxor-encoder.patch
>
>
> [Hitchhiker | 
> http://www.eecs.berkeley.edu/~nihar/publications/Hitchhiker_SIGCOMM14.pdf] is 
> a new erasure coding algorithm developed as a research project at UC 
> Berkeley. It has been shown to reduce network traffic and disk I/O by 25%-45% 
> during data reconstruction. This JIRA aims to introduce Hitchhiker to the 
> HDFS-EC framework, as one of the pluggable codec algorithms.
> The existing implementation is based on HDFS-RAID. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to