[
https://issues.apache.org/jira/browse/HADOOP-11828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537790#comment-14537790
]
Kai Zheng commented on HADOOP-11828:
------------------------------------
Hi Jack, thanks for your clarifying.
For the first 3 points I really would like them to be resolved first as they're
clear to us now and it would lay a more solid base for the following
implementations of the other two modes. Doing so we won't have to change big
after committed. I understand the process isn't very productive but that's the
pain of open source. I really wish we could get this in sooner but we have to
do more reviews from more guys, so I guess you will have chances to get the
codes more clean and elegant.
bq. HH is specific in preparing input data in decoding
I don't think so, any erasure code is used to encode and decode arbitrary user
data, we don't need to prepare for it specifically.
bq. Current testCoding()in TestErasureCoderBase using left 9 data units + 4
parity units to reconstruct the missing one data unit.
Yes it is for now. It will be corrected in HADOOP-11847. I thought it's good to
customize the {{testCoding}} logic here, but in future we should consolidate
the codes into the parent {{testCoding}}.
bq. I have no good idea cause encoding of RS will erasure input data.
I see. I don't have either, checking the RS codes it's not easy to avoid the
erasure. Let's optimize it in future when we get all the things work right
first.
> Implement the Hitchhiker erasure coding algorithm
> -------------------------------------------------
>
> Key: HADOOP-11828
> URL: https://issues.apache.org/jira/browse/HADOOP-11828
> Project: Hadoop Common
> Issue Type: Sub-task
> Reporter: Zhe Zhang
> Assignee: jack liuquan
> Attachments: 7715-hitchhikerXOR-v2-testcode.patch,
> 7715-hitchhikerXOR-v2.patch, HADOOP-11828-hitchhikerXOR-V3.patch,
> HADOOP-11828-hitchhikerXOR-V4.patch, HDFS-7715-hhxor-decoder.patch,
> HDFS-7715-hhxor-encoder.patch
>
>
> [Hitchhiker |
> http://www.eecs.berkeley.edu/~nihar/publications/Hitchhiker_SIGCOMM14.pdf] is
> a new erasure coding algorithm developed as a research project at UC
> Berkeley. It has been shown to reduce network traffic and disk I/O by 25%-45%
> during data reconstruction. This JIRA aims to introduce Hitchhiker to the
> HDFS-EC framework, as one of the pluggable codec algorithms.
> The existing implementation is based on HDFS-RAID.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)