[
https://issues.apache.org/jira/browse/HDFS-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14379957#comment-14379957
]
Kai Zheng commented on HDFS-7715:
---------------------------------
Hi Jack,
bq.I just modify it for building in my machine. sorry for that.
No sorry at all. I understand.
bq.I have refined my codes with your suggestions, coding style using google
style.
I'm a little confused. I thought we should use Hadoop coding style. If you're
not clear about that, please read:
https://wiki.apache.org/hadoop/CodeReviewChecklist
bq.I think we should balance gains between using native RS raw coders and
separate them.
I'm glad you confirmed it should be doable for you to reuse existing raw coders
for XOR and RS codes. I thought the performance consideration should be OK for
now, particularly considering we will have native implementations of the raw
coders. Anyway, reusing existing raw coders would save us much of low level
codes and allow us to focus on the HH specific algorithms. Your current effort
still makes sense for your grasping of the codes. Regarding your performance
improvement, what's that ? Maybe you can apply it to the Java implementation of
RS raw coder ? I will come up benchmark tool to compare performance for raw
coders in HADOOP-11588. Hope it will help.
In your current implementation, you have hard-coded matrix and parameters (10,
4). I'm wondering if it could get resolved as we desire a code can be
configurable and flexible. Will it help if we reuse existing raw coders ?
I saw you uploaded new patches for both encoder and decoder. Please merge them
together to make a complete patch. We can have more review.
> Implement the Hitchhiker erasure coding algorithm
> -------------------------------------------------
>
> Key: HDFS-7715
> URL: https://issues.apache.org/jira/browse/HDFS-7715
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Zhe Zhang
> Assignee: jack liuquan
> Attachments: HDFS-7715-hhxor-decoder.patch,
> HDFS-7715-hhxor-encoder.patch
>
>
> [Hitchhiker |
> http://www.eecs.berkeley.edu/~nihar/publications/Hitchhiker_SIGCOMM14.pdf] is
> a new erasure coding algorithm developed as a research project at UC
> Berkeley. It has been shown to reduce network traffic and disk I/O by 25%-45%
> during data reconstruction. This JIRA aims to introduce Hitchhiker to the
> HDFS-EC framework, as one of the pluggable codec algorithms.
> The existing implementation is based on HDFS-RAID.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)