[
https://issues.apache.org/jira/browse/HADOOP-12041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086391#comment-15086391
]
Zhe Zhang commented on HADOOP-12041:
------------------------------------
Thanks Kai for the work. I'm still reviewing the patch but would like to
discuss a high level question.
bq. The old HDFS-RAID originated coder will still be there for comparing, and
converting old data from HDFS-RAID systems.
The old and new coders should generate the same results, right? I don't think
we need the old coder to port data? I guess it depends on whether
{{GaloisField}} has the same matrix at size 256 as the new {{GF256}}?
bq. The new Java RS coder will be favored and used by default
If that's our position, we should rename the existing coder as
{{RSRawEncoderLegacy}}} and name the new one as {{RSRawEncoder}} (same for
decoder). Alternatively, if we think the stability of the new coder needs more
testing, we can keep the current naming in v5 patch, implying that the new
coder is in "beta" mode.
Some unused methods: {{genReedSolomonMatrix}}, {{gfBase}}, {{gfLogBase}}
> Implement another Reed-Solomon coder in pure Java
> -------------------------------------------------
>
> Key: HADOOP-12041
> URL: https://issues.apache.org/jira/browse/HADOOP-12041
> Project: Hadoop Common
> Issue Type: Sub-task
> Reporter: Kai Zheng
> Assignee: Kai Zheng
> Attachments: HADOOP-12041-v1.patch, HADOOP-12041-v2.patch,
> HADOOP-12041-v3.patch, HADOOP-12041-v4.patch, HADOOP-12041-v5.patch
>
>
> Currently existing Java RS coders based on {{GaloisField}} implementation
> have some drawbacks or limitations:
> * The decoder computes not really erased units unnecessarily (HADOOP-11871);
> * The decoder requires parity units + data units order for the inputs in the
> decode API (HADOOP-12040);
> * Need to support or align with native erasure coders regarding concrete
> coding algorithms and matrix, so Java coders and native coders can be easily
> swapped in/out and transparent to HDFS (HADOOP-12010);
> * It's unnecessarily flexible but incurs some overhead, as HDFS erasure
> coding is totally a byte based data system, we don't need to consider other
> symbol size instead of 256.
> This desires to implement another RS coder in pure Java, in addition to the
> existing {{GaliosField}} from HDFS-RAID. The new Java RS coder will be
> favored and used by default to resolve the related issues. The old HDFS-RAID
> originated coder will still be there for comparing, and converting old data
> from HDFS-RAID systems.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)