[ 
https://issues.apache.org/jira/browse/HADOOP-12041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HADOOP-12041:
-------------------------------
    Description: 
Currently existing Java RS coders based on {{GaloisField}} implementation have 
some drawbacks or limitations:
* The decoder computes not really erased units unnecessarily (HADOOP-11871);
* The decoder requires parity units + data units order for the inputs in the 
decode API (HADOOP-12040);
* Need to support or align with native erasure coders regarding concrete coding 
algorithms and matrix, so Java coders and native coders can be easily swapped 
in/out and transparent to HDFS (HADOOP-12010);
* It's unnecessarily flexible but incurs some overhead, as HDFS erasure coding 
is totally a byte based data system, we don't need to consider other symbol 
size instead of 256.

This desires to implement another  RS coder in pure Java, in addition to the 
existing {{GaliosField}} from HDFS-RAID. The new Java RS coder will be favored 
and used by default to resolve the related issues. The old HDFS-RAID originated 
coder will still be there for comparing, and converting old data from HDFS-RAID 
systems.

  was:
Currently existing Java RS coders based on {{GaloisField}} implementation have 
some drawbacks or limitations:
* The decoder computes not really erased units unnecessarily (HADOOP-11871);
* The decoder requires parity units + data units order for the inputs in the 
decode API (HADOOP-12040);
* Need to support or align with native erasure coders regarding concrete coding 
algorithms and matrix, so Java coders and native coders can be easily swapped 
in/out and transparent to HDFS (HADOOP-12010);
* It's unnecessarily flexible but incurs some overhead, as HDFS erasure coding 
is totally a byte based data system, we don't need to consider other symbol 
size instead of 256.

This desires to re-implement the underlying facilities for the Java RS coders, 
getting rid of existing {{GaliosField}} from HDFS-RAID. Based on this work, 
Java RS coders will be re-implemented easily as well to resolve related issues.


> Implement another Reed-Solomon coder in pure Java
> -------------------------------------------------
>
>                 Key: HADOOP-12041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12041
>             Project: Hadoop Common
>          Issue Type: Sub-task
>            Reporter: Kai Zheng
>            Assignee: Kai Zheng
>
> Currently existing Java RS coders based on {{GaloisField}} implementation 
> have some drawbacks or limitations:
> * The decoder computes not really erased units unnecessarily (HADOOP-11871);
> * The decoder requires parity units + data units order for the inputs in the 
> decode API (HADOOP-12040);
> * Need to support or align with native erasure coders regarding concrete 
> coding algorithms and matrix, so Java coders and native coders can be easily 
> swapped in/out and transparent to HDFS (HADOOP-12010);
> * It's unnecessarily flexible but incurs some overhead, as HDFS erasure 
> coding is totally a byte based data system, we don't need to consider other 
> symbol size instead of 256.
> This desires to implement another  RS coder in pure Java, in addition to the 
> existing {{GaliosField}} from HDFS-RAID. The new Java RS coder will be 
> favored and used by default to resolve the related issues. The old HDFS-RAID 
> originated coder will still be there for comparing, and converting old data 
> from HDFS-RAID systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to