[ https://issues.apache.org/jira/browse/HADOOP-12041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15127239#comment-15127239 ]
Zhe Zhang commented on HADOOP-12041: ------------------------------------ Thanks for the great work Kai. Patch LGTM overall. A few minor issues: # {{makeValidIndexes}} should be static and maybe moved to {{CodecUtil}} # The reported {{checkstyle}} issue looks valid. # When you say the new coder is compatible with ISA-L coder, you mean I can use new Java coder to decode data encoded with ISA-L, right? # {{gftbls}} is hard to understand by name, does it mean {{gfTables}}? # Any reason to allocate {{encodeMatrix}}, {{decodeMatrix}}, and {{invertMatrix}} as 1D arrays but use them as matrices? Can we use 2D arrays? # The below is not easy to understand. Why don't we need to prepare {{decodeMatrix}} if the cached indexes haven't changed? {code} if (Arrays.equals(this.cachedErasedIndexes, erasedIndexes) && Arrays.equals(this.validIndexes, tmpValidIndexes)) { return; // Optimization. Nothing to do } {code} # {{RSUtil2#genReedSolomonMatrix}} is unused # Patch is already very large, I think we should add {{package-info}} separately. # So about the incompatibility between HDFS-RAID coder and the new Java coder: is it because they use different GF matrices? > Implement another Reed-Solomon coder in pure Java > ------------------------------------------------- > > Key: HADOOP-12041 > URL: https://issues.apache.org/jira/browse/HADOOP-12041 > Project: Hadoop Common > Issue Type: Sub-task > Reporter: Kai Zheng > Assignee: Kai Zheng > Attachments: HADOOP-12041-v1.patch, HADOOP-12041-v2.patch, > HADOOP-12041-v3.patch, HADOOP-12041-v4.patch, HADOOP-12041-v5.patch, > HADOOP-12041-v6.patch > > > Currently existing Java RS coders based on {{GaloisField}} implementation > have some drawbacks or limitations: > * The decoder computes not really erased units unnecessarily (HADOOP-11871); > * The decoder requires parity units + data units order for the inputs in the > decode API (HADOOP-12040); > * Need to support or align with native erasure coders regarding concrete > coding algorithms and matrix, so Java coders and native coders can be easily > swapped in/out and transparent to HDFS (HADOOP-12010); > * It's unnecessarily flexible but incurs some overhead, as HDFS erasure > coding is totally a byte based data system, we don't need to consider other > symbol size instead of 256. > This desires to implement another RS coder in pure Java, in addition to the > existing {{GaliosField}} from HDFS-RAID. The new Java RS coder will be > favored and used by default to resolve the related issues. The old HDFS-RAID > originated coder will still be there for comparing, and converting old data > from HDFS-RAID systems. -- This message was sent by Atlassian JIRA (v6.3.4#6332)