[
https://issues.apache.org/jira/browse/HADOOP-19180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17871201#comment-17871201
]
ASF GitHub Bot commented on HADOOP-19180:
-----------------------------------------
zhengchenyu opened a new pull request, #6813:
URL: https://github.com/apache/hadoop/pull/6813
### Description of PR
I found that if the erasedIndexes distribution is such that the parity index
is in front of the data index, ec will produce wrong results when decoding.
In fact, [HDFS-15186](https://issues.apache.org/jira/browse/HDFS-15186) has
described this problem, but does not fundamentally solve it.
The reason is that the code assumes that erasedIndexes is preceded by the
data index and followed by parity index. If there is a parity index placed in
front of the data index in the incoming code, a calculation error will occur.
When we decode the data unit, we multiply the existing data by the decoding
matrix. (Look at the formula
[doc](https://zhengchenyu.github.io/2024/05/17/ErasuceCode%E7%AE%97%E6%B3%95%E5%AE%9E%E7%8E%B0/)
in 1.2)
When we decode the parity unit, we multiply the existing data by the
decoding matrix, get data unit, then multiply by encoding matrix. (Look at the
formula
[doc](https://zhengchenyu.github.io/2024/05/17/ErasuceCode%E7%AE%97%E6%B3%95%E5%AE%9E%E7%8E%B0/)
in 1.1 and 1.2 )
The calculations for parity and block are different. But They calculate in
two separate loops, then the code requires that the data index must precede the
parity index.
### How was this patch tested?
The TestErasureCodingEncodeAndDecode unit test and the erasure_code_test
binary were executed on different machines. The test machines include those
with isa-l installed and those without isa-l installed.
### For code changes:
- Make erasedIndexes support arbitrary index distribution.
> EC: Fix calculation errors caused by special index order
> --------------------------------------------------------
>
> Key: HADOOP-19180
> URL: https://issues.apache.org/jira/browse/HADOOP-19180
> Project: Hadoop Common
> Issue Type: Bug
> Reporter: Chenyu Zheng
> Assignee: Chenyu Zheng
> Priority: Critical
> Labels: pull-request-available
>
> I found that if the erasedIndexes distribution is such that the parity index
> is in front of the data index, ec will produce wrong results when decoding.
> In fact, HDFS-15186 has described this problem, but does not fundamentally
> solve it.
> The reason is that the code assumes that erasedIndexes is preceded by the
> data index and followed by parity index. If there is a parity index placed in
> front of the data index, a calculation error will occur.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]