[
https://issues.apache.org/jira/browse/HDFS-11542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Nauroth updated HDFS-11542:
---------------------------------
Target Version/s: 3.5.1
> Fix RawErasureCoderBenchmark decoding operation
> -----------------------------------------------
>
> Key: HDFS-11542
> URL: https://issues.apache.org/jira/browse/HDFS-11542
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: erasure-coding
> Affects Versions: 3.0.0-alpha2
> Reporter: László Bence Nagy
> Priority: Minor
> Labels: test
>
> There are some issues with the decode operation in the
> *RawErasureCoderBenchmark.java* file. The decoding method is called like
> this: *decoder.decode(decodeInputs, ERASED_INDEXES, outputs);*.
> Using RS 6+3 configuration it could be called with these parameters correctly
> like this: *decode([ d0, NULL, d2, d3, NULL, d5, p0, NULL, p2 ], [ 1, 4, 7 ],
> [ -, -, - ])*. The 1,4,7 indexes are in the *ERASED_INDEXES* array so in the
> *decodeInputs* array the values at those indexes are set to NULL, all other
> data and parity packets are present in the array. The *outputs* array's
> length is 3, where the d1, d4 and p1 packets should be reconstructed. This
> would be the right solution.
> Right now this example would be called like this: *decode([ d0, d1, d2, d3,
> d4, d5, -, -, - ], [ 1, 4, 7 ], [ -, -, - ])*. So it has two main problems
> with the *decodeInputs* array. Firstly, the packets are not set to NULL where
> they should be based on the *ERASED_INDEXES* array. Secondly, it does not
> have any parity packets for decoding.
> The first problem is easy to solve, the values at the proper indexes need to
> be set to NULL. The latter one is a little harder because right now multiple
> rounds of encode operations are done one after another and similarly multiple
> decode operations are called one by one. Encode and decode pairs should be
> called one after another so that the encoded parity packets can be used in
> the *decodeInputs* array as a parameter for decode. (Of course, their
> performance should be still measured separately.)
> Moreover, there is one more problem in this file. Right now it works with RS
> 6+3 and the *ERASED_INDEXES* array is fixed to *[ 6, 7, 8 ]*. So the three
> parity packets are needed to be reconstructed. This means that no real decode
> performance is measured because no data packet is needed to be reconstructed
> (even if the decode works properly). Actually, only new parity packets are
> needed to be encoded. The exact implementation depends on the underlying
> erasure coding plugin, but the point is that data packets should also be
> erased to measure real decode performance.
> In addition to this, more RS configurations (not just 6+3) could be measured
> as well to be able to compare them.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]