[ https://issues.apache.org/jira/browse/HDFS-11542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17802507#comment-17802507 ]
Shilun Fan commented on HDFS-11542: ----------------------------------- Bulk update: moved all 3.4.0 non-blocker issues, please move back if it is a blocker. Retarget 3.5.0. > Fix RawErasureCoderBenchmark decoding operation > ----------------------------------------------- > > Key: HDFS-11542 > URL: https://issues.apache.org/jira/browse/HDFS-11542 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding > Affects Versions: 3.0.0-alpha2 > Reporter: László Bence Nagy > Priority: Minor > Labels: test > > There are some issues with the decode operation in the > *RawErasureCoderBenchmark.java* file. The decoding method is called like > this: *decoder.decode(decodeInputs, ERASED_INDEXES, outputs);*. > Using RS 6+3 configuration it could be called with these parameters correctly > like this: *decode([ d0, NULL, d2, d3, NULL, d5, p0, NULL, p2 ], [ 1, 4, 7 ], > [ -, -, - ])*. The 1,4,7 indexes are in the *ERASED_INDEXES* array so in the > *decodeInputs* array the values at those indexes are set to NULL, all other > data and parity packets are present in the array. The *outputs* array's > length is 3, where the d1, d4 and p1 packets should be reconstructed. This > would be the right solution. > Right now this example would be called like this: *decode([ d0, d1, d2, d3, > d4, d5, -, -, - ], [ 1, 4, 7 ], [ -, -, - ])*. So it has two main problems > with the *decodeInputs* array. Firstly, the packets are not set to NULL where > they should be based on the *ERASED_INDEXES* array. Secondly, it does not > have any parity packets for decoding. > The first problem is easy to solve, the values at the proper indexes need to > be set to NULL. The latter one is a little harder because right now multiple > rounds of encode operations are done one after another and similarly multiple > decode operations are called one by one. Encode and decode pairs should be > called one after another so that the encoded parity packets can be used in > the *decodeInputs* array as a parameter for decode. (Of course, their > performance should be still measured separately.) > Moreover, there is one more problem in this file. Right now it works with RS > 6+3 and the *ERASED_INDEXES* array is fixed to *[ 6, 7, 8 ]*. So the three > parity packets are needed to be reconstructed. This means that no real decode > performance is measured because no data packet is needed to be reconstructed > (even if the decode works properly). Actually, only new parity packets are > needed to be encoded. The exact implementation depends on the underlying > erasure coding plugin, but the point is that data packets should also be > erased to measure real decode performance. > In addition to this, more RS configurations (not just 6+3) could be measured > as well to be able to compare them. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org