[jira] [Commented] (HADOOP-11938) Fix ByteBuffer version encode/decode API of raw erasure coder

Yi Liu (JIRA) Mon, 11 May 2015 22:10:45 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-11938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14539254#comment-14539254
 ]


Yi Liu commented on HADOOP-11938:
---------------------------------

In *XORRawDecoder.java*
{code}
    int dataLen = inputs[0].remaining();
    int erasedIdx = erasedIndexes[0];

    // Process the inputs.
    int iPos, oPos, iIdx, oIdx;
    oPos = output.position();
    for (int i = 0; i < inputs.length; i++) {
      // Skip the erased location.
      if (i == erasedIdx) {
        continue;
      }

      iPos = inputs[i].position();
      for (iIdx = iPos, oIdx = oPos;
           iIdx < iPos + dataLen; iIdx++, oIdx++) {
        output.put(oIdx, (byte) (output.get(oIdx) ^ inputs[i].get(iIdx)));
      }
    }
{code}
{{dataLen/iPos/oPos}} are not necessary.  we can use buffer.limit(), position() 
instead.

{code}
@Override
  protected void doDecode(ByteBuffer[] inputs, int[] erasedIndexes,
                          ByteBuffer[] outputs) {
    ByteBuffer output = outputs[0];
    resetOutputBuffer(output);

    int dataLen = inputs[0].remaining();
    int erasedIdx = erasedIndexes[0];

    // Process the inputs.
    int iPos, oPos, iIdx, oIdx;
    oPos = output.position();
    for (int i = 0; i < inputs.length; i++) {
      // Skip the erased location.
      if (i == erasedIdx) {
        continue;
      }

      iPos = inputs[i].position();
      for (iIdx = iPos, oIdx = oPos;
           iIdx < iPos + dataLen; iIdx++, oIdx++) {
        output.put(oIdx, (byte) (output.get(oIdx) ^ inputs[i].get(iIdx)));
      }
    }
  }
{code}
I wonder whether this works, I see it only decode for *output\[0\]*.
Do we ever test this? if not, we should have more test in this patch. 

In *XORRawEncoder.java*
same comments as in XORRawDecoder

In *GaloisField.java*
{code}
+    ByteBuffer p, prev, after;
+    int pos1, idx1, pos2, idx2;
{code}
besides {{idx1/idx2}}, others are unnecessary, then the code is more clear.
Also in some other places, most of them are unnecessary, if they are only used 
once or twice, we don't need declare a separate variable.


*For the tests, I want to see more tests:*
1) The length of inputs/outputs is not equal to chunksize, we can still decode.
2) Some negative test, we can catch the expected exception.


> Fix ByteBuffer version encode/decode API of raw erasure coder
> -------------------------------------------------------------
>
>                 Key: HADOOP-11938
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11938
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: io
>            Reporter: Kai Zheng
>            Assignee: Kai Zheng
>         Attachments: HADOOP-11938-HDFS-7285-v1.patch, 
> HADOOP-11938-HDFS-7285-workaround.patch
>
>
> While investigating a test failure in {{TestRecoverStripedFile}}, one issue 
> in raw erasrue coder, caused by an optimization in below codes. It assumes 
> the  heap buffer backed by the bytes array available for reading or writing 
> always starts with zero and takes the whole space.
> {code}
>   protected static byte[][] toArrays(ByteBuffer[] buffers) {
>     byte[][] bytesArr = new byte[buffers.length][];
>     ByteBuffer buffer;
>     for (int i = 0; i < buffers.length; i++) {
>       buffer = buffers[i];
>       if (buffer == null) {
>         bytesArr[i] = null;
>         continue;
>       }
>       if (buffer.hasArray()) {
>         bytesArr[i] = buffer.array();
>       } else {
>         throw new IllegalArgumentException("Invalid ByteBuffer passed, " +
>             "expecting heap buffer");
>       }
>     }
>     return bytesArr;
>   }
> {code} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-11938) Fix ByteBuffer version encode/decode API of raw erasure coder

Reply via email to