[jira] [Commented] (HADOOP-11847) Enhance raw coder allowing to read least required inputs in decoding

Zhe Zhang (JIRA) Fri, 24 Apr 2015 17:41:49 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14512109#comment-14512109
 ]


Zhe Zhang commented on HADOOP-11847:
------------------------------------

Thanks Kai for the updated patch. A few quick comments before I finish 
reviewing the whole patch:
# The below can be combined as {{if (xxx == null)}}
{code}
    boolean isFirstTime = (adjustedByteArrayInputsParameter == null);
    if (isFirstTime) {
{code}
# Based on Java language [specification | 
http://docs.oracle.com/javase/specs/jls/se7/html/jls-4.html#jls-4.12.5] we can 
assume zero initial values and don't need to zero-fill arrays again.
{code}
      // These are temp buffers for bad inputs.
      byteArrayBuffersForInput = new byte[numBadUnitsAtMost][];
      for (int i = 0; i < byteArrayBuffersForInput.length; ++i) {
        byteArrayBuffersForInput[i] = new byte[getChunkSize()];
      }
...
    // Ensure only ZERO bytes are read from
    for (int i = 0; i < byteArrayBuffersForInput.length; ++i) {
      System.arraycopy(ZERO_BYTES, 0, byteArrayBuffersForInput[i],
          0, ZERO_BYTES.length);
    }
{code}
# The names {{ensureWhenXXX}} are not clear enough ("ensure" should be followed 
by something).
# Does a "bad" unit mean an erased unit? If so, why are both 
{{byteArrayBuffersForInput}} and {{byteArrayBuffersForOutput}} created with 
size {{numBadUnitsAtMost}}? Also, {{numBadUnitsAtMost}} can be 
{{maxErasedUnits}}
# {{erasedIndexes}} -> {{erasedIndices}}

> Enhance raw coder allowing to read least required inputs in decoding
> --------------------------------------------------------------------
>
>                 Key: HADOOP-11847
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11847
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: io
>            Reporter: Kai Zheng
>            Assignee: Kai Zheng
>         Attachments: HADOOP-11847-v1.patch, HADOOP-11847-v2.patch
>
>
> This is to enhance raw erasure coder to allow only reading least required 
> inputs while decoding. It will also refine and document the relevant APIs for 
> better understanding and usage. When using least required inputs, it may add 
> computating overhead but will possiblly outperform overall since less network 
> traffic and disk IO are involved.
> This is something planned to do but just got reminded by [~zhz]' s question 
> raised in HDFS-7678, also copied here:
> bq.Kai Zheng I have a question about decoding: in a (6+3) schema, if block #2 
> is missing, and I want to repair it with blocks 0, 1, 3, 4, 5, 8, how should 
> I construct the inputs to RawErasureDecoder#decode?
> With this work, hopefully the answer to above question would be obvious.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-11847) Enhance raw coder allowing to read least required inputs in decoding

Reply via email to