[jira] [Commented] (HADOOP-11847) Enhance raw coder allowing to read least required inputs in decoding

Zhe Zhang (JIRA) Tue, 21 Apr 2015 17:37:31 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14506111#comment-14506111
 ]


Zhe Zhang commented on HADOOP-11847:
------------------------------------

Thanks Kai for the patch. Please find my review below:
# We try to decode all null slots in the input arrays. I'm not sure if this 
will cause unnecessary computation. 
# Could you explain this change? Shouldn't the first argument be 
{{numDataUnits}}?
{code}
-      xorRawDecoder.initialize(getNumDataUnits(), 1, getChunkSize());
+      xorRawDecoder.initialize(getNumDataUnits() + getNumParityUnits() - 1,
+          1, getChunkSize());
{code}
# {{checkParameters}} goes through the input arrays once, and the {{badCount}} 
makes another pass. Can we just assert {{badCount + erasedIndexes.length == 
numDataUnits}}? 
# {{ensureWhenUseXXX}} needs some Javadoc. Maybe also add a better explanation 
than {{// Lazy on demand}}?
# These variable names look confusing: {{decodingDirectBufferInputs}} vs. 
{{decodingDirectBuffersForInput}}, and {{decodingDirectBufferOutputs}} vs. 
{{decodingDirectBuffersForOutput}}
# Is {{decodingByteArrayBuffersForInput}} always filled with zero bytes? I 
don't see where it's filled with actual data

> Enhance raw coder allowing to read least required inputs in decoding
> --------------------------------------------------------------------
>
>                 Key: HADOOP-11847
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11847
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: io
>            Reporter: Kai Zheng
>            Assignee: Kai Zheng
>         Attachments: HADOOP-11847-v1.patch
>
>
> This is to enhance raw erasure coder to allow only reading least required 
> inputs while decoding. It will also refine and document the relevant APIs for 
> better understanding and usage. When using least required inputs, it may add 
> computating overhead but will possiblly outperform overall since less network 
> traffic and disk IO are involved.
> This is something planned to do but just got reminded by [~zhz]' s question 
> raised in HDFS-7678, also copied here:
> bq.Kai Zheng I have a question about decoding: in a (6+3) schema, if block #2 
> is missing, and I want to repair it with blocks 0, 1, 3, 4, 5, 8, how should 
> I construct the inputs to RawErasureDecoder#decode?
> With this work, hopefully the answer to above question would be obvious.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-11847) Enhance raw coder allowing to read least required inputs in decoding

Reply via email to