[jira] [Commented] (HADOOP-11847) Enhance raw coder allowing to read least required inputs in decoding

Kai Zheng (JIRA) Mon, 27 Apr 2015 19:00:28 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14516181#comment-14516181
 ]


Kai Zheng commented on HADOOP-11847:
------------------------------------

Hi [~zhz],

Thanks for your good points.
bq.The below can be combined as ...
I thought it would be good to have the variable {{isFirstTime}} to indicate 
it's the first time to run to there. Sure I'm OK to combine if you insist. :)
bq. we can assume zero initial values and don't need to zero-fill arrays again
I agree. It's too worried. Actually I'm going to remove more codes than this as 
if any error occurs, then exception would be thrown, otherwise all the buffer 
places will be overwritten with good bytes, so ZEROing the buffer doesn't be 
needed at all. I wrote the codes too carefully. Avoiding the overhead makes 
sense considering performance.
bq. The names ensureWhenXXX are not clear enough ...
I got it this time, agree!
bq. why are both byteArrayBuffersForInput and byteArrayBuffersForOutput created 
with size numBadUnitsAtMost?
Good question. I need comment in the codes clearly about this. For both input 
and output, in addition to the valid buffers from the caller passed from the 
call, we need to provide extra buffers for the internal usage. For input, the 
caller should provide at least numDataUnits non-NULL buffers; for output, the 
caller should provide no more than but at least one numParityUnits buffers. The 
left buffers will be borrowed from either {{byteArrayBuffersForInput}} or 
{{byteArrayBuffersForOutput}}, but at most of numParityUnits.

> Enhance raw coder allowing to read least required inputs in decoding
> --------------------------------------------------------------------
>
>                 Key: HADOOP-11847
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11847
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: io
>            Reporter: Kai Zheng
>            Assignee: Kai Zheng
>         Attachments: HADOOP-11847-v1.patch, HADOOP-11847-v2.patch
>
>
> This is to enhance raw erasure coder to allow only reading least required 
> inputs while decoding. It will also refine and document the relevant APIs for 
> better understanding and usage. When using least required inputs, it may add 
> computating overhead but will possiblly outperform overall since less network 
> traffic and disk IO are involved.
> This is something planned to do but just got reminded by [~zhz]' s question 
> raised in HDFS-7678, also copied here:
> bq.Kai Zheng I have a question about decoding: in a (6+3) schema, if block #2 
> is missing, and I want to repair it with blocks 0, 1, 3, 4, 5, 8, how should 
> I construct the inputs to RawErasureDecoder#decode?
> With this work, hopefully the answer to above question would be obvious.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-11847) Enhance raw coder allowing to read least required inputs in decoding

Reply via email to