[jira] [Commented] (HADOOP-11847) Enhance raw coder allowing to read least required inputs in decoding

Kai Zheng (JIRA) Fri, 22 May 2015 00:25:32 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14555737#comment-14555737
 ]


Kai Zheng commented on HADOOP-11847:
------------------------------------

bq. If the first element is not null, it will return. It will have loop?
I mean when the first element is not null, then it returns and won't come to 
the {{for}} block at all. I'm not very sure if the trivial optimization would 
make much sense, so let me remove it.
bq.  If the buffers is not enough, then we allocate new and add it to the 
shared pool, it's typically behavior.
OK, let me do it the way you suggested. It makes sense to save some memory, 
particularly when callers use large chunk size.
bq. ensureBytesArrayBuffer and ensureDirectBuffers need to be renamed and 
rewritten per above comments.
In the new update let me get rid of the two functions and use more lightweight 
ones.
bq. You call resetBuffer: parityNum + erasedIndexes, is that true? 
Yes unnecessary resetBuffer calls are involved. According to our off-line 
discussions and your suggestions, we can ensure only needed calls shall be 
used. Thanks a lot!

An update patch is out but pending on HDFS-8382. Will post it later when 
HDFS-8382 is in. 

> Enhance raw coder allowing to read least required inputs in decoding
> --------------------------------------------------------------------
>
>                 Key: HADOOP-11847
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11847
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: io
>            Reporter: Kai Zheng
>            Assignee: Kai Zheng
>              Labels: BB2015-05-TBR
>         Attachments: HADOOP-11847-HDFS-7285-v3.patch, 
> HADOOP-11847-HDFS-7285-v4.patch, HADOOP-11847-HDFS-7285-v5.patch, 
> HADOOP-11847-HDFS-7285-v6.patch, HADOOP-11847-v1.patch, HADOOP-11847-v2.patch
>
>
> This is to enhance raw erasure coder to allow only reading least required 
> inputs while decoding. It will also refine and document the relevant APIs for 
> better understanding and usage. When using least required inputs, it may add 
> computating overhead but will possiblly outperform overall since less network 
> traffic and disk IO are involved.
> This is something planned to do but just got reminded by [~zhz]' s question 
> raised in HDFS-7678, also copied here:
> bq.Kai Zheng I have a question about decoding: in a (6+3) schema, if block #2 
> is missing, and I want to repair it with blocks 0, 1, 3, 4, 5, 8, how should 
> I construct the inputs to RawErasureDecoder#decode?
> With this work, hopefully the answer to above question would be obvious.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-11847) Enhance raw coder allowing to read least required inputs in decoding

Reply via email to