[jira] [Commented] (HDFS-8481) Erasure coding: remove workarounds in client side stripped blocks recovering

Walter Su (JIRA) Thu, 28 May 2015 18:45:08 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-8481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14564053#comment-14564053
 ]


Walter Su commented on HDFS-8481:
---------------------------------

This is user's logic of calling pread. The {{buf}} is reused until the entire 
file has been read.
{code}
byte[] buf = new buf[4096];
while(readLen = in.read(buf)){
..
}
{code}

Assume we has a 768mb file (128mb * 6) which exactly contains 1 block group. We 
lost one block so we have to decode until 768mb data has been read.
{code}
    byte[][] decodeInputs =
        new byte[dataBlkNum + parityBlkNum][(int) 
alignedStripe.getSpanInBlock()];
{code}
For every {{alignedStripe}} being read we need a new {{decodeInputs}}. For 
everytime user calls pread, we have new multiple {{alignedStripe}}. For 
everytime user calls stateful read, we have 1~3 new {{alignedStripe}}.
Which means, when entire 768mb data has been read, we have newed 128mb*9 
byte[][] {{decodeInputs}} garbage waiting gc.
We cannot depend {{DFSStripedInputStream}} to keep {{decodeInputs}} object and 
reuse it. Because every {{SpanInBlock}} is different.
I'm not sure if I make it clear. If so, it's an issue right? (Not related to 
this jira)
bq. we need more abstraction than the util.
I'm +1 for this idea. I think we can resolve the {{decodeInputs}} issue in that 
abstraction.

> Erasure coding: remove workarounds in client side stripped blocks recovering
> ----------------------------------------------------------------------------
>
>                 Key: HDFS-8481
>                 URL: https://issues.apache.org/jira/browse/HDFS-8481
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Zhe Zhang
>            Assignee: Zhe Zhang
>         Attachments: HDFS-8481-HDFS-7285.00.patch, 
> HDFS-8481-HDFS-7285.01.patch, HDFS-8481-HDFS-7285.02.patch
>
>
> After HADOOP-11847 and related fixes, we should be able to properly calculate 
> decoded contents.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8481) Erasure coding: remove workarounds in client side stripped blocks recovering

Reply via email to