[ 
https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7678:
----------------------------
    Attachment: HDFS-7678-HDFS-7285.006.patch

Thanks Andrew for the review!

bq. When fetching recovery work, we exclude blocks that still have an in-flight 
read. This means we might sometimes error out when we need additional data from 
the in-flight block.
This is a very good catch. I took a closer look at it, with some additional 
test  cases. I think the logic in 005 patch is functionally correct. Basically 
we always _hope_ all inflight read tasks are both successful and cover max read 
portion. If the next returned request turns out otherwise, that event will 
trigger {{scheduleRecoveryReads}}. Note that the loop will continue until 
{{futures}} is empty. That said, the suggested change is a performance 
optimization to speedup recovery. I added {{actualReadPortions}} to keep track 
of the actual issued read portion for each index. Using it we are able to more 
accurately avoid some inflight reads but not all.

bq. The test logic where we kill a DN doesn't look quite right, since we need 
to make sure the killed DN has the expected missing block.
Actually since we are manually injecting simulated blocks, the block->DN 
mapping is fixed. I did extend the test quite a bit to cover more misaligned 
cases:
{code}
    int delta = 10;
    int done = 0;
    // read a small delta, shouldn't trigger decode
    // |cell_0 |
    // |10     |
    done += in.read(0, readBuffer, 0, delta);
    assertEquals(delta, done);
    // both head and trail cells are partial
    // |c_0      |c_1    |c_2 |c_3 |c_4      |c_5         |
    // |256K - 10|missing|256K|256K|256K - 10|not in range|
    done += in.read(delta, readBuffer, delta,
        CELLSIZE * (DATA_BLK_NUM - 1) - 2 * delta);
    assertEquals(CELLSIZE * (DATA_BLK_NUM - 1) - delta, done);
    // read the rest
    done += in.read(done, readBuffer, done, readSize - done);
{code}

Please let me know if the new changes look OK. Thanks!

> Erasure coding: DFSInputStream with decode functionality
> --------------------------------------------------------
>
>                 Key: HDFS-7678
>                 URL: https://issues.apache.org/jira/browse/HDFS-7678
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: HDFS-7285
>            Reporter: Li Bo
>            Assignee: Zhe Zhang
>         Attachments: BlockGroupReader.patch, HDFS-7678-HDFS-7285.002.patch, 
> HDFS-7678-HDFS-7285.003.patch, HDFS-7678-HDFS-7285.004.patch, 
> HDFS-7678-HDFS-7285.005.patch, HDFS-7678-HDFS-7285.006.patch, 
> HDFS-7678.000.patch, HDFS-7678.001.patch
>
>
> A block group reader will read data from BlockGroup no matter in striping 
> layout or contiguous layout. The corrupt blocks can be known before 
> reading(told by namenode), or just be found during reading. The block group 
> reader needs to do decoding work when some blocks are found corrupt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to