[jira] [Commented] (HDFS-15730) Erasure Coding: Fix unit test bug of TestAddOverReplicatedStripedBlocks.testProcessOverReplicatedAndCorruptStripedBlock.

Jinglun (Jira) Sun, 20 Dec 2020 20:03:05 -0800


    [ 
https://issues.apache.org/jira/browse/HDFS-15730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17252600#comment-17252600
 ]


Jinglun commented on HDFS-15730:
--------------------------------

Hi [~ayushtkn], thanks your comments ! 
{quote}In case we delete the excess index say b0 and we loose the only 
remaining b0 as well, we might loose a chance of reconstruction as well?
{quote}
It won't happen. The method BlockManager#processExtraRedundancyBlock() 
construct an array nonExcess. We can see the corrupt nodes are excluded from 
the array.
{code:java}
if (!isExcess(cur, block)) {
  if (cur.isInService()) {
    // exclude corrupt replicas
    if (corruptNodes == null || !corruptNodes.contains(cur)) { // Corrupt 
storage is excluded here !
      nonExcess.add(storage);
    }
  }
}
{code}
Then it calls BlockManager#chooseExcessRedundancies(). All the excess storage 
are chosen from the array nonExcess. Supposing we have 2 b0. One is corrupt. 
The corrupt one will be excluded from nonExcess, so the left b0 won't be excess 
anymore and won't be deleted in BlockManager#chooseExcessRedundancies().

> Erasure Coding: Fix unit test bug of 
> TestAddOverReplicatedStripedBlocks.testProcessOverReplicatedAndCorruptStripedBlock.
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-15730
>                 URL: https://issues.apache.org/jira/browse/HDFS-15730
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Jinglun
>            Assignee: Jinglun
>            Priority: Minor
>         Attachments: HDFS-15730.001.patch
>
>
> I'm working on ec replication and find a bug of the test case: 
> TestAddOverReplicatedStripedBlocks#testProcessOverReplicatedAndCorruptStripedBlock.
> The test case added 2 redundant block then check the block indices. It wrote 
> 'the redundant internal blocks will not be deleted before the corrupted block 
> gets reconstructed.'
> But actually the redundant block could be deleted when there is corrupted 
> block. The reason the test could pass is it runs very fast and checks the 
> block indices before the redundant block is deleted and reported to the 
> NameNode. 
> The patch is both a fix and an explanation of the bug.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-15730) Erasure Coding: Fix unit test bug of TestAddOverReplicatedStripedBlocks.testProcessOverReplicatedAndCorruptStripedBlock.

Reply via email to