[ 
https://issues.apache.org/jira/browse/HDFS-13846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16593177#comment-16593177
 ] 

Xiao Chen commented on HDFS-13846:
----------------------------------

Thanks Kitti for finding and fixing this, and Zsolt for reviewing!

This is a good find, and the patch looks pretty good overall. A few test 
comment:
- We can take out the magic number (dataBlockNum) 2 here, make it a constant 
and calculated from {{liveReplicas}}. {code}
when(blockInfo.getRealDataBlockNum()).thenReturn((short)2);
{code}
- The assertion part of {{testIncrementAndDecrementSafeBlockCount}} and 
{{testIncrementAndDecrementStripedSafeBlockCount}} can be refactored into a 
shared method.

> Safe blocks counter is not decremented correctly if the block is striped
> ------------------------------------------------------------------------
>
>                 Key: HDFS-13846
>                 URL: https://issues.apache.org/jira/browse/HDFS-13846
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs
>    Affects Versions: 3.1.0
>            Reporter: Kitti Nanasi
>            Assignee: Kitti Nanasi
>            Priority: Major
>         Attachments: HDFS-13846.001.patch
>
>
> In BlockManagerSafeMode class, the "safe blocks" counter is incremented if 
> the number of nodes containing the block equals to the number of data units 
> specified by the erasure coding policy, which looks like this in the code:
> {code:java}
> final int safe = storedBlock.isStriped() ?
>         ((BlockInfoStriped)storedBlock).getRealDataBlockNum() : 
> safeReplication;
>     if (storageNum == safe) {
>       this.blockSafe++;
> {code}
> But when it is decremented the code does not check if the block is striped or 
> not, just compares the number of nodes containing the block with 0 
> (safeReplication - 1) if the block is complete, which is not correct.
> {code:java}
> if (storedBlock.isComplete() &&
>         blockManager.countNodes(b).liveReplicas() == safeReplication - 1) {
>       this.blockSafe--;
>       assert blockSafe >= 0;
>       checkSafeMode();
>     }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to