[jira] [Commented] (HDFS-14920) Erasure Coding: Decommission may hang If one or more datanodes are out of service during decommission

Ayush Saxena (Jira) Tue, 29 Oct 2019 05:40:35 -0700


    [ 
https://issues.apache.org/jira/browse/HDFS-14920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16961968#comment-16961968
 ]


Ayush Saxena commented on HDFS-14920:
-------------------------------------

If I go by 

{code:java}
 But note we exclude duplicated internal block replicas
   * for calculating {@link NumberReplicas#liveReplicas}.
{code}

* For decommissioning also, Then its fine, but you need to add a line here too, 
If a node is Live and decommissioning....(Similar you added for the Decom enum) 
explaining that.
*Change to Lambdas for tests.
* Change in setting {{decommissioningBitSet.set(blockIndex);}} as I posted 
above, if possible. As I feel decommissioningBitSet should contain the bits 
actually decomissioning
Apart seems fair enough.


> Erasure Coding: Decommission may hang If one or more datanodes are out of 
> service during decommission  
> -------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-14920
>                 URL: https://issues.apache.org/jira/browse/HDFS-14920
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: ec
>    Affects Versions: 3.0.3, 3.2.1, 3.1.3
>            Reporter: Fei Hui
>            Assignee: Fei Hui
>            Priority: Major
>         Attachments: HDFS-14920.001.patch, HDFS-14920.002.patch, 
> HDFS-14920.003.patch
>
>
> Decommission test hangs in our clusters.
> Have seen the messages as follow
> {quote}
> 2019-10-22 15:58:51,514 TRACE 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager: Block 
> blk_-9223372035600425840_372987973 numExpected=9, numLive=5
> 2019-10-22 15:58:51,514 INFO BlockStateChange: Block: 
> blk_-9223372035600425840_372987973, Expected Replicas: 9, live replicas: 5, 
> corrupt replicas: 0, decommissioned replicas: 0, decommissioning replicas: 4, 
> maintenance replicas: 0, live entering maintenance replicas: 0, excess 
> replicas: 0, Is Open File: false, Datanodes having this block: 
> 10.255.43.57:50010 10.255.53.12:50010 10.255.63.12:50010 10.255.62.39:50010 
> 10.255.37.36:50010 10.255.33.15:50010 10.255.69.29:50010 10.255.51.13:50010 
> 10.255.64.15:50010 , Current Datanode: 10.255.69.29:50010, Is current 
> datanode decommissioning: true, Is current datanode entering maintenance: 
> false
> 2019-10-22 15:58:51,514 DEBUG 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager: Node 
> 10.255.69.29:50010 still has 1 blocks to replicate before it is a candidate 
> to finish Decommission In Progress
> {quote}
> After digging the source code and cluster log,  guess it happens as follow 
> steps.
> # Storage strategy is RS-6-3-1024k.
> # EC block b consists of b0, b1, b2, b3, b4, b5, b6, b7, b8, b0 is from 
> datanode dn0, b1 is from datanode dn1, ...etc
> # At the beginning dn0 is in decommission progress, b0 is replicated 
> successfully, and dn0 is staill in decommission progress.
> # Later b1, b2, b3 in decommission progress, and dn4 containing b4 is out of 
> service, so need to reconstruct, and create ErasureCodingWork to do it, in 
> the ErasureCodingWork, additionalReplRequired is 4
> # Because hasAllInternalBlocks is false, Will call 
> ErasureCodingWork#addTaskToDatanode -> 
> DatanodeDescriptor#addBlockToBeErasureCoded, and send 
> BlockECReconstructionInfo task to Datanode
> # DataNode can not reconstruction the block because targets is 4, greater 
> than 3( parity number).
> There is a problem as follow, from BlockManager.java#scheduleReconstruction
> {code}
>       // should reconstruct all the internal blocks before scheduling
>       // replication task for decommissioning node(s).
>       if (additionalReplRequired - numReplicas.decommissioning() -
>           numReplicas.liveEnteringMaintenanceReplicas() > 0) {
>         additionalReplRequired = additionalReplRequired -
>             numReplicas.decommissioning() -
>             numReplicas.liveEnteringMaintenanceReplicas();
>       }
> {code}
> Should reconstruction firstly and then replicate for decommissioning. Because 
> numReplicas.decommissioning() is 4, and additionalReplRequired is 4, that's 
> wrong,
> numReplicas.decommissioning() should be 3, it should exclude live replica. 
> If so, additionalReplRequired will be 1, reconstruction will schedule as 
> expected. After that, decommission goes on.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HDFS-14920) Erasure Coding: Decommission may hang If one or more datanodes are out of service during decommission

Reply via email to