[jira] [Commented] (HDFS-14465) When the Block expected replications is larger than the number of DataNodes, entering maintenance will never exit.

Yicong Cai (JIRA) Tue, 07 May 2019 02:17:24 -0700


    [ 
https://issues.apache.org/jira/browse/HDFS-14465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16834534#comment-16834534
 ]


Yicong Cai commented on HDFS-14465:
-----------------------------------

[^HDFS-14465.02.patch] Fix checkstyle.

Had tested the hadoop.hdfs.web.TestWebHdfsTimeouts use case separately and it 
works fine.

> When the Block expected replications is larger than the number of DataNodes, 
> entering maintenance will never exit.
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-14465
>                 URL: https://issues.apache.org/jira/browse/HDFS-14465
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.9.2
>            Reporter: Yicong Cai
>            Priority: Major
>         Attachments: HDFS-14465.01.patch, HDFS-14465.02.patch
>
>
> Scenes:
> There is a small HDFS cluster with 5 DataNodes; one of them is maintained, 
> added to the maintenance list, and set 
> dfs.namenode.maintenance.replication.min to 1.
> When refresh Nodes, the NameNode starts checking whether the blocks on the 
> node require a new replication.
> The replications of the MapReduce task job file is 10 by default, 
> isNeededReplicationForMaintenance will determine to false, and 
> isSufficientlyReplicated will determine to false, so the block of the job 
> file needs to increase the replication.
> When adding a replication, since the cluster has only 5 DataNodes, all the 
> nodes have the replications of the block, chooseTargetInOrder will throw a 
> NotEnoughReplicasException, so that the replication cannot be increase, and 
> the Entering Maintenance cannot be ended.
> This issue will cause the independent small cluster to be unable to use the 
> maintenance mode.
>  
> {panel:title=chooseTarget exception log}
> 2019-05-03 23:42:31,008 [31545331] - WARN  
> [ReplicationMonitor:BlockPlacementPolicyDefault@431] - Failed to place enough 
> replicas, still in need of 1 to reach 5 (unavailableStorages=[], 
> storagePolicy=BlockStoragePolicy\{HOT:7, storageTypes=[DISK], 
> creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=false) For 
> more information, please enable DEBUG log level on 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy and 
> org.apache.hadoop.net.NetworkTopology
> {panel}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HDFS-14465) When the Block expected replications is larger than the number of DataNodes, entering maintenance will never exit.

Reply via email to