Yicong Cai created HDFS-14465: --------------------------------- Summary: When the Block expected replications is larger than the number of DataNodes, entering maintenance will never exit. Key: HDFS-14465 URL: https://issues.apache.org/jira/browse/HDFS-14465 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.9.2 Reporter: Yicong Cai
Scenes: There is a small HDFS cluster with 5 DataNodes; one of them is maintained, added to the maintenance list, and set dfs.namenode.maintenance.replication.min to 1. When refresh Nodes, the NameNode starts checking whether the blocks on the node require a new replication. The replications of the MapReduce task job file is 10 by default, isNeededReplicationForMaintenance will determine to false, and isSufficientlyReplicated will determine to false, so the block of the job file needs to increase the replication. When adding a replication, since the cluster has only 5 DataNodes, all the nodes have the replications of the block, chooseTargetInOrder will throw a NotEnoughReplicasException, so that the replication cannot be increase, and the Entering Maintenance cannot be ended. This issue will cause the independent small cluster to be unable to use the maintenance mode. {panel:title=chooseTarget exception log} 2019-05-03 23:42:31,008 [31545331] - WARN [ReplicationMonitor:BlockPlacementPolicyDefault@431] - Failed to place enough replicas, still in need of 1 to reach 5 (unavailableStorages=[], storagePolicy=BlockStoragePolicy\{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=false) For more information, please enable DEBUG log level on org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy and org.apache.hadoop.net.NetworkTopology {panel} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org