[ https://issues.apache.org/jira/browse/HDFS-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15150485#comment-15150485 ]
Oleg Danilov commented on HDFS-5970: ------------------------------------ We just "reproduced" this issue accidentally using Hadoop 2.3.0: ... 2016-02-16 11:21:37,217 INFO org.apache.hadoop.net.NetworkTopology: Removing a node: /default-rack/10.5.68.40:1004 2016-02-16 11:21:37,217 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* removeDeadDatanode: lost heartbeat from 10.5.68.45:1004 2016-02-16 11:21:37,217 INFO org.apache.hadoop.net.NetworkTopology: Removing a node: /default-rack/10.5.68.45:1004 2016-02-16 11:21:37,218 FATAL org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: ReplicationMonitor thread received Runtime exception. java.lang.NullPointerException at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:507) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRemoteRack(BlockPlacementPolicyDefault.java:455) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:278) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:212) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:117) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationWork.chooseTargets(BlockManager.java:3309) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationWork.access$200(BlockManager.java:3277) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWorkForBlocks(BlockManager.java:1283) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWork(BlockManager.java:1190) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:3250) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3204) at java.lang.Thread.run(Thread.java:745) 2016-02-16 11:21:37,246 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1 2016-02-16 11:21:37,260 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG: Unfortunately it causes the namenode shutdown. > callers of NetworkTopology's chooseRandom method to expect null return value > ---------------------------------------------------------------------------- > > Key: HDFS-5970 > URL: https://issues.apache.org/jira/browse/HDFS-5970 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode > Affects Versions: 3.0.0 > Reporter: Yongjun Zhang > Priority: Minor > > Class NetworkTopology's method > public Node chooseRandom(String scope) > calls > private Node chooseRandom(String scope, String excludedScope) > which may return null value. > Callers of this method such as BlockPlacementPolicyDefault etc need to be > aware that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)