[
https://issues.apache.org/jira/browse/HDFS-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15150485#comment-15150485
]
Oleg Danilov commented on HDFS-5970:
------------------------------------
We just "reproduced" this issue accidentally using Hadoop 2.3.0:
...
2016-02-16 11:21:37,217 INFO org.apache.hadoop.net.NetworkTopology: Removing a
node: /default-rack/10.5.68.40:1004
2016-02-16 11:21:37,217 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
removeDeadDatanode: lost heartbeat from 10.5.68.45:1004
2016-02-16 11:21:37,217 INFO org.apache.hadoop.net.NetworkTopology: Removing a
node: /default-rack/10.5.68.45:1004
2016-02-16 11:21:37,218 FATAL
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: ReplicationMonitor
thread received Runtime exception.
java.lang.NullPointerException
at
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:507)
at
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRemoteRack(BlockPlacementPolicyDefault.java:455)
at
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:278)
at
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:212)
at
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:117)
at
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationWork.chooseTargets(BlockManager.java:3309)
at
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationWork.access$200(BlockManager.java:3277)
at
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWorkForBlocks(BlockManager.java:1283)
at
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWork(BlockManager.java:1190)
at
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:3250)
at
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3204)
at java.lang.Thread.run(Thread.java:745)
2016-02-16 11:21:37,246 INFO org.apache.hadoop.util.ExitUtil: Exiting with
status 1
2016-02-16 11:21:37,260 INFO org.apache.hadoop.hdfs.server.namenode.NameNode:
SHUTDOWN_MSG:
Unfortunately it causes the namenode shutdown.
> callers of NetworkTopology's chooseRandom method to expect null return value
> ----------------------------------------------------------------------------
>
> Key: HDFS-5970
> URL: https://issues.apache.org/jira/browse/HDFS-5970
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: namenode
> Affects Versions: 3.0.0
> Reporter: Yongjun Zhang
> Priority: Minor
>
> Class NetworkTopology's method
> public Node chooseRandom(String scope)
> calls
> private Node chooseRandom(String scope, String excludedScope)
> which may return null value.
> Callers of this method such as BlockPlacementPolicyDefault etc need to be
> aware that.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)