[
https://issues.apache.org/jira/browse/HADOOP-4840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657242#action_12657242
]
Tsz Wo (Nicholas), SZE commented on HADOOP-4840:
------------------------------------------------
FSNamesystem.countNodes(..) is called in many places including:
- FSNamesystem.addStoredBlock(Block, DatanodeDescriptor, DatanodeDescriptor)
- FSNamesystem.checkReplicationFactor(INodeFile)
- FSNamesystem.decrementSafeBlockCount(Block)
- FSNamesystem.getBlockLocationsInternal(String, INodeFile, long, long, int,
boolean)
- FSNamesystem.invalidateBlock(Block, DatanodeInfo)
- FSNamesystem.isReplicationInProgress(DatanodeDescriptor)
- FSNamesystem.markBlockAsCorrupt(Block, DatanodeInfo)
- FSNamesystem.processMisReplicatedBlocks()
- FSNamesystem.processPendingReplications()
- FSNamesystem.updateNeededReplications(Block, int, int)
However, some of them, e.g. getBlockLocationsInternal, call countNodes(..)
without owning the fsnamesystem lock before calling . It may causes NPE in
runtime.
> TestNodeCount sometimes fails with NullPointerException
> -------------------------------------------------------
>
> Key: HADOOP-4840
> URL: https://issues.apache.org/jira/browse/HADOOP-4840
> Project: Hadoop Core
> Issue Type: Bug
> Components: dfs
> Affects Versions: 0.18.3
> Reporter: Hairong Kuang
> Assignee: Hairong Kuang
> Fix For: 0.18.3
>
> Attachments: nodeCountNPE.patch, nodeCountNPE1.patch
>
>
> Testcase: testNodeCount took 9.628 sec
> Caused an ERROR
> java.lang.NullPointerException
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.countNodes(FSNamesystem.java:3523)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.countNodes(FSNamesystem.java:3543)
> at
> org.apache.hadoop.hdfs.server.namenode.TestNodeCount.testNodeCount(TestNodeCount.java:64)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.