[
https://issues.apache.org/jira/browse/HADOOP-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12678873#action_12678873
]
Hairong Kuang commented on HADOOP-5399:
---------------------------------------
It turns out this bug is caused by HADOOP-5384. Simulated datanodes send block
reports to NN that contains a block with an invalid generation stamp,
GenerationStamp.WILDCARD_STAMP. NN finds out the block does not belong to any
file so marks it to be invalid. Then ReplicationMonitor schedules the block to
be deleted on its datanode by adding it to the invalidateSet of its
DatanodeDescriptor, which is a TreeSet. So adding the block to the
invalidateSet triggers the call to Block#compareTo that throws
IllegalStateExceptionon on wild card generation stamp. ReplicationMonitor calls
System.exit to shutdown NN when catching a RuntimeException. So NN gets crashed.
A simple solution to the problem is that block report processing should filter
blocks with wild card generation stamp.
> Simulated datanodes crashes NameNode
> ------------------------------------
>
> Key: HADOOP-5399
> URL: https://issues.apache.org/jira/browse/HADOOP-5399
> Project: Hadoop Core
> Issue Type: Bug
> Components: dfs
> Reporter: Hairong Kuang
> Fix For: 0.21.0
>
>
> INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.processReport:
> block blk_448_1 on
> XX size 10 does not belong to any file.
> INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.addToInvalidates:
> blk_448 is added
> to invalidSet of XX
> INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.processReport:
> block blk_447_1 on
> XX size 10 does not belong to any file.
> INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.addToInvalidates:
> blk_447 is added
> to invalidSet of XX
> WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: ReplicationMonitor
> thread received
> Runtime exception. java.lang.IllegalStateException: generationStamp (=1) ==
> GenerationStamp.WILDCARD_STAMP
> INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
> /************************************************************
> SHUTDOWN_MSG: Shutting down NameNode at YY
> ************************************************************/
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.