[
https://issues.apache.org/jira/browse/HDFS-6148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13946763#comment-13946763
]
Kihwal Lee commented on HDFS-6148:
----------------------------------
This happens when only NN restarts and an incremental block report is received
after the node registration, but before adding the storage. I.e. queued
incremental block report coming first before the first heartbeat. In this
case, {{BlockInfoUnderConstruction#addReplicaIfNotPresent()}} is called from
{{addStoredBlockUnderConstruction()}}, but the {{StorageInfo}} is null. Since
the storage is not added yet, {{node.getStorageInfo(storageID)}} is null. As a
result, the {{BlockInfoUnderConstruction}} will have one
{{ReplicaUnderConstruction}} with its expectedLocation set to null. This is
apparent from the log message from the processing of such an incremental block
report.
{noformat}
WARN BlockStateChange: BLOCK* addStoredBlock: Redundant addStoredBlock request
received for
blk_1089713407_xxxxx{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1,
replicas=[ReplicaUnderConstruction[null|FINALIZED]]} on 1.2.3.4:1004 size 0
{noformat}
After this, fsck will fail with a NPE and the LeaseManager will also crash with
a NPE.
> LeaseManager crashes while initiating block recovery
> ----------------------------------------------------
>
> Key: HDFS-6148
> URL: https://issues.apache.org/jira/browse/HDFS-6148
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 2.3.0
> Reporter: Kihwal Lee
> Priority: Blocker
>
> While running branch-2.4, the LeaseManager crashed with an NPE. This does not
> always happen on block recovery.
> {panel}
> Exception in thread
> "org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor@5d66b728"
> java.lang.NullPointerException
> at
> org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoUnderConstruction$
>
> ReplicaUnderConstruction.isAlive(BlockInfoUnderConstruction.java:121)
> at
> org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoUnderConstruction.
> initializeBlockRecovery(BlockInfoUnderConstruction.java:286)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.internalReleaseLease(FSNamesystem.java:3746)
> at
> org.apache.hadoop.hdfs.server.namenode.LeaseManager.checkLeases(LeaseManager.java:474)
> at
> org.apache.hadoop.hdfs.server.namenode.LeaseManager.access$900(LeaseManager.java:68)
> at
> org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor.run(LeaseManager.java:411)
> at java.lang.Thread.run(Thread.java:722)
> {panel}
--
This message was sent by Atlassian JIRA
(v6.2#6252)