[
https://issues.apache.org/jira/browse/HDFS-14498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xiaoqiao He reopened HDFS-14498:
--------------------------------
> LeaseManager can loop forever on the file for which create has failed
> ----------------------------------------------------------------------
>
> Key: HDFS-14498
> URL: https://issues.apache.org/jira/browse/HDFS-14498
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: namenode
> Affects Versions: 2.9.0
> Reporter: Sergey Shelukhin
> Assignee: Stephen O'Donnell
> Priority: Major
> Fix For: 3.2.2, 2.10.1, 3.3.1, 3.4.0, 3.1.5
>
> Attachments: HDFS-14498.001.patch, HDFS-14498.002.patch
>
>
> The logs from file creation are long gone due to infinite lease logging,
> however it presumably failed... the client who was trying to write this file
> is definitely long dead.
> The version includes HDFS-4882.
> We get this log pattern repeating infinitely:
> {noformat}
> 2019-05-16 14:00:16,893 INFO
> [org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor@b27557f]
> org.apache.hadoop.hdfs.server.namenode.LeaseManager: [Lease. Holder:
> DFSClient_NONMAPREDUCE_-20898906_61, pending creates: 1] has expired hard
> limit
> 2019-05-16 14:00:16,893 INFO
> [org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor@b27557f]
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Recovering [Lease.
> Holder: DFSClient_NONMAPREDUCE_-20898906_61, pending creates: 1], src=<snip>
> 2019-05-16 14:00:16,893 WARN
> [org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor@b27557f]
> org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.internalReleaseLease:
> Failed to release lease for file <snip>. Committed blocks are waiting to be
> minimally replicated. Try again later.
> 2019-05-16 14:00:16,893 WARN
> [org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor@b27557f]
> org.apache.hadoop.hdfs.server.namenode.LeaseManager: Cannot release the path
> <snip> in the lease [Lease. Holder: DFSClient_NONMAPREDUCE_-20898906_61,
> pending creates: 1]. It will be retried.
> org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: DIR*
> NameSystem.internalReleaseLease: Failed to release lease for file <snip>.
> Committed blocks are waiting to be minimally replicated. Try again later.
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.internalReleaseLease(FSNamesystem.java:3357)
> at
> org.apache.hadoop.hdfs.server.namenode.LeaseManager.checkLeases(LeaseManager.java:573)
> at
> org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor.run(LeaseManager.java:509)
> at java.lang.Thread.run(Thread.java:745)
> $ grep -c "Recovering.*DFSClient_NONMAPREDUCE_-20898906_61, pending creates:
> 1" hdfs_nn*
> hdfs_nn.log:1068035
> hdfs_nn.log.2019-05-16-14:1516179
> hdfs_nn.log.2019-05-16-15:1538350
> {noformat}
> Aside from an actual bug fix, it might make sense to make LeaseManager not
> log so much, in case if there are more bugs like this...
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]