[ https://issues.apache.org/jira/browse/HDFS-15292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17129676#comment-17129676 ]
Ayush Saxena commented on HDFS-15292: ------------------------------------- It is already mention in the code that such a situation can lead to infinite loop in lease manager. {code:java} // Cannot close file right now, since some blocks // are not yet minimally replicated. // This may potentially cause infinite loop in lease recovery // if there are no valid replicas on data-nodes. String message = "DIR* NameSystem.internalReleaseLease: " + "Failed to release lease for file " + src + ". Committed blocks are waiting to be mi {code} If this is a frequent occurence, you shouldn't allow files to close with committed blocks itself. dfs.namenode.file.close.num-committed-allowed shouldn't be set > Infinite loop in Lease Manager due to replica is missing in dn > -------------------------------------------------------------- > > Key: HDFS-15292 > URL: https://issues.apache.org/jira/browse/HDFS-15292 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode > Affects Versions: 3.1.3 > Reporter: Aaron Guo > Priority: Major > > In our production environment, we found that files of under construction keep > growing, and the lease manager is trying to release the lease in a Infinite > loop: > {code:java} > 2020-04-18 23:10:57,816 WARN namenode.LeaseManager > (LeaseManager.java:checkLeases(589)) - Cannot release the path > /user/hadoop/myTestFile.txt in the lease [Lease. Holder: > go-hdfs-7VVGF3sGvHZcsZZC, pending creates: 1]. It will be retried. > org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: DIR* > NameSystem.internalReleaseLease: Failed to release lease for file > /user/hadoop/myTestFile.txt. Committed blocks are waiting to be minimally > replicated. Try again later. > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.internalReleaseLease(FSNamesystem.java:3391) > at > org.apache.hadoop.hdfs.server.namenode.LeaseManager.checkLeases(LeaseManager.java:586) > at > org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor.run(LeaseManager.java:524) > at java.lang.Thread.run(Thread.java:745) > {code} > This is because the last block of this file can NOT meet the minimum > required replica of 1, a AlreadyBeingCreatedException get thrown, and it > will keeps retry forever. > This infinite loop also cause another issue since the lease manager always > trying to release the first lease then goto the next one, so no lease will be > released. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org