snodawn created HDFS-15544:
------------------------------
Summary: Standby namenode EditLogTailerThread shouldn't aquire a
lock interruptibly when do tail edits
Key: HDFS-15544
URL: https://issues.apache.org/jira/browse/HDFS-15544
Project: Hadoop HDFS
Issue Type: Bug
Components: namenode
Affects Versions: 3.3.0
Reporter: snodawn
In my practice, active namenode sometimes holds a long time write lock in
rollEditLog
{code:java}
Longest write-lock held at 2020-08-27 12:59:30,773+0800 for 66067ms via
java.lang.Thread.getStackTrace(Thread.java:1559)
org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032)
org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:283)
org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:258)
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1610)
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.rollEditLog(FSNamesystem.java:4667)
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollEditLog(NameNodeRpcServer.java:1292)
org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolServerSideTranslatorPB.rollEditLog(NamenodeProtocolServerSideTranslatorPB.java:146){code}
because standby namenode may not triggerActiveLogRoll() as set in
dfs.ha.log-roll.period after its last checkpoint, which may lead to a large
size editlog for active namenode to roll.
When try to do tail edits, standby namenode EditLogTailerThread acquire the
same lock as it do in checkpoint thread, but checkpoint thread may paste a log
of time to save fsimage file (in my practice, 4 minutes) , so
triggerActiveLogRoll() in EditLogTailerThread will not be called as set in
dfs.ha.log-roll.period.
I propose that EditLogTailerThread shouldn't acquire a lock by using
cpLockInterruptibly(), trylock() is enough.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]