[
https://issues.apache.org/jira/browse/HDFS-15217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17245749#comment-17245749
]
Toshihiro Suzuki commented on HDFS-15217:
-----------------------------------------
{quote}
The ops information can be retrieved from the AuditLog. Isn't the AuditLog
enough to see the ops?
Was there a concern regarding a possible deadLock? Then, why not using debug
instead of adding that overhead to hot production code?
{quote}
I don't think the AuditLog is enough to see the ops when there are huge number
of opts. As mentioned in the Description, I faced the long time write-lock
held issue, but I was not able to identify which operation caused the long time
write-lock held from the NN log and the AuditLog. That's why I thought that we
needed this change and added more information to the longest write/read lock
held log.
> Add more information to longest write/read lock held log
> --------------------------------------------------------
>
> Key: HDFS-15217
> URL: https://issues.apache.org/jira/browse/HDFS-15217
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Toshihiro Suzuki
> Assignee: Toshihiro Suzuki
> Priority: Major
> Fix For: 3.4.0
>
>
> Currently, we can see the stack trace in the longest write/read lock held
> log, but sometimes we need more information, for example, a target path of
> deletion:
> {code:java}
> 2020-03-10 21:51:21,116 [main] INFO namenode.FSNamesystem
> (FSNamesystemLock.java:writeUnlock(276)) - Number of suppressed
> write-lock reports: 0
> Longest write-lock held at 2020-03-10 21:51:21,107+0900 for 6ms via
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1058)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:257)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:233)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1706)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3188)
> ...
> {code}
> Adding more information (opName, path, etc.) to the log is useful to
> troubleshoot.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]