[
https://issues.apache.org/jira/browse/HDFS-13946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16727221#comment-16727221
]
Yiqun Lin commented on HDFS-13946:
----------------------------------
Thanks for the review [~elgoiri]. Attach the same patch of v07 patch to
re-trigger building.
> Log longest FSN write/read lock held stack trace
> ------------------------------------------------
>
> Key: HDFS-13946
> URL: https://issues.apache.org/jira/browse/HDFS-13946
> Project: Hadoop HDFS
> Issue Type: Improvement
> Affects Versions: 3.1.1
> Reporter: Yiqun Lin
> Assignee: Yiqun Lin
> Priority: Minor
> Attachments: HDFS-13946.001.patch, HDFS-13946.002.patch,
> HDFS-13946.003.patch, HDFS-13946.004.patch, HDFS-13946.005.patch,
> HDFS-13946.006.patch, HDFS-13946.007.patch, HDFS-13946.008.patch
>
>
> FSN write/read lock log statement only prints longest lock held interval not
> its stack trace during suppress warning interval. Only current thread is
> printed, but it looks not so useful. Once NN is slowing down, the most
> important thing we take care is that which operation holds longest time of
> the lock.
> Following is log printed based on current logic.
> {noformat}
> 2018-09-30 13:56:06,700 INFO [IPC Server handler 119 on 8020]
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem write lock
> held for 11 ms via
> java.lang.Thread.getStackTrace(Thread.java:1589)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:945)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:198)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1688)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4281)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4247)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4183)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4167)
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:848)org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311)
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625)
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2226)
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2222)
> java.security.AccessController.doPrivileged(Native Method)
> javax.security.auth.Subject.doAs(Subject.java:415)
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
> org.apache.hadoop.ipc.Server$Handler.run(Server.java:2220)
> Number of suppressed write-lock reports: 14
> Longest write-lock held interval: 70
> {noformat}
> Also it will be good for the trouble shooting.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]