Akira AJISAKA created YARN-4563: ----------------------------------- Summary: ContainerMetrics deadlocks Key: YARN-4563 URL: https://issues.apache.org/jira/browse/YARN-4563 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Environment: HDP 2.3.2 (Hadoop 2.7.1 + patches) Reporter: Akira AJISAKA Priority: Blocker
On one of our environment, some NodeManagers' webapp do not working. I found a dead lock in the stacktrace. {noformat} Found one Java-level deadlock: ============================= "1193752357@qtp-907815246-22238": waiting to lock monitor 0x0000000005e20a18 (object 0x00000000f6afa048, a org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics), which is held by "2107307914@qtp-907815246-19994" "2107307914@qtp-907815246-19994": waiting to lock monitor 0x0000000001a000a8 (object 0x00000000d4f1e1f8, a org.apache.hadoop.metrics2.impl.MetricsSystemImpl), which is held by "Timer for 'NodeManager' metrics system" "Timer for 'NodeManager' metrics system": waiting to lock monitor 0x00000000027ade88 (object 0x00000000f6582df0, a org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics), which is held by "1530638165@qtp-907815246-19992" "1530638165@qtp-907815246-19992": waiting to lock monitor 0x0000000001a000a8 (object 0x00000000d4f1e1f8, a org.apache.hadoop.metrics2.impl.MetricsSystemImpl), which is held by "Timer for 'NodeManager' metrics system" {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)