[jira] [Updated] (YARN-4563) ContainerMetrics deadlocks

2016-01-08 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated YARN-4563:

Description: 
On one of our environment, some NodeManagers' webapp do not working. I found a 
dead lock in the thread dump.
{noformat}
Found one Java-level deadlock:
=
"1193752357@qtp-907815246-22238":
  waiting to lock monitor 0x05e20a18 (object 0xf6afa048, a 
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics),
  which is held by "2107307914@qtp-907815246-19994"
"2107307914@qtp-907815246-19994":
  waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a 
org.apache.hadoop.metrics2.impl.MetricsSystemImpl),
  which is held by "Timer for 'NodeManager' metrics system"
"Timer for 'NodeManager' metrics system":
  waiting to lock monitor 0x027ade88 (object 0xf6582df0, a 
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics),
  which is held by "1530638165@qtp-907815246-19992"
"1530638165@qtp-907815246-19992":
  waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a 
org.apache.hadoop.metrics2.impl.MetricsSystemImpl),
  which is held by "Timer for 'NodeManager' metrics system"
{noformat}

  was:
On one of our environment, some NodeManagers' webapp do not working. I found a 
dead lock in the stacktrace.
{noformat}
Found one Java-level deadlock:
=
"1193752357@qtp-907815246-22238":
  waiting to lock monitor 0x05e20a18 (object 0xf6afa048, a 
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics),
  which is held by "2107307914@qtp-907815246-19994"
"2107307914@qtp-907815246-19994":
  waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a 
org.apache.hadoop.metrics2.impl.MetricsSystemImpl),
  which is held by "Timer for 'NodeManager' metrics system"
"Timer for 'NodeManager' metrics system":
  waiting to lock monitor 0x027ade88 (object 0xf6582df0, a 
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics),
  which is held by "1530638165@qtp-907815246-19992"
"1530638165@qtp-907815246-19992":
  waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a 
org.apache.hadoop.metrics2.impl.MetricsSystemImpl),
  which is held by "Timer for 'NodeManager' metrics system"
{noformat}


> ContainerMetrics deadlocks
> --
>
> Key: YARN-4563
> URL: https://issues.apache.org/jira/browse/YARN-4563
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
> Environment: HDP 2.3.2 (Hadoop 2.7.1 + patches)
>Reporter: Akira AJISAKA
>Priority: Blocker
>
> On one of our environment, some NodeManagers' webapp do not working. I found 
> a dead lock in the thread dump.
> {noformat}
> Found one Java-level deadlock:
> =
> "1193752357@qtp-907815246-22238":
>   waiting to lock monitor 0x05e20a18 (object 0xf6afa048, a 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics),
>   which is held by "2107307914@qtp-907815246-19994"
> "2107307914@qtp-907815246-19994":
>   waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl),
>   which is held by "Timer for 'NodeManager' metrics system"
> "Timer for 'NodeManager' metrics system":
>   waiting to lock monitor 0x027ade88 (object 0xf6582df0, a 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics),
>   which is held by "1530638165@qtp-907815246-19992"
> "1530638165@qtp-907815246-19992":
>   waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl),
>   which is held by "Timer for 'NodeManager' metrics system"
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4563) ContainerMetrics deadlocks

2016-01-08 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated YARN-4563:

Attachment: jstack.log

Attaching the stacktrace of the threads related to the dead lock.

> ContainerMetrics deadlocks
> --
>
> Key: YARN-4563
> URL: https://issues.apache.org/jira/browse/YARN-4563
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
> Environment: HDP 2.3.2 (Hadoop 2.7.1 + patches)
>Reporter: Akira AJISAKA
>Priority: Blocker
> Attachments: jstack.log
>
>
> On one of our environment, some NodeManagers' webapp do not working. I found 
> a dead lock in the thread dump.
> {noformat}
> Found one Java-level deadlock:
> =
> "1193752357@qtp-907815246-22238":
>   waiting to lock monitor 0x05e20a18 (object 0xf6afa048, a 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics),
>   which is held by "2107307914@qtp-907815246-19994"
> "2107307914@qtp-907815246-19994":
>   waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl),
>   which is held by "Timer for 'NodeManager' metrics system"
> "Timer for 'NodeManager' metrics system":
>   waiting to lock monitor 0x027ade88 (object 0xf6582df0, a 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics),
>   which is held by "1530638165@qtp-907815246-19992"
> "1530638165@qtp-907815246-19992":
>   waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl),
>   which is held by "Timer for 'NodeManager' metrics system"
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4563) ContainerMetrics deadlocks

2016-01-08 Thread Rohith Sharma K S (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-4563:

Attachment: 0001-YARN-4563.patch

> ContainerMetrics deadlocks
> --
>
> Key: YARN-4563
> URL: https://issues.apache.org/jira/browse/YARN-4563
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
> Environment: HDP 2.3.2 (Hadoop 2.7.1 + patches)
>Reporter: Akira AJISAKA
>Priority: Blocker
> Attachments: 0001-YARN-4563.patch, jstack.log
>
>
> On one of our environment, some NodeManagers' webapp do not working. I found 
> a dead lock in the thread dump.
> {noformat}
> Found one Java-level deadlock:
> =
> "1193752357@qtp-907815246-22238":
>   waiting to lock monitor 0x05e20a18 (object 0xf6afa048, a 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics),
>   which is held by "2107307914@qtp-907815246-19994"
> "2107307914@qtp-907815246-19994":
>   waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl),
>   which is held by "Timer for 'NodeManager' metrics system"
> "Timer for 'NodeManager' metrics system":
>   waiting to lock monitor 0x027ade88 (object 0xf6582df0, a 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics),
>   which is held by "1530638165@qtp-907815246-19992"
> "1530638165@qtp-907815246-19992":
>   waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl),
>   which is held by "Timer for 'NodeManager' metrics system"
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)