[jira] [Commented] (YARN-4563) ContainerMetrics deadlocks
[ https://issues.apache.org/jira/browse/YARN-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15126383#comment-15126383 ] ASF GitHub Bot commented on YARN-4563: -- GitHub user steveloughran opened a pull request: https://github.com/apache/hadoop/pull/72 YARN-4563 Attempt to document YARN security, including HADOOP_TOKEN_FILE_LOCATION propagation You can merge this pull request into a Git repository by running: $ git pull https://github.com/steveloughran/hadoop HADOOP-12649-security/YARN-4653-yarn Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hadoop/pull/72.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #72 commit 73baa11ff74201faa56ce1fc18941bdad43263fe Author: Steve LoughranDate: 2016-01-28T20:04:14Z YARN-4653 document YARN security: first pass commit 778f623f7c436a975a1020d8a1eea55b67a630bf Author: Steve Loughran Date: 2016-01-29T20:04:53Z YARN-4653 document YARN security: more, though more is needed commit 6b4ce5fa7ed6a83e471d14994bd26aa01bf37552 Author: Steve Loughran Date: 2016-02-01T15:20:35Z YARN-4653 document YARN security with instructions on propagating oozie credentials > ContainerMetrics deadlocks > -- > > Key: YARN-4563 > URL: https://issues.apache.org/jira/browse/YARN-4563 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Environment: HDP 2.3.2 (Hadoop 2.7.1 + patches) >Reporter: Akira AJISAKA >Priority: Blocker > Attachments: 0001-YARN-4563.patch, jstack.log > > > On one of our environment, some NodeManagers' webapp do not working. I found > a dead lock in the thread dump. > {noformat} > Found one Java-level deadlock: > = > "1193752357@qtp-907815246-22238": > waiting to lock monitor 0x05e20a18 (object 0xf6afa048, a > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics), > which is held by "2107307914@qtp-907815246-19994" > "2107307914@qtp-907815246-19994": > waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a > org.apache.hadoop.metrics2.impl.MetricsSystemImpl), > which is held by "Timer for 'NodeManager' metrics system" > "Timer for 'NodeManager' metrics system": > waiting to lock monitor 0x027ade88 (object 0xf6582df0, a > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics), > which is held by "1530638165@qtp-907815246-19992" > "1530638165@qtp-907815246-19992": > waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a > org.apache.hadoop.metrics2.impl.MetricsSystemImpl), > which is held by "Timer for 'NodeManager' metrics system" > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4563) ContainerMetrics deadlocks
[ https://issues.apache.org/jira/browse/YARN-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15093329#comment-15093329 ] Akira AJISAKA commented on YARN-4563: - Thanks [~rohithsharma] for your comment. bq. Branch-2.6 might need to port this issue. ContainerMetrics doesn't exist in branch-2.6, so there's no need to backport this issue. > ContainerMetrics deadlocks > -- > > Key: YARN-4563 > URL: https://issues.apache.org/jira/browse/YARN-4563 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Environment: HDP 2.3.2 (Hadoop 2.7.1 + patches) >Reporter: Akira AJISAKA >Priority: Blocker > Attachments: 0001-YARN-4563.patch, jstack.log > > > On one of our environment, some NodeManagers' webapp do not working. I found > a dead lock in the thread dump. > {noformat} > Found one Java-level deadlock: > = > "1193752357@qtp-907815246-22238": > waiting to lock monitor 0x05e20a18 (object 0xf6afa048, a > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics), > which is held by "2107307914@qtp-907815246-19994" > "2107307914@qtp-907815246-19994": > waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a > org.apache.hadoop.metrics2.impl.MetricsSystemImpl), > which is held by "Timer for 'NodeManager' metrics system" > "Timer for 'NodeManager' metrics system": > waiting to lock monitor 0x027ade88 (object 0xf6582df0, a > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics), > which is held by "1530638165@qtp-907815246-19992" > "1530638165@qtp-907815246-19992": > waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a > org.apache.hadoop.metrics2.impl.MetricsSystemImpl), > which is held by "Timer for 'NodeManager' metrics system" > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4563) ContainerMetrics deadlocks
[ https://issues.apache.org/jira/browse/YARN-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15089046#comment-15089046 ] Naganarasimha G R commented on YARN-4563: - Sorry by mistake assigned it ... > ContainerMetrics deadlocks > -- > > Key: YARN-4563 > URL: https://issues.apache.org/jira/browse/YARN-4563 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Environment: HDP 2.3.2 (Hadoop 2.7.1 + patches) >Reporter: Akira AJISAKA >Priority: Blocker > Attachments: jstack.log > > > On one of our environment, some NodeManagers' webapp do not working. I found > a dead lock in the thread dump. > {noformat} > Found one Java-level deadlock: > = > "1193752357@qtp-907815246-22238": > waiting to lock monitor 0x05e20a18 (object 0xf6afa048, a > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics), > which is held by "2107307914@qtp-907815246-19994" > "2107307914@qtp-907815246-19994": > waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a > org.apache.hadoop.metrics2.impl.MetricsSystemImpl), > which is held by "Timer for 'NodeManager' metrics system" > "Timer for 'NodeManager' metrics system": > waiting to lock monitor 0x027ade88 (object 0xf6582df0, a > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics), > which is held by "1530638165@qtp-907815246-19992" > "1530638165@qtp-907815246-19992": > waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a > org.apache.hadoop.metrics2.impl.MetricsSystemImpl), > which is held by "Timer for 'NodeManager' metrics system" > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4563) ContainerMetrics deadlocks
[ https://issues.apache.org/jira/browse/YARN-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15089084#comment-15089084 ] Naganarasimha G R commented on YARN-4563: - Seems like the code in the stack trace is before to YARN-3619, so you mean to say YARN-3619 fixes this issue ? > ContainerMetrics deadlocks > -- > > Key: YARN-4563 > URL: https://issues.apache.org/jira/browse/YARN-4563 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Environment: HDP 2.3.2 (Hadoop 2.7.1 + patches) >Reporter: Akira AJISAKA >Priority: Blocker > Attachments: jstack.log > > > On one of our environment, some NodeManagers' webapp do not working. I found > a dead lock in the thread dump. > {noformat} > Found one Java-level deadlock: > = > "1193752357@qtp-907815246-22238": > waiting to lock monitor 0x05e20a18 (object 0xf6afa048, a > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics), > which is held by "2107307914@qtp-907815246-19994" > "2107307914@qtp-907815246-19994": > waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a > org.apache.hadoop.metrics2.impl.MetricsSystemImpl), > which is held by "Timer for 'NodeManager' metrics system" > "Timer for 'NodeManager' metrics system": > waiting to lock monitor 0x027ade88 (object 0xf6582df0, a > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics), > which is held by "1530638165@qtp-907815246-19992" > "1530638165@qtp-907815246-19992": > waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a > org.apache.hadoop.metrics2.impl.MetricsSystemImpl), > which is held by "Timer for 'NodeManager' metrics system" > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4563) ContainerMetrics deadlocks
[ https://issues.apache.org/jira/browse/YARN-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15089042#comment-15089042 ] Akira AJISAKA commented on YARN-4563: - Seems to be related to YARN-3619. > ContainerMetrics deadlocks > -- > > Key: YARN-4563 > URL: https://issues.apache.org/jira/browse/YARN-4563 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Environment: HDP 2.3.2 (Hadoop 2.7.1 + patches) >Reporter: Akira AJISAKA >Priority: Blocker > Attachments: jstack.log > > > On one of our environment, some NodeManagers' webapp do not working. I found > a dead lock in the thread dump. > {noformat} > Found one Java-level deadlock: > = > "1193752357@qtp-907815246-22238": > waiting to lock monitor 0x05e20a18 (object 0xf6afa048, a > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics), > which is held by "2107307914@qtp-907815246-19994" > "2107307914@qtp-907815246-19994": > waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a > org.apache.hadoop.metrics2.impl.MetricsSystemImpl), > which is held by "Timer for 'NodeManager' metrics system" > "Timer for 'NodeManager' metrics system": > waiting to lock monitor 0x027ade88 (object 0xf6582df0, a > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics), > which is held by "1530638165@qtp-907815246-19992" > "1530638165@qtp-907815246-19992": > waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a > org.apache.hadoop.metrics2.impl.MetricsSystemImpl), > which is held by "Timer for 'NodeManager' metrics system" > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4563) ContainerMetrics deadlocks
[ https://issues.apache.org/jira/browse/YARN-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15089119#comment-15089119 ] Rohith Sharma K S commented on YARN-4563: - Based on the commit history of YARN-3619, YARN-3619 is fixed to branch-2 and branch-2.7. Branch-2.6 might need to port this issue. > ContainerMetrics deadlocks > -- > > Key: YARN-4563 > URL: https://issues.apache.org/jira/browse/YARN-4563 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Environment: HDP 2.3.2 (Hadoop 2.7.1 + patches) >Reporter: Akira AJISAKA >Priority: Blocker > Attachments: 0001-YARN-4563.patch, jstack.log > > > On one of our environment, some NodeManagers' webapp do not working. I found > a dead lock in the thread dump. > {noformat} > Found one Java-level deadlock: > = > "1193752357@qtp-907815246-22238": > waiting to lock monitor 0x05e20a18 (object 0xf6afa048, a > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics), > which is held by "2107307914@qtp-907815246-19994" > "2107307914@qtp-907815246-19994": > waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a > org.apache.hadoop.metrics2.impl.MetricsSystemImpl), > which is held by "Timer for 'NodeManager' metrics system" > "Timer for 'NodeManager' metrics system": > waiting to lock monitor 0x027ade88 (object 0xf6582df0, a > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics), > which is held by "1530638165@qtp-907815246-19992" > "1530638165@qtp-907815246-19992": > waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a > org.apache.hadoop.metrics2.impl.MetricsSystemImpl), > which is held by "Timer for 'NodeManager' metrics system" > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4563) ContainerMetrics deadlocks
[ https://issues.apache.org/jira/browse/YARN-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15089113#comment-15089113 ] Rohith Sharma K S commented on YARN-4563: - IIUC, in trunk this issue should not come because {{ContainerMetrics#unregisterContainerMetrics}} is synchronized on class object and {{ContainerMetrics#getMetrics}} is synchronized on "this". But for more precisely I updated the patch. > ContainerMetrics deadlocks > -- > > Key: YARN-4563 > URL: https://issues.apache.org/jira/browse/YARN-4563 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Environment: HDP 2.3.2 (Hadoop 2.7.1 + patches) >Reporter: Akira AJISAKA >Priority: Blocker > Attachments: 0001-YARN-4563.patch, jstack.log > > > On one of our environment, some NodeManagers' webapp do not working. I found > a dead lock in the thread dump. > {noformat} > Found one Java-level deadlock: > = > "1193752357@qtp-907815246-22238": > waiting to lock monitor 0x05e20a18 (object 0xf6afa048, a > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics), > which is held by "2107307914@qtp-907815246-19994" > "2107307914@qtp-907815246-19994": > waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a > org.apache.hadoop.metrics2.impl.MetricsSystemImpl), > which is held by "Timer for 'NodeManager' metrics system" > "Timer for 'NodeManager' metrics system": > waiting to lock monitor 0x027ade88 (object 0xf6582df0, a > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics), > which is held by "1530638165@qtp-907815246-19992" > "1530638165@qtp-907815246-19992": > waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a > org.apache.hadoop.metrics2.impl.MetricsSystemImpl), > which is held by "Timer for 'NodeManager' metrics system" > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4563) ContainerMetrics deadlocks
[ https://issues.apache.org/jira/browse/YARN-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15089116#comment-15089116 ] Akira AJISAKA commented on YARN-4563: - Hi [~rohithsharma], thank you for your comment. Do you think this issue still exists in branch-2.7/2.8? > ContainerMetrics deadlocks > -- > > Key: YARN-4563 > URL: https://issues.apache.org/jira/browse/YARN-4563 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Environment: HDP 2.3.2 (Hadoop 2.7.1 + patches) >Reporter: Akira AJISAKA >Priority: Blocker > Attachments: 0001-YARN-4563.patch, jstack.log > > > On one of our environment, some NodeManagers' webapp do not working. I found > a dead lock in the thread dump. > {noformat} > Found one Java-level deadlock: > = > "1193752357@qtp-907815246-22238": > waiting to lock monitor 0x05e20a18 (object 0xf6afa048, a > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics), > which is held by "2107307914@qtp-907815246-19994" > "2107307914@qtp-907815246-19994": > waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a > org.apache.hadoop.metrics2.impl.MetricsSystemImpl), > which is held by "Timer for 'NodeManager' metrics system" > "Timer for 'NodeManager' metrics system": > waiting to lock monitor 0x027ade88 (object 0xf6582df0, a > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics), > which is held by "1530638165@qtp-907815246-19992" > "1530638165@qtp-907815246-19992": > waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a > org.apache.hadoop.metrics2.impl.MetricsSystemImpl), > which is held by "Timer for 'NodeManager' metrics system" > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4563) ContainerMetrics deadlocks
[ https://issues.apache.org/jira/browse/YARN-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15089090#comment-15089090 ] Akira AJISAKA commented on YARN-4563: - bq. you mean to say YARN-3619 fixes this issue? I meant that YARN-3619 may fixed this issue. Perhaps the bug still exists even after YARN-3619. > ContainerMetrics deadlocks > -- > > Key: YARN-4563 > URL: https://issues.apache.org/jira/browse/YARN-4563 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Environment: HDP 2.3.2 (Hadoop 2.7.1 + patches) >Reporter: Akira AJISAKA >Priority: Blocker > Attachments: jstack.log > > > On one of our environment, some NodeManagers' webapp do not working. I found > a dead lock in the thread dump. > {noformat} > Found one Java-level deadlock: > = > "1193752357@qtp-907815246-22238": > waiting to lock monitor 0x05e20a18 (object 0xf6afa048, a > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics), > which is held by "2107307914@qtp-907815246-19994" > "2107307914@qtp-907815246-19994": > waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a > org.apache.hadoop.metrics2.impl.MetricsSystemImpl), > which is held by "Timer for 'NodeManager' metrics system" > "Timer for 'NodeManager' metrics system": > waiting to lock monitor 0x027ade88 (object 0xf6582df0, a > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics), > which is held by "1530638165@qtp-907815246-19992" > "1530638165@qtp-907815246-19992": > waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a > org.apache.hadoop.metrics2.impl.MetricsSystemImpl), > which is held by "Timer for 'NodeManager' metrics system" > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)