[jira] [Commented] (YARN-4563) ContainerMetrics deadlocks

2016-02-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15126383#comment-15126383
 ] 

ASF GitHub Bot commented on YARN-4563:
--

GitHub user steveloughran opened a pull request:

https://github.com/apache/hadoop/pull/72

YARN-4563

Attempt to document YARN security, including HADOOP_TOKEN_FILE_LOCATION 
propagation

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/steveloughran/hadoop 
HADOOP-12649-security/YARN-4653-yarn

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hadoop/pull/72.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #72


commit 73baa11ff74201faa56ce1fc18941bdad43263fe
Author: Steve Loughran 
Date:   2016-01-28T20:04:14Z

YARN-4653 document YARN security: first pass

commit 778f623f7c436a975a1020d8a1eea55b67a630bf
Author: Steve Loughran 
Date:   2016-01-29T20:04:53Z

YARN-4653 document YARN security: more, though more is needed

commit 6b4ce5fa7ed6a83e471d14994bd26aa01bf37552
Author: Steve Loughran 
Date:   2016-02-01T15:20:35Z

YARN-4653 document YARN security with instructions on propagating oozie 
credentials




> ContainerMetrics deadlocks
> --
>
> Key: YARN-4563
> URL: https://issues.apache.org/jira/browse/YARN-4563
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
> Environment: HDP 2.3.2 (Hadoop 2.7.1 + patches)
>Reporter: Akira AJISAKA
>Priority: Blocker
> Attachments: 0001-YARN-4563.patch, jstack.log
>
>
> On one of our environment, some NodeManagers' webapp do not working. I found 
> a dead lock in the thread dump.
> {noformat}
> Found one Java-level deadlock:
> =
> "1193752357@qtp-907815246-22238":
>   waiting to lock monitor 0x05e20a18 (object 0xf6afa048, a 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics),
>   which is held by "2107307914@qtp-907815246-19994"
> "2107307914@qtp-907815246-19994":
>   waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl),
>   which is held by "Timer for 'NodeManager' metrics system"
> "Timer for 'NodeManager' metrics system":
>   waiting to lock monitor 0x027ade88 (object 0xf6582df0, a 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics),
>   which is held by "1530638165@qtp-907815246-19992"
> "1530638165@qtp-907815246-19992":
>   waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl),
>   which is held by "Timer for 'NodeManager' metrics system"
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4563) ContainerMetrics deadlocks

2016-01-11 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15093329#comment-15093329
 ] 

Akira AJISAKA commented on YARN-4563:
-

Thanks [~rohithsharma] for your comment.
bq. Branch-2.6 might need to port this issue.
ContainerMetrics doesn't exist in branch-2.6, so there's no need to backport 
this issue.

> ContainerMetrics deadlocks
> --
>
> Key: YARN-4563
> URL: https://issues.apache.org/jira/browse/YARN-4563
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
> Environment: HDP 2.3.2 (Hadoop 2.7.1 + patches)
>Reporter: Akira AJISAKA
>Priority: Blocker
> Attachments: 0001-YARN-4563.patch, jstack.log
>
>
> On one of our environment, some NodeManagers' webapp do not working. I found 
> a dead lock in the thread dump.
> {noformat}
> Found one Java-level deadlock:
> =
> "1193752357@qtp-907815246-22238":
>   waiting to lock monitor 0x05e20a18 (object 0xf6afa048, a 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics),
>   which is held by "2107307914@qtp-907815246-19994"
> "2107307914@qtp-907815246-19994":
>   waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl),
>   which is held by "Timer for 'NodeManager' metrics system"
> "Timer for 'NodeManager' metrics system":
>   waiting to lock monitor 0x027ade88 (object 0xf6582df0, a 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics),
>   which is held by "1530638165@qtp-907815246-19992"
> "1530638165@qtp-907815246-19992":
>   waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl),
>   which is held by "Timer for 'NodeManager' metrics system"
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4563) ContainerMetrics deadlocks

2016-01-08 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15089046#comment-15089046
 ] 

Naganarasimha G R commented on YARN-4563:
-

Sorry by mistake assigned it ... 

> ContainerMetrics deadlocks
> --
>
> Key: YARN-4563
> URL: https://issues.apache.org/jira/browse/YARN-4563
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
> Environment: HDP 2.3.2 (Hadoop 2.7.1 + patches)
>Reporter: Akira AJISAKA
>Priority: Blocker
> Attachments: jstack.log
>
>
> On one of our environment, some NodeManagers' webapp do not working. I found 
> a dead lock in the thread dump.
> {noformat}
> Found one Java-level deadlock:
> =
> "1193752357@qtp-907815246-22238":
>   waiting to lock monitor 0x05e20a18 (object 0xf6afa048, a 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics),
>   which is held by "2107307914@qtp-907815246-19994"
> "2107307914@qtp-907815246-19994":
>   waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl),
>   which is held by "Timer for 'NodeManager' metrics system"
> "Timer for 'NodeManager' metrics system":
>   waiting to lock monitor 0x027ade88 (object 0xf6582df0, a 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics),
>   which is held by "1530638165@qtp-907815246-19992"
> "1530638165@qtp-907815246-19992":
>   waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl),
>   which is held by "Timer for 'NodeManager' metrics system"
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4563) ContainerMetrics deadlocks

2016-01-08 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15089084#comment-15089084
 ] 

Naganarasimha G R commented on YARN-4563:
-

Seems like the code in the stack trace is before to YARN-3619, so you mean to 
say YARN-3619 fixes this issue ?

> ContainerMetrics deadlocks
> --
>
> Key: YARN-4563
> URL: https://issues.apache.org/jira/browse/YARN-4563
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
> Environment: HDP 2.3.2 (Hadoop 2.7.1 + patches)
>Reporter: Akira AJISAKA
>Priority: Blocker
> Attachments: jstack.log
>
>
> On one of our environment, some NodeManagers' webapp do not working. I found 
> a dead lock in the thread dump.
> {noformat}
> Found one Java-level deadlock:
> =
> "1193752357@qtp-907815246-22238":
>   waiting to lock monitor 0x05e20a18 (object 0xf6afa048, a 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics),
>   which is held by "2107307914@qtp-907815246-19994"
> "2107307914@qtp-907815246-19994":
>   waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl),
>   which is held by "Timer for 'NodeManager' metrics system"
> "Timer for 'NodeManager' metrics system":
>   waiting to lock monitor 0x027ade88 (object 0xf6582df0, a 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics),
>   which is held by "1530638165@qtp-907815246-19992"
> "1530638165@qtp-907815246-19992":
>   waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl),
>   which is held by "Timer for 'NodeManager' metrics system"
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4563) ContainerMetrics deadlocks

2016-01-08 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15089042#comment-15089042
 ] 

Akira AJISAKA commented on YARN-4563:
-

Seems to be related to YARN-3619.

> ContainerMetrics deadlocks
> --
>
> Key: YARN-4563
> URL: https://issues.apache.org/jira/browse/YARN-4563
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
> Environment: HDP 2.3.2 (Hadoop 2.7.1 + patches)
>Reporter: Akira AJISAKA
>Priority: Blocker
> Attachments: jstack.log
>
>
> On one of our environment, some NodeManagers' webapp do not working. I found 
> a dead lock in the thread dump.
> {noformat}
> Found one Java-level deadlock:
> =
> "1193752357@qtp-907815246-22238":
>   waiting to lock monitor 0x05e20a18 (object 0xf6afa048, a 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics),
>   which is held by "2107307914@qtp-907815246-19994"
> "2107307914@qtp-907815246-19994":
>   waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl),
>   which is held by "Timer for 'NodeManager' metrics system"
> "Timer for 'NodeManager' metrics system":
>   waiting to lock monitor 0x027ade88 (object 0xf6582df0, a 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics),
>   which is held by "1530638165@qtp-907815246-19992"
> "1530638165@qtp-907815246-19992":
>   waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl),
>   which is held by "Timer for 'NodeManager' metrics system"
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4563) ContainerMetrics deadlocks

2016-01-08 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15089119#comment-15089119
 ] 

Rohith Sharma K S commented on YARN-4563:
-

Based on the commit history of YARN-3619, YARN-3619 is fixed to branch-2 and 
branch-2.7.
Branch-2.6 might need to port this issue.

> ContainerMetrics deadlocks
> --
>
> Key: YARN-4563
> URL: https://issues.apache.org/jira/browse/YARN-4563
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
> Environment: HDP 2.3.2 (Hadoop 2.7.1 + patches)
>Reporter: Akira AJISAKA
>Priority: Blocker
> Attachments: 0001-YARN-4563.patch, jstack.log
>
>
> On one of our environment, some NodeManagers' webapp do not working. I found 
> a dead lock in the thread dump.
> {noformat}
> Found one Java-level deadlock:
> =
> "1193752357@qtp-907815246-22238":
>   waiting to lock monitor 0x05e20a18 (object 0xf6afa048, a 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics),
>   which is held by "2107307914@qtp-907815246-19994"
> "2107307914@qtp-907815246-19994":
>   waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl),
>   which is held by "Timer for 'NodeManager' metrics system"
> "Timer for 'NodeManager' metrics system":
>   waiting to lock monitor 0x027ade88 (object 0xf6582df0, a 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics),
>   which is held by "1530638165@qtp-907815246-19992"
> "1530638165@qtp-907815246-19992":
>   waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl),
>   which is held by "Timer for 'NodeManager' metrics system"
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4563) ContainerMetrics deadlocks

2016-01-08 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15089113#comment-15089113
 ] 

Rohith Sharma K S commented on YARN-4563:
-

IIUC, in trunk this issue should not come because 
{{ContainerMetrics#unregisterContainerMetrics}} is synchronized on class object 
and {{ContainerMetrics#getMetrics}} is synchronized on "this".  
But for more precisely I updated the patch.

> ContainerMetrics deadlocks
> --
>
> Key: YARN-4563
> URL: https://issues.apache.org/jira/browse/YARN-4563
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
> Environment: HDP 2.3.2 (Hadoop 2.7.1 + patches)
>Reporter: Akira AJISAKA
>Priority: Blocker
> Attachments: 0001-YARN-4563.patch, jstack.log
>
>
> On one of our environment, some NodeManagers' webapp do not working. I found 
> a dead lock in the thread dump.
> {noformat}
> Found one Java-level deadlock:
> =
> "1193752357@qtp-907815246-22238":
>   waiting to lock monitor 0x05e20a18 (object 0xf6afa048, a 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics),
>   which is held by "2107307914@qtp-907815246-19994"
> "2107307914@qtp-907815246-19994":
>   waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl),
>   which is held by "Timer for 'NodeManager' metrics system"
> "Timer for 'NodeManager' metrics system":
>   waiting to lock monitor 0x027ade88 (object 0xf6582df0, a 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics),
>   which is held by "1530638165@qtp-907815246-19992"
> "1530638165@qtp-907815246-19992":
>   waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl),
>   which is held by "Timer for 'NodeManager' metrics system"
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4563) ContainerMetrics deadlocks

2016-01-08 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15089116#comment-15089116
 ] 

Akira AJISAKA commented on YARN-4563:
-

Hi [~rohithsharma], thank you for your comment. Do you think this issue still 
exists in branch-2.7/2.8?

> ContainerMetrics deadlocks
> --
>
> Key: YARN-4563
> URL: https://issues.apache.org/jira/browse/YARN-4563
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
> Environment: HDP 2.3.2 (Hadoop 2.7.1 + patches)
>Reporter: Akira AJISAKA
>Priority: Blocker
> Attachments: 0001-YARN-4563.patch, jstack.log
>
>
> On one of our environment, some NodeManagers' webapp do not working. I found 
> a dead lock in the thread dump.
> {noformat}
> Found one Java-level deadlock:
> =
> "1193752357@qtp-907815246-22238":
>   waiting to lock monitor 0x05e20a18 (object 0xf6afa048, a 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics),
>   which is held by "2107307914@qtp-907815246-19994"
> "2107307914@qtp-907815246-19994":
>   waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl),
>   which is held by "Timer for 'NodeManager' metrics system"
> "Timer for 'NodeManager' metrics system":
>   waiting to lock monitor 0x027ade88 (object 0xf6582df0, a 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics),
>   which is held by "1530638165@qtp-907815246-19992"
> "1530638165@qtp-907815246-19992":
>   waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl),
>   which is held by "Timer for 'NodeManager' metrics system"
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4563) ContainerMetrics deadlocks

2016-01-08 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15089090#comment-15089090
 ] 

Akira AJISAKA commented on YARN-4563:
-

bq. you mean to say YARN-3619 fixes this issue?
I meant that YARN-3619 may fixed this issue. Perhaps the bug still exists even 
after YARN-3619.

> ContainerMetrics deadlocks
> --
>
> Key: YARN-4563
> URL: https://issues.apache.org/jira/browse/YARN-4563
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
> Environment: HDP 2.3.2 (Hadoop 2.7.1 + patches)
>Reporter: Akira AJISAKA
>Priority: Blocker
> Attachments: jstack.log
>
>
> On one of our environment, some NodeManagers' webapp do not working. I found 
> a dead lock in the thread dump.
> {noformat}
> Found one Java-level deadlock:
> =
> "1193752357@qtp-907815246-22238":
>   waiting to lock monitor 0x05e20a18 (object 0xf6afa048, a 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics),
>   which is held by "2107307914@qtp-907815246-19994"
> "2107307914@qtp-907815246-19994":
>   waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl),
>   which is held by "Timer for 'NodeManager' metrics system"
> "Timer for 'NodeManager' metrics system":
>   waiting to lock monitor 0x027ade88 (object 0xf6582df0, a 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics),
>   which is held by "1530638165@qtp-907815246-19992"
> "1530638165@qtp-907815246-19992":
>   waiting to lock monitor 0x01a000a8 (object 0xd4f1e1f8, a 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl),
>   which is held by "Timer for 'NodeManager' metrics system"
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)