date:20160719


[ 
https://issues.apache.org/jira/browse/YARN-5350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385407#comment-15385407
 ] 

Hadoop QA commented on YARN-5350:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 23s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
48s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
47s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
27s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
15s {color} | {color:green} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 0 new + 22 unchanged - 1 fixed = 22 total (was 23) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
54s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 13m 19s {color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
16s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 28m 4s {color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.nodemanager.TestDirectoryCollection |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12818962/YARN-5350.003.patch |
| JIRA Issue | YARN-5350 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 7040699c7987 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 8f0d3d6 |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/12377/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
| unit test logs |  
https://builds.apache.org/job/PreCommit-YARN-Build/12377/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/12377/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/12377/console |
| Powered by |

[jira] [Commented] (YARN-5340) Race condition in RollingLevelDBTimelineStore#getAndSetStartTime()


[ 
https://issues.apache.org/jira/browse/YARN-5340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385396#comment-15385396
 ] 

Hadoop QA commented on YARN-5340:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 29s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
15s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 23s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
33s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
18s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 16s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 19s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
37s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 11s 
{color} | {color:green} hadoop-yarn-server-applicationhistoryservice in the 
patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
17s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 15m 56s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12818915/YARN-5340-trunk.002.patch
 |
| JIRA Issue | YARN-5340 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux f2206a99ca13 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 8f0d3d6 |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/12379/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/12379/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Race condition in RollingLevelDBTimelineStore#getAndSetStartTime()
> --
>
> Key: YARN-5340
> URL: https://issues.apache.org/jira/browse/YARN-5340
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter:

[jira] [Commented] (YARN-4464) default value of yarn.resourcemanager.state-store.max-completed-applications should lower.

2016-07-19 Thread Naganarasimha G R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385382#comment-15385382
 ] 

Naganarasimha G R commented on YARN-4464:
-

+1 for Option 3 

> default value of yarn.resourcemanager.state-store.max-completed-applications 
> should lower.
> --
>
> Key: YARN-4464
> URL: https://issues.apache.org/jira/browse/YARN-4464
> Project: Hadoop YARN
>  Issue Type: Wish
>  Components: resourcemanager
>Reporter: KWON BYUNGCHANG
>Assignee: Daniel Templeton
>Priority: Blocker
> Attachments: YARN-4464.001.patch, YARN-4464.002.patch, 
> YARN-4464.003.patch, YARN-4464.004.patch
>
>
> my cluster has 120 nodes.
> I configured RM Restart feature.
> {code}
> yarn.resourcemanager.recovery.enabled=true
> yarn.resourcemanager.store.class=org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore
> yarn.resourcemanager.fs.state-store.uri=/system/yarn/rmstore
> {code}
> unfortunately I did not configure 
> {{yarn.resourcemanager.state-store.max-completed-applications}}.
> so that property configured default value 10,000.
> I have restarted RM due to changing another configuartion.
> I expected that RM restart immediately.
> recovery process was very slow.  I have waited about 20min.  
> realize missing 
> {{yarn.resourcemanager.state-store.max-completed-applications}}.
> its default value is very huge.  
> need to change lower value or document notice on [RM Restart 
> page|http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/ResourceManagerRestart.html].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5403) yarn top command does not execute correct

2016-07-19 Thread Bibin A Chundatt (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385347#comment-15385347
 ] 

Bibin A Chundatt commented on YARN-5403:


Duplicate of YARN-4232. 

> yarn top command does not execute correct
> -
>
> Key: YARN-5403
> URL: https://issues.apache.org/jira/browse/YARN-5403
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.7.2
>Reporter: gu-chi
> Attachments: YARN-5403.patch
>
>
> when execute {{yarn top}}, I always get exception as below:
> {quote}
> 16/07/19 19:55:12 ERROR cli.TopCLI: Could not fetch RM start time
> java.net.ConnectException: Connection refused
>   at java.net.PlainSocketImpl.socketConnect(Native Method)
>   at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
>   at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:204)
>   at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
>   at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>   at java.net.Socket.connect(Socket.java:589)
>   at java.net.Socket.connect(Socket.java:538)
>   at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
>   at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
>   at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
>   at sun.net.www.http.HttpClient.(HttpClient.java:211)
>   at sun.net.www.http.HttpClient.New(HttpClient.java:308)
>   at sun.net.www.http.HttpClient.New(HttpClient.java:326)
>   at 
> sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1169)
>   at 
> sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1105)
>   at 
> sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:999)
>   at 
> sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:933)
>   at 
> org.apache.hadoop.yarn.client.cli.TopCLI.getRMStartTime(TopCLI.java:747)
>   at org.apache.hadoop.yarn.client.cli.TopCLI.run(TopCLI.java:443)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>   at org.apache.hadoop.yarn.client.cli.TopCLI.main(TopCLI.java:421)
> YARN top - 19:55:13, up 17001d, 11:55, 0 active users, queue(s): root
> {quote}
> As I looked into it, the function {{getRMStartTime}} use HTTP as hardcoding 
> no matter what is the {{yarn.http.policy}} setting, should consider if use 
> HTTPS 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-4464) default value of yarn.resourcemanager.state-store.max-completed-applications should lower.

2016-07-19 Thread Jian He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385320#comment-15385320
 ] 

Jian He commented on YARN-4464:
---

I vote for 3) which can solve the slowness problem and preserves the behavior 
to some extend

> default value of yarn.resourcemanager.state-store.max-completed-applications 
> should lower.
> --
>
> Key: YARN-4464
> URL: https://issues.apache.org/jira/browse/YARN-4464
> Project: Hadoop YARN
>  Issue Type: Wish
>  Components: resourcemanager
>Reporter: KWON BYUNGCHANG
>Assignee: Daniel Templeton
>Priority: Blocker
> Attachments: YARN-4464.001.patch, YARN-4464.002.patch, 
> YARN-4464.003.patch, YARN-4464.004.patch
>
>
> my cluster has 120 nodes.
> I configured RM Restart feature.
> {code}
> yarn.resourcemanager.recovery.enabled=true
> yarn.resourcemanager.store.class=org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore
> yarn.resourcemanager.fs.state-store.uri=/system/yarn/rmstore
> {code}
> unfortunately I did not configure 
> {{yarn.resourcemanager.state-store.max-completed-applications}}.
> so that property configured default value 10,000.
> I have restarted RM due to changing another configuartion.
> I expected that RM restart immediately.
> recovery process was very slow.  I have waited about 20min.  
> realize missing 
> {{yarn.resourcemanager.state-store.max-completed-applications}}.
> its default value is very huge.  
> need to change lower value or document notice on [RM Restart 
> page|http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/ResourceManagerRestart.html].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-4464) default value of yarn.resourcemanager.state-store.max-completed-applications should lower.

2016-07-19 Thread Jian He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385319#comment-15385319
 ] 

Jian He commented on YARN-4464:
---

I vote for 3) which can solve the slowness problem and preserves the behavior 
to some extend

> default value of yarn.resourcemanager.state-store.max-completed-applications 
> should lower.
> --
>
> Key: YARN-4464
> URL: https://issues.apache.org/jira/browse/YARN-4464
> Project: Hadoop YARN
>  Issue Type: Wish
>  Components: resourcemanager
>Reporter: KWON BYUNGCHANG
>Assignee: Daniel Templeton
>Priority: Blocker
> Attachments: YARN-4464.001.patch, YARN-4464.002.patch, 
> YARN-4464.003.patch, YARN-4464.004.patch
>
>
> my cluster has 120 nodes.
> I configured RM Restart feature.
> {code}
> yarn.resourcemanager.recovery.enabled=true
> yarn.resourcemanager.store.class=org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore
> yarn.resourcemanager.fs.state-store.uri=/system/yarn/rmstore
> {code}
> unfortunately I did not configure 
> {{yarn.resourcemanager.state-store.max-completed-applications}}.
> so that property configured default value 10,000.
> I have restarted RM due to changing another configuartion.
> I expected that RM restart immediately.
> recovery process was very slow.  I have waited about 20min.  
> realize missing 
> {{yarn.resourcemanager.state-store.max-completed-applications}}.
> its default value is very huge.  
> need to change lower value or document notice on [RM Restart 
> page|http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/ResourceManagerRestart.html].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-4997) Update fair scheduler to use pluggable auth provider

2016-07-19 Thread Tao Jie (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385256#comment-15385256
 ] 

Tao Jie commented on YARN-4997:
---

Fix for review.

> Update fair scheduler to use pluggable auth provider
> 
>
> Key: YARN-4997
> URL: https://issues.apache.org/jira/browse/YARN-4997
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.8.0
>Reporter: Daniel Templeton
>Assignee: Tao Jie
> Attachments: YARN-4997-001.patch
>
>
> Now that YARN-3100 has made the authorization pluggable, it should be 
> supported by the fair scheduler.  YARN-3100 only updated the capacity 
> scheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-4997) Update fair scheduler to use pluggable auth provider

2016-07-19 Thread Tao Jie (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated YARN-4997:
--
Attachment: YARN-4997-001.patch

> Update fair scheduler to use pluggable auth provider
> 
>
> Key: YARN-4997
> URL: https://issues.apache.org/jira/browse/YARN-4997
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.8.0
>Reporter: Daniel Templeton
>Assignee: Tao Jie
> Attachments: YARN-4997-001.patch
>
>
> Now that YARN-3100 has made the authorization pluggable, it should be 
> supported by the fair scheduler.  YARN-3100 only updated the capacity 
> scheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5390) Federation Subcluster Resolver


 [ 
https://issues.apache.org/jira/browse/YARN-5390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subru Krishnan updated YARN-5390:
-
Assignee: Ellen Hui

> Federation Subcluster Resolver
> --
>
> Key: YARN-5390
> URL: https://issues.apache.org/jira/browse/YARN-5390
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Carlo Curino
>Assignee: Ellen Hui
>
> This JIRA tracks effort to create a mechanism to resolve nodes/racks resource 
> names to sub-cluster identifiers. This is needed by the federation policies 
> in YARN-5323, YARN-5324, YARN-5325 to operate correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-4464) default value of yarn.resourcemanager.state-store.max-completed-applications should lower.

2016-07-19 Thread Karthik Kambatla (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385192#comment-15385192
 ] 

Karthik Kambatla commented on YARN-4464:


I know there is no right answer here. We should have picked a better default to 
begin with. 

IAC, my preference would be whatever least astonishes the admins/users. Options 
sorted by least astonishment:
# Don't change anything. Keep it at 10,000 and deal with recovery slowness etc. 
# Change it to 0. When people try out Hadoop 3 and failover, they immediately 
realize they don't see any completed applications. However, they all will 
likely have to change it
# Change it to 1000. People will realize it late, but most users might not 
necessarily run into any issues ever. 

By the way, one other change we should make is to limit 
{{rm.store.max-completed-apps}} to {{rm.max-completed-apps}}.

> default value of yarn.resourcemanager.state-store.max-completed-applications 
> should lower.
> --
>
> Key: YARN-4464
> URL: https://issues.apache.org/jira/browse/YARN-4464
> Project: Hadoop YARN
>  Issue Type: Wish
>  Components: resourcemanager
>Reporter: KWON BYUNGCHANG
>Assignee: Daniel Templeton
>Priority: Blocker
> Attachments: YARN-4464.001.patch, YARN-4464.002.patch, 
> YARN-4464.003.patch, YARN-4464.004.patch
>
>
> my cluster has 120 nodes.
> I configured RM Restart feature.
> {code}
> yarn.resourcemanager.recovery.enabled=true
> yarn.resourcemanager.store.class=org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore
> yarn.resourcemanager.fs.state-store.uri=/system/yarn/rmstore
> {code}
> unfortunately I did not configure 
> {{yarn.resourcemanager.state-store.max-completed-applications}}.
> so that property configured default value 10,000.
> I have restarted RM due to changing another configuartion.
> I expected that RM restart immediately.
> recovery process was very slow.  I have waited about 20min.  
> realize missing 
> {{yarn.resourcemanager.state-store.max-completed-applications}}.
> its default value is very huge.  
> need to change lower value or document notice on [RM Restart 
> page|http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/ResourceManagerRestart.html].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-3664) Federation PolicyStore APIs


 [ 
https://issues.apache.org/jira/browse/YARN-3664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subru Krishnan updated YARN-3664:
-
Attachment: YARN-3664-YARN-2915-v1.patch

Updated patch (v1) that incorporates [~leftnoteasy]'s 
[feedback|https://issues.apache.org/jira/browse/YARN-3662?focusedCommentId=15375947]:

  * Included {{FederationPolicyStore}} API class. I had missed it in previous 
patch, good catch.
  * Renamed record classes and updated Javadoc (with help from [~curino]) to 
make it more understandable.

For more context, kindly refer to [~curino]'s 
[summary|https://issues.apache.org/jira/browse/YARN-5323?focusedCommentId=15380907]
 and associated policy patches in YARN-5323. 

> Federation PolicyStore APIs
> ---
>
> Key: YARN-3664
> URL: https://issues.apache.org/jira/browse/YARN-3664
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Subru Krishnan
> Attachments: YARN-3664-YARN-2915-v0.patch, 
> YARN-3664-YARN-2915-v1.patch
>
>
> The federation Policy Store contains information about the capacity 
> allocations made by users, their mapping to sub-clusters and the policies 
> that each of the components (Router, AMRMPRoxy, RMs) should enforce



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5203) Return ResourceRequest JAXB object in ResourceManager Cluster Applications REST API

2016-07-19 Thread Ellen Hui (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385137#comment-15385137
 ] 

Ellen Hui commented on YARN-5203:
-

Hi [~sunilg], thanks for the comment. I tested this more thoroughly and 
[~subru] is right; keeping the old resourceRequests element causes an 
UnmarshalException in the Federation Router, which was the original symptom of 
this bug.

> Return ResourceRequest JAXB object in ResourceManager Cluster Applications 
> REST API
> ---
>
> Key: YARN-5203
> URL: https://issues.apache.org/jira/browse/YARN-5203
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Subru Krishnan
>Assignee: Ellen Hui
> Attachments: YARN-5203.v0.patch, YARN-5203.v1.patch
>
>
> The ResourceManager Cluster Applications REST API returns {{ResourceRequest}} 
> as String rather than a JAXB object. This prevents downstream tools like 
> Federation Router (YARN-3659) that depend on the REST API to unmarshall the 
> {{AppInfo}}. This JIRA proposes updating {{AppInfo}} to return a JAXB version 
> of the {{ResourceRequest}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5340) Race condition in RollingLevelDBTimelineStore#getAndSetStartTime()


[ 
https://issues.apache.org/jira/browse/YARN-5340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385116#comment-15385116
 ] 

Hadoop QA commented on YARN-5340:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 26s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
52s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 21s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
30s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 15s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 15s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 19s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
35s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 9s 
{color} | {color:green} hadoop-yarn-server-applicationhistoryservice in the 
patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
15s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 15m 8s {color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12818915/YARN-5340-trunk.002.patch
 |
| JIRA Issue | YARN-5340 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux f13f570de975 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / dc065dd |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/12376/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/12376/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Race condition in RollingLevelDBTimelineStore#getAndSetStartTime()
> --
>
> Key: YARN-5340
> URL: https://issues.apache.org/jira/browse/YARN-5340
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter:

[jira] [Commented] (YARN-5203) Return ResourceRequest JAXB object in ResourceManager Cluster Applications REST API


[ 
https://issues.apache.org/jira/browse/YARN-5203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385099#comment-15385099
 ] 

Subru Krishnan commented on YARN-5203:
--

[~sunilg], thanks for taking a look and bringing up the important point on 
compatibility. 

IIUC unfortunately we cannot continue to have the raw RRs as JAXB will not be 
able to unmarshal directly to {{AppInfo}} object. Today it works because 
existing clients like UI deserialize directly to _String_ which should continue 
to work even if we use JAXB object for RRs.

[~ellenfkh], can you kindly do a quick validation as you have a setup where you 
already have been testing extensively. Thanks!

> Return ResourceRequest JAXB object in ResourceManager Cluster Applications 
> REST API
> ---
>
> Key: YARN-5203
> URL: https://issues.apache.org/jira/browse/YARN-5203
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Subru Krishnan
>Assignee: Ellen Hui
> Attachments: YARN-5203.v0.patch, YARN-5203.v1.patch
>
>
> The ResourceManager Cluster Applications REST API returns {{ResourceRequest}} 
> as String rather than a JAXB object. This prevents downstream tools like 
> Federation Router (YARN-3659) that depend on the REST API to unmarshall the 
> {{AppInfo}}. This JIRA proposes updating {{AppInfo}} to return a JAXB version 
> of the {{ResourceRequest}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5391) FederationPolicy implementations (tieing together RouterFederationPolicy and AMRMProxyFederationPolicy)


 [ 
https://issues.apache.org/jira/browse/YARN-5391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carlo Curino updated YARN-5391:
---
Attachment: YARN-5391.02.patch

> FederationPolicy implementations (tieing together RouterFederationPolicy and 
> AMRMProxyFederationPolicy)
> ---
>
> Key: YARN-5391
> URL: https://issues.apache.org/jira/browse/YARN-5391
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Carlo Curino
>Assignee: Carlo Curino
> Attachments: YARN-5391.01.patch, YARN-5391.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5324) Stateless router policies implementation


 [ 
https://issues.apache.org/jira/browse/YARN-5324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carlo Curino updated YARN-5324:
---
Attachment: YARN-5324.02.patch

> Stateless router policies implementation
> 
>
> Key: YARN-5324
> URL: https://issues.apache.org/jira/browse/YARN-5324
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Carlo Curino
>Assignee: Carlo Curino
> Attachments: YARN-5324.01.patch, YARN-5324.02.patch
>
>
> These are policies at the Router that do not require maintaing state across 
> choices (e.g., weighted random).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5323) Policies APIs (for Router and AMRMProxy policies)


 [ 
https://issues.apache.org/jira/browse/YARN-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carlo Curino updated YARN-5323:
---
Attachment: YARN-5323.03.patch

> Policies APIs (for Router and AMRMProxy policies)
> -
>
> Key: YARN-5323
> URL: https://issues.apache.org/jira/browse/YARN-5323
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Carlo Curino
>Assignee: Carlo Curino
> Attachments: YARN-5323.01.patch, YARN-5323.02.patch, 
> YARN-5323.03.patch
>
>
> This JIRA tracks APIs for the policies that will guide the Router and 
> AMRMProxy decisions on where to fwd the jobs submission/query requests as 
> well as ResourceRequests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5325) Stateless ARMRMProxy policies implementation


 [ 
https://issues.apache.org/jira/browse/YARN-5325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carlo Curino updated YARN-5325:
---
Attachment: YARN-5325.02.patch

> Stateless ARMRMProxy policies implementation
> 
>
> Key: YARN-5325
> URL: https://issues.apache.org/jira/browse/YARN-5325
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Carlo Curino
>Assignee: Carlo Curino
> Attachments: YARN-5325.01.patch, YARN-5325.02.patch
>
>
> This JIRA tracks policies in the AMRMProxy that decide how to forward 
> ResourceRequests, without maintaining substantial state across decissions 
> (e.g., broadcast).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5350) Ensure LocalScheduler does not lose the sort order of allocatable nodes returned by the RM

2016-07-19 Thread Arun Suresh (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh updated YARN-5350:
--
Attachment: YARN-5350.003.patch

Thanks for the review [~subru].
Updating testcase with your suggestion..

> Ensure LocalScheduler does not lose the sort order of allocatable nodes 
> returned by the RM
> --
>
> Key: YARN-5350
> URL: https://issues.apache.org/jira/browse/YARN-5350
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Fix For: 2.9.0
>
> Attachments: YARN-5350.001.patch, YARN-5350.002.patch, 
> YARN-5350.003.patch
>
>
> The LocalScheduler receives an ordered list of nodes from the RM with each 
> allocate call. This list, which is used by the LocalScheduler to allocate 
> OPPORTUNISTIC containers, is sorted on the Nodes free capacity (queue length 
> / wait time).
> Unfortunately, the LocalScheduler stores this list in a HashMap thereby 
> losing the sort order.
> The trivial fix would be to replace the HashMap with a LinkedHashMap which 
> retains the insertion order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5392) Replace use of Priority in the Scheduling infrastructure with an opaque ShedulerKey


[ 
https://issues.apache.org/jira/browse/YARN-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385055#comment-15385055
 ] 

Subru Krishnan commented on YARN-5392:
--

Thanks [~asuresh] for working on this. 

I just have one question, should we have the {{SchedulerKey}} in addition to 
{{Priority}}? 
I feel {{Priority}} should be accessible directly as before outside of the 
scheduler layers and the notion of {{SchedulerKey}} should be confined to the 
scheduler (ideally should be transparent to other RM entities/services). An 
extreme example would be that in future we could decide not to use {{Priority}} 
as a {{SchedulerKey}} at some point in the future. 

Overall the patch LGTM. Since I have been working very closely with you; 
[~kasha]/[~leftnoteasy], can you guys take a look.

> Replace use of Priority in the Scheduling infrastructure with an opaque 
> ShedulerKey
> ---
>
> Key: YARN-5392
> URL: https://issues.apache.org/jira/browse/YARN-5392
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-5392.001.patch, YARN-5392.002.patch, 
> YARN-5392.003.patch
>
>
> Based on discussions in YARN-4888, this jira proposes to replace the use of 
> {{Priority}} in the Scheduler infrastructure (Scheduler, Queues, SchedulerApp 
> / Node etc.) with a more opaque and extensible {{SchedulerKey}}.
> Note: Even though {{SchedulerKey}} will be used by the internal scheduling 
> infrastructure, It will not be exposed to the Client or the AM. The 
> SchdulerKey is meant to be an internal construct that is derived from 
> attributes of the ResourceRequest / ApplicationSubmissionContext / Scheduler 
> Configuration etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5350) Ensure LocalScheduler does not lose the sort order of allocatable nodes returned by the RM


[ 
https://issues.apache.org/jira/browse/YARN-5350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385047#comment-15385047
 ] 

Subru Krishnan commented on YARN-5350:
--

[~asuresh], +1 for the fix.

A minor feedback for the test:
  * it would be good to add a assertion on the total number of OPPORTUNISTIC 
containers.
  * add another check to ensure sort order is maintained by doing a second 
round of allocation and/or request for multiple OPPORTUNISTIC containers.

Thanks.

> Ensure LocalScheduler does not lose the sort order of allocatable nodes 
> returned by the RM
> --
>
> Key: YARN-5350
> URL: https://issues.apache.org/jira/browse/YARN-5350
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Fix For: 2.9.0
>
> Attachments: YARN-5350.001.patch, YARN-5350.002.patch
>
>
> The LocalScheduler receives an ordered list of nodes from the RM with each 
> allocate call. This list, which is used by the LocalScheduler to allocate 
> OPPORTUNISTIC containers, is sorted on the Nodes free capacity (queue length 
> / wait time).
> Unfortunately, the LocalScheduler stores this list in a HashMap thereby 
> losing the sort order.
> The trivial fix would be to replace the HashMap with a LinkedHashMap which 
> retains the insertion order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-3477) TimelineClientImpl swallows exceptions

2016-07-19 Thread Li Lu (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Lu updated YARN-3477:

Attachment: YARN-3477-trunk.003.patch

[~ste...@apache.org] I rebased your patch to the latest trunk. Here's the 
rebased version. 

> TimelineClientImpl swallows exceptions
> --
>
> Key: YARN-3477
> URL: https://issues.apache.org/jira/browse/YARN-3477
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 2.6.0, 2.7.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: YARN-3477-001.patch, YARN-3477-002.patch, 
> YARN-3477-trunk.003.patch
>
>
> If timeline client fails more than the retry count, the original exception is 
> not thrown. Instead some runtime exception is raised saying "retries run out"
> # the failing exception should be rethrown, ideally via 
> NetUtils.wrapException to include URL of the failing endpoing
> # Otherwise, the raised RTE should (a) state that URL and (b) set the 
> original fault as the inner cause



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5394) Correct the wrong file name when mounting /etc/passwd to Docker Container

2016-07-19 Thread Sidharta Seethana (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384999#comment-15384999
 ] 

Sidharta Seethana commented on YARN-5394:
-

/cc [~zyluo], [~vvasudev]

As discussed in YARN-5360, I don't think its a good idea to mount /etc/passwd 
into a container without any way to disable it. At a minimum, we should add a 
(cluster-wide?) mechanism to control this (and it should be disabled by 
default, IMO).

> Correct the wrong file name when mounting /etc/passwd to Docker Container
> -
>
> Key: YARN-5394
> URL: https://issues.apache.org/jira/browse/YARN-5394
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Zhankun Tang
>Assignee: Zhankun Tang
> Attachments: YARN-5394-branch-2.8.001.patch
>
>
> Current LCE (DockerLinuxContainerRuntime) is mounting /etc/passwd to the 
> container. But it seems to use wrong file name "/etc/password" for container.
> {panel}
> .addMountLocation("/etc/passwd", "/etc/password:ro");
> {panel}
> This causes LCE failed to launch the Docker container if the Docker images 
> don't create the same user name and UID in it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-3662) Federation Membership State APIs


 [ 
https://issues.apache.org/jira/browse/YARN-3662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subru Krishnan updated YARN-3662:
-
Attachment: YARN-3662-YARN-2915-v3.01.patch

Reattaching patch after rebasing _YARN-2915_ branch to pull in HADOOP-13342

> Federation Membership State APIs
> 
>
> Key: YARN-3662
> URL: https://issues.apache.org/jira/browse/YARN-3662
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Subru Krishnan
> Attachments: YARN-3662-YARN-2915-v1.1.patch, 
> YARN-3662-YARN-2915-v1.patch, YARN-3662-YARN-2915-v2.patch, 
> YARN-3662-YARN-2915-v3.01.patch, YARN-3662-YARN-2915-v3.patch
>
>
> The Federation Application State encapsulates the information about the 
> active RM of each sub-cluster that is participating in Federation. The 
> information includes addresses for ClientRM, ApplicationMaster and Admin 
> services along with the sub_cluster _capability_ which is currently defined 
> by *ClusterMetricsInfo*. Please refer to the design doc in parent JIRA for 
> further details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5340) Race condition in RollingLevelDBTimelineStore#getAndSetStartTime()

2016-07-19 Thread Vinod Kumar Vavilapalli (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384964#comment-15384964
 ] 

Vinod Kumar Vavilapalli commented on YARN-5340:
---

Tx for the update, [~gtCarrera9]. This definitely looks slightly better than 
the previous one.

I'll check this in if Jenkins says okay.

> Race condition in RollingLevelDBTimelineStore#getAndSetStartTime()
> --
>
> Key: YARN-5340
> URL: https://issues.apache.org/jira/browse/YARN-5340
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Sumana Sathish
>Assignee: Li Lu
>Priority: Critical
> Attachments: YARN-5340-trunk.001.patch, YARN-5340-trunk.002.patch
>
>
> App Name/User/RPC Port/AM Host info is missing from ATS web service or YARN 
> CLI's app info
> {code}
> RUNNING: /usr/hdp/current/hadoop-yarn-client/bin/yarn --config 
> /tmp/hadoopConf application -status application_1467931619679_0001
> Application Report :
> Application-Id : application_1467931619679_0001
> Application-Name : null
> Application-Type : null
> User : null
> Queue : null
> Application Priority : null
> Start-Time : 0
> Finish-Time : 1467931672057
> Progress : 100%
> State : FINISHED
> Final-State : SUCCEEDED
> Tracking-URL : N/A
> RPC Port : -1
> AM Host : N/A
> Aggregate Resource Allocation : 290014 MB-seconds, 74 vcore-seconds
> Log Aggregation Status : N/A
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5092) TestRMDelegationTokens fails intermittently


 [ 
https://issues.apache.org/jira/browse/YARN-5092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-5092:
-
Attachment: YARN-5092.002.patch

Thanks for the review, Rohith!

Good catch on the rm1.stop() suggestion.  Clearing the queue metrics wasn't 
causing the default metrics exception, that was the failure to stop. Clearing 
the queue metrics fixed the class cast exception, but I went ahead with your 
suggestion to remove the scheduler setting as another way to fix it since it 
seemed unrelated to the test.

Also added the setLoginUser change in the setup method.  Tested both orderings 
of the tests.




> TestRMDelegationTokens fails intermittently 
> 
>
> Key: YARN-5092
> URL: https://issues.apache.org/jira/browse/YARN-5092
> Project: Hadoop YARN
>  Issue Type: Test
>  Components: test
>Affects Versions: 2.7.2
>Reporter: Rohith Sharma K S
>Assignee: Jason Lowe
> Attachments: YARN-5092.001.patch, YARN-5092.002.patch
>
>
> In build 
> [link|https://builds.apache.org/job/PreCommit-YARN-Build/11476/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_101.txt]
>  , TestRMDelegationTokens fails for 2 test cases
> # TestRMDelegationTokens.testRMDTMasterKeyStateOnRollingMasterKey
> # TestRMDelegationTokens.testRemoveExpiredMasterKeyInRMStateStore



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5203) Return ResourceRequest JAXB object in ResourceManager Cluster Applications REST API

2016-07-19 Thread Ellen Hui (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ellen Hui updated YARN-5203:

Attachment: YARN-5203.v1.patch

Add ExecutionType, raw ResourceRequests for backwards compatibility.

> Return ResourceRequest JAXB object in ResourceManager Cluster Applications 
> REST API
> ---
>
> Key: YARN-5203
> URL: https://issues.apache.org/jira/browse/YARN-5203
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Subru Krishnan
>Assignee: Ellen Hui
> Attachments: YARN-5203.v0.patch, YARN-5203.v1.patch
>
>
> The ResourceManager Cluster Applications REST API returns {{ResourceRequest}} 
> as String rather than a JAXB object. This prevents downstream tools like 
> Federation Router (YARN-3659) that depend on the REST API to unmarshall the 
> {{AppInfo}}. This JIRA proposes updating {{AppInfo}} to return a JAXB version 
> of the {{ResourceRequest}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-679) add an entry point that can start any Yarn service


[ 
https://issues.apache.org/jira/browse/YARN-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384882#comment-15384882
 ] 

Hadoop QA commented on YARN-679:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 30s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 22 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
44s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 5s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
31s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 4s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
39s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
51s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 35s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 8m 35s {color} 
| {color:red} root generated 8 new + 709 unchanged - 0 fixed = 717 total (was 
709) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 29s 
{color} | {color:red} hadoop-common-project/hadoop-common: The patch generated 
42 new + 119 unchanged - 34 fixed = 161 total (was 153) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 6s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 73 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
48s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 22m 54s {color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
26s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 61m 5s {color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Timed out junit tests | org.apache.hadoop.http.TestHttpServerLifecycle |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12818866/YARN-679-010.patch |
| JIRA Issue | YARN-679 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 78053710b77c 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / cda0a28 |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
| javac | 
https://builds.apache.org/job/PreCommit-YARN-Build/12374/artifact/patchprocess/diff-compile-javac-root.txt
 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/12374/artifact/patchprocess/diff-checkstyle-hadoop-common-project_hadoop-common.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/12374/artifact/patchprocess/whitespace-eol.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/12374/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common.txt
 |
| unit test logs |  
https://builds.apache.org/job/PreCommit-YARN-Build/12374/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common.txt
 |
|  Test Results |

[jira] [Commented] (YARN-5352) Allow container-executor to use private /tmp


[ 
https://issues.apache.org/jira/browse/YARN-5352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384856#comment-15384856
 ] 

Hadoop QA commented on YARN-5352:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 13m 8s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
28s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
22s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 22s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 22s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 12m 52s {color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
18s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 36m 27s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.nodemanager.TestDirectoryCollection |
|   | 
hadoop.yarn.server.nodemanager.containermanager.queuing.TestQueuingContainerManager
 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12818857/YARN-5352-v0.patch |
| JIRA Issue | YARN-5352 |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux 4ec1f4873ecf 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / cda0a28 |
| Default Java | 1.8.0_91 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/12375/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
| unit test logs |  
https://builds.apache.org/job/PreCommit-YARN-Build/12375/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/12375/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/12375/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Allow container-executor to use private /tmp 
> -
>
> Key: YARN-5352
> URL: https://issues.apache.org/jira/browse/YARN-5352
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Nathan Roberts
>Assignee: Nathan Roberts
> Attachments: YARN-5352-v0.patch
>
>
> It's very common for user code to create things in /tmp. Yes, applications 
> have means to specify alternate tmp directories but doing so is opt-in and 
> therefore doesn't happen in many case.  At a minimum, linux can use private 
> namespaces

[jira] [Updated] (YARN-5137) Make DiskChecker pluggable in NodeManager

2016-07-19 Thread Yufei Gu (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated YARN-5137:
---
Attachment: YARN-5137.003.patch

> Make DiskChecker pluggable in NodeManager
> -
>
> Key: YARN-5137
> URL: https://issues.apache.org/jira/browse/YARN-5137
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Ray Chiang
>Assignee: Yufei Gu
>  Labels: supportability
> Attachments: YARN-5137.001.patch, YARN-5137.002.patch, 
> YARN-5137.003.patch
>
>
> It would be nice to have the option for a DiskChecker that has more 
> sophisticated checking capabilities.  In order to do this, we would first 
> need DiskChecker to be pluggable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5181) ClusterNodeTracker: add method to get list of nodes matching a specific resourceName

2016-07-19 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384804#comment-15384804
 ] 

Hudson commented on YARN-5181:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #10119 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10119/])
YARN-5181. ClusterNodeTracker: add method to get list of nodes matching (arun 
suresh: rev cda0a280ddd0c77af93d236fc80478c16bbe809a)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/ClusterNodeTracker.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestClusterNodeTracker.java


> ClusterNodeTracker: add method to get list of nodes matching a specific 
> resourceName
> 
>
> Key: YARN-5181
> URL: https://issues.apache.org/jira/browse/YARN-5181
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Fix For: 2.9.0
>
> Attachments: yarn-5181-1.patch, yarn-5181-2.patch, yarn-5181-3.patch
>
>
> ClusterNodeTracker should have a method to return the list of nodes matching 
> a particular resourceName. This is so we could identify what all nodes a 
> particular ResourceRequest is interested in, which in turn is useful in 
> YARN-5139 (global scheduler) and YARN-4752 (FairScheduler preemption 
> overhaul). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5340) Race condition in RollingLevelDBTimelineStore#getAndSetStartTime()

2016-07-19 Thread Li Lu (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Lu updated YARN-5340:

Attachment: YARN-5340-trunk.002.patch

Thanks for the review [~djp]! I shrink the size of the critical section. Now we 
only synchronize globally when there is a cache miss and we have to check and 
update the timestamp in leveldb. 

> Race condition in RollingLevelDBTimelineStore#getAndSetStartTime()
> --
>
> Key: YARN-5340
> URL: https://issues.apache.org/jira/browse/YARN-5340
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Sumana Sathish
>Assignee: Li Lu
>Priority: Critical
> Attachments: YARN-5340-trunk.001.patch, YARN-5340-trunk.002.patch
>
>
> App Name/User/RPC Port/AM Host info is missing from ATS web service or YARN 
> CLI's app info
> {code}
> RUNNING: /usr/hdp/current/hadoop-yarn-client/bin/yarn --config 
> /tmp/hadoopConf application -status application_1467931619679_0001
> Application Report :
> Application-Id : application_1467931619679_0001
> Application-Name : null
> Application-Type : null
> User : null
> Queue : null
> Application Priority : null
> Start-Time : 0
> Finish-Time : 1467931672057
> Progress : 100%
> State : FINISHED
> Final-State : SUCCEEDED
> Tracking-URL : N/A
> RPC Port : -1
> AM Host : N/A
> Aggregate Resource Allocation : 290014 MB-seconds, 74 vcore-seconds
> Log Aggregation Status : N/A
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5360) Decouple host user and Docker container user

2016-07-19 Thread Sidharta Seethana (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384786#comment-15384786
 ] 

Sidharta Seethana commented on YARN-5360:
-

[~templedf], running a container as root does in fact have security 
implications (there are other things to consider in conjunction with this - 
capabilites, selinux and so on). There are (at least) a couple of reasons why 
--user is enforced currently :  1) YARN security model requires the launched 
process run as the designated user 2) Log aggregation/local permissions etc - 
some of these things would stop working if the generated logs have ownership 
that is different from what YARN expects. These are also the reasons that need 
to be considered for YARN-4266

> Decouple host user and Docker container user
> 
>
> Key: YARN-5360
> URL: https://issues.apache.org/jira/browse/YARN-5360
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Zhankun Tang
>Assignee: Zhankun Tang
>
> There is *a dependency between job submitting user and the user in the Docker 
> image* in LCE currently. For instance, in order to run the Docker container 
> as yarn user, we can choose set the 
> "yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user" to yarn 
> and leave 
> "yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users" 
> default (true). Then LCE will choose yarn ( UID maybe 1001) as the user 
> running jobs.
> LCE will mount the generated launch_container.sh (owned by the running job 
> user) and /etc/passwd (*current the code is mounting to container's 
> /etc/password, I think it's a mistake*) into the Docker container and 
> utilizes "docker run --user=" option to get it done internally.
> Mounting /etc/passwd to the container is a not good choice due to override 
> original users defined in Docker image. As far as I know, since Docker v1.8 
> (or maybe earlier), the Docker run command "--user=" option accepts UID and 
> *when passing UID, the user does not have to exist in the container*. So we 
> could use UID instead of user name to construct the Docker run command to 
> eliminate the dependency that create the same user in the Docker image. This 
> enables LCE the ability to launch any Docker container safely regardless what 
> users in it.
> But this is not enough to decouple host user and Docker container user. The 
> final solution we are searching for are focused on allowing users to run 
> their Docker images flexibly without involving dependencies of YARN and make 
> sure the container won't bring in security risk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-3662) Federation Membership State APIs


[ 
https://issues.apache.org/jira/browse/YARN-3662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384783#comment-15384783
 ] 

Hadoop QA commented on YARN-3662:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red} 0m 2s {color} 
| {color:red} Docker failed to build yetus/hadoop:e2f6409. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12818722/YARN-3662-YARN-2915-v3.patch
 |
| JIRA Issue | YARN-3662 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/12373/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Federation Membership State APIs
> 
>
> Key: YARN-3662
> URL: https://issues.apache.org/jira/browse/YARN-3662
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Subru Krishnan
> Attachments: YARN-3662-YARN-2915-v1.1.patch, 
> YARN-3662-YARN-2915-v1.patch, YARN-3662-YARN-2915-v2.patch, 
> YARN-3662-YARN-2915-v3.patch
>
>
> The Federation Application State encapsulates the information about the 
> active RM of each sub-cluster that is participating in Federation. The 
> information includes addresses for ClientRM, ApplicationMaster and Admin 
> services along with the sub_cluster _capability_ which is currently defined 
> by *ClusterMetricsInfo*. Please refer to the design doc in parent JIRA for 
> further details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5382) RM does not audit log kill request for active applications

2016-07-19 Thread Vrushali C (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384758#comment-15384758
 ] 

Vrushali C commented on YARN-5382:
--

I see, thanks, sounds good. Will do that. 

> RM does not audit log kill request for active applications
> --
>
> Key: YARN-5382
> URL: https://issues.apache.org/jira/browse/YARN-5382
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.2
>Reporter: Jason Lowe
>Assignee: Vrushali C
> Attachments: YARN-5382-branch-2.7.01.patch, 
> YARN-5382-branch-2.7.02.patch
>
>
> ClientRMService will audit a kill request but only if it either fails to 
> issue the kill or if the kill is sent to an already finished application.  It 
> does not create a log entry when the application is active which is arguably 
> the most important case to audit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-5404) Add the ability to split reverse zone subnets

2016-07-19 Thread Shane Kumpf (JIRA)

Shane Kumpf created YARN-5404:
-

 Summary: Add the ability to split reverse zone subnets
 Key: YARN-5404
 URL: https://issues.apache.org/jira/browse/YARN-5404
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Shane Kumpf
Assignee: Shane Kumpf


In some environments, the entire container subnet may not be used exclusively 
by containers (ie the YARN nodemanager host IPs may also be part of the larger 
subnet). 

As a result, the reverse lookup zones created by the YARN Registry DNS server 
may not match those created on the forwarders.

For example:

Network: 172.27.0.0
Subnet: 255.255.248.0

Hosts:
0.27.172.in-addr.arpa
1.27.172.in-addr.arpa
2.27.172.in-addr.arpa
3.27.172.in-addr.arpa

Containers
4.27.172.in-addr.arpa
5.27.172.in-addr.arpa
6.27.172.in-addr.arpa
7.27.172.in-addr.arpa

YARN Registry DNS only allows for creating (as the total IP count is greater 
than 256):
27.172.in-addr.arpa

Provide configuration to further subdivide the subnets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5360) Decouple host user and Docker container user

2016-07-19 Thread Sidharta Seethana (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384777#comment-15384777
 ] 

Sidharta Seethana commented on YARN-5360:
-

[~zyluo],

{quote}
I think this is inconsistent with Docker's motto to "build, ship and run". 
There is no point of using Docker if the user has to use every image as a base 
to add the correct user.
{quote}

While that may be Docker's motto - the objective of YARN-3611, in my opinion 
has never been to use docker for docker's sake - we needed to adapt it to the 
YARN/hadoop world - hadoop security, log aggregation, localization - all of 
these need to work.  

> Decouple host user and Docker container user
> 
>
> Key: YARN-5360
> URL: https://issues.apache.org/jira/browse/YARN-5360
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Zhankun Tang
>Assignee: Zhankun Tang
>
> There is *a dependency between job submitting user and the user in the Docker 
> image* in LCE currently. For instance, in order to run the Docker container 
> as yarn user, we can choose set the 
> "yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user" to yarn 
> and leave 
> "yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users" 
> default (true). Then LCE will choose yarn ( UID maybe 1001) as the user 
> running jobs.
> LCE will mount the generated launch_container.sh (owned by the running job 
> user) and /etc/passwd (*current the code is mounting to container's 
> /etc/password, I think it's a mistake*) into the Docker container and 
> utilizes "docker run --user=" option to get it done internally.
> Mounting /etc/passwd to the container is a not good choice due to override 
> original users defined in Docker image. As far as I know, since Docker v1.8 
> (or maybe earlier), the Docker run command "--user=" option accepts UID and 
> *when passing UID, the user does not have to exist in the container*. So we 
> could use UID instead of user name to construct the Docker run command to 
> eliminate the dependency that create the same user in the Docker image. This 
> enables LCE the ability to launch any Docker container safely regardless what 
> users in it.
> But this is not enough to decouple host user and Docker container user. The 
> final solution we are searching for are focused on allowing users to run 
> their Docker images flexibly without involving dependencies of YARN and make 
> sure the container won't bring in security risk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5382) RM does not audit log kill request for active applications


[ 
https://issues.apache.org/jira/browse/YARN-5382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384722#comment-15384722
 ] 

Jason Lowe commented on YARN-5382:
--

bq. Will update the patch to include auditing of killing of active apps only. 

Actually I think we should go with Jian's suggestion.  Auditing active apps 
could still generate duplicate events if the event dispatch is delayed, and 
Jian's suggestion means we'll only log it once when the app transitions from 
active to starting the kill processing.  We will need to enhance the kill event 
to include the requesting user and remote IP address so it can be audit logged 
properly within the RMAppImpl transition.

> RM does not audit log kill request for active applications
> --
>
> Key: YARN-5382
> URL: https://issues.apache.org/jira/browse/YARN-5382
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.2
>Reporter: Jason Lowe
>Assignee: Vrushali C
> Attachments: YARN-5382-branch-2.7.01.patch, 
> YARN-5382-branch-2.7.02.patch
>
>
> ClientRMService will audit a kill request but only if it either fails to 
> issue the kill or if the kill is sent to an already finished application.  It 
> does not create a log entry when the application is active which is arguably 
> the most important case to audit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-4091) Improvement: Introduce more debug/diagnostics information to detail out scheduler activity

2016-07-19 Thread Wangda Tan (JIRA)

[
https://issues.apache.org/jira/browse/YARN-4091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374343#comment-15374343
]

Wangda Tan edited comment on YARN-4091 at 7/19/16 6:33 PM:
---

Hi all,

Given "YARN-4091.preliminary.1.patch" I uploaded above, here are some brief
descriptions about newly added classes and test REST API.

Newly Added Classes:
ActivityManager:
- A class to store node or application allocations. It mainly contains
operations for allocation start, add, update and finish.

NodeAllocation:
- It contains allocation information for one allocation in a node heartbeat.
Detailed allocation activities are first stored in "AllocationActivity" as
operations, then transformed to a tree structure. Tree structure starts from
root queue and ends in leaf queue, application or container allocation.

AllocationActivity:
- It records an activity operation in allocation, which can be classified as
queue, application or container activity. Other information include state,
diagnostic, priority.

ActivityNode:
- It represents tree node in "NodeAllocation" tree structure. Each node may
represent queue, application or container in allocation activity. Node may have
children node if successfully allocated to next level.

ActivityDiagnosticConstant:
- Collection of diagnostics.

ActivityState:
- Collection of activity operation states.

AllocationState:
- Collection of allocation final states.

AllocationActivityType:
- Collection of types for activity operation.

AppAllocation:
- It contains allocation information for one application within a period of
time. Each application allocation may have several allocation attempts.

ActivitiesInfo:
- DAO object to display node allocation activity.

NodeAllocationInfo:
- DAO object to display each node allocation in node heartbeat.

ActivityNodeInfo:
- DAO object to display node information in allocation tree. It corresponds to
"ActivityNode" class.

AppActivitiesInfo:
- DAO object to display application activity.

AppAllocationInfo:
- DAO object to display application allocation detailed information.

Test REST API:
- Look at next node’s activities(by
default):http://localhost:18088/ws/v1/cluster/scheduler/activities
- Only look at specific node:
http://localhost:18088/ws/v1/cluster/scheduler/activities?nodeId=node-87:75 OR
without port number
http://localhost:18088/ws/v1/cluster/scheduler/activities?nodeId=node-87
- look at activities for specific application within a period of time(3s in
default):
http://localhost:18088/ws/v1/cluster/scheduler/app-activities?appId=application_1468198570845_0022,

http://localhost:18088/ws/v1/cluster/scheduler/app-activities?appId=application_1468198570845_0022=5.2

Test class:
- TestRMWebServicesCapacitySched.java
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesCapacitySched#testActivityJSON
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesCapacitySched#testAppActivityJSON

Thanks for review. Please feel free to put forward any suggestions for
improvements.

was (Author: chenge):
Hi all,

Given "YARN-4091.preliminary.1.patch" I uploaded above, here are some brief
descriptions about newly added classes and test REST API.

Newly Added Classes:
ActivityManager:
A class to store node or application allocations. It mainly contains
operations for allocation start, add, update and finish.

NodeAllocation:
It contains allocation information for one allocation in a node
heartbeat. Detailed allocation activities are first stored in
"AllocationActivity" as operations, then transformed to a tree structure. Tree
structure starts from root queue and ends in leaf queue, application or
container allocation.

AllocationActivity:
It records an activity operation in allocation, which can be classified
as queue, application or container activity. Other information include state,
diagnostic, priority.

ActivityNode:
It represents tree node in "NodeAllocation" tree structure. Each node
may represent queue, application or container in allocation activity. Node may
have children node if successfully allocated to next level.

ActivityDiagnosticConstant:
Collection of diagnostics.

ActivityState:
Collection of activity operation states.

AllocationState:
Collection of allocation final states.

AllocationActivityType:
Collection of types for activity operation.

AppAllocation:
It contains allocation information for one application within a period
of time. Each application allocation may have several allocation attempts.

ActivitiesInfo:
DAO object to display node allocation activity.

NodeAllocationInfo:
DAO object to display each node allocation in node heartbeat.

ActivityNodeInfo:
DAO object to display node information in allocation tree. It
corresponds to "ActivityNode" class.

AppActivitiesInfo:

[jira] [Commented] (YARN-5382) RM does not audit log kill request for active applications

2016-07-19 Thread Vrushali C (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384627#comment-15384627
 ] 

Vrushali C commented on YARN-5382:
--

Thanks [~jlowe] and [~jianhe]! Will update the patch to include auditing of 
killing of active apps only. 

> RM does not audit log kill request for active applications
> --
>
> Key: YARN-5382
> URL: https://issues.apache.org/jira/browse/YARN-5382
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.2
>Reporter: Jason Lowe
>Assignee: Vrushali C
> Attachments: YARN-5382-branch-2.7.01.patch, 
> YARN-5382-branch-2.7.02.patch
>
>
> ClientRMService will audit a kill request but only if it either fails to 
> issue the kill or if the kill is sent to an already finished application.  It 
> does not create a log entry when the application is active which is arguably 
> the most important case to audit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5264) Use FSQueue to store queue-specific information

2016-07-19 Thread Yufei Gu (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated YARN-5264:
---
Attachment: YARN-5264.001.patch

> Use FSQueue to store queue-specific information
> ---
>
> Key: YARN-5264
> URL: https://issues.apache.org/jira/browse/YARN-5264
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: YARN-5264.001.patch
>
>
> Use FSQueue to store queue-specific information instead of querying 
> AllocationConfiguration. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5392) Replace use of Priority in the Scheduling infrastructure with an opaque ShedulerKey

2016-07-19 Thread Arun Suresh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384596#comment-15384596
 ] 

Arun Suresh commented on YARN-5392:
---

ping [~kasha], [~subru]..
wondering if you might be able to give this a quick look.

> Replace use of Priority in the Scheduling infrastructure with an opaque 
> ShedulerKey
> ---
>
> Key: YARN-5392
> URL: https://issues.apache.org/jira/browse/YARN-5392
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-5392.001.patch, YARN-5392.002.patch, 
> YARN-5392.003.patch
>
>
> Based on discussions in YARN-4888, this jira proposes to replace the use of 
> {{Priority}} in the Scheduler infrastructure (Scheduler, Queues, SchedulerApp 
> / Node etc.) with a more opaque and extensible {{SchedulerKey}}.
> Note: Even though {{SchedulerKey}} will be used by the internal scheduling 
> infrastructure, It will not be exposed to the Client or the AM. The 
> SchdulerKey is meant to be an internal construct that is derived from 
> attributes of the ResourceRequest / ApplicationSubmissionContext / Scheduler 
> Configuration etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-679) add an entry point that can start any Yarn service

2016-07-19 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated YARN-679:

Attachment: YARN-679-010.patch

Patch 010, address complaints from checkstyle *as far as I consider necessary.* 
Specifically, sometimes it is better if lines do go beyond 80 chars, and you 
can be a bit less rigorous in test code than in production about accessibility 
of fields.

the complaints about IrqHandler using forbidden classes is valid; it's intended 
to be a single place for this. Ultimately, other uses in the Hadoop code could 
be replaced with this.

> add an entry point that can start any Yarn service
> --
>
> Key: YARN-679
> URL: https://issues.apache.org/jira/browse/YARN-679
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: api
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: YARN-679-001.patch, YARN-679-002.patch, 
> YARN-679-002.patch, YARN-679-003.patch, YARN-679-004.patch, 
> YARN-679-005.patch, YARN-679-006.patch, YARN-679-007.patch, 
> YARN-679-008.patch, YARN-679-009.patch, YARN-679-010.patch, 
> org.apache.hadoop.servic...mon 3.0.0-SNAPSHOT API).pdf
>
>  Time Spent: 72h
>  Remaining Estimate: 0h
>
> There's no need to write separate .main classes for every Yarn service, given 
> that the startup mechanism should be identical: create, init, start, wait for 
> stopped -with an interrupt handler to trigger a clean shutdown on a control-c 
> interrupt.
> Provide one that takes any classname, and a list of config files/options



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5401) yarn application kill does not let mapreduce jobs show up in jobhistory


[ 
https://issues.apache.org/jira/browse/YARN-5401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384529#comment-15384529
 ] 

Jason Lowe commented on YARN-5401:
--

Yes, if an application framework provides a kill command then that should be 
preferred over the yarn kill approach.  The MapReduce framework kill will 
automatically fallback to the yarn kill if the application master is 
unresponsive or if the job fails to enter the killed state within a 
configurable amount of time (controlled via 
yarn.app.mapreduce.am.hard-kill-timeout-ms).

> yarn application kill does not let mapreduce jobs show up in jobhistory
> ---
>
> Key: YARN-5401
> URL: https://issues.apache.org/jira/browse/YARN-5401
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
> Environment: centos 6.6
> apache hadoop 2.6.4
>Reporter: Nikhil Mulley
>
> Hi,
> Its been found in our cluster running apache hadoop 2.6.4, that while the 
> mapreduce jobs that are killed with 'hadoop job -kill' command do end up have 
> the job and its counters to jobhistory server but when 'yarn application 
> -kill' is used on mapreduce application, job does not show up in jobhistory 
> server interface.
> Is this intentional? If so, any particular reasons?
> It would be better to have mapreduce application history reported on 
> jobhistory  irrespective of whether kill is performed using yarn application 
> cli or hadoop job cli.
> thanks,
> Nikhil



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5342) Improve non-exclusive node partition resource allocation in Capacity Scheduler

2016-07-19 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384522#comment-15384522
 ] 

Wangda Tan commented on YARN-5342:
--

[~Naganarasimha], 

bq. Because in next NonExclusive mode allocation for the node of this parition 
might skip this application for which reset happened but might allocate to 
another application but still that partition might have pending resource 
requests.

IIUC, we now do allocation twice for shareable node partition, first one is for 
exclusive allocation and second one is for shareable allocation. This is 
already implicitly confirmed the non-exclusive allocation is safe. 

Please let me know if I missed anything. I want to check this patch in as soon 
as possible for 2.8 and do more comprehensive in follow up JIRAs.

Thanks,

> Improve non-exclusive node partition resource allocation in Capacity Scheduler
> --
>
> Key: YARN-5342
> URL: https://issues.apache.org/jira/browse/YARN-5342
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Sunil G
> Attachments: YARN-5342.1.patch, YARN-5342.2.patch
>
>
> In the previous implementation, one non-exclusive container allocation is 
> possible when the missed-opportunity >= #cluster-nodes. And 
> missed-opportunity will be reset when container allocated to any node.
> This will slow down the frequency of container allocation on non-exclusive 
> node partition: *When a non-exclusive partition=x has idle resource, we can 
> only allocate one container for this app in every 
> X=nodemanagers.heartbeat-interval secs for the whole cluster.*
> In this JIRA, I propose a fix to reset missed-opporunity only if we have >0 
> pending resource for the non-exclusive partition OR we get allocation from 
> the default partition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5401) yarn application kill does not let mapreduce jobs show up in jobhistory

2016-07-19 Thread Nikhil Mulley (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384516#comment-15384516
 ] 

Nikhil Mulley commented on YARN-5401:
-

So, should apps always use app specific methods to kill their jobs but never 
use yarn kill unless really necessary. (like always use kill(TERM) unless kill 
-9 becomes necessary)

> yarn application kill does not let mapreduce jobs show up in jobhistory
> ---
>
> Key: YARN-5401
> URL: https://issues.apache.org/jira/browse/YARN-5401
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
> Environment: centos 6.6
> apache hadoop 2.6.4
>Reporter: Nikhil Mulley
>
> Hi,
> Its been found in our cluster running apache hadoop 2.6.4, that while the 
> mapreduce jobs that are killed with 'hadoop job -kill' command do end up have 
> the job and its counters to jobhistory server but when 'yarn application 
> -kill' is used on mapreduce application, job does not show up in jobhistory 
> server interface.
> Is this intentional? If so, any particular reasons?
> It would be better to have mapreduce application history reported on 
> jobhistory  irrespective of whether kill is performed using yarn application 
> cli or hadoop job cli.
> thanks,
> Nikhil



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-3477) TimelineClientImpl swallows exceptions

2016-07-19 Thread Li Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384510#comment-15384510
 ] 

Li Lu commented on YARN-3477:
-

Sure. Let me find some time to finish this. Thanks [~ste...@apache.org]! 

> TimelineClientImpl swallows exceptions
> --
>
> Key: YARN-3477
> URL: https://issues.apache.org/jira/browse/YARN-3477
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 2.6.0, 2.7.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: YARN-3477-001.patch, YARN-3477-002.patch
>
>
> If timeline client fails more than the retry count, the original exception is 
> not thrown. Instead some runtime exception is raised saying "retries run out"
> # the failing exception should be rethrown, ideally via 
> NetUtils.wrapException to include URL of the failing endpoing
> # Otherwise, the raised RTE should (a) state that URL and (b) set the 
> original fault as the inner cause



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5352) Allow container-executor to use private /tmp

2016-07-19 Thread Nathan Roberts (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384496#comment-15384496
 ] 

Nathan Roberts commented on YARN-5352:
--

This patch doesn't address localization. Thinking was that localization doesn't 
run application code so while it may create files in /tmp (e.g. hsperf*), I 
wouldn't expect that to be a significant problem. I can look into addressing 
localization as well if folks think it's important.

> Allow container-executor to use private /tmp 
> -
>
> Key: YARN-5352
> URL: https://issues.apache.org/jira/browse/YARN-5352
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Nathan Roberts
>Assignee: Nathan Roberts
> Attachments: YARN-5352-v0.patch
>
>
> It's very common for user code to create things in /tmp. Yes, applications 
> have means to specify alternate tmp directories but doing so is opt-in and 
> therefore doesn't happen in many case.  At a minimum, linux can use private 
> namespaces to create a private /tmp for each container so that it's using the 
> same space allocated to containers and it's automatically cleaned up as part 
> of container clean-up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5352) Allow container-executor to use private /tmp

2016-07-19 Thread Nathan Roberts (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nathan Roberts updated YARN-5352:
-
Attachment: YARN-5352-v0.patch

Patch that uses linux private namespace and bind mounts to achieve a private 
/tmp.


> Allow container-executor to use private /tmp 
> -
>
> Key: YARN-5352
> URL: https://issues.apache.org/jira/browse/YARN-5352
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Nathan Roberts
>Assignee: Nathan Roberts
> Attachments: YARN-5352-v0.patch
>
>
> It's very common for user code to create things in /tmp. Yes, applications 
> have means to specify alternate tmp directories but doing so is opt-in and 
> therefore doesn't happen in many case.  At a minimum, linux can use private 
> namespaces to create a private /tmp for each container so that it's using the 
> same space allocated to containers and it's automatically cleaned up as part 
> of container clean-up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5340) Race condition in RollingLevelDBTimelineStore#getAndSetStartTime()

2016-07-19 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384489#comment-15384489
 ] 

Junping Du commented on YARN-5340:
--

I think another side effort of current patch is it force accessing of 
{{startTimeWriteCache.get(entity);}} has to get a lock first which affects 
every put entity operations. One way to make lock finer-grained is only put 
lock on when {{startTimeWriteCache.get(entity);}} doesn't get anything.

> Race condition in RollingLevelDBTimelineStore#getAndSetStartTime()
> --
>
> Key: YARN-5340
> URL: https://issues.apache.org/jira/browse/YARN-5340
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Sumana Sathish
>Assignee: Li Lu
>Priority: Critical
> Attachments: YARN-5340-trunk.001.patch
>
>
> App Name/User/RPC Port/AM Host info is missing from ATS web service or YARN 
> CLI's app info
> {code}
> RUNNING: /usr/hdp/current/hadoop-yarn-client/bin/yarn --config 
> /tmp/hadoopConf application -status application_1467931619679_0001
> Application Report :
> Application-Id : application_1467931619679_0001
> Application-Name : null
> Application-Type : null
> User : null
> Queue : null
> Application Priority : null
> Start-Time : 0
> Finish-Time : 1467931672057
> Progress : 100%
> State : FINISHED
> Final-State : SUCCEEDED
> Tracking-URL : N/A
> RPC Port : -1
> AM Host : N/A
> Aggregate Resource Allocation : 290014 MB-seconds, 74 vcore-seconds
> Log Aggregation Status : N/A
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5382) RM does not audit log kill request for active applications


[ 
https://issues.apache.org/jira/browse/YARN-5382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384373#comment-15384373
 ] 

Jason Lowe commented on YARN-5382:
--

I like the general idea, but I'm not sure a literal move of the audit log to 
the transition will work.  The audit log will try to log the remote IP of the 
caller, but at the AppKilledTransition we're no longer in an RPC context so 
there's no remote caller.  The basic information is actually in the kill 
message after YARN-5053, but not in a way that we can pull apart and pass as 
individual pieces of information to the audit logger (e.g.: user, remote IP, 
etc.).  We could extend the kill event to optionally contain those bits then 
extend the audit logger API so we can manually specify the user and remote IP 
rather than have the audit logger always assume it can get them on its own.

> RM does not audit log kill request for active applications
> --
>
> Key: YARN-5382
> URL: https://issues.apache.org/jira/browse/YARN-5382
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.2
>Reporter: Jason Lowe
>Assignee: Vrushali C
> Attachments: YARN-5382-branch-2.7.01.patch, 
> YARN-5382-branch-2.7.02.patch
>
>
> ClientRMService will audit a kill request but only if it either fails to 
> issue the kill or if the kill is sent to an already finished application.  It 
> does not create a log entry when the application is active which is arguably 
> the most important case to audit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5401) yarn application kill does not let mapreduce jobs show up in jobhistory


[ 
https://issues.apache.org/jira/browse/YARN-5401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384330#comment-15384330
 ] 

Jason Lowe commented on YARN-5401:
--

This is effectively a duplicate of YARN-2261.  MapReduce history requires the 
MapReduce ApplicationMaster to generate the history when it completes.  hadoop 
job -kill or mapred job -kill accomplishes the kill by having the client 
connect to the MapReduce ApplicationMaster for the job and asks it to kill the 
job.  Since this goes through the ApplicationMaster it allows the history to be 
generated properly.

When the kill is done via YARN then the ApplicationMaster is not involved.  The 
ResourceManager kills the AM without the AM's knowledge.  This is similar to 
kill vs. kill -9 (i.e.: SIGTERM vs SIGKILL) in POSIX.  The former allows the 
application to perform cleanup tasks on the way down, while the latter 
mercilessly kills the process without any chance for cleanup.

Since YARN does not allow the application to specify a cleanup task to be 
performed when the app dies the MapReduce framework doesn't get a chance to 
finish generating the history for the job.

> yarn application kill does not let mapreduce jobs show up in jobhistory
> ---
>
> Key: YARN-5401
> URL: https://issues.apache.org/jira/browse/YARN-5401
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
> Environment: centos 6.6
> apache hadoop 2.6.4
>Reporter: Nikhil Mulley
>
> Hi,
> Its been found in our cluster running apache hadoop 2.6.4, that while the 
> mapreduce jobs that are killed with 'hadoop job -kill' command do end up have 
> the job and its counters to jobhistory server but when 'yarn application 
> -kill' is used on mapreduce application, job does not show up in jobhistory 
> server interface.
> Is this intentional? If so, any particular reasons?
> It would be better to have mapreduce application history reported on 
> jobhistory  irrespective of whether kill is performed using yarn application 
> cli or hadoop job cli.
> thanks,
> Nikhil



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5213) Fix a bug in LogCLIHelpers which cause TestLogsCLI#testFetchApplictionLogs fails intermittently

2016-07-19 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384289#comment-15384289
 ] 

Hudson commented on YARN-5213:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #10118 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10118/])
YARN-5213. Fix a bug in LogCLIHelpers which cause (junping_du: rev 
dc2f4b6ac8a6f8848457046cf9e1362d8f48495d)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestLogsCLI.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/LogCLIHelpers.java


> Fix a bug in LogCLIHelpers which cause TestLogsCLI#testFetchApplictionLogs 
> fails intermittently
> ---
>
> Key: YARN-5213
> URL: https://issues.apache.org/jira/browse/YARN-5213
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Reporter: Rohith Sharma K S
>Assignee: Xuan Gong
> Fix For: 2.9.0
>
> Attachments: YARN-5213.1.patch, YARN-5213.2.patch, YARN-5213.patch
>
>
> TestLogsCLI fails intermittently on build 
> [link|https://builds.apache.org/job/PreCommit-YARN-Build/11910/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt]
> {noformat}
> Running org.apache.hadoop.yarn.client.cli.TestLogsCLI
> Tests run: 11, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.708 sec 
> <<< FAILURE! - in org.apache.hadoop.yarn.client.cli.TestLogsCLI
> testFetchApplictionLogs(org.apache.hadoop.yarn.client.cli.TestLogsCLI)  Time 
> elapsed: 0.176 sec  <<< FAILURE!
> org.junit.ComparisonFailure: expected:<[Hello]> but was:<[=]>
>   at org.junit.Assert.assertEquals(Assert.java:115)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.hadoop.yarn.client.cli.TestLogsCLI.testFetchApplictionLogs(TestLogsCLI.java:389)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5213) Fix a bug in LogCLIHelpers which cause TestLogsCLI#testFetchApplictionLogs fails intermittently

2016-07-19 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-5213:
-
Summary: Fix a bug in LogCLIHelpers which cause 
TestLogsCLI#testFetchApplictionLogs fails intermittently  (was: 
TestLogsCLI#testFetchApplictionLogs fails intermittently)

> Fix a bug in LogCLIHelpers which cause TestLogsCLI#testFetchApplictionLogs 
> fails intermittently
> ---
>
> Key: YARN-5213
> URL: https://issues.apache.org/jira/browse/YARN-5213
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Reporter: Rohith Sharma K S
>Assignee: Xuan Gong
> Attachments: YARN-5213.1.patch, YARN-5213.2.patch, YARN-5213.patch
>
>
> TestLogsCLI fails intermittently on build 
> [link|https://builds.apache.org/job/PreCommit-YARN-Build/11910/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt]
> {noformat}
> Running org.apache.hadoop.yarn.client.cli.TestLogsCLI
> Tests run: 11, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.708 sec 
> <<< FAILURE! - in org.apache.hadoop.yarn.client.cli.TestLogsCLI
> testFetchApplictionLogs(org.apache.hadoop.yarn.client.cli.TestLogsCLI)  Time 
> elapsed: 0.176 sec  <<< FAILURE!
> org.junit.ComparisonFailure: expected:<[Hello]> but was:<[=]>
>   at org.junit.Assert.assertEquals(Assert.java:115)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.hadoop.yarn.client.cli.TestLogsCLI.testFetchApplictionLogs(TestLogsCLI.java:389)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5403) yarn top command does not execute correct


[ 
https://issues.apache.org/jira/browse/YARN-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384148#comment-15384148
 ] 

Hadoop QA commented on YARN-5403:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
29s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
33s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
20s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 14s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client: The 
patch generated 5 new + 152 unchanged - 0 fixed = 157 total (was 152) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
37s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 16s {color} 
| {color:red} hadoop-yarn-client in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
15s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 21m 12s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.client.api.impl.TestYarnClient |
|   | hadoop.yarn.client.cli.TestLogsCLI |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12818813/YARN-5403.patch |
| JIRA Issue | YARN-5403 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 7ac530d3bc47 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / fe20494 |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/12372/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/12372/artifact/patchprocess/whitespace-eol.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/12372/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
 |
| unit test logs |  
https://builds.apache.org/job/PreCommit-YARN-Build/12372/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/12372/testReport/ |
| modules | C:

[jira] [Updated] (YARN-5403) yarn top command does not execute correct


 [ 
https://issues.apache.org/jira/browse/YARN-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gu-chi updated YARN-5403:
-
Attachment: YARN-5403.patch

> yarn top command does not execute correct
> -
>
> Key: YARN-5403
> URL: https://issues.apache.org/jira/browse/YARN-5403
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.7.2
>Reporter: gu-chi
> Attachments: YARN-5403.patch
>
>
> when execute {{yarn top}}, I always get exception as below:
> {quote}
> 16/07/19 19:55:12 ERROR cli.TopCLI: Could not fetch RM start time
> java.net.ConnectException: Connection refused
>   at java.net.PlainSocketImpl.socketConnect(Native Method)
>   at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
>   at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:204)
>   at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
>   at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>   at java.net.Socket.connect(Socket.java:589)
>   at java.net.Socket.connect(Socket.java:538)
>   at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
>   at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
>   at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
>   at sun.net.www.http.HttpClient.(HttpClient.java:211)
>   at sun.net.www.http.HttpClient.New(HttpClient.java:308)
>   at sun.net.www.http.HttpClient.New(HttpClient.java:326)
>   at 
> sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1169)
>   at 
> sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1105)
>   at 
> sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:999)
>   at 
> sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:933)
>   at 
> org.apache.hadoop.yarn.client.cli.TopCLI.getRMStartTime(TopCLI.java:747)
>   at org.apache.hadoop.yarn.client.cli.TopCLI.run(TopCLI.java:443)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>   at org.apache.hadoop.yarn.client.cli.TopCLI.main(TopCLI.java:421)
> YARN top - 19:55:13, up 17001d, 11:55, 0 active users, queue(s): root
> {quote}
> As I looked into it, the function {{getRMStartTime}} use HTTP as hardcoding 
> no matter what is the {{yarn.http.policy}} setting, should consider if use 
> HTTPS 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5403) yarn top command does not execute correct


 [ 
https://issues.apache.org/jira/browse/YARN-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gu-chi updated YARN-5403:
-
Attachment: (was: YARN-5403.patch)

> yarn top command does not execute correct
> -
>
> Key: YARN-5403
> URL: https://issues.apache.org/jira/browse/YARN-5403
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.7.2
>Reporter: gu-chi
>
> when execute {{yarn top}}, I always get exception as below:
> {quote}
> 16/07/19 19:55:12 ERROR cli.TopCLI: Could not fetch RM start time
> java.net.ConnectException: Connection refused
>   at java.net.PlainSocketImpl.socketConnect(Native Method)
>   at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
>   at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:204)
>   at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
>   at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>   at java.net.Socket.connect(Socket.java:589)
>   at java.net.Socket.connect(Socket.java:538)
>   at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
>   at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
>   at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
>   at sun.net.www.http.HttpClient.(HttpClient.java:211)
>   at sun.net.www.http.HttpClient.New(HttpClient.java:308)
>   at sun.net.www.http.HttpClient.New(HttpClient.java:326)
>   at 
> sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1169)
>   at 
> sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1105)
>   at 
> sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:999)
>   at 
> sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:933)
>   at 
> org.apache.hadoop.yarn.client.cli.TopCLI.getRMStartTime(TopCLI.java:747)
>   at org.apache.hadoop.yarn.client.cli.TopCLI.run(TopCLI.java:443)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>   at org.apache.hadoop.yarn.client.cli.TopCLI.main(TopCLI.java:421)
> YARN top - 19:55:13, up 17001d, 11:55, 0 active users, queue(s): root
> {quote}
> As I looked into it, the function {{getRMStartTime}} use HTTP as hardcoding 
> no matter what is the {{yarn.http.policy}} setting, should consider if use 
> HTTPS 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5403) yarn top command does not execute correct


[ 
https://issues.apache.org/jira/browse/YARN-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384066#comment-15384066
 ] 

Hadoop QA commented on YARN-5403:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s {color} 
| {color:red} YARN-5403 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12818806/YARN-5403.patch |
| JIRA Issue | YARN-5403 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/12371/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> yarn top command does not execute correct
> -
>
> Key: YARN-5403
> URL: https://issues.apache.org/jira/browse/YARN-5403
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.7.2
>Reporter: gu-chi
> Attachments: YARN-5403.patch
>
>
> when execute {{yarn top}}, I always get exception as below:
> {quote}
> 16/07/19 19:55:12 ERROR cli.TopCLI: Could not fetch RM start time
> java.net.ConnectException: Connection refused
>   at java.net.PlainSocketImpl.socketConnect(Native Method)
>   at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
>   at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:204)
>   at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
>   at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>   at java.net.Socket.connect(Socket.java:589)
>   at java.net.Socket.connect(Socket.java:538)
>   at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
>   at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
>   at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
>   at sun.net.www.http.HttpClient.(HttpClient.java:211)
>   at sun.net.www.http.HttpClient.New(HttpClient.java:308)
>   at sun.net.www.http.HttpClient.New(HttpClient.java:326)
>   at 
> sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1169)
>   at 
> sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1105)
>   at 
> sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:999)
>   at 
> sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:933)
>   at 
> org.apache.hadoop.yarn.client.cli.TopCLI.getRMStartTime(TopCLI.java:747)
>   at org.apache.hadoop.yarn.client.cli.TopCLI.run(TopCLI.java:443)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>   at org.apache.hadoop.yarn.client.cli.TopCLI.main(TopCLI.java:421)
> YARN top - 19:55:13, up 17001d, 11:55, 0 active users, queue(s): root
> {quote}
> As I looked into it, the function {{getRMStartTime}} use HTTP as hardcoding 
> no matter what is the {{yarn.http.policy}} setting, should consider if use 
> HTTPS 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5403) yarn top command does not execute correct


 [ 
https://issues.apache.org/jira/browse/YARN-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gu-chi updated YARN-5403:
-
Attachment: YARN-5403.patch

> yarn top command does not execute correct
> -
>
> Key: YARN-5403
> URL: https://issues.apache.org/jira/browse/YARN-5403
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.7.2
>Reporter: gu-chi
> Attachments: YARN-5403.patch
>
>
> when execute {{yarn top}}, I always get exception as below:
> {quote}
> 16/07/19 19:55:12 ERROR cli.TopCLI: Could not fetch RM start time
> java.net.ConnectException: Connection refused
>   at java.net.PlainSocketImpl.socketConnect(Native Method)
>   at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
>   at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:204)
>   at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
>   at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>   at java.net.Socket.connect(Socket.java:589)
>   at java.net.Socket.connect(Socket.java:538)
>   at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
>   at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
>   at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
>   at sun.net.www.http.HttpClient.(HttpClient.java:211)
>   at sun.net.www.http.HttpClient.New(HttpClient.java:308)
>   at sun.net.www.http.HttpClient.New(HttpClient.java:326)
>   at 
> sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1169)
>   at 
> sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1105)
>   at 
> sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:999)
>   at 
> sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:933)
>   at 
> org.apache.hadoop.yarn.client.cli.TopCLI.getRMStartTime(TopCLI.java:747)
>   at org.apache.hadoop.yarn.client.cli.TopCLI.run(TopCLI.java:443)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>   at org.apache.hadoop.yarn.client.cli.TopCLI.main(TopCLI.java:421)
> YARN top - 19:55:13, up 17001d, 11:55, 0 active users, queue(s): root
> {quote}
> As I looked into it, the function {{getRMStartTime}} use HTTP as hardcoding 
> no matter what is the {{yarn.http.policy}} setting, should consider if use 
> HTTPS 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-5403) yarn top command does not execute correct

gu-chi created YARN-5403:


 Summary: yarn top command does not execute correct
 Key: YARN-5403
 URL: https://issues.apache.org/jira/browse/YARN-5403
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.7.2
Reporter: gu-chi


when execute {{yarn top}}, I always get exception as below:
{quote}
16/07/19 19:55:12 ERROR cli.TopCLI: Could not fetch RM start time
java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:204)
at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at java.net.Socket.connect(Socket.java:538)
at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
at sun.net.www.http.HttpClient.(HttpClient.java:211)
at sun.net.www.http.HttpClient.New(HttpClient.java:308)
at sun.net.www.http.HttpClient.New(HttpClient.java:326)
at 
sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1169)
at 
sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1105)
at 
sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:999)
at 
sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:933)
at 
org.apache.hadoop.yarn.client.cli.TopCLI.getRMStartTime(TopCLI.java:747)
at org.apache.hadoop.yarn.client.cli.TopCLI.run(TopCLI.java:443)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.yarn.client.cli.TopCLI.main(TopCLI.java:421)
YARN top - 19:55:13, up 17001d, 11:55, 0 active users, queue(s): root
{quote}

As I looked into it, the function {{getRMStartTime}} use HTTP as hardcoding no 
matter what is the {{yarn.http.policy}} setting, should consider if use HTTPS 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-3477) TimelineClientImpl swallows exceptions

2016-07-19 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384024#comment-15384024
 ] 

Steve Loughran commented on YARN-3477:
--

It would have really nice if this patch could have been reviewed and committed 
when it actually compiled against the code

as it is, it'll be a significant piece of work to try and merge in. I don't 
have the time this week, and am off on vacation from friday.

Can you look at the core changes: the exception logging, wrapping, 
{{nterruptedIOException}} rethrowing, and replicate that in your ongoing work? 
That's all that matters.

I'll watch your JIRA and review it.

> TimelineClientImpl swallows exceptions
> --
>
> Key: YARN-3477
> URL: https://issues.apache.org/jira/browse/YARN-3477
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 2.6.0, 2.7.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: YARN-3477-001.patch, YARN-3477-002.patch
>
>
> If timeline client fails more than the retry count, the original exception is 
> not thrown. Instead some runtime exception is raised saying "retries run out"
> # the failing exception should be rethrown, ideally via 
> NetUtils.wrapException to include URL of the failing endpoing
> # Otherwise, the raised RTE should (a) state that URL and (b) set the 
> original fault as the inner cause



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5309) Fix SSLFactory truststore reloader thread leak in TimelineClientImpl


[ 
https://issues.apache.org/jira/browse/YARN-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384016#comment-15384016
 ] 

Hadoop QA commented on YARN-5309:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 13m 21s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
35s {color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s 
{color} | {color:green} branch-2.8 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s 
{color} | {color:green} branch-2.8 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
21s {color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s 
{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
13s {color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s 
{color} | {color:green} branch-2.8 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s 
{color} | {color:green} branch-2.8 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
29s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
18s {color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common: 
The patch generated 0 new + 21 unchanged - 2 fixed = 21 total (was 23) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
1s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 0s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
29s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 15s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.8.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 34s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.7.0_101. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
19s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 36m 33s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:5af2af1 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12818795/YARN-5309.branch-2.8.001.patch
 |
| JIRA Issue | YARN-5309 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  xml  findbugs  checkstyle  |
| uname | Linux 2623ade54897

[jira] [Commented] (YARN-679) add an entry point that can start any Yarn service


[ 
https://issues.apache.org/jira/browse/YARN-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384009#comment-15384009
 ] 

Hadoop QA commented on YARN-679:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 33s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
1s {color} | {color:green} The patch appears to include 22 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
35s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 10s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
24s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
25s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 48s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
43s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 17s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 7m 17s {color} 
| {color:red} root generated 8 new + 709 unchanged - 0 fixed = 717 total (was 
709) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 26s 
{color} | {color:red} hadoop-common-project/hadoop-common: The patch generated 
143 new + 119 unchanged - 34 fixed = 262 total (was 153) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 73 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
34s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 48s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 17s 
{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
23s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 39m 29s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12818791/YARN-679-009.patch |
| JIRA Issue | YARN-679 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 1190dfe4fb46 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / fe20494 |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
| javac | 
https://builds.apache.org/job/PreCommit-YARN-Build/12368/artifact/patchprocess/diff-compile-javac-root.txt
 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/12368/artifact/patchprocess/diff-checkstyle-hadoop-common-project_hadoop-common.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/12368/artifact/patchprocess/whitespace-eol.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/12368/testReport/ |
| modules | C: hadoop-common-project/hadoop-common U: 
hadoop-common-project/hadoop-common |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/12368/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> add an entry point that can start any Yarn service
>

[jira] [Commented] (YARN-5309) Fix SSLFactory truststore reloader thread leak in TimelineClientImpl


[ 
https://issues.apache.org/jira/browse/YARN-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384008#comment-15384008
 ] 

Hadoop QA commented on YARN-5309:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 13m 11s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
2s {color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s 
{color} | {color:green} branch-2.8 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s 
{color} | {color:green} branch-2.8 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
23s {color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s 
{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
18s {color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
14s {color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} branch-2.8 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s 
{color} | {color:green} branch-2.8 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
26s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 21s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
17s {color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common: 
The patch generated 0 new + 21 unchanged - 2 fixed = 21 total (was 23) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 29s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
21s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 5s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.8.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 23s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.7.0_101. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
17s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 37m 44s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:5af2af1 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12818795/YARN-5309.branch-2.8.001.patch
 |
| JIRA Issue | YARN-5309 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  xml  findbugs  checkstyle  |
| uname | Linux cdbcdc5c7f80

[jira] [Updated] (YARN-5309) Fix SSLFactory truststore reloader thread leak in TimelineClientImpl

2016-07-19 Thread Weiwei Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-5309:
--
Attachment: (was: YARN-5309.branch-2.8.001.patch)

> Fix SSLFactory truststore reloader thread leak in TimelineClientImpl
> 
>
> Key: YARN-5309
> URL: https://issues.apache.org/jira/browse/YARN-5309
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver, yarn
>Affects Versions: 2.7.1
>Reporter: Thomas Friedrich
>Assignee: Weiwei Yang
>Priority: Blocker
> Attachments: YARN-5309.001.patch, YARN-5309.002.patch, 
> YARN-5309.003.patch, YARN-5309.004.patch, YARN-5309.005.patch, 
> YARN-5309.branch-2.7.3.001.patch, YARN-5309.branch-2.8.001.patch
>
>
> We found a similar issue as HADOOP-11368 in TimelineClientImpl. The class 
> creates an instance of SSLFactory in newSslConnConfigurator and subsequently 
> creates the ReloadingX509TrustManager instance which in turn starts a trust 
> store reloader thread. 
> However, the SSLFactory is never destroyed and hence the trust store reloader 
> threads are not killed.
> This problem was observed by a customer who had SSL enabled in Hadoop and 
> submitted many queries against the HiveServer2. After a few days, the HS2 
> instance crashed and from the Java dump we could see many (over 13000) 
> threads like this:
> "Truststore reloader thread" #126 daemon prio=5 os_prio=0 
> tid=0x7f680d2e3000 nid=0x98fd waiting on 
> condition [0x7f67e482c000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at org.apache.hadoop.security.ssl.ReloadingX509TrustManager.run
> (ReloadingX509TrustManager.java:225)
> at java.lang.Thread.run(Thread.java:745)
> HiveServer2 uses the JobClient to submit a job:
> Thread [HiveServer2-Background-Pool: Thread-188] (Suspended (breakpoint at 
> line 89 in 
> ReloadingX509TrustManager))   
>   owns: Object  (id=464)  
>   owns: Object  (id=465)  
>   owns: Object  (id=466)  
>   owns: ServiceLoader  (id=210)
>   ReloadingX509TrustManager.(String, String, String, long) line: 89 
>   FileBasedKeyStoresFactory.init(SSLFactory$Mode) line: 209   
>   SSLFactory.init() line: 131 
>   TimelineClientImpl.newSslConnConfigurator(int, Configuration) line: 532 
>   TimelineClientImpl.newConnConfigurator(Configuration) line: 507 
>   TimelineClientImpl.serviceInit(Configuration) line: 269 
>   TimelineClientImpl(AbstractService).init(Configuration) line: 163   
>   YarnClientImpl.serviceInit(Configuration) line: 169 
>   YarnClientImpl(AbstractService).init(Configuration) line: 163   
>   ResourceMgrDelegate.serviceInit(Configuration) line: 102
>   ResourceMgrDelegate(AbstractService).init(Configuration) line: 163  
>   ResourceMgrDelegate.(YarnConfiguration) line: 96  
>   YARNRunner.(Configuration) line: 112  
>   YarnClientProtocolProvider.create(Configuration) line: 34   
>   Cluster.initialize(InetSocketAddress, Configuration) line: 95   
>   Cluster.(InetSocketAddress, Configuration) line: 82   
>   Cluster.(Configuration) line: 75  
>   JobClient.init(JobConf) line: 475   
>   JobClient.(JobConf) line: 454 
>   MapRedTask(ExecDriver).execute(DriverContext) line: 401 
>   MapRedTask.execute(DriverContext) line: 137 
>   MapRedTask(Task).executeTask() line: 160 
>   TaskRunner.runSequential() line: 88 
>   Driver.launchTask(Task, String, boolean, String, int, 
> DriverContext) line: 1653   
>   Driver.execute() line: 1412 
> For every job, a new instance of JobClient/YarnClientImpl/TimelineClientImpl 
> is created. But because the HS2 process stays up for days, the previous trust 
> store reloader threads are still hanging around in the HS2 process and 
> eventually use all the resources available. 
> It seems like a similar fix as HADOOP-11368 is needed in TimelineClientImpl 
> but it doesn't have a destroy method to begin with. 
> One option to avoid this problem is to disable the yarn timeline service 
> (yarn.timeline-service.enabled=false).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5309) Fix SSLFactory truststore reloader thread leak in TimelineClientImpl

2016-07-19 Thread Weiwei Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-5309:
--
Attachment: YARN-5309.branch-2.8.001.patch

> Fix SSLFactory truststore reloader thread leak in TimelineClientImpl
> 
>
> Key: YARN-5309
> URL: https://issues.apache.org/jira/browse/YARN-5309
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver, yarn
>Affects Versions: 2.7.1
>Reporter: Thomas Friedrich
>Assignee: Weiwei Yang
>Priority: Blocker
> Attachments: YARN-5309.001.patch, YARN-5309.002.patch, 
> YARN-5309.003.patch, YARN-5309.004.patch, YARN-5309.005.patch, 
> YARN-5309.branch-2.7.3.001.patch, YARN-5309.branch-2.8.001.patch
>
>
> We found a similar issue as HADOOP-11368 in TimelineClientImpl. The class 
> creates an instance of SSLFactory in newSslConnConfigurator and subsequently 
> creates the ReloadingX509TrustManager instance which in turn starts a trust 
> store reloader thread. 
> However, the SSLFactory is never destroyed and hence the trust store reloader 
> threads are not killed.
> This problem was observed by a customer who had SSL enabled in Hadoop and 
> submitted many queries against the HiveServer2. After a few days, the HS2 
> instance crashed and from the Java dump we could see many (over 13000) 
> threads like this:
> "Truststore reloader thread" #126 daemon prio=5 os_prio=0 
> tid=0x7f680d2e3000 nid=0x98fd waiting on 
> condition [0x7f67e482c000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at org.apache.hadoop.security.ssl.ReloadingX509TrustManager.run
> (ReloadingX509TrustManager.java:225)
> at java.lang.Thread.run(Thread.java:745)
> HiveServer2 uses the JobClient to submit a job:
> Thread [HiveServer2-Background-Pool: Thread-188] (Suspended (breakpoint at 
> line 89 in 
> ReloadingX509TrustManager))   
>   owns: Object  (id=464)  
>   owns: Object  (id=465)  
>   owns: Object  (id=466)  
>   owns: ServiceLoader  (id=210)
>   ReloadingX509TrustManager.(String, String, String, long) line: 89 
>   FileBasedKeyStoresFactory.init(SSLFactory$Mode) line: 209   
>   SSLFactory.init() line: 131 
>   TimelineClientImpl.newSslConnConfigurator(int, Configuration) line: 532 
>   TimelineClientImpl.newConnConfigurator(Configuration) line: 507 
>   TimelineClientImpl.serviceInit(Configuration) line: 269 
>   TimelineClientImpl(AbstractService).init(Configuration) line: 163   
>   YarnClientImpl.serviceInit(Configuration) line: 169 
>   YarnClientImpl(AbstractService).init(Configuration) line: 163   
>   ResourceMgrDelegate.serviceInit(Configuration) line: 102
>   ResourceMgrDelegate(AbstractService).init(Configuration) line: 163  
>   ResourceMgrDelegate.(YarnConfiguration) line: 96  
>   YARNRunner.(Configuration) line: 112  
>   YarnClientProtocolProvider.create(Configuration) line: 34   
>   Cluster.initialize(InetSocketAddress, Configuration) line: 95   
>   Cluster.(InetSocketAddress, Configuration) line: 82   
>   Cluster.(Configuration) line: 75  
>   JobClient.init(JobConf) line: 475   
>   JobClient.(JobConf) line: 454 
>   MapRedTask(ExecDriver).execute(DriverContext) line: 401 
>   MapRedTask.execute(DriverContext) line: 137 
>   MapRedTask(Task).executeTask() line: 160 
>   TaskRunner.runSequential() line: 88 
>   Driver.launchTask(Task, String, boolean, String, int, 
> DriverContext) line: 1653   
>   Driver.execute() line: 1412 
> For every job, a new instance of JobClient/YarnClientImpl/TimelineClientImpl 
> is created. But because the HS2 process stays up for days, the previous trust 
> store reloader threads are still hanging around in the HS2 process and 
> eventually use all the resources available. 
> It seems like a similar fix as HADOOP-11368 is needed in TimelineClientImpl 
> but it doesn't have a destroy method to begin with. 
> One option to avoid this problem is to disable the yarn timeline service 
> (yarn.timeline-service.enabled=false).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5309) Fix SSLFactory truststore reloader thread leak in TimelineClientImpl

2016-07-19 Thread Weiwei Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-5309:
--
Attachment: YARN-5309.branch-2.8.001.patch
YARN-5309.branch-2.7.3.001.patch

Hello [~vvasudev]

Attached a patch for branch-2.7.3 and 2.8. Please check. Thanks for all the 
help.

> Fix SSLFactory truststore reloader thread leak in TimelineClientImpl
> 
>
> Key: YARN-5309
> URL: https://issues.apache.org/jira/browse/YARN-5309
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver, yarn
>Affects Versions: 2.7.1
>Reporter: Thomas Friedrich
>Assignee: Weiwei Yang
>Priority: Blocker
> Attachments: YARN-5309.001.patch, YARN-5309.002.patch, 
> YARN-5309.003.patch, YARN-5309.004.patch, YARN-5309.005.patch, 
> YARN-5309.branch-2.7.3.001.patch, YARN-5309.branch-2.8.001.patch
>
>
> We found a similar issue as HADOOP-11368 in TimelineClientImpl. The class 
> creates an instance of SSLFactory in newSslConnConfigurator and subsequently 
> creates the ReloadingX509TrustManager instance which in turn starts a trust 
> store reloader thread. 
> However, the SSLFactory is never destroyed and hence the trust store reloader 
> threads are not killed.
> This problem was observed by a customer who had SSL enabled in Hadoop and 
> submitted many queries against the HiveServer2. After a few days, the HS2 
> instance crashed and from the Java dump we could see many (over 13000) 
> threads like this:
> "Truststore reloader thread" #126 daemon prio=5 os_prio=0 
> tid=0x7f680d2e3000 nid=0x98fd waiting on 
> condition [0x7f67e482c000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at org.apache.hadoop.security.ssl.ReloadingX509TrustManager.run
> (ReloadingX509TrustManager.java:225)
> at java.lang.Thread.run(Thread.java:745)
> HiveServer2 uses the JobClient to submit a job:
> Thread [HiveServer2-Background-Pool: Thread-188] (Suspended (breakpoint at 
> line 89 in 
> ReloadingX509TrustManager))   
>   owns: Object  (id=464)  
>   owns: Object  (id=465)  
>   owns: Object  (id=466)  
>   owns: ServiceLoader  (id=210)
>   ReloadingX509TrustManager.(String, String, String, long) line: 89 
>   FileBasedKeyStoresFactory.init(SSLFactory$Mode) line: 209   
>   SSLFactory.init() line: 131 
>   TimelineClientImpl.newSslConnConfigurator(int, Configuration) line: 532 
>   TimelineClientImpl.newConnConfigurator(Configuration) line: 507 
>   TimelineClientImpl.serviceInit(Configuration) line: 269 
>   TimelineClientImpl(AbstractService).init(Configuration) line: 163   
>   YarnClientImpl.serviceInit(Configuration) line: 169 
>   YarnClientImpl(AbstractService).init(Configuration) line: 163   
>   ResourceMgrDelegate.serviceInit(Configuration) line: 102
>   ResourceMgrDelegate(AbstractService).init(Configuration) line: 163  
>   ResourceMgrDelegate.(YarnConfiguration) line: 96  
>   YARNRunner.(Configuration) line: 112  
>   YarnClientProtocolProvider.create(Configuration) line: 34   
>   Cluster.initialize(InetSocketAddress, Configuration) line: 95   
>   Cluster.(InetSocketAddress, Configuration) line: 82   
>   Cluster.(Configuration) line: 75  
>   JobClient.init(JobConf) line: 475   
>   JobClient.(JobConf) line: 454 
>   MapRedTask(ExecDriver).execute(DriverContext) line: 401 
>   MapRedTask.execute(DriverContext) line: 137 
>   MapRedTask(Task).executeTask() line: 160 
>   TaskRunner.runSequential() line: 88 
>   Driver.launchTask(Task, String, boolean, String, int, 
> DriverContext) line: 1653   
>   Driver.execute() line: 1412 
> For every job, a new instance of JobClient/YarnClientImpl/TimelineClientImpl 
> is created. But because the HS2 process stays up for days, the previous trust 
> store reloader threads are still hanging around in the HS2 process and 
> eventually use all the resources available. 
> It seems like a similar fix as HADOOP-11368 is needed in TimelineClientImpl 
> but it doesn't have a destroy method to begin with. 
> One option to avoid this problem is to disable the yarn timeline service 
> (yarn.timeline-service.enabled=false).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-4996) Make TestNMReconnect.testCompareRMNodeAfterReconnect() scheduler agnostic, or better yet parameterized

2016-07-19 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15383951#comment-15383951
 ] 

Hudson commented on YARN-4996:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #10117 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10117/])
YARN-4996. Make TestNMReconnect.testCompareRMNodeAfterReconnect() (varunsaxena: 
rev fe20494a728836c974a4cfa062e1802583fdc934)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/ParameterizedSchedulerTestBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/resourcetracker/TestNMReconnect.java


> Make TestNMReconnect.testCompareRMNodeAfterReconnect() scheduler agnostic, or 
> better yet parameterized
> --
>
> Key: YARN-4996
> URL: https://issues.apache.org/jira/browse/YARN-4996
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager, test
>Affects Versions: 2.8.0
>Reporter: Daniel Templeton
>Assignee: Kai Sasaki
>Priority: Minor
>  Labels: newbie
> Fix For: 2.9.0
>
> Attachments: YARN-4996.01.patch, YARN-4996.02.patch, 
> YARN-4996.03.patch, YARN-4996.04.patch, YARN-4996.05.patch, 
> YARN-4996.06.patch, YARN-4996.07.patch, YARN-4996.08.patch
>
>
> The test tests only the capacity scheduler.  It should also test fair 
> scheduler.  At a bare minimum, it should use the default scheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-679) add an entry point that can start any Yarn service

2016-07-19 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated YARN-679:

Attachment: YARN-679-009.patch

Patch 009, synced up with trunk

HADOOP-13179 broke things, as it made the building of common options 
private/static, not, as I needed for some things, subclassable. I fixed by 
making that protected and synchronized on {{OptionBuilder}}, which is what 
everything should do.

[~aw] I've edited references to the PR in the JIRA. Will patches now take, or 
is there some secret DB I need to alter?

> add an entry point that can start any Yarn service
> --
>
> Key: YARN-679
> URL: https://issues.apache.org/jira/browse/YARN-679
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: api
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: YARN-679-001.patch, YARN-679-002.patch, 
> YARN-679-002.patch, YARN-679-003.patch, YARN-679-004.patch, 
> YARN-679-005.patch, YARN-679-006.patch, YARN-679-007.patch, 
> YARN-679-008.patch, YARN-679-009.patch, org.apache.hadoop.servic...mon 
> 3.0.0-SNAPSHOT API).pdf
>
>  Time Spent: 72h
>  Remaining Estimate: 0h
>
> There's no need to write separate .main classes for every Yarn service, given 
> that the startup mechanism should be identical: create, init, start, wait for 
> stopped -with an interrupt handler to trigger a clean shutdown on a control-c 
> interrupt.
> Provide one that takes any classname, and a list of config files/options



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-4996) Make TestNMReconnect.testCompareRMNodeAfterReconnect() scheduler agnostic, or better yet parameterized

2016-07-19 Thread Kai Sasaki (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15383932#comment-15383932
 ] 

Kai Sasaki commented on YARN-4996:
--

[~varun_saxena] Thank you so much!

> Make TestNMReconnect.testCompareRMNodeAfterReconnect() scheduler agnostic, or 
> better yet parameterized
> --
>
> Key: YARN-4996
> URL: https://issues.apache.org/jira/browse/YARN-4996
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager, test
>Affects Versions: 2.8.0
>Reporter: Daniel Templeton
>Assignee: Kai Sasaki
>Priority: Minor
>  Labels: newbie
> Fix For: 2.9.0
>
> Attachments: YARN-4996.01.patch, YARN-4996.02.patch, 
> YARN-4996.03.patch, YARN-4996.04.patch, YARN-4996.05.patch, 
> YARN-4996.06.patch, YARN-4996.07.patch, YARN-4996.08.patch
>
>
> The test tests only the capacity scheduler.  It should also test fair 
> scheduler.  At a bare minimum, it should use the default scheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-4996) Make TestNMReconnect.testCompareRMNodeAfterReconnect() scheduler agnostic, or better yet parameterized

2016-07-19 Thread Varun Saxena (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15383927#comment-15383927
 ] 

Varun Saxena commented on YARN-4996:


Committed the latest patch to trunk, branch-2.
Thanks [~lewuathe] for your contribution and [~templedf] for the reviews.

> Make TestNMReconnect.testCompareRMNodeAfterReconnect() scheduler agnostic, or 
> better yet parameterized
> --
>
> Key: YARN-4996
> URL: https://issues.apache.org/jira/browse/YARN-4996
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager, test
>Affects Versions: 2.8.0
>Reporter: Daniel Templeton
>Assignee: Kai Sasaki
>Priority: Minor
>  Labels: newbie
> Fix For: 2.9.0
>
> Attachments: YARN-4996.01.patch, YARN-4996.02.patch, 
> YARN-4996.03.patch, YARN-4996.04.patch, YARN-4996.05.patch, 
> YARN-4996.06.patch, YARN-4996.07.patch, YARN-4996.08.patch
>
>
> The test tests only the capacity scheduler.  It should also test fair 
> scheduler.  At a bare minimum, it should use the default scheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5309) Fix SSLFactory truststore reloader thread leak in TimelineClientImpl

2016-07-19 Thread Varun Vasudev (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15383755#comment-15383755
 ] 

Varun Vasudev commented on YARN-5309:
-

[~cheersyang] - the patch doesn't apply cleanly on branch-2.7. Can you please 
add a patch for branch-2.7 if you need this to go into 2.7.3? Thanks!

> Fix SSLFactory truststore reloader thread leak in TimelineClientImpl
> 
>
> Key: YARN-5309
> URL: https://issues.apache.org/jira/browse/YARN-5309
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver, yarn
>Affects Versions: 2.7.1
>Reporter: Thomas Friedrich
>Assignee: Weiwei Yang
>Priority: Blocker
> Attachments: YARN-5309.001.patch, YARN-5309.002.patch, 
> YARN-5309.003.patch, YARN-5309.004.patch, YARN-5309.005.patch
>
>
> We found a similar issue as HADOOP-11368 in TimelineClientImpl. The class 
> creates an instance of SSLFactory in newSslConnConfigurator and subsequently 
> creates the ReloadingX509TrustManager instance which in turn starts a trust 
> store reloader thread. 
> However, the SSLFactory is never destroyed and hence the trust store reloader 
> threads are not killed.
> This problem was observed by a customer who had SSL enabled in Hadoop and 
> submitted many queries against the HiveServer2. After a few days, the HS2 
> instance crashed and from the Java dump we could see many (over 13000) 
> threads like this:
> "Truststore reloader thread" #126 daemon prio=5 os_prio=0 
> tid=0x7f680d2e3000 nid=0x98fd waiting on 
> condition [0x7f67e482c000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at org.apache.hadoop.security.ssl.ReloadingX509TrustManager.run
> (ReloadingX509TrustManager.java:225)
> at java.lang.Thread.run(Thread.java:745)
> HiveServer2 uses the JobClient to submit a job:
> Thread [HiveServer2-Background-Pool: Thread-188] (Suspended (breakpoint at 
> line 89 in 
> ReloadingX509TrustManager))   
>   owns: Object  (id=464)  
>   owns: Object  (id=465)  
>   owns: Object  (id=466)  
>   owns: ServiceLoader  (id=210)
>   ReloadingX509TrustManager.(String, String, String, long) line: 89 
>   FileBasedKeyStoresFactory.init(SSLFactory$Mode) line: 209   
>   SSLFactory.init() line: 131 
>   TimelineClientImpl.newSslConnConfigurator(int, Configuration) line: 532 
>   TimelineClientImpl.newConnConfigurator(Configuration) line: 507 
>   TimelineClientImpl.serviceInit(Configuration) line: 269 
>   TimelineClientImpl(AbstractService).init(Configuration) line: 163   
>   YarnClientImpl.serviceInit(Configuration) line: 169 
>   YarnClientImpl(AbstractService).init(Configuration) line: 163   
>   ResourceMgrDelegate.serviceInit(Configuration) line: 102
>   ResourceMgrDelegate(AbstractService).init(Configuration) line: 163  
>   ResourceMgrDelegate.(YarnConfiguration) line: 96  
>   YARNRunner.(Configuration) line: 112  
>   YarnClientProtocolProvider.create(Configuration) line: 34   
>   Cluster.initialize(InetSocketAddress, Configuration) line: 95   
>   Cluster.(InetSocketAddress, Configuration) line: 82   
>   Cluster.(Configuration) line: 75  
>   JobClient.init(JobConf) line: 475   
>   JobClient.(JobConf) line: 454 
>   MapRedTask(ExecDriver).execute(DriverContext) line: 401 
>   MapRedTask.execute(DriverContext) line: 137 
>   MapRedTask(Task).executeTask() line: 160 
>   TaskRunner.runSequential() line: 88 
>   Driver.launchTask(Task, String, boolean, String, int, 
> DriverContext) line: 1653   
>   Driver.execute() line: 1412 
> For every job, a new instance of JobClient/YarnClientImpl/TimelineClientImpl 
> is created. But because the HS2 process stays up for days, the previous trust 
> store reloader threads are still hanging around in the HS2 process and 
> eventually use all the resources available. 
> It seems like a similar fix as HADOOP-11368 is needed in TimelineClientImpl 
> but it doesn't have a destroy method to begin with. 
> One option to avoid this problem is to disable the yarn timeline service 
> (yarn.timeline-service.enabled=false).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5309) Fix SSLFactory truststore reloader thread leak in TimelineClientImpl

2016-07-19 Thread Varun Vasudev (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Vasudev updated YARN-5309:

Summary: Fix SSLFactory truststore reloader thread leak in 
TimelineClientImpl  (was: SSLFactory truststore reloader thread leak in 
TimelineClientImpl)

> Fix SSLFactory truststore reloader thread leak in TimelineClientImpl
> 
>
> Key: YARN-5309
> URL: https://issues.apache.org/jira/browse/YARN-5309
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver, yarn
>Affects Versions: 2.7.1
>Reporter: Thomas Friedrich
>Assignee: Weiwei Yang
>Priority: Blocker
> Attachments: YARN-5309.001.patch, YARN-5309.002.patch, 
> YARN-5309.003.patch, YARN-5309.004.patch, YARN-5309.005.patch
>
>
> We found a similar issue as HADOOP-11368 in TimelineClientImpl. The class 
> creates an instance of SSLFactory in newSslConnConfigurator and subsequently 
> creates the ReloadingX509TrustManager instance which in turn starts a trust 
> store reloader thread. 
> However, the SSLFactory is never destroyed and hence the trust store reloader 
> threads are not killed.
> This problem was observed by a customer who had SSL enabled in Hadoop and 
> submitted many queries against the HiveServer2. After a few days, the HS2 
> instance crashed and from the Java dump we could see many (over 13000) 
> threads like this:
> "Truststore reloader thread" #126 daemon prio=5 os_prio=0 
> tid=0x7f680d2e3000 nid=0x98fd waiting on 
> condition [0x7f67e482c000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at org.apache.hadoop.security.ssl.ReloadingX509TrustManager.run
> (ReloadingX509TrustManager.java:225)
> at java.lang.Thread.run(Thread.java:745)
> HiveServer2 uses the JobClient to submit a job:
> Thread [HiveServer2-Background-Pool: Thread-188] (Suspended (breakpoint at 
> line 89 in 
> ReloadingX509TrustManager))   
>   owns: Object  (id=464)  
>   owns: Object  (id=465)  
>   owns: Object  (id=466)  
>   owns: ServiceLoader  (id=210)
>   ReloadingX509TrustManager.(String, String, String, long) line: 89 
>   FileBasedKeyStoresFactory.init(SSLFactory$Mode) line: 209   
>   SSLFactory.init() line: 131 
>   TimelineClientImpl.newSslConnConfigurator(int, Configuration) line: 532 
>   TimelineClientImpl.newConnConfigurator(Configuration) line: 507 
>   TimelineClientImpl.serviceInit(Configuration) line: 269 
>   TimelineClientImpl(AbstractService).init(Configuration) line: 163   
>   YarnClientImpl.serviceInit(Configuration) line: 169 
>   YarnClientImpl(AbstractService).init(Configuration) line: 163   
>   ResourceMgrDelegate.serviceInit(Configuration) line: 102
>   ResourceMgrDelegate(AbstractService).init(Configuration) line: 163  
>   ResourceMgrDelegate.(YarnConfiguration) line: 96  
>   YARNRunner.(Configuration) line: 112  
>   YarnClientProtocolProvider.create(Configuration) line: 34   
>   Cluster.initialize(InetSocketAddress, Configuration) line: 95   
>   Cluster.(InetSocketAddress, Configuration) line: 82   
>   Cluster.(Configuration) line: 75  
>   JobClient.init(JobConf) line: 475   
>   JobClient.(JobConf) line: 454 
>   MapRedTask(ExecDriver).execute(DriverContext) line: 401 
>   MapRedTask.execute(DriverContext) line: 137 
>   MapRedTask(Task).executeTask() line: 160 
>   TaskRunner.runSequential() line: 88 
>   Driver.launchTask(Task, String, boolean, String, int, 
> DriverContext) line: 1653   
>   Driver.execute() line: 1412 
> For every job, a new instance of JobClient/YarnClientImpl/TimelineClientImpl 
> is created. But because the HS2 process stays up for days, the previous trust 
> store reloader threads are still hanging around in the HS2 process and 
> eventually use all the resources available. 
> It seems like a similar fix as HADOOP-11368 is needed in TimelineClientImpl 
> but it doesn't have a destroy method to begin with. 
> One option to avoid this problem is to disable the yarn timeline service 
> (yarn.timeline-service.enabled=false).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-4996) Make TestNMReconnect.testCompareRMNodeAfterReconnect() scheduler agnostic, or better yet parameterized