[jira] [Commented] (YARN-6555) Store application flow context in NM state store for work-preserving restart

2017-05-25 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025849#comment-16025849
 ] 

Rohith Sharma K S commented on YARN-6555:
-

cool.. thank you :-)

> Store application flow context in NM state store for work-preserving restart
> 
>
> Key: YARN-6555
> URL: https://issues.apache.org/jira/browse/YARN-6555
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-5355, YARN-5355-branch-2, 3.0.0-alpha3
>Reporter: Vrushali C
>Assignee: Rohith Sharma K S
>  Labels: yarn-5355-merge-blocker
> Fix For: YARN-5355, YARN-5355-branch-2, 3.0.0-alpha3
>
> Attachments: YARN-6555.001.patch, YARN-6555.002.patch, 
> YARN-6555.003.patch
>
>
> If timeline service v2 is enabled and NM is restarted with recovery enabled, 
> then NM fails to start and throws an error as  "flow context can't be null".
> This is happening because the flow context did not exist before but now that 
> timeline service v2 is enabled, ApplicationImpl expects it to exist. 
> This would also happen even if flow context existed before but since we are 
> not persisting it / reading it during 
> ContainerManagerImpl#recoverApplication, it does not get passed in to 
> ApplicationImpl.
> full stack trace
> {code}
> 2017-05-03 21:51:52,178 FATAL 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting 
> NodeManager
> java.lang.IllegalArgumentException: flow context cannot be null
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.(ApplicationImpl.java:104)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.(ApplicationImpl.java:90)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recoverApplication(ContainerManagerImpl.java:318)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:280)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:267)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:276)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:588)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:649)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6555) Store application flow context in NM state store for work-preserving restart

2017-05-25 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025843#comment-16025843
 ] 

Haibo Chen commented on YARN-6555:
--

Yes, I already committed this into trunk

> Store application flow context in NM state store for work-preserving restart
> 
>
> Key: YARN-6555
> URL: https://issues.apache.org/jira/browse/YARN-6555
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-5355, YARN-5355-branch-2, 3.0.0-alpha3
>Reporter: Vrushali C
>Assignee: Rohith Sharma K S
>  Labels: yarn-5355-merge-blocker
> Fix For: YARN-5355, YARN-5355-branch-2, 3.0.0-alpha3
>
> Attachments: YARN-6555.001.patch, YARN-6555.002.patch, 
> YARN-6555.003.patch
>
>
> If timeline service v2 is enabled and NM is restarted with recovery enabled, 
> then NM fails to start and throws an error as  "flow context can't be null".
> This is happening because the flow context did not exist before but now that 
> timeline service v2 is enabled, ApplicationImpl expects it to exist. 
> This would also happen even if flow context existed before but since we are 
> not persisting it / reading it during 
> ContainerManagerImpl#recoverApplication, it does not get passed in to 
> ApplicationImpl.
> full stack trace
> {code}
> 2017-05-03 21:51:52,178 FATAL 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting 
> NodeManager
> java.lang.IllegalArgumentException: flow context cannot be null
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.(ApplicationImpl.java:104)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.(ApplicationImpl.java:90)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recoverApplication(ContainerManagerImpl.java:318)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:280)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:267)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:276)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:588)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:649)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6555) Store application flow context in NM state store for work-preserving restart

2017-05-25 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025840#comment-16025840
 ] 

Rohith Sharma K S commented on YARN-6555:
-

[~haibo.chen] could we merge this into trunk?

> Store application flow context in NM state store for work-preserving restart
> 
>
> Key: YARN-6555
> URL: https://issues.apache.org/jira/browse/YARN-6555
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-5355, YARN-5355-branch-2, 3.0.0-alpha3
>Reporter: Vrushali C
>Assignee: Rohith Sharma K S
>  Labels: yarn-5355-merge-blocker
> Fix For: YARN-5355, YARN-5355-branch-2, 3.0.0-alpha3
>
> Attachments: YARN-6555.001.patch, YARN-6555.002.patch, 
> YARN-6555.003.patch
>
>
> If timeline service v2 is enabled and NM is restarted with recovery enabled, 
> then NM fails to start and throws an error as  "flow context can't be null".
> This is happening because the flow context did not exist before but now that 
> timeline service v2 is enabled, ApplicationImpl expects it to exist. 
> This would also happen even if flow context existed before but since we are 
> not persisting it / reading it during 
> ContainerManagerImpl#recoverApplication, it does not get passed in to 
> ApplicationImpl.
> full stack trace
> {code}
> 2017-05-03 21:51:52,178 FATAL 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting 
> NodeManager
> java.lang.IllegalArgumentException: flow context cannot be null
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.(ApplicationImpl.java:104)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.(ApplicationImpl.java:90)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recoverApplication(ContainerManagerImpl.java:318)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:280)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:267)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:276)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:588)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:649)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6555) Store application flow context in NM state store for work-preserving restart

2017-05-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025813#comment-16025813
 ] 

Hudson commented on YARN-6555:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11786 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11786/])
YARN-6555. Store application flow context in NM state store for (haibochen: rev 
47474fffac085e0e5ea46336bf80ccd0677017a3)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/application/ApplicationImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/proto/yarn_server_nodemanager_recovery.proto
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManagerRecovery.java


> Store application flow context in NM state store for work-preserving restart
> 
>
> Key: YARN-6555
> URL: https://issues.apache.org/jira/browse/YARN-6555
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-5355, YARN-5355-branch-2, 3.0.0-alpha3
>Reporter: Vrushali C
>Assignee: Rohith Sharma K S
>  Labels: yarn-5355-merge-blocker
> Fix For: YARN-5355, YARN-5355-branch-2, 3.0.0-alpha3
>
> Attachments: YARN-6555.001.patch, YARN-6555.002.patch, 
> YARN-6555.003.patch
>
>
> If timeline service v2 is enabled and NM is restarted with recovery enabled, 
> then NM fails to start and throws an error as  "flow context can't be null".
> This is happening because the flow context did not exist before but now that 
> timeline service v2 is enabled, ApplicationImpl expects it to exist. 
> This would also happen even if flow context existed before but since we are 
> not persisting it / reading it during 
> ContainerManagerImpl#recoverApplication, it does not get passed in to 
> ApplicationImpl.
> full stack trace
> {code}
> 2017-05-03 21:51:52,178 FATAL 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting 
> NodeManager
> java.lang.IllegalArgumentException: flow context cannot be null
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.(ApplicationImpl.java:104)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.(ApplicationImpl.java:90)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recoverApplication(ContainerManagerImpl.java:318)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:280)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:267)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:276)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:588)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:649)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6654) RollingLevelDBTimelineStore introduce minor backwards compatible change

2017-05-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025811#comment-16025811
 ] 

Hadoop QA commented on YARN-6654:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
26s{color} | {color:green} hadoop-yarn-server-applicationhistoryservice in the 
patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 24m  9s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | YARN-6654 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12869995/YARN-6654.1.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux d507ff7f8d80 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 
14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 47474ff |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/16028/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/16028/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> RollingLevelDBTimelineStore introduce minor backwards compatible change
> ---
>
> Key: YARN-6654
> URL: https://issues.apache.org/jira/browse/YARN-6654
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>

[jira] [Updated] (YARN-6555) Store application flow context in NM state store for work-preserving restart

2017-05-25 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-6555:
-
Fix Version/s: YARN-5355-branch-2
   YARN-5355

> Store application flow context in NM state store for work-preserving restart
> 
>
> Key: YARN-6555
> URL: https://issues.apache.org/jira/browse/YARN-6555
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-5355, YARN-5355-branch-2, 3.0.0-alpha3
>Reporter: Vrushali C
>Assignee: Rohith Sharma K S
>  Labels: yarn-5355-merge-blocker
> Fix For: YARN-5355, YARN-5355-branch-2, 3.0.0-alpha3
>
> Attachments: YARN-6555.001.patch, YARN-6555.002.patch, 
> YARN-6555.003.patch
>
>
> If timeline service v2 is enabled and NM is restarted with recovery enabled, 
> then NM fails to start and throws an error as  "flow context can't be null".
> This is happening because the flow context did not exist before but now that 
> timeline service v2 is enabled, ApplicationImpl expects it to exist. 
> This would also happen even if flow context existed before but since we are 
> not persisting it / reading it during 
> ContainerManagerImpl#recoverApplication, it does not get passed in to 
> ApplicationImpl.
> full stack trace
> {code}
> 2017-05-03 21:51:52,178 FATAL 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting 
> NodeManager
> java.lang.IllegalArgumentException: flow context cannot be null
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.(ApplicationImpl.java:104)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.(ApplicationImpl.java:90)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recoverApplication(ContainerManagerImpl.java:318)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:280)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:267)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:276)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:588)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:649)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5531) UnmanagedAM pool manager for federating application across clusters

2017-05-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025798#comment-16025798
 ] 

Hadoop QA commented on YARN-5531:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
29s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
46s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
58s{color} | {color:green} YARN-2915 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
56s{color} | {color:green} YARN-2915 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 1s{color} | {color:green} YARN-2915 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
29s{color} | {color:green} YARN-2915 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
33s{color} | {color:green} YARN-2915 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m  
5s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common in 
YARN-2915 has 1 extant Findbugs warnings. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
51s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 in YARN-2915 has 5 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
1s{color} | {color:green} YARN-2915 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
53s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 
0 new + 48 unchanged - 1 fixed = 48 total (was 49) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
26s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
14s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 
45s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 43m 44s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}134m 24s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Timed out junit tests | 
org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | YARN-5531 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12869974/YARN-5531-YARN-2915.v14.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| 

[jira] [Updated] (YARN-6555) Store application flow context in NM state store for work-preserving restart

2017-05-25 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-6555:
-
Fix Version/s: 3.0.0-alpha3

> Store application flow context in NM state store for work-preserving restart
> 
>
> Key: YARN-6555
> URL: https://issues.apache.org/jira/browse/YARN-6555
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-5355, YARN-5355-branch-2, 3.0.0-alpha3
>Reporter: Vrushali C
>Assignee: Rohith Sharma K S
>  Labels: yarn-5355-merge-blocker
> Fix For: 3.0.0-alpha3
>
> Attachments: YARN-6555.001.patch, YARN-6555.002.patch, 
> YARN-6555.003.patch
>
>
> If timeline service v2 is enabled and NM is restarted with recovery enabled, 
> then NM fails to start and throws an error as  "flow context can't be null".
> This is happening because the flow context did not exist before but now that 
> timeline service v2 is enabled, ApplicationImpl expects it to exist. 
> This would also happen even if flow context existed before but since we are 
> not persisting it / reading it during 
> ContainerManagerImpl#recoverApplication, it does not get passed in to 
> ApplicationImpl.
> full stack trace
> {code}
> 2017-05-03 21:51:52,178 FATAL 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting 
> NodeManager
> java.lang.IllegalArgumentException: flow context cannot be null
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.(ApplicationImpl.java:104)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.(ApplicationImpl.java:90)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recoverApplication(ContainerManagerImpl.java:318)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:280)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:267)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:276)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:588)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:649)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6555) Store application flow context in NM state store for work-preserving restart

2017-05-25 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-6555:
-
Summary: Store application flow context in NM state store for 
work-preserving restart  (was: Enable flow context read (& corresponding write) 
for recovering application with NM restart )

> Store application flow context in NM state store for work-preserving restart
> 
>
> Key: YARN-6555
> URL: https://issues.apache.org/jira/browse/YARN-6555
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-5355, YARN-5355-branch-2, 3.0.0-alpha3
>Reporter: Vrushali C
>Assignee: Rohith Sharma K S
>  Labels: yarn-5355-merge-blocker
> Attachments: YARN-6555.001.patch, YARN-6555.002.patch, 
> YARN-6555.003.patch
>
>
> If timeline service v2 is enabled and NM is restarted with recovery enabled, 
> then NM fails to start and throws an error as  "flow context can't be null".
> This is happening because the flow context did not exist before but now that 
> timeline service v2 is enabled, ApplicationImpl expects it to exist. 
> This would also happen even if flow context existed before but since we are 
> not persisting it / reading it during 
> ContainerManagerImpl#recoverApplication, it does not get passed in to 
> ApplicationImpl.
> full stack trace
> {code}
> 2017-05-03 21:51:52,178 FATAL 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting 
> NodeManager
> java.lang.IllegalArgumentException: flow context cannot be null
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.(ApplicationImpl.java:104)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.(ApplicationImpl.java:90)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recoverApplication(ContainerManagerImpl.java:318)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:280)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:267)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:276)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:588)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:649)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6111) Rumen input does't work in SLS

2017-05-25 Thread YuJie Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025782#comment-16025782
 ] 

YuJie Huang commented on YARN-6111:
---

Ok, thank you very much! Is the latest Hadoop version ok in SLS?

> Rumen input does't work in SLS
> --
>
> Key: YARN-6111
> URL: https://issues.apache.org/jira/browse/YARN-6111
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler-load-simulator
>Affects Versions: 2.6.0, 2.7.3, 3.0.0-alpha2
> Environment: ubuntu14.0.4 os
>Reporter: YuJie Huang
>Assignee: Yufei Gu
>  Labels: test
> Fix For: 3.0.0-alpha3
>
> Attachments: YARN-6111.001.patch
>
>
> Hi guys,
> I am trying to learn the use of SLS.
> I would like to get the file realtimetrack.json, but this it only 
> contains "[]" at the end of a simulation. This is the command I use to 
> run the instance:
> HADOOP_HOME $ bin/slsrun.sh --input-rumen=sample-data/2jobsmin-rumen-jh.json 
> --output-dir=sample-data 
> All other files, including metrics, appears to be properly populated.I can 
> also trace with web:http://localhost:10001/simulate
> Can someone help?
> Thanks



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6654) RollingLevelDBTimelineStore introduce minor backwards compatible change

2017-05-25 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated YARN-6654:
--
Attachment: YARN-6654.1.patch

> RollingLevelDBTimelineStore introduce minor backwards compatible change
> ---
>
> Key: YARN-6654
> URL: https://issues.apache.org/jira/browse/YARN-6654
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
>Priority: Blocker
> Attachments: YARN-6654.1.patch
>
>
> There is a small minor backwards compatible change introduced while upgrading 
> fst library from 2.24 to 2.50.
> {code}
> Exception in thread "main" java.io.IOException: java.lang.RuntimeException: 
> unable to find class for code 83
>   at 
> org.nustaq.serialization.FSTObjectInput.readObject(FSTObjectInput.java:243)
>   at 
> org.nustaq.serialization.FSTConfiguration.asObject(FSTConfiguration.java:1125)
>   at org.nustaq.serialization.FSTNoJackson.main(FSTNoJackson.java:31)
> Caused by: java.lang.RuntimeException: unable to find class for code 83
>   at 
> org.nustaq.serialization.FSTClazzNameRegistry.decodeClass(FSTClazzNameRegistry.java:180)
>   at 
> org.nustaq.serialization.coders.FSTStreamDecoder.readClass(FSTStreamDecoder.java:472)
>   at 
> org.nustaq.serialization.FSTObjectInput.readClass(FSTObjectInput.java:933)
>   at 
> org.nustaq.serialization.FSTObjectInput.readObjectWithHeader(FSTObjectInput.java:343)
>   at 
> org.nustaq.serialization.FSTObjectInput.readObjectInternal(FSTObjectInput.java:327)
>   at 
> org.nustaq.serialization.serializers.FSTArrayListSerializer.instantiate(FSTArrayListSerializer.java:63)
>   at 
> org.nustaq.serialization.FSTObjectInput.instantiateAndReadWithSer(FSTObjectInput.java:497)
>   at 
> org.nustaq.serialization.FSTObjectInput.readObjectWithHeader(FSTObjectInput.java:366)
>   at 
> org.nustaq.serialization.FSTObjectInput.readObjectInternal(FSTObjectInput.java:327)
>   at 
> org.nustaq.serialization.serializers.FSTMapSerializer.instantiate(FSTMapSerializer.java:78)
>   at 
> org.nustaq.serialization.FSTObjectInput.instantiateAndReadWithSer(FSTObjectInput.java:497)
>   at 
> org.nustaq.serialization.FSTObjectInput.readObjectWithHeader(FSTObjectInput.java:366)
>   at 
> org.nustaq.serialization.FSTObjectInput.readObjectInternal(FSTObjectInput.java:327)
>   at 
> org.nustaq.serialization.FSTObjectInput.readObject(FSTObjectInput.java:307)
>   at 
> org.nustaq.serialization.FSTObjectInput.readObject(FSTObjectInput.java:241)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6649) RollingLevelDBTimelineServer throws RuntimeException if object decoding ever fails runtime exception

2017-05-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025779#comment-16025779
 ] 

Hadoop QA commented on YARN-6649:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
11s{color} | {color:green} hadoop-yarn-server-applicationhistoryservice in the 
patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 25m 11s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | YARN-6649 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12869983/YARN-6649.1.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 2ec80658f3e2 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 2b5ad48 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/16027/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/16027/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> RollingLevelDBTimelineServer throws RuntimeException if object decoding ever 
> fails runtime exception
> 
>
> Key: YARN-6649
> URL: https://issues.apache.org/jira/browse/YARN-6649
> Project: Hadoop YARN
>  

[jira] [Created] (YARN-6654) RollingLevelDBTimelineStore introduce minor backwards compatible change

2017-05-25 Thread Jonathan Eagles (JIRA)
Jonathan Eagles created YARN-6654:
-

 Summary: RollingLevelDBTimelineStore introduce minor backwards 
compatible change
 Key: YARN-6654
 URL: https://issues.apache.org/jira/browse/YARN-6654
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles
Priority: Blocker


There is a small minor backwards compatible change introduced while upgrading 
fst library from 2.24 to 2.50.
{code}
Exception in thread "main" java.io.IOException: java.lang.RuntimeException: 
unable to find class for code 83
at 
org.nustaq.serialization.FSTObjectInput.readObject(FSTObjectInput.java:243)
at 
org.nustaq.serialization.FSTConfiguration.asObject(FSTConfiguration.java:1125)
at org.nustaq.serialization.FSTNoJackson.main(FSTNoJackson.java:31)
Caused by: java.lang.RuntimeException: unable to find class for code 83
at 
org.nustaq.serialization.FSTClazzNameRegistry.decodeClass(FSTClazzNameRegistry.java:180)
at 
org.nustaq.serialization.coders.FSTStreamDecoder.readClass(FSTStreamDecoder.java:472)
at 
org.nustaq.serialization.FSTObjectInput.readClass(FSTObjectInput.java:933)
at 
org.nustaq.serialization.FSTObjectInput.readObjectWithHeader(FSTObjectInput.java:343)
at 
org.nustaq.serialization.FSTObjectInput.readObjectInternal(FSTObjectInput.java:327)
at 
org.nustaq.serialization.serializers.FSTArrayListSerializer.instantiate(FSTArrayListSerializer.java:63)
at 
org.nustaq.serialization.FSTObjectInput.instantiateAndReadWithSer(FSTObjectInput.java:497)
at 
org.nustaq.serialization.FSTObjectInput.readObjectWithHeader(FSTObjectInput.java:366)
at 
org.nustaq.serialization.FSTObjectInput.readObjectInternal(FSTObjectInput.java:327)
at 
org.nustaq.serialization.serializers.FSTMapSerializer.instantiate(FSTMapSerializer.java:78)
at 
org.nustaq.serialization.FSTObjectInput.instantiateAndReadWithSer(FSTObjectInput.java:497)
at 
org.nustaq.serialization.FSTObjectInput.readObjectWithHeader(FSTObjectInput.java:366)
at 
org.nustaq.serialization.FSTObjectInput.readObjectInternal(FSTObjectInput.java:327)
at 
org.nustaq.serialization.FSTObjectInput.readObject(FSTObjectInput.java:307)
at 
org.nustaq.serialization.FSTObjectInput.readObject(FSTObjectInput.java:241)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-6630) Container worker dir could not recover when NM restart

2017-05-25 Thread Feng Yuan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025741#comment-16025741
 ] 

Feng Yuan edited comment on YARN-6630 at 5/26/17 3:11 AM:
--

Hi, wy {code}ContainerRetryPolicy{code} is configuarable,for example if you are 
using DistributeShell app you can set this by 
parameter:*--container_retry_policy*.
IMO,{code}yarn.nodemanager.recovery.enabled=true{code} and 
{code}ContainerRetryPolicy= NEVER_RETRY{code} is not ambivalent.
I think ContainerRetryPolicy is create to let app control which container 
should retry which not.
For example in ApplicationMaster assemble ContainerLaunchContext, can set this.



was (Author: feng yuan):
Hi, wy {code}ContainerRetryPolicy{code} is configuarable,for example if you are 
using DistributeShell app you can set this by 
parameter:*--container_retry_policy*.
IMO,{code}yarn.nodemanager.recovery.enabled=true{code} and 
{code}ContainerRetryPolicy= NEVER_RETRY{code} is not ambivalent.
I think ContainerRetryPolicy is create to let app control which container 
should retry which not.


> Container worker dir could not recover when NM restart
> --
>
> Key: YARN-6630
> URL: https://issues.apache.org/jira/browse/YARN-6630
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yang Wang
> Attachments: YARN-6630.001.patch
>
>
> When yarn.nodemanager.recovery.enabled is true and ContainerRetryPolicy is 
> NEVER_RETRY, container worker dir will not be saved in NM state store. 
> {code:title=ContainerLaunch.java}
> ...
>   private void recordContainerWorkDir(ContainerId containerId,
>   String workDir) throws IOException{
> container.setWorkDir(workDir);
> if (container.isRetryContextSet()) {
>   context.getNMStateStore().storeContainerWorkDir(containerId, workDir);
> }
>   }
> {code}
> Then NM restarts, container.workDir is null, and may cause other exceptions.
> {code:title=ContainerImpl.java}
>   static class ResourceLocalizedWhileRunningTransition
>   extends ContainerTransition {
> ...
>   String linkFile = new Path(container.workDir, link).toString();
> ...
> {code}
> {code}
> java.lang.IllegalArgumentException: Can not create a Path from a null string
> at org.apache.hadoop.fs.Path.checkPathArg(Path.java:159)
> at org.apache.hadoop.fs.Path.(Path.java:175)
> at org.apache.hadoop.fs.Path.(Path.java:110)
> ... ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6630) Container worker dir could not recover when NM restart

2017-05-25 Thread Feng Yuan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025741#comment-16025741
 ] 

Feng Yuan commented on YARN-6630:
-

Hi, wy {code}ContainerRetryPolicy{code} is configuarable,for example if you are 
using DistributeShell app you can set this by 
parameter:*--container_retry_policy*.
IMO,{code}yarn.nodemanager.recovery.enabled=true{code} and 
{code}ContainerRetryPolicy= NEVER_RETRY{code} is not ambivalent.
I think ContainerRetryPolicy is create to let app control which container 
should retry which not.


> Container worker dir could not recover when NM restart
> --
>
> Key: YARN-6630
> URL: https://issues.apache.org/jira/browse/YARN-6630
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yang Wang
> Attachments: YARN-6630.001.patch
>
>
> When yarn.nodemanager.recovery.enabled is true and ContainerRetryPolicy is 
> NEVER_RETRY, container worker dir will not be saved in NM state store. 
> {code:title=ContainerLaunch.java}
> ...
>   private void recordContainerWorkDir(ContainerId containerId,
>   String workDir) throws IOException{
> container.setWorkDir(workDir);
> if (container.isRetryContextSet()) {
>   context.getNMStateStore().storeContainerWorkDir(containerId, workDir);
> }
>   }
> {code}
> Then NM restarts, container.workDir is null, and may cause other exceptions.
> {code:title=ContainerImpl.java}
>   static class ResourceLocalizedWhileRunningTransition
>   extends ContainerTransition {
> ...
>   String linkFile = new Path(container.workDir, link).toString();
> ...
> {code}
> {code}
> java.lang.IllegalArgumentException: Can not create a Path from a null string
> at org.apache.hadoop.fs.Path.checkPathArg(Path.java:159)
> at org.apache.hadoop.fs.Path.(Path.java:175)
> at org.apache.hadoop.fs.Path.(Path.java:110)
> ... ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6649) RollingLevelDBTimelineServer throws RuntimeException if object decoding ever fails runtime exception

2017-05-25 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated YARN-6649:
--
Attachment: YARN-6649.1.patch

> RollingLevelDBTimelineServer throws RuntimeException if object decoding ever 
> fails runtime exception
> 
>
> Key: YARN-6649
> URL: https://issues.apache.org/jira/browse/YARN-6649
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
>Priority: Critical
> Attachments: YARN-6649.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5892) Capacity Scheduler: Support user-specific minimum user limit percent

2017-05-25 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025713#comment-16025713
 ] 

Sunil G commented on YARN-5892:
---

Thanks [~eepayne]

bq.which then multiplies the value of userLimitResource by the appropriate 
user's weight before returning it
I think I am fine here as we multiply with real weight of user, this will help 
to bring UL to correct value. Thanks for explaining in detail. I also have one 
more doubt now.

{code}
Resource userSpecificUserLimit =
Resources.multiplyAndNormalizeUp(resourceCalculator,
userLimitResource, weight, lQueue.getMinimumAllocation());
{code}
I think we could use multiplyAndNormalizeDown here. I have 2 reasons for this.
1) Ideally we allow atleast one container (extra) for a user given UL is 
lesser. So it might be fine to use multiplyAndNormalizeDown given we are not 
breaking a valid use case.
We do a > check here, not >=
{code:title=LeafQueue#canAssignToUser}
  if (Resources.greaterThan(resourceCalculator, clusterResource,
  user.getUsed(nodePartition), limit)) {
...
{code}
2) weight_user1=0.1, weight_user2=0.1. Now consider userLimitResource is some 
how 10GB and minimumAllocation is 4GB. In this case, both user1 and 2 will get 
UL as 4GB. This will help each user to get 2 containers each. I assume we have 
queue elasticity and other queue has some more resources. In this case, I feel 
we do not need to award a user with 2 containers, correct.?

Please correct me if I am wrong.


bq.I think the code is within locks everwhere it is used.
Yes. I did check the code detail. We are fine here, below code was not having 
lock which confused me, but its caller has correct lock.
{{UsersManager.addUser(String userName, User user)}}


> Capacity Scheduler: Support user-specific minimum user limit percent
> 
>
> Key: YARN-5892
> URL: https://issues.apache.org/jira/browse/YARN-5892
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Reporter: Eric Payne
>Assignee: Eric Payne
> Attachments: Active users highlighted.jpg, YARN-5892.001.patch, 
> YARN-5892.002.patch, YARN-5892.003.patch, YARN-5892.004.patch, 
> YARN-5892.005.patch, YARN-5892.006.patch, YARN-5892.007.patch, 
> YARN-5892.008.patch, YARN-5892.009.patch, YARN-5892.010.patch, 
> YARN-5892.012.patch, YARN-5892.013.patch
>
>
> Currently, in the capacity scheduler, the {{minimum-user-limit-percent}} 
> property is per queue. A cluster admin should be able to set the minimum user 
> limit percent on a per-user basis within the queue.
> This functionality is needed so that when intra-queue preemption is enabled 
> (YARN-4945 / YARN-2113), some users can be deemed as more important than 
> other users, and resources from VIP users won't be as likely to be preempted.
> For example, if the {{getstuffdone}} queue has a MULP of 25 percent, but user 
> {{jane}} is a power user of queue {{getstuffdone}} and needs to be guaranteed 
> 75 percent, the properties for {{getstuffdone}} and {{jane}} would look like 
> this:
> {code}
>   
> 
> yarn.scheduler.capacity.root.getstuffdone.minimum-user-limit-percent
> 25
>   
>   
> 
> yarn.scheduler.capacity.root.getstuffdone.jane.minimum-user-limit-percent
> 75
>   
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6646) Modifier 'static' is redundant for inner enums less

2017-05-25 Thread ZhangBing Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhangBing Lin updated YARN-6646:

Description: Java enumeration type is a static constant, implicitly 
modified with static final,Modifier 'static' is redundant for inner enums 
less.So I suggest deleting the 'static' modifier.

> Modifier 'static' is redundant for inner enums less
> ---
>
> Key: YARN-6646
> URL: https://issues.apache.org/jira/browse/YARN-6646
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha3
>Reporter: ZhangBing Lin
>Assignee: ZhangBing Lin
>Priority: Minor
> Attachments: YARN-6646.001.patch
>
>
> Java enumeration type is a static constant, implicitly modified with static 
> final,Modifier 'static' is redundant for inner enums less.So I suggest 
> deleting the 'static' modifier.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5531) UnmanagedAM pool manager for federating application across clusters

2017-05-25 Thread Botong Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated YARN-5531:
---
Attachment: YARN-5531-YARN-2915.v14.patch

> UnmanagedAM pool manager for federating application across clusters
> ---
>
> Key: YARN-5531
> URL: https://issues.apache.org/jira/browse/YARN-5531
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Botong Huang
> Attachments: YARN-5531-YARN-2915.v10.patch, 
> YARN-5531-YARN-2915.v11.patch, YARN-5531-YARN-2915.v12.patch, 
> YARN-5531-YARN-2915.v13.patch, YARN-5531-YARN-2915.v14.patch, 
> YARN-5531-YARN-2915.v1.patch, YARN-5531-YARN-2915.v2.patch, 
> YARN-5531-YARN-2915.v3.patch, YARN-5531-YARN-2915.v4.patch, 
> YARN-5531-YARN-2915.v5.patch, YARN-5531-YARN-2915.v6.patch, 
> YARN-5531-YARN-2915.v7.patch, YARN-5531-YARN-2915.v8.patch, 
> YARN-5531-YARN-2915.v9.patch
>
>
> One of the main tenets the YARN Federation is to *transparently* scale 
> applications across multiple clusters. This is achieved by running UAMs on 
> behalf of the application on other clusters. This JIRA tracks the addition of 
> a UnmanagedAM pool manager for federating application across clusters which 
> will be used the FederationInterceptor (YARN-3666) which is part of the 
> AMRMProxy pipeline introduced in YARN-2884.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6528) Add JMX metrics for Plan Follower and Agent Placement and Plan Operations

2017-05-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025606#comment-16025606
 ] 

Hadoop QA commented on YARN-6528:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 7 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 29s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 29 new + 448 unchanged - 2 fixed = 477 total (was 450) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 40m  9s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 61m  2s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.reservation.planning.TestSimpleCapacityReplanner
 |
|   | hadoop.yarn.server.resourcemanager.reservation.TestInMemoryPlan |
|   | 
hadoop.yarn.server.resourcemanager.reservation.planning.TestGreedyReservationAgent
 |
|   | hadoop.yarn.server.resourcemanager.TestRMRestart |
|   | hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy 
|
|   | 
hadoop.yarn.server.resourcemanager.reservation.planning.TestAlignedPlanner |
|   | hadoop.yarn.server.resourcemanager.reservation.TestNoOverCommitPolicy |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | YARN-6528 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12869954/YARN-6528.v005.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux f3a11d9e60fb 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 2b5ad48 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/16025/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| unit | 

[jira] [Commented] (YARN-5531) UnmanagedAM pool manager for federating application across clusters

2017-05-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025603#comment-16025603
 ] 

Hadoop QA commented on YARN-5531:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
52s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
15s{color} | {color:green} YARN-2915 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
18s{color} | {color:green} YARN-2915 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
53s{color} | {color:green} YARN-2915 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
27s{color} | {color:green} YARN-2915 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
29s{color} | {color:green} YARN-2915 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m  
5s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common in 
YARN-2915 has 1 extant Findbugs warnings. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
51s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 in YARN-2915 has 5 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
52s{color} | {color:green} YARN-2915 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
53s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 
0 new + 49 unchanged - 1 fixed = 49 total (was 50) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
35s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
22s{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-common 
generated 1 new + 162 unchanged - 0 fixed = 163 total (was 162) {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
26s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
15s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 
56s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 38m 56s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
35s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}122m 57s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | YARN-5531 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12869947/YARN-5531-YARN-2915.v13.patch
 |
| Optional Tests |  

[jira] [Updated] (YARN-6653) Retrieve CPU and MEMORY metrics for applications in a flow run

2017-05-25 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-6653:
-
Issue Type: Sub-task  (was: Bug)
Parent: YARN-3368

> Retrieve CPU and MEMORY metrics for applications in a flow run
> --
>
> Key: YARN-6653
> URL: https://issues.apache.org/jira/browse/YARN-6653
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha2
>Reporter: Haibo Chen
>
> Similarly to YARN-6651, 
> 'metricstoretrieve=YARN_APPLICATION_CPU,YARN_APPLICATION_MEMORY' can be added 
> to the web ui query fired by a user listing all applications in a flow run.  
> CPU and MEMORY can be retrieved this way.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-6653) Retrieve CPU and MEMORY metrics for applications in a flow run

2017-05-25 Thread Haibo Chen (JIRA)
Haibo Chen created YARN-6653:


 Summary: Retrieve CPU and MEMORY metrics for applications in a 
flow run
 Key: YARN-6653
 URL: https://issues.apache.org/jira/browse/YARN-6653
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.0.0-alpha2
Reporter: Haibo Chen


Similarly to YARN-6651, 
'metricstoretrieve=YARN_APPLICATION_CPU,YARN_APPLICATION_MEMORY' can be added 
to the web ui query fired by a user listing all applications in a flow run.  
CPU and MEMORY can be retrieved this way.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6643) TestRMFailover fails rarely due to port conflict

2017-05-25 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025582#comment-16025582
 ] 

Robert Kanter commented on YARN-6643:
-

Thanks Jason!

> TestRMFailover fails rarely due to port conflict
> 
>
> Key: YARN-6643
> URL: https://issues.apache.org/jira/browse/YARN-6643
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.9.0, 3.0.0-alpha3
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Fix For: 2.9.0, 3.0.0-alpha3, 2.8.2
>
> Attachments: YARN-6643.001.patch
>
>
> We've seen various tests in {{TestRMFailover}} fail very rarely with a 
> message like "org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.io.IOException: ResourceManager failed to start. Final state is 
> STOPPED".  
> After some digging, it turns out that it's due to a port conflict with the 
> embedded ZooKeeper in the tests.  The embedded ZooKeeper uses 
> {{ServerSocketUtil#getPort}} to choose a free port, but the RMs are 
> configured to 1 +  and 2 +  (e.g. the 
> default port for the RM is 8032, so you'd use 18032 and 28032).
> When I was able to reproduce this, I saw that ZooKeeper was using port 18033, 
> which is 1 + 8033, the default RM Admin port.  It results in an error 
> like this, causing the RM to be unable to start, and hence the original error 
> message in the test failure:
> {noformat}
> 2017-05-24 01:16:52,735 INFO  service.AbstractService 
> (AbstractService.java:noteFailure(272)) - Service ResourceManager failed in 
> state STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [0.0.0.0:18033] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [0.0.0.0:18033] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
> at 
> org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:139)
> at 
> org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getServer(HadoopYarnProtoRPC.java:65)
> at org.apache.hadoop.yarn.ipc.YarnRPC.getServer(YarnRPC.java:54)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.startServer(AdminService.java:171)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceStart(AdminService.java:158)
> at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1147)
> at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.yarn.server.MiniYARNCluster$2.run(MiniYARNCluster.java:310)
> Caused by: java.net.BindException: Problem binding to [0.0.0.0:18033] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:720)
> at org.apache.hadoop.ipc.Server.bind(Server.java:482)
> at org.apache.hadoop.ipc.Server$Listener.(Server.java:688)
> at org.apache.hadoop.ipc.Server.(Server.java:2376)
> at org.apache.hadoop.ipc.RPC$Server.(RPC.java:1042)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:535)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:510)
> at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:887)
> at 
> org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.createServer(RpcServerFactoryPBImpl.java:169)
> at 
> org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:132)
> ... 9 more
> Caused by: java.net.BindException: Address already in use
> at sun.nio.ch.Net.bind0(Native Method)
> at sun.nio.ch.Net.bind(Net.java:444)
> at sun.nio.ch.Net.bind(Net.java:436)
> at 
> 

[jira] [Commented] (YARN-6246) Identifying starved apps does not need the scheduler writelock

2017-05-25 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025578#comment-16025578
 ] 

Daniel Templeton commented on YARN-6246:


LGTM.  Fix your javadoc error and the checkstyle issue, and I'm happy.

> Identifying starved apps does not need the scheduler writelock
> --
>
> Key: YARN-6246
> URL: https://issues.apache.org/jira/browse/YARN-6246
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Affects Versions: 2.9.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Attachments: YARN-6246.001.patch, YARN-6246.002.patch, 
> YARN-6246.003.patch, YARN-6246.004.patch
>
>
> Currently, the starvation checks are done holding the scheduler writelock. We 
> are probably better of doing this outside. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6652) Merge flow info and flow runs

2017-05-25 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-6652:
-
Issue Type: Sub-task  (was: Improvement)
Parent: YARN-3368

> Merge flow info and flow runs
> -
>
> Key: YARN-6652
> URL: https://issues.apache.org/jira/browse/YARN-6652
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-ui-v2
>Affects Versions: 3.0.0-alpha2
>Reporter: Haibo Chen
>
> If a user clicks on a flow from the flow activity page, Flow Run and Flow 
> Info are shown separately. Usually, users want to go to individual flow runs. 
> With the current work flow, the user will need to click on Flow Run because 
> Flow Info is selected by default. 
> Given that Flow Info does not have much information, It'd be a nice 
> improvement if we can show flow info and flow run together, that is, one 
> section at the top containing flow info, another section at the bottom 
> containing the flow runs



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-6652) Merge flow info and flow runs

2017-05-25 Thread Haibo Chen (JIRA)
Haibo Chen created YARN-6652:


 Summary: Merge flow info and flow runs
 Key: YARN-6652
 URL: https://issues.apache.org/jira/browse/YARN-6652
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: yarn-ui-v2
Affects Versions: 3.0.0-alpha2
Reporter: Haibo Chen


If a user clicks on a flow from the flow activity page, Flow Run and Flow Info 
are shown separately. Usually, users want to go to individual flow runs. With 
the current work flow, the user will need to click on Flow Run because Flow 
Info is selected by default. 

Given that Flow Info does not have much information, It'd be a nice improvement 
if we can show flow info and flow run together, that is, one section at the top 
containing flow info, another section at the bottom containing the flow runs



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6651) Flow Activity should specify 'metricstoretrieve' in its query to ATSv2 to retrieve CPU and memory

2017-05-25 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-6651:
-
Issue Type: Sub-task  (was: Task)
Parent: YARN-3368

> Flow Activity should specify 'metricstoretrieve' in its query to ATSv2 to 
> retrieve CPU and memory 
> --
>
> Key: YARN-6651
> URL: https://issues.apache.org/jira/browse/YARN-6651
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha2
>Reporter: Haibo Chen
>
> When you click on Flow Acitivity => {a flow} => flow runs, the web server 
> sends a REST query to ATSv2 TimelineReaderServer, but it does not include a 
> query param 'metricstoretrieve" to get any metrics back.
> Instead, we should add 
> '?metricstoretrieve=YARN_APPLICATION_CPU,YARN_APPLICATION_MEMORY' to the 
> query to get CPU and MEMORY back.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6651) Flow Activity should specify 'metricstoretrieve' in its query to ATSv2 to retrieve CPU and memory

2017-05-25 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-6651:
-
Description: 
When you click on Flow Acitivity => \{a flow\} => flow runs, the web server 
sends a REST query to ATSv2 TimelineReaderServer, but it does not include a 
query param 'metricstoretrieve" to get any metrics back.

Instead, we should add 
'?metricstoretrieve=YARN_APPLICATION_CPU,YARN_APPLICATION_MEMORY' to the query 
to get CPU and MEMORY back.

  was:
When you click on Flow Acitivity => {a flow} => flow runs, the web server sends 
a REST query to ATSv2 TimelineReaderServer, but it does not include a query 
param 'metricstoretrieve" to get any metrics back.

Instead, we should add 
'?metricstoretrieve=YARN_APPLICATION_CPU,YARN_APPLICATION_MEMORY' to the query 
to get CPU and MEMORY back.


> Flow Activity should specify 'metricstoretrieve' in its query to ATSv2 to 
> retrieve CPU and memory 
> --
>
> Key: YARN-6651
> URL: https://issues.apache.org/jira/browse/YARN-6651
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha2
>Reporter: Haibo Chen
>
> When you click on Flow Acitivity => \{a flow\} => flow runs, the web server 
> sends a REST query to ATSv2 TimelineReaderServer, but it does not include a 
> query param 'metricstoretrieve" to get any metrics back.
> Instead, we should add 
> '?metricstoretrieve=YARN_APPLICATION_CPU,YARN_APPLICATION_MEMORY' to the 
> query to get CPU and MEMORY back.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6651) Flow Activity should specify 'metricstoretrieve' in its query to ATSv2 to retrieve CPU and memory

2017-05-25 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-6651:
-
Issue Type: Task  (was: Bug)

> Flow Activity should specify 'metricstoretrieve' in its query to ATSv2 to 
> retrieve CPU and memory 
> --
>
> Key: YARN-6651
> URL: https://issues.apache.org/jira/browse/YARN-6651
> Project: Hadoop YARN
>  Issue Type: Task
>Affects Versions: 3.0.0-alpha2
>Reporter: Haibo Chen
>
> When you click on Flow Acitivity => {a flow} => flow runs, the web server 
> sends a REST query to ATSv2 TimelineReaderServer, but it does not include a 
> query param 'metricstoretrieve" to get any metrics back.
> Instead, we should add 
> '?metricstoretrieve=YARN_APPLICATION_CPU,YARN_APPLICATION_MEMORY' to the 
> query to get CPU and MEMORY back.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6651) Flow Activity should specify 'metricstoretrieve' in its query to ATSv2 to retrieve CPU and memory

2017-05-25 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-6651:
-
Summary: Flow Activity should specify 'metricstoretrieve' in its query to 
ATSv2 to retrieve CPU and memory   (was: Flow Activity page should specify 
'metricstoretrieve' in its query to ATSv2 to get back CPU and memory )

> Flow Activity should specify 'metricstoretrieve' in its query to ATSv2 to 
> retrieve CPU and memory 
> --
>
> Key: YARN-6651
> URL: https://issues.apache.org/jira/browse/YARN-6651
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha2
>Reporter: Haibo Chen
>
> When you click on Flow Acitivity => {a flow} => flow runs, the web server 
> sends a REST query to ATSv2 TimelineReaderServer, but it does not include a 
> query param 'metricstoretrieve" to get any metrics back.
> Instead, we should add 
> '?metricstoretrieve=YARN_APPLICATION_CPU,YARN_APPLICATION_MEMORY' to the 
> query to get CPU and MEMORY back.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-6651) Flow Activity page should specify 'metricstoretrieve' in its query to ATSv2 to get back CPU and memory

2017-05-25 Thread Haibo Chen (JIRA)
Haibo Chen created YARN-6651:


 Summary: Flow Activity page should specify 'metricstoretrieve' in 
its query to ATSv2 to get back CPU and memory 
 Key: YARN-6651
 URL: https://issues.apache.org/jira/browse/YARN-6651
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.0.0-alpha2
Reporter: Haibo Chen


When you click on Flow Acitivity => {a flow} => flow runs, the web server sends 
a REST query to ATSv2 TimelineReaderServer, but it does not include a query 
param 'metricstoretrieve" to get any metrics back.

Instead, we should add 
'?metricstoretrieve=YARN_APPLICATION_CPU,YARN_APPLICATION_MEMORY' to the query 
to get CPU and MEMORY back.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6634) [API] Define an API for ResourceManager WebServices

2017-05-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025550#comment-16025550
 ] 

Hadoop QA commented on YARN-6634:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 24s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 25 new + 4 unchanged - 36 fixed = 29 total (was 40) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
19s{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager
 generated 145 new + 852 unchanged - 22 fixed = 997 total (was 874) {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 40m 14s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 63m 38s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | YARN-6634 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12869943/YARN-6634.v3.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 75348e203b80 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 2b5ad48 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/16021/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| javadoc | 
https://builds.apache.org/job/PreCommit-YARN-Build/16021/artifact/patchprocess/diff-javadoc-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| unit | 

[jira] [Updated] (YARN-6528) Add JMX metrics for Plan Follower and Agent Placement and Plan Operations

2017-05-25 Thread Sean Po (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Po updated YARN-6528:
--
Attachment: YARN-6528.v005.patch

YARN-6528.v005.patch fixes the findbugs error and the test failures. 

This patch has been tested on a single node cluster.

> Add JMX metrics for Plan Follower and Agent Placement and Plan Operations
> -
>
> Key: YARN-6528
> URL: https://issues.apache.org/jira/browse/YARN-6528
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Sean Po
>Assignee: Sean Po
> Attachments: YARN-6528.v001.patch, YARN-6528.v002.patch, 
> YARN-6528.v003.patch, YARN-6528.v004.patch, YARN-6528.v005.patch
>
>
> YARN-1051 introduced a ReservationSytem that enables the YARN RM to handle 
> time explicitly, i.e. users can now "reserve" capacity ahead of time which is 
> predictably allocated to them. In order to understand in finer detail the 
> performance of Rayon, YARN-6528 proposes to include JMX metrics in the Plan 
> Follower, Agent Placement and Plan Operations components of Rayon.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6650) ContainerTokenIdentifier is re-encoded during token verification

2017-05-25 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025503#comment-16025503
 ] 

Jason Lowe commented on YARN-6650:
--

The decode then re-encode issue is not really specific to 
ContainerTokenIdentifier.  Any token that is re-encoded in such a way where 
unknown fields are either omitted or not guaranteed to be serialized in the 
same order as done by the token creator could be problematic for upgrade 
scenarios.

> ContainerTokenIdentifier is re-encoded during token verification
> 
>
> Key: YARN-6650
> URL: https://issues.apache.org/jira/browse/YARN-6650
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.8.0
>Reporter: Jason Lowe
>
> A ContainerTokenIdentifier is serialized into bytes and signed by the RM 
> secret key.  When the NM needs to verify the identifier, it is decoding the 
> bytes into a ContainerTokenIdentifier to get the key ID then re-encoding the 
> identifier into a byte buffer to hash it with the key.  This is fine as long 
> as the RM and NM both agree how a ContainerTokenIdentifier should be 
> serialized into bytes.
> However when the versions of the RM and NM are different and fields were 
> added to the identifier between those versions then the NM may end up 
> re-serializing the fields in a different order than the RM did, especially 
> when there were gaps in the protocol field IDs that were filled in between 
> the versions. If the fields are reordered during the re-encoding then the 
> bytes will not match the original stream that was signed and the token 
> verification will fail.
> The original token identifier bytes received via RPC need to be used by the 
> verification process, not the bytes generated by re-encoding the identifier.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6582) FSAppAttempt demand can be updated atomically in updateDemand()

2017-05-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025502#comment-16025502
 ] 

Hudson commented on YARN-6582:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11784 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11784/])
YARN-6582. FSAppAttempt demand can be updated atomically in (yufei: rev 
87590090c887829e874a7132be9cf8de061437d6)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java


> FSAppAttempt demand can be updated atomically in updateDemand()
> ---
>
> Key: YARN-6582
> URL: https://issues.apache.org/jira/browse/YARN-6582
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: YARN-6582.001.patch
>
>
> FSAppAttempt#updateDemand first sets demand to 0, and then adds up all the 
> outstanding requests. Instead, we could use another variable tmpDemand to 
> build the new value and atomically replace the demand.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6643) TestRMFailover fails rarely due to port conflict

2017-05-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025501#comment-16025501
 ] 

Hudson commented on YARN-6643:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11784 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11784/])
YARN-6643. TestRMFailover fails rarely due to port conflict. Contributed 
(jlowe: rev 3fd6a2da4e537423d1462238e10cc9e1f698d1c2)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/HATestUtil.java


> TestRMFailover fails rarely due to port conflict
> 
>
> Key: YARN-6643
> URL: https://issues.apache.org/jira/browse/YARN-6643
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.9.0, 3.0.0-alpha3
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Fix For: 2.9.0, 3.0.0-alpha3, 2.8.2
>
> Attachments: YARN-6643.001.patch
>
>
> We've seen various tests in {{TestRMFailover}} fail very rarely with a 
> message like "org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.io.IOException: ResourceManager failed to start. Final state is 
> STOPPED".  
> After some digging, it turns out that it's due to a port conflict with the 
> embedded ZooKeeper in the tests.  The embedded ZooKeeper uses 
> {{ServerSocketUtil#getPort}} to choose a free port, but the RMs are 
> configured to 1 +  and 2 +  (e.g. the 
> default port for the RM is 8032, so you'd use 18032 and 28032).
> When I was able to reproduce this, I saw that ZooKeeper was using port 18033, 
> which is 1 + 8033, the default RM Admin port.  It results in an error 
> like this, causing the RM to be unable to start, and hence the original error 
> message in the test failure:
> {noformat}
> 2017-05-24 01:16:52,735 INFO  service.AbstractService 
> (AbstractService.java:noteFailure(272)) - Service ResourceManager failed in 
> state STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [0.0.0.0:18033] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [0.0.0.0:18033] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
> at 
> org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:139)
> at 
> org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getServer(HadoopYarnProtoRPC.java:65)
> at org.apache.hadoop.yarn.ipc.YarnRPC.getServer(YarnRPC.java:54)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.startServer(AdminService.java:171)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceStart(AdminService.java:158)
> at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1147)
> at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.yarn.server.MiniYARNCluster$2.run(MiniYARNCluster.java:310)
> Caused by: java.net.BindException: Problem binding to [0.0.0.0:18033] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:720)
> at org.apache.hadoop.ipc.Server.bind(Server.java:482)
> at org.apache.hadoop.ipc.Server$Listener.(Server.java:688)
> at org.apache.hadoop.ipc.Server.(Server.java:2376)
> at org.apache.hadoop.ipc.RPC$Server.(RPC.java:1042)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:535)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:510)
> at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:887)
> at 
> 

[jira] [Updated] (YARN-6648) Add FederationStateStore interfaces for Global Policy Generator

2017-05-25 Thread Botong Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated YARN-6648:
---
Attachment: YARN-6648-YARN-2915.v1.patch

> Add FederationStateStore interfaces for Global Policy Generator
> ---
>
> Key: YARN-6648
> URL: https://issues.apache.org/jira/browse/YARN-6648
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
> Attachments: YARN-6648-YARN-2915.v1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5531) UnmanagedAM pool manager for federating application across clusters

2017-05-25 Thread Botong Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated YARN-5531:
---
Attachment: YARN-5531-YARN-2915.v13.patch

> UnmanagedAM pool manager for federating application across clusters
> ---
>
> Key: YARN-5531
> URL: https://issues.apache.org/jira/browse/YARN-5531
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Botong Huang
> Attachments: YARN-5531-YARN-2915.v10.patch, 
> YARN-5531-YARN-2915.v11.patch, YARN-5531-YARN-2915.v12.patch, 
> YARN-5531-YARN-2915.v13.patch, YARN-5531-YARN-2915.v1.patch, 
> YARN-5531-YARN-2915.v2.patch, YARN-5531-YARN-2915.v3.patch, 
> YARN-5531-YARN-2915.v4.patch, YARN-5531-YARN-2915.v5.patch, 
> YARN-5531-YARN-2915.v6.patch, YARN-5531-YARN-2915.v7.patch, 
> YARN-5531-YARN-2915.v8.patch, YARN-5531-YARN-2915.v9.patch
>
>
> One of the main tenets the YARN Federation is to *transparently* scale 
> applications across multiple clusters. This is achieved by running UAMs on 
> behalf of the application on other clusters. This JIRA tracks the addition of 
> a UnmanagedAM pool manager for federating application across clusters which 
> will be used the FederationInterceptor (YARN-3666) which is part of the 
> AMRMProxy pipeline introduced in YARN-2884.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6641) Non-public resource localization on a bad disk causes subsequent containers failure

2017-05-25 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025489#comment-16025489
 ] 

Jason Lowe commented on YARN-6641:
--

Thanks for updating the patch!  One last thing I missed in the previous review, 
the new getDirsHandler method should be package-private like the other 
only-for-testing methods.


> Non-public resource localization on a bad disk causes subsequent containers 
> failure
> ---
>
> Key: YARN-6641
> URL: https://issues.apache.org/jira/browse/YARN-6641
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Attachments: YARN-6641.001.patch, YARN-6641.002.patch, 
> YARN-6641.003.patch
>
>
> YARN-3591 added the {{checkLocalResource}} method to {{isResourcePresent()}} 
> call to allow checking an already localized resource against the list of 
> good/full directories.
> Since LocalResourcesTrackerImpl instantiations for app level resources and 
> private resources do not use the new constructor, such resources that are on 
> bad disk will never be checked against good dirs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-2113) Add cross-user preemption within CapacityScheduler's leaf-queue

2017-05-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025487#comment-16025487
 ] 

Hadoop QA commented on YARN-2113:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red}  4m 
26s{color} | {color:red} Docker failed to build yetus/hadoop:5970e82. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-2113 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12869941/YARN-2113.branch-2.8.0019.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/16022/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Add cross-user preemption within CapacityScheduler's leaf-queue
> ---
>
> Key: YARN-2113
> URL: https://issues.apache.org/jira/browse/YARN-2113
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Sunil G
> Fix For: 3.0.0-alpha3
>
> Attachments: IntraQueue Preemption-Impact Analysis.pdf, 
> TestNoIntraQueuePreemptionIfBelowUserLimitAndDifferentPrioritiesWithExtraUsers.txt,
>  YARN-2113.0001.patch, YARN-2113.0002.patch, YARN-2113.0003.patch, 
> YARN-2113.0004.patch, YARN-2113.0005.patch, YARN-2113.0006.patch, 
> YARN-2113.0007.patch, YARN-2113.0008.patch, YARN-2113.0009.patch, 
> YARN-2113.0010.patch, YARN-2113.0011.patch, YARN-2113.0012.patch, 
> YARN-2113.0013.patch, YARN-2113.0014.patch, YARN-2113.0015.patch, 
> YARN-2113.0016.patch, YARN-2113.0017.patch, YARN-2113.0018.patch, 
> YARN-2113.0019.patch, YARN-2113.apply.onto.0012.ericp.patch, 
> YARN-2113.branch-2.8.0019.patch, YARN-2113 Intra-QueuePreemption 
> Behavior.pdf, YARN-2113.v0.patch
>
>
> Preemption today only works across queues and moves around resources across 
> queues per demand and usage. We should also have user-level preemption within 
> a queue, to balance capacity across users in a predictable manner.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-6650) ContainerTokenIdentifier is re-encoded during token verification

2017-05-25 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-6650:


 Summary: ContainerTokenIdentifier is re-encoded during token 
verification
 Key: YARN-6650
 URL: https://issues.apache.org/jira/browse/YARN-6650
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Affects Versions: 2.8.0
Reporter: Jason Lowe


A ContainerTokenIdentifier is serialized into bytes and signed by the RM secret 
key.  When the NM needs to verify the identifier, it is decoding the bytes into 
a ContainerTokenIdentifier to get the key ID then re-encoding the identifier 
into a byte buffer to hash it with the key.  This is fine as long as the RM and 
NM both agree how a ContainerTokenIdentifier should be serialized into bytes.

However when the versions of the RM and NM are different and fields were added 
to the identifier between those versions then the NM may end up re-serializing 
the fields in a different order than the RM did, especially when there were 
gaps in the protocol field IDs that were filled in between the versions. If the 
fields are reordered during the re-encoding then the bytes will not match the 
original stream that was signed and the token verification will fail.

The original token identifier bytes received via RPC need to be used by the 
verification process, not the bytes generated by re-encoding the identifier.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6634) [API] Define an API for ResourceManager WebServices

2017-05-25 Thread Giovanni Matteo Fumarola (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025463#comment-16025463
 ] 

Giovanni Matteo Fumarola commented on YARN-6634:


Fixed the Yetus warnings. TestRMWebServicesAppsModification failed due to the 
difference between appId and appid. TestRMRestart is not related.

> [API] Define an API for ResourceManager WebServices
> ---
>
> Key: YARN-6634
> URL: https://issues.apache.org/jira/browse/YARN-6634
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.8.0
>Reporter: Subru Krishnan
>Assignee: Giovanni Matteo Fumarola
>Priority: Critical
> Attachments: YARN-6634.proto.patch, YARN-6634.v1.patch, 
> YARN-6634.v2.patch, YARN-6634.v3.patch
>
>
> The RM exposes few REST queries but there's no clear API interface defined. 
> This makes it painful to build either clients or extension components like 
> Router (YARN-5412) that expose REST interfaces themselves. This jira proposes 
> adding a RM WebServices protocol similar to the one we have for RPC, i.e. 
> {{ApplicationClientProtocol}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6634) [API] Define an API for ResourceManager WebServices

2017-05-25 Thread Giovanni Matteo Fumarola (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giovanni Matteo Fumarola updated YARN-6634:
---
Attachment: YARN-6634.v3.patch

> [API] Define an API for ResourceManager WebServices
> ---
>
> Key: YARN-6634
> URL: https://issues.apache.org/jira/browse/YARN-6634
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.8.0
>Reporter: Subru Krishnan
>Assignee: Giovanni Matteo Fumarola
>Priority: Critical
> Attachments: YARN-6634.proto.patch, YARN-6634.v1.patch, 
> YARN-6634.v2.patch, YARN-6634.v3.patch
>
>
> The RM exposes few REST queries but there's no clear API interface defined. 
> This makes it painful to build either clients or extension components like 
> Router (YARN-5412) that expose REST interfaces themselves. This jira proposes 
> adding a RM WebServices protocol similar to the one we have for RPC, i.e. 
> {{ApplicationClientProtocol}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6484) [Documentation] Documenting the YARN Federation feature

2017-05-25 Thread Subru Krishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025456#comment-16025456
 ] 

Subru Krishnan commented on YARN-6484:
--

Thanks [~curino] for the solid documentation. I have a few minor comments:
  * Can you please add a sequence diagram for the *Job execution flow* as 
that'll make it much easier to understand.
  * It'll be good if we can also reuse the {{AMRMProxy}} internals diagram from 
our Hadoop summit talk.
  * We should call out that the _yarn.resourcemanager.cluster-id_ is the same 
as what's used for RM HA, i.e. we simply reuse the config.
  * I feel we should clarify that the _yarn.resourcemanager.epoch_ is unique 
per sub-cluster, _yarn.resourcemanager.cluster-id_ and having increments of 
1000 will provide practical safety from ContainerId clashes.
  * NIt: {{Federation.md}} has a whitespace at the end.

> [Documentation] Documenting the YARN Federation feature
> ---
>
> Key: YARN-6484
> URL: https://issues.apache.org/jira/browse/YARN-6484
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Affects Versions: YARN-2915
>Reporter: Subru Krishnan
>Assignee: Carlo Curino
> Attachments: YARN-6484-YARN-2915.v0.patch, 
> YARN-6484-YARN-2915.v1.patch, YARN-6484-YARN-2915.v2.patch
>
>
> We should document the high level design and configuration to enable YARN 
> Federation



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6555) Enable flow context read (& corresponding write) for recovering application with NM restart

2017-05-25 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025452#comment-16025452
 ] 

Vrushali C commented on YARN-6555:
--

No more comments, patch looks good. 
+1.

[~haibo.chen] please feel free to commit it. 

> Enable flow context read (& corresponding write) for recovering application 
> with NM restart 
> 
>
> Key: YARN-6555
> URL: https://issues.apache.org/jira/browse/YARN-6555
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-5355, YARN-5355-branch-2, 3.0.0-alpha3
>Reporter: Vrushali C
>Assignee: Rohith Sharma K S
>  Labels: yarn-5355-merge-blocker
> Attachments: YARN-6555.001.patch, YARN-6555.002.patch, 
> YARN-6555.003.patch
>
>
> If timeline service v2 is enabled and NM is restarted with recovery enabled, 
> then NM fails to start and throws an error as  "flow context can't be null".
> This is happening because the flow context did not exist before but now that 
> timeline service v2 is enabled, ApplicationImpl expects it to exist. 
> This would also happen even if flow context existed before but since we are 
> not persisting it / reading it during 
> ContainerManagerImpl#recoverApplication, it does not get passed in to 
> ApplicationImpl.
> full stack trace
> {code}
> 2017-05-03 21:51:52,178 FATAL 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting 
> NodeManager
> java.lang.IllegalArgumentException: flow context cannot be null
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.(ApplicationImpl.java:104)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.(ApplicationImpl.java:90)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recoverApplication(ContainerManagerImpl.java:318)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:280)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:267)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:276)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:588)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:649)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2113) Add cross-user preemption within CapacityScheduler's leaf-queue

2017-05-25 Thread Eric Payne (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Payne updated YARN-2113:
-
Attachment: YARN-2113.branch-2.8.0019.patch

[~sunilg], I took a shot at backporting YARN-2113.0019.patch to branch-2.8, 
decoupling it from YARN-5889. I am still running through the tests, but this 
seems to work fairly well so far.

> Add cross-user preemption within CapacityScheduler's leaf-queue
> ---
>
> Key: YARN-2113
> URL: https://issues.apache.org/jira/browse/YARN-2113
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Sunil G
> Fix For: 3.0.0-alpha3
>
> Attachments: IntraQueue Preemption-Impact Analysis.pdf, 
> TestNoIntraQueuePreemptionIfBelowUserLimitAndDifferentPrioritiesWithExtraUsers.txt,
>  YARN-2113.0001.patch, YARN-2113.0002.patch, YARN-2113.0003.patch, 
> YARN-2113.0004.patch, YARN-2113.0005.patch, YARN-2113.0006.patch, 
> YARN-2113.0007.patch, YARN-2113.0008.patch, YARN-2113.0009.patch, 
> YARN-2113.0010.patch, YARN-2113.0011.patch, YARN-2113.0012.patch, 
> YARN-2113.0013.patch, YARN-2113.0014.patch, YARN-2113.0015.patch, 
> YARN-2113.0016.patch, YARN-2113.0017.patch, YARN-2113.0018.patch, 
> YARN-2113.0019.patch, YARN-2113.apply.onto.0012.ericp.patch, 
> YARN-2113.branch-2.8.0019.patch, YARN-2113 Intra-QueuePreemption 
> Behavior.pdf, YARN-2113.v0.patch
>
>
> Preemption today only works across queues and moves around resources across 
> queues per demand and usage. We should also have user-level preemption within 
> a queue, to balance capacity across users in a predictable manner.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6582) FSAppAttempt demand can be updated atomically in updateDemand()

2017-05-25 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025424#comment-16025424
 ] 

Yufei Gu commented on YARN-6582:


Committed to trunk and branch-2. Thanks [~kasha] for the patch. 

> FSAppAttempt demand can be updated atomically in updateDemand()
> ---
>
> Key: YARN-6582
> URL: https://issues.apache.org/jira/browse/YARN-6582
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: YARN-6582.001.patch
>
>
> FSAppAttempt#updateDemand first sets demand to 0, and then adds up all the 
> outstanding requests. Instead, we could use another variable tmpDemand to 
> build the new value and atomically replace the demand.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5892) Capacity Scheduler: Support user-specific minimum user limit percent

2017-05-25 Thread Eric Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025425#comment-16025425
 ] 

Eric Payne commented on YARN-5892:
--

Thank you for the reviews.
bq. allUsersTimesWeights will be less than 1. I think in this case UL value is 
higher

[~sunilg], I think this is similar to the question I answered 
[above|https://issues.apache.org/jira/browse/YARN-5892?focusedCommentId=15972782=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15972782],
 but I'll restate it here for the sake of clarity.

Having a combined sum of weights < 1 does not cause UL to be too large. This is 
because {{userLimitResource}} (the return value of {{computeUserLimit}}) is 
only ever used by {{getComputedResourceLimitFor\[Active|All\]Users}}, which 
then multiplies the value of {{userLimitResource}} by the appropriate user's 
weight before returning it. This will result in the correct value of userLimit 
for each specific user. When the sum of active user(s)'s weight(s) is < 1, then 
it is true that {{userLimitResource}} is larger than the actual number of 
resources used. However, {{userLimitResource}} is just an intermediate value.

bq. In UserManager, do we also need to lock while updating 
"activeUsersTimesWeights". 
Can you please clarify where you see it being read or written outside the lock? 
I think the code is within locks everwhere it is used.


bq. 1) Could you move CapacitySchedulerQueueManager#updateUserWeights to 
LeafQueue#setupQueueConfigs.

[~leftnoteasy], good optimization. I will make this change, do testing and 
await [~sunilg]'s response before submitting a new patch.


> Capacity Scheduler: Support user-specific minimum user limit percent
> 
>
> Key: YARN-5892
> URL: https://issues.apache.org/jira/browse/YARN-5892
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Reporter: Eric Payne
>Assignee: Eric Payne
> Attachments: Active users highlighted.jpg, YARN-5892.001.patch, 
> YARN-5892.002.patch, YARN-5892.003.patch, YARN-5892.004.patch, 
> YARN-5892.005.patch, YARN-5892.006.patch, YARN-5892.007.patch, 
> YARN-5892.008.patch, YARN-5892.009.patch, YARN-5892.010.patch, 
> YARN-5892.012.patch, YARN-5892.013.patch
>
>
> Currently, in the capacity scheduler, the {{minimum-user-limit-percent}} 
> property is per queue. A cluster admin should be able to set the minimum user 
> limit percent on a per-user basis within the queue.
> This functionality is needed so that when intra-queue preemption is enabled 
> (YARN-4945 / YARN-2113), some users can be deemed as more important than 
> other users, and resources from VIP users won't be as likely to be preempted.
> For example, if the {{getstuffdone}} queue has a MULP of 25 percent, but user 
> {{jane}} is a power user of queue {{getstuffdone}} and needs to be guaranteed 
> 75 percent, the properties for {{getstuffdone}} and {{jane}} would look like 
> this:
> {code}
>   
> 
> yarn.scheduler.capacity.root.getstuffdone.minimum-user-limit-percent
> 25
>   
>   
> 
> yarn.scheduler.capacity.root.getstuffdone.jane.minimum-user-limit-percent
> 75
>   
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6555) Enable flow context read (& corresponding write) for recovering application with NM restart

2017-05-25 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025390#comment-16025390
 ] 

Haibo Chen commented on YARN-6555:
--

Thanks [~rohithsharma] for the explanation, and updating the patch!  +1 on the 
latest patch. [~vrushalic] do you have any other comments?

> Enable flow context read (& corresponding write) for recovering application 
> with NM restart 
> 
>
> Key: YARN-6555
> URL: https://issues.apache.org/jira/browse/YARN-6555
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-5355, YARN-5355-branch-2, 3.0.0-alpha3
>Reporter: Vrushali C
>Assignee: Rohith Sharma K S
>  Labels: yarn-5355-merge-blocker
> Attachments: YARN-6555.001.patch, YARN-6555.002.patch, 
> YARN-6555.003.patch
>
>
> If timeline service v2 is enabled and NM is restarted with recovery enabled, 
> then NM fails to start and throws an error as  "flow context can't be null".
> This is happening because the flow context did not exist before but now that 
> timeline service v2 is enabled, ApplicationImpl expects it to exist. 
> This would also happen even if flow context existed before but since we are 
> not persisting it / reading it during 
> ContainerManagerImpl#recoverApplication, it does not get passed in to 
> ApplicationImpl.
> full stack trace
> {code}
> 2017-05-03 21:51:52,178 FATAL 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting 
> NodeManager
> java.lang.IllegalArgumentException: flow context cannot be null
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.(ApplicationImpl.java:104)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.(ApplicationImpl.java:90)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recoverApplication(ContainerManagerImpl.java:318)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:280)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:267)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:276)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:588)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:649)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6582) FSAppAttempt demand can be updated atomically in updateDemand()

2017-05-25 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025381#comment-16025381
 ] 

Yufei Gu commented on YARN-6582:


Thanks [~kasha] for working on this. The patch looks good to me. Both 
{{getSchedulerKeys()}} and {{getPendingAsk}} are fine after removing the write 
lock. +1.

> FSAppAttempt demand can be updated atomically in updateDemand()
> ---
>
> Key: YARN-6582
> URL: https://issues.apache.org/jira/browse/YARN-6582
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Attachments: YARN-6582.001.patch
>
>
> FSAppAttempt#updateDemand first sets demand to 0, and then adds up all the 
> outstanding requests. Instead, we could use another variable tmpDemand to 
> build the new value and atomically replace the demand.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-6649) RollingLevelDBTimelineServer throws RuntimeException if object decoding ever fails runtime exception

2017-05-25 Thread Jonathan Eagles (JIRA)
Jonathan Eagles created YARN-6649:
-

 Summary: RollingLevelDBTimelineServer throws RuntimeException if 
object decoding ever fails runtime exception
 Key: YARN-6649
 URL: https://issues.apache.org/jira/browse/YARN-6649
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles
Priority: Critical






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-4925) ContainerRequest in AMRMClient, application should be able to specify nodes/racks together with nodeLabelExpression

2017-05-25 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16024866#comment-16024866
 ] 

Bibin A Chundatt edited comment on YARN-4925 at 5/25/17 8:39 PM:
-

We should also backport dependent YARN-4140


was (Author: bibinchundatt):
We should also port dependent YARN-4140

> ContainerRequest in AMRMClient, application should be able to specify 
> nodes/racks together with nodeLabelExpression
> ---
>
> Key: YARN-4925
> URL: https://issues.apache.org/jira/browse/YARN-4925
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>  Labels: release-blocker
> Fix For: 2.8.0, 3.0.0-alpha1
>
> Attachments: 0001-YARN-4925.patch, 0002-YARN-4925.patch, 
> YARN-4925-branch-2.7.001.patch
>
>
> Currently with nodelabel AMRMClient will not be able to specify nodelabels 
> with Node/Rack requests.For application like spark NODE_LOCAL requests cannot 
> be asked with label expression.
> As per the check in  {{AMRMClientImpl#checkNodeLabelExpression}}
> {noformat}
> // Don't allow specify node label against ANY request
> if ((containerRequest.getRacks() != null && 
> (!containerRequest.getRacks().isEmpty()))
> || 
> (containerRequest.getNodes() != null && 
> (!containerRequest.getNodes().isEmpty( {
>   throw new InvalidContainerRequestException(
>   "Cannot specify node label with rack and node");
> }
> {noformat}
> {{AppSchedulingInfo#updateResourceRequests}} we do reset of labels to that of 
> OFF-SWITCH. 
> The above check is not required for ContainerRequest ask /cc [~wangda] thank 
> you for confirming



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6641) Non-public resource localization on a bad disk causes subsequent containers failure

2017-05-25 Thread Kuhu Shukla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025341#comment-16025341
 ] 

Kuhu Shukla commented on YARN-6641:
---

[~jlowe], request for some more comments. Thanks a lot!

> Non-public resource localization on a bad disk causes subsequent containers 
> failure
> ---
>
> Key: YARN-6641
> URL: https://issues.apache.org/jira/browse/YARN-6641
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Attachments: YARN-6641.001.patch, YARN-6641.002.patch, 
> YARN-6641.003.patch
>
>
> YARN-3591 added the {{checkLocalResource}} method to {{isResourcePresent()}} 
> call to allow checking an already localized resource against the list of 
> good/full directories.
> Since LocalResourcesTrackerImpl instantiations for app level resources and 
> private resources do not use the new constructor, such resources that are on 
> bad disk will never be checked against good dirs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6648) Add FederationStateStore interfaces for Global Policy Generator

2017-05-25 Thread Subru Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subru Krishnan updated YARN-6648:
-
Issue Type: Sub-task  (was: Task)
Parent: YARN-5597

> Add FederationStateStore interfaces for Global Policy Generator
> ---
>
> Key: YARN-6648
> URL: https://issues.apache.org/jira/browse/YARN-6648
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6555) Enable flow context read (& corresponding write) for recovering application with NM restart

2017-05-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025226#comment-16025226
 ] 

Hadoop QA commented on YARN-6555:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
42s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 in trunk has 5 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 14s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 1 new + 65 unchanged - 0 fixed = 66 total (was 65) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 
17s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 32m 51s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | YARN-6555 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12869901/YARN-6555.003.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  cc  |
| uname | Linux ee85dfafbee4 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 29b7df9 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-YARN-Build/16020/artifact/patchprocess/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-warnings.html
 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/16020/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/16020/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 

[jira] [Created] (YARN-6648) Add FederationStateStore interfaces for Global Policy Generator

2017-05-25 Thread Botong Huang (JIRA)
Botong Huang created YARN-6648:
--

 Summary: Add FederationStateStore interfaces for Global Policy 
Generator
 Key: YARN-6648
 URL: https://issues.apache.org/jira/browse/YARN-6648
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Botong Huang
Assignee: Botong Huang
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6555) Enable flow context read (& corresponding write) for recovering application with NM restart

2017-05-25 Thread Rohith Sharma K S (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-6555:

Attachment: YARN-6555.003.patch

Updated the patch fixing review comment from Haibo.

> Enable flow context read (& corresponding write) for recovering application 
> with NM restart 
> 
>
> Key: YARN-6555
> URL: https://issues.apache.org/jira/browse/YARN-6555
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-5355, YARN-5355-branch-2, 3.0.0-alpha3
>Reporter: Vrushali C
>Assignee: Rohith Sharma K S
>  Labels: yarn-5355-merge-blocker
> Attachments: YARN-6555.001.patch, YARN-6555.002.patch, 
> YARN-6555.003.patch
>
>
> If timeline service v2 is enabled and NM is restarted with recovery enabled, 
> then NM fails to start and throws an error as  "flow context can't be null".
> This is happening because the flow context did not exist before but now that 
> timeline service v2 is enabled, ApplicationImpl expects it to exist. 
> This would also happen even if flow context existed before but since we are 
> not persisting it / reading it during 
> ContainerManagerImpl#recoverApplication, it does not get passed in to 
> ApplicationImpl.
> full stack trace
> {code}
> 2017-05-03 21:51:52,178 FATAL 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting 
> NodeManager
> java.lang.IllegalArgumentException: flow context cannot be null
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.(ApplicationImpl.java:104)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.(ApplicationImpl.java:90)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recoverApplication(ContainerManagerImpl.java:318)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:280)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:267)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:276)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:588)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:649)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6641) Non-public resource localization on a bad disk causes subsequent containers failure

2017-05-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025057#comment-16025057
 ] 

Hadoop QA commented on YARN-6641:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
27s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
46s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 in trunk has 5 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 0 new + 250 unchanged - 1 fixed = 250 total (was 251) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 14m 
17s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 40m 28s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | YARN-6641 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12869891/YARN-6641.003.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux cec92e0cea29 3.13.0-108-generic #155-Ubuntu SMP Wed Jan 11 
16:58:52 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 2e41f88 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-YARN-Build/16019/artifact/patchprocess/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-warnings.html
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/16019/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/16019/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Non-public resource localization on a bad disk causes subsequent 

[jira] [Comment Edited] (YARN-6111) Rumen input does't work in SLS

2017-05-25 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16024979#comment-16024979
 ] 

Yufei Gu edited comment on YARN-6111 at 5/25/17 5:18 PM:
-

[~yoyo], The patch file is a diff file, which tells the differences made for 
the repo. Try to play tool 'git', especially how to generate a patch and apply 
a patch. 

SLS in Hadoop-2.7.3 may be broken in some way, please use the trunk instead. 
YARN-6608 tries to backport all recent SLS improvements from trunk to branch-2. 
You can try branch-2 after that.


was (Author: yufeigu):
[~yoyo], The patch file is a diff, which tells the different made for the repo. 
Try to play 'git', especially how to generate patch and apply patch. 

SLS in Hadoop-2.7.3 may be broken in some way, use the trunk instead. YARN-6608 
tries to backport all SLS improvements from trunk to branch-2. You can try 
branch-2 after that.

> Rumen input does't work in SLS
> --
>
> Key: YARN-6111
> URL: https://issues.apache.org/jira/browse/YARN-6111
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler-load-simulator
>Affects Versions: 2.6.0, 2.7.3, 3.0.0-alpha2
> Environment: ubuntu14.0.4 os
>Reporter: YuJie Huang
>Assignee: Yufei Gu
>  Labels: test
> Fix For: 3.0.0-alpha3
>
> Attachments: YARN-6111.001.patch
>
>
> Hi guys,
> I am trying to learn the use of SLS.
> I would like to get the file realtimetrack.json, but this it only 
> contains "[]" at the end of a simulation. This is the command I use to 
> run the instance:
> HADOOP_HOME $ bin/slsrun.sh --input-rumen=sample-data/2jobsmin-rumen-jh.json 
> --output-dir=sample-data 
> All other files, including metrics, appears to be properly populated.I can 
> also trace with web:http://localhost:10001/simulate
> Can someone help?
> Thanks



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6111) Rumen input does't work in SLS

2017-05-25 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16024979#comment-16024979
 ] 

Yufei Gu commented on YARN-6111:


[~yoyo], The patch file is a diff, which tells the different made for the repo. 
Try to play 'git', especially how to generate patch and apply patch. 

SLS in Hadoop-2.7.3 may be broken in some way, use the trunk instead. YARN-6608 
tries to backport all SLS improvements from trunk to branch-2. You can try 
branch-2 after that.

> Rumen input does't work in SLS
> --
>
> Key: YARN-6111
> URL: https://issues.apache.org/jira/browse/YARN-6111
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler-load-simulator
>Affects Versions: 2.6.0, 2.7.3, 3.0.0-alpha2
> Environment: ubuntu14.0.4 os
>Reporter: YuJie Huang
>Assignee: Yufei Gu
>  Labels: test
> Fix For: 3.0.0-alpha3
>
> Attachments: YARN-6111.001.patch
>
>
> Hi guys,
> I am trying to learn the use of SLS.
> I would like to get the file realtimetrack.json, but this it only 
> contains "[]" at the end of a simulation. This is the command I use to 
> run the instance:
> HADOOP_HOME $ bin/slsrun.sh --input-rumen=sample-data/2jobsmin-rumen-jh.json 
> --output-dir=sample-data 
> All other files, including metrics, appears to be properly populated.I can 
> also trace with web:http://localhost:10001/simulate
> Can someone help?
> Thanks



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-6644) The demand of FSAppAttempt may be negative

2017-05-25 Thread Yufei Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu resolved YARN-6644.

Resolution: Duplicate

> The demand of FSAppAttempt may be negative 
> ---
>
> Key: YARN-6644
> URL: https://issues.apache.org/jira/browse/YARN-6644
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.2
> Environment: CentOS release 6.7 (Final)
>Reporter: JackZhou
> Fix For: 2.9.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6641) Non-public resource localization on a bad disk causes subsequent containers failure

2017-05-25 Thread Kuhu Shukla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kuhu Shukla updated YARN-6641:
--
Attachment: YARN-6641.003.patch

Thanks [~jlowe] for the quick response. I have updated the patch.

> Non-public resource localization on a bad disk causes subsequent containers 
> failure
> ---
>
> Key: YARN-6641
> URL: https://issues.apache.org/jira/browse/YARN-6641
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Attachments: YARN-6641.001.patch, YARN-6641.002.patch, 
> YARN-6641.003.patch
>
>
> YARN-3591 added the {{checkLocalResource}} method to {{isResourcePresent()}} 
> call to allow checking an already localized resource against the list of 
> good/full directories.
> Since LocalResourcesTrackerImpl instantiations for app level resources and 
> private resources do not use the new constructor, such resources that are on 
> bad disk will never be checked against good dirs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6643) TestRMFailover fails rarely due to port conflict

2017-05-25 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16024904#comment-16024904
 ] 

Jason Lowe commented on YARN-6643:
--

+1 lgtm.  The unit tests that failed don't even call the code that was changed. 
 I was able to reproduce one of the tests exiting early and filed YARN-6647.  
I'll commit this later today if there are no objections.

> TestRMFailover fails rarely due to port conflict
> 
>
> Key: YARN-6643
> URL: https://issues.apache.org/jira/browse/YARN-6643
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.9.0, 3.0.0-alpha3
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: YARN-6643.001.patch
>
>
> We've seen various tests in {{TestRMFailover}} fail very rarely with a 
> message like "org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.io.IOException: ResourceManager failed to start. Final state is 
> STOPPED".  
> After some digging, it turns out that it's due to a port conflict with the 
> embedded ZooKeeper in the tests.  The embedded ZooKeeper uses 
> {{ServerSocketUtil#getPort}} to choose a free port, but the RMs are 
> configured to 1 +  and 2 +  (e.g. the 
> default port for the RM is 8032, so you'd use 18032 and 28032).
> When I was able to reproduce this, I saw that ZooKeeper was using port 18033, 
> which is 1 + 8033, the default RM Admin port.  It results in an error 
> like this, causing the RM to be unable to start, and hence the original error 
> message in the test failure:
> {noformat}
> 2017-05-24 01:16:52,735 INFO  service.AbstractService 
> (AbstractService.java:noteFailure(272)) - Service ResourceManager failed in 
> state STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [0.0.0.0:18033] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [0.0.0.0:18033] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
> at 
> org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:139)
> at 
> org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getServer(HadoopYarnProtoRPC.java:65)
> at org.apache.hadoop.yarn.ipc.YarnRPC.getServer(YarnRPC.java:54)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.startServer(AdminService.java:171)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceStart(AdminService.java:158)
> at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1147)
> at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.yarn.server.MiniYARNCluster$2.run(MiniYARNCluster.java:310)
> Caused by: java.net.BindException: Problem binding to [0.0.0.0:18033] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:720)
> at org.apache.hadoop.ipc.Server.bind(Server.java:482)
> at org.apache.hadoop.ipc.Server$Listener.(Server.java:688)
> at org.apache.hadoop.ipc.Server.(Server.java:2376)
> at org.apache.hadoop.ipc.RPC$Server.(RPC.java:1042)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:535)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:510)
> at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:887)
> at 
> org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.createServer(RpcServerFactoryPBImpl.java:169)
> at 
> org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:132)
> ... 9 more
> Caused by: java.net.BindException: Address already in use
> at 

[jira] [Commented] (YARN-6647) ZKRMStateStore can crash during shutdown due to InterruptedException

2017-05-25 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16024901#comment-16024901
 ] 

Jason Lowe commented on YARN-6647:
--

Sample test output showing the mishandling of InterruptedException and a forced 
exit of the RM as a result.  In this case it causes tests to error because the 
JVM exits without notifying the test framework.
{noformat}
2017-05-25 10:23:45,835 INFO  [Thread-50] zookeeper.JUnit4ZKTestRunner 
(JUnit4ZKTestRunner.java:evaluate(78)) - FINISHED TEST METHOD 
testKillAppWhenFailoverHappensAtNewState
2017-05-25 10:23:45,835 DEBUG [main] service.AbstractService 
(AbstractService.java:enterState(452)) - Service: ResourceManager entered state 
STOPPED
2017-05-25 10:23:45,835 DEBUG [main] service.CompositeService 
(CompositeService.java:serviceStop(129)) - ResourceManager: stopping services, 
size=3
2017-05-25 10:23:45,835 DEBUG [main] service.CompositeService 
(CompositeService.java:stop(151)) - Stopping service #2: Service Dispatcher in 
state Dispatcher: STARTED
2017-05-25 10:23:45,835 DEBUG [main] service.AbstractService 
(AbstractService.java:enterState(452)) - Service: Dispatcher entered state 
STOPPED
2017-05-25 10:23:45,835 INFO  
[org.apache.hadoop.util.JvmPauseMonitor$Monitor@233aac83] util.JvmPauseMonitor 
(JvmPauseMonitor.java:run(188)) - Starting JVM pause monitor
2017-05-25 10:23:45,836 DEBUG [main] service.CompositeService 
(CompositeService.java:stop(151)) - Stopping service #1: Service 
org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter in 
state org.apache.hadoop.yarn.server.res
ourcemanager.ahs.RMApplicationHistoryWriter: STARTED
2017-05-25 10:23:45,836 DEBUG [main] service.AbstractService 
(AbstractService.java:enterState(452)) - Service: 
org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter 
entered state STOPPED
2017-05-25 10:23:45,836 DEBUG [main] service.CompositeService 
(CompositeService.java:serviceStop(129)) - 
org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter: 
stopping services, size=0
2017-05-25 10:23:45,836 DEBUG [main] service.CompositeService 
(CompositeService.java:stop(151)) - Stopping service #0: Service 
org.apache.hadoop.yarn.server.resourcemanager.AdminService in state 
org.apache.hadoop.yarn.server.resourcemanager.Admin
Service: STARTED
2017-05-25 10:23:45,836 DEBUG [main] service.AbstractService 
(AbstractService.java:enterState(452)) - Service: 
org.apache.hadoop.yarn.server.resourcemanager.AdminService entered state STOPPED
2017-05-25 10:23:45,836 DEBUG [main] service.CompositeService 
(CompositeService.java:serviceStop(129)) - 
org.apache.hadoop.yarn.server.resourcemanager.AdminService: stopping services, 
size=0
2017-05-25 10:23:45,836 INFO  [main] resourcemanager.ResourceManager 
(ResourceManager.java:transitionToStandby(1191)) - Already in standby state
2017-05-25 10:23:45,836 DEBUG [main] service.AbstractService 
(AbstractService.java:enterState(452)) - Service: ResourceManager entered state 
STOPPED
2017-05-25 10:23:45,836 DEBUG [main] service.CompositeService 
(CompositeService.java:serviceStop(129)) - ResourceManager: stopping services, 
size=3
2017-05-25 10:23:45,836 DEBUG [main] service.CompositeService 
(CompositeService.java:stop(151)) - Stopping service #2: Service 
org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter in 
state 
org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter: 
STARTED
2017-05-25 10:23:45,836 DEBUG [main] service.AbstractService 
(AbstractService.java:enterState(452)) - Service: 
org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter 
entered state STOPPED
2017-05-25 10:23:45,837 DEBUG [main] service.CompositeService 
(CompositeService.java:serviceStop(129)) - 
org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter: 
stopping services, size=0
2017-05-25 10:23:45,837 DEBUG [main] service.CompositeService 
(CompositeService.java:stop(151)) - Stopping service #1: Service 
org.apache.hadoop.yarn.server.resourcemanager.AdminService in state 
org.apache.hadoop.yarn.server.resourcemanager.AdminService: STARTED
2017-05-25 10:23:45,837 DEBUG [main] service.AbstractService 
(AbstractService.java:enterState(452)) - Service: 
org.apache.hadoop.yarn.server.resourcemanager.AdminService entered state STOPPED
2017-05-25 10:23:45,837 DEBUG [main] service.CompositeService 
(CompositeService.java:serviceStop(129)) - 
org.apache.hadoop.yarn.server.resourcemanager.AdminService: stopping services, 
size=0
2017-05-25 10:23:45,837 DEBUG [main] service.CompositeService 
(CompositeService.java:stop(151)) - Stopping service #0: Service Dispatcher in 
state Dispatcher: STARTED
2017-05-25 10:23:45,837 DEBUG [main] service.AbstractService 
(AbstractService.java:enterState(452)) - Service: Dispatcher entered state 
STOPPED
2017-05-25 10:23:45,837 INFO  [main] 

[jira] [Created] (YARN-6647) ZKRMStateStore can crash during shutdown due to InterruptedException

2017-05-25 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-6647:


 Summary: ZKRMStateStore can crash during shutdown due to 
InterruptedException
 Key: YARN-6647
 URL: https://issues.apache.org/jira/browse/YARN-6647
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Jason Lowe


Noticed some tests were failing due to the JVM shutting down early.  I was able 
to reproduce this occasionally with TestKillApplicationWithRMHA.  Stacktrace to 
follow.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4925) ContainerRequest in AMRMClient, application should be able to specify nodes/racks together with nodeLabelExpression

2017-05-25 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16024866#comment-16024866
 ] 

Bibin A Chundatt commented on YARN-4925:


We should also port dependent YARN-4140

> ContainerRequest in AMRMClient, application should be able to specify 
> nodes/racks together with nodeLabelExpression
> ---
>
> Key: YARN-4925
> URL: https://issues.apache.org/jira/browse/YARN-4925
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>  Labels: release-blocker
> Fix For: 2.8.0, 3.0.0-alpha1
>
> Attachments: 0001-YARN-4925.patch, 0002-YARN-4925.patch, 
> YARN-4925-branch-2.7.001.patch
>
>
> Currently with nodelabel AMRMClient will not be able to specify nodelabels 
> with Node/Rack requests.For application like spark NODE_LOCAL requests cannot 
> be asked with label expression.
> As per the check in  {{AMRMClientImpl#checkNodeLabelExpression}}
> {noformat}
> // Don't allow specify node label against ANY request
> if ((containerRequest.getRacks() != null && 
> (!containerRequest.getRacks().isEmpty()))
> || 
> (containerRequest.getNodes() != null && 
> (!containerRequest.getNodes().isEmpty( {
>   throw new InvalidContainerRequestException(
>   "Cannot specify node label with rack and node");
> }
> {noformat}
> {{AppSchedulingInfo#updateResourceRequests}} we do reset of labels to that of 
> OFF-SWITCH. 
> The above check is not required for ContainerRequest ask /cc [~wangda] thank 
> you for confirming



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-6390) Support service assembly

2017-05-25 Thread Billie Rinaldi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Billie Rinaldi resolved YARN-6390.
--
Resolution: Duplicate

This will be completed as part of YARN-6613.

> Support service assembly
> 
>
> Key: YARN-6390
> URL: https://issues.apache.org/jira/browse/YARN-6390
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Reporter: Jian He
>
> An assembly is a hierarchical app-of-apps. Say, an assembly could be a 
> combination of zookeeper + hbase + kafka. 
> This functionality was there in slider, need to re-implement this in the new 
> yarn-native-service framework.
> Also, the new yarn-native-service UI needs to account for the assembly 
> concept.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6641) Non-public resource localization on a bad disk causes subsequent containers failure

2017-05-25 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16024818#comment-16024818
 ] 

Jason Lowe commented on YARN-6641:
--

Thanks for the patch!  Patch looks good overall.

At this point the only callers of the LocalResourcesTrackerImpl constructor 
that omits the directory handler are tests, and I think it would be better to 
simply remove this constructor and update the few places in the tests to 
explicitly pass null.  That way future maintainers won't be lulled into 
thinking it's OK to call the constructor without a handler, since it clearly 
needs a dir handler to properly deal with resources that get orphaned on bad 
disks.


> Non-public resource localization on a bad disk causes subsequent containers 
> failure
> ---
>
> Key: YARN-6641
> URL: https://issues.apache.org/jira/browse/YARN-6641
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Attachments: YARN-6641.001.patch, YARN-6641.002.patch
>
>
> YARN-3591 added the {{checkLocalResource}} method to {{isResourcePresent()}} 
> call to allow checking an already localized resource against the list of 
> good/full directories.
> Since LocalResourcesTrackerImpl instantiations for app level resources and 
> private resources do not use the new constructor, such resources that are on 
> bad disk will never be checked against good dirs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6641) Non-public resource localization on a bad disk causes subsequent containers failure

2017-05-25 Thread Kuhu Shukla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16024814#comment-16024814
 ] 

Kuhu Shukla commented on YARN-6641:
---

Minor checkstyle issues. Will fix in upcoming patches. Request for review on 
the approach and any concerns with this change. [~jlowe]/ [~nroberts]. 

> Non-public resource localization on a bad disk causes subsequent containers 
> failure
> ---
>
> Key: YARN-6641
> URL: https://issues.apache.org/jira/browse/YARN-6641
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Attachments: YARN-6641.001.patch, YARN-6641.002.patch
>
>
> YARN-3591 added the {{checkLocalResource}} method to {{isResourcePresent()}} 
> call to allow checking an already localized resource against the list of 
> good/full directories.
> Since LocalResourcesTrackerImpl instantiations for app level resources and 
> private resources do not use the new constructor, such resources that are on 
> bad disk will never be checked against good dirs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6646) Modifier 'static' is redundant for inner enums less

2017-05-25 Thread ZhangBing Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16024621#comment-16024621
 ] 

ZhangBing Lin commented on YARN-6646:
-

Through the log,Unit test failure and FindBugs are not caused by this patch


> Modifier 'static' is redundant for inner enums less
> ---
>
> Key: YARN-6646
> URL: https://issues.apache.org/jira/browse/YARN-6646
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha3
>Reporter: ZhangBing Lin
>Assignee: ZhangBing Lin
>Priority: Minor
> Attachments: YARN-6646.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6646) Modifier 'static' is redundant for inner enums less

2017-05-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16024617#comment-16024617
 ] 

Hadoop QA commented on YARN-6646:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
55s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  2m 
36s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
50s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 in trunk has 5 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
40s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
58s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 53s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 2 new + 147 unchanged - 14 fixed = 149 total (was 161) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  2m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
33s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 
57s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
8s{color} | {color:green} hadoop-yarn-server-applicationhistoryservice in the 
patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
50s{color} | {color:green} hadoop-yarn-server-timelineservice in the patch 
passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 38m 58s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 
16s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
39s{color} | {color:green} hadoop-yarn-applications-distributedshell in the 
patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}159m 47s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit 

[jira] [Commented] (YARN-6644) The demand of FSAppAttempt may be negative

2017-05-25 Thread JackZhou (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16024582#comment-16024582
 ] 

JackZhou commented on YARN-6644:


[~Feng Yuan] Thanks a lot.

> The demand of FSAppAttempt may be negative 
> ---
>
> Key: YARN-6644
> URL: https://issues.apache.org/jira/browse/YARN-6644
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.2
> Environment: CentOS release 6.7 (Final)
>Reporter: JackZhou
> Fix For: 2.9.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6644) The demand of FSAppAttempt may be negative

2017-05-25 Thread JackZhou (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16024581#comment-16024581
 ] 

JackZhou commented on YARN-6644:


Thank you, yufei. I find my problem is the same with YARN-6020.
So my problem is solve, thanks a lot.

> The demand of FSAppAttempt may be negative 
> ---
>
> Key: YARN-6644
> URL: https://issues.apache.org/jira/browse/YARN-6644
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.2
> Environment: CentOS release 6.7 (Final)
>Reporter: JackZhou
> Fix For: 2.9.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6644) The demand of FSAppAttempt may be negative

2017-05-25 Thread Feng Yuan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16024538#comment-16024538
 ] 

Feng Yuan commented on YARN-6644:
-

Hi, [~jack zhou] this is because before 2.8, Resource in FSAppAttempt#demand 
use Int to display memory,so it will let int overflow Integer.Max.
check this issue YARN-6020

> The demand of FSAppAttempt may be negative 
> ---
>
> Key: YARN-6644
> URL: https://issues.apache.org/jira/browse/YARN-6644
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.2
> Environment: CentOS release 6.7 (Final)
>Reporter: JackZhou
> Fix For: 2.9.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6646) Modifier 'static' is redundant for inner enums less

2017-05-25 Thread ZhangBing Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhangBing Lin updated YARN-6646:

Affects Version/s: 3.0.0-alpha3

> Modifier 'static' is redundant for inner enums less
> ---
>
> Key: YARN-6646
> URL: https://issues.apache.org/jira/browse/YARN-6646
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha3
>Reporter: ZhangBing Lin
>Assignee: ZhangBing Lin
>Priority: Minor
> Attachments: YARN-6646.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6141) ppc64le on Linux doesn't trigger __linux get_executable codepath

2017-05-25 Thread Sonia Garudi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16024445#comment-16024445
 ] 

Sonia Garudi commented on YARN-6141:


Thanks [~ajisakaa] .

> ppc64le on Linux doesn't trigger __linux get_executable codepath
> 
>
> Key: YARN-6141
> URL: https://issues.apache.org/jira/browse/YARN-6141
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha3
> Environment: $ uname -a
> Linux f8eef0f055cf 3.16.0-30-generic #40~14.04.1-Ubuntu SMP Thu Jan 15 
> 17:42:36 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux
>Reporter: Sonia Garudi
>Assignee: Ayappan
>  Labels: ppc64le
> Fix For: 2.9.0, 2.8.1, 3.0.0-alpha3
>
> Attachments: YARN-6141.patch
>
>
> On ppc64le architecture, the build fails in the 'Hadoop YARN NodeManager' 
> project with the below error:
> Cannot safely determine executable path with a relative HADOOP_CONF_DIR on 
> this operating system.
> [WARNING]  #error Cannot safely determine executable path with a relative 
> HADOOP_CONF_DIR on this operating system.
> [WARNING]   ^
> [WARNING] make[2]: *** 
> [CMakeFiles/container.dir/main/native/container-executor/impl/get_executable.c.o]
>  Error 1
> [WARNING] make[2]: *** Waiting for unfinished jobs
> [WARNING] make[1]: *** [CMakeFiles/container.dir/all] Error 2
> [WARNING] make: *** [all] Error 2
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> Cmake version used :
> $ /usr/bin/cmake --version
> cmake version 2.8.12.2



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6646) Modifier 'static' is redundant for inner enums less

2017-05-25 Thread ZhangBing Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhangBing Lin updated YARN-6646:

Attachment: YARN-6646.001.patch

> Modifier 'static' is redundant for inner enums less
> ---
>
> Key: YARN-6646
> URL: https://issues.apache.org/jira/browse/YARN-6646
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: ZhangBing Lin
>Assignee: ZhangBing Lin
>Priority: Minor
> Attachments: YARN-6646.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6141) ppc64le on Linux doesn't trigger __linux get_executable codepath

2017-05-25 Thread Ayappan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16024434#comment-16024434
 ] 

Ayappan commented on YARN-6141:
---

Thanks [~ajisakaa]

> ppc64le on Linux doesn't trigger __linux get_executable codepath
> 
>
> Key: YARN-6141
> URL: https://issues.apache.org/jira/browse/YARN-6141
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha3
> Environment: $ uname -a
> Linux f8eef0f055cf 3.16.0-30-generic #40~14.04.1-Ubuntu SMP Thu Jan 15 
> 17:42:36 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux
>Reporter: Sonia Garudi
>Assignee: Ayappan
>  Labels: ppc64le
> Fix For: 2.9.0, 2.8.1, 3.0.0-alpha3
>
> Attachments: YARN-6141.patch
>
>
> On ppc64le architecture, the build fails in the 'Hadoop YARN NodeManager' 
> project with the below error:
> Cannot safely determine executable path with a relative HADOOP_CONF_DIR on 
> this operating system.
> [WARNING]  #error Cannot safely determine executable path with a relative 
> HADOOP_CONF_DIR on this operating system.
> [WARNING]   ^
> [WARNING] make[2]: *** 
> [CMakeFiles/container.dir/main/native/container-executor/impl/get_executable.c.o]
>  Error 1
> [WARNING] make[2]: *** Waiting for unfinished jobs
> [WARNING] make[1]: *** [CMakeFiles/container.dir/all] Error 2
> [WARNING] make: *** [all] Error 2
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> Cmake version used :
> $ /usr/bin/cmake --version
> cmake version 2.8.12.2



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-6646) Modifier 'static' is redundant for inner enums less

2017-05-25 Thread ZhangBing Lin (JIRA)
ZhangBing Lin created YARN-6646:
---

 Summary: Modifier 'static' is redundant for inner enums less
 Key: YARN-6646
 URL: https://issues.apache.org/jira/browse/YARN-6646
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: ZhangBing Lin
Assignee: ZhangBing Lin
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6141) ppc64le on Linux doesn't trigger __linux get_executable codepath

2017-05-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16024396#comment-16024396
 ] 

Hudson commented on YARN-6141:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11779 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11779/])
YARN-6141. ppc64le on Linux doesn't trigger __linux get_executable (aajisaka: 
rev bc28da65fb1c67904aa3cefd7273cb7423521014)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/get_executable.c


> ppc64le on Linux doesn't trigger __linux get_executable codepath
> 
>
> Key: YARN-6141
> URL: https://issues.apache.org/jira/browse/YARN-6141
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha3
> Environment: $ uname -a
> Linux f8eef0f055cf 3.16.0-30-generic #40~14.04.1-Ubuntu SMP Thu Jan 15 
> 17:42:36 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux
>Reporter: Sonia Garudi
>Assignee: Ayappan
>  Labels: ppc64le
> Fix For: 2.9.0, 2.8.1, 3.0.0-alpha3
>
> Attachments: YARN-6141.patch
>
>
> On ppc64le architecture, the build fails in the 'Hadoop YARN NodeManager' 
> project with the below error:
> Cannot safely determine executable path with a relative HADOOP_CONF_DIR on 
> this operating system.
> [WARNING]  #error Cannot safely determine executable path with a relative 
> HADOOP_CONF_DIR on this operating system.
> [WARNING]   ^
> [WARNING] make[2]: *** 
> [CMakeFiles/container.dir/main/native/container-executor/impl/get_executable.c.o]
>  Error 1
> [WARNING] make[2]: *** Waiting for unfinished jobs
> [WARNING] make[1]: *** [CMakeFiles/container.dir/all] Error 2
> [WARNING] make: *** [all] Error 2
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> Cmake version used :
> $ /usr/bin/cmake --version
> cmake version 2.8.12.2



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6141) ppc64le on Linux doesn't trigger __linux get_executable codepath

2017-05-25 Thread Akira Ajisaka (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16024373#comment-16024373
 ] 

Akira Ajisaka commented on YARN-6141:
-

LGTM, +1. Checking this in.

> ppc64le on Linux doesn't trigger __linux get_executable codepath
> 
>
> Key: YARN-6141
> URL: https://issues.apache.org/jira/browse/YARN-6141
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha3
> Environment: $ uname -a
> Linux f8eef0f055cf 3.16.0-30-generic #40~14.04.1-Ubuntu SMP Thu Jan 15 
> 17:42:36 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux
>Reporter: Sonia Garudi
>Assignee: Ayappan
>  Labels: ppc64le
> Attachments: YARN-6141.patch
>
>
> On ppc64le architecture, the build fails in the 'Hadoop YARN NodeManager' 
> project with the below error:
> Cannot safely determine executable path with a relative HADOOP_CONF_DIR on 
> this operating system.
> [WARNING]  #error Cannot safely determine executable path with a relative 
> HADOOP_CONF_DIR on this operating system.
> [WARNING]   ^
> [WARNING] make[2]: *** 
> [CMakeFiles/container.dir/main/native/container-executor/impl/get_executable.c.o]
>  Error 1
> [WARNING] make[2]: *** Waiting for unfinished jobs
> [WARNING] make[1]: *** [CMakeFiles/container.dir/all] Error 2
> [WARNING] make: *** [all] Error 2
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> Cmake version used :
> $ /usr/bin/cmake --version
> cmake version 2.8.12.2



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6111) Rumen input does't work in SLS

2017-05-25 Thread YuJie Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16024368#comment-16024368
 ] 

YuJie Huang commented on YARN-6111:
---

Only two jobs in 2jobs2min-rumen-jh.json file in Hadoop-2.7.3 and its format is 
{job1} {job2} but not [{job1},{job2}], but the  realtimetrack.json file still 
only has "[]".

> Rumen input does't work in SLS
> --
>
> Key: YARN-6111
> URL: https://issues.apache.org/jira/browse/YARN-6111
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler-load-simulator
>Affects Versions: 2.6.0, 2.7.3, 3.0.0-alpha2
> Environment: ubuntu14.0.4 os
>Reporter: YuJie Huang
>Assignee: Yufei Gu
>  Labels: test
> Fix For: 3.0.0-alpha3
>
> Attachments: YARN-6111.001.patch
>
>
> Hi guys,
> I am trying to learn the use of SLS.
> I would like to get the file realtimetrack.json, but this it only 
> contains "[]" at the end of a simulation. This is the command I use to 
> run the instance:
> HADOOP_HOME $ bin/slsrun.sh --input-rumen=sample-data/2jobsmin-rumen-jh.json 
> --output-dir=sample-data 
> All other files, including metrics, appears to be properly populated.I can 
> also trace with web:http://localhost:10001/simulate
> Can someone help?
> Thanks



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6577) Remove unused ContainerLocalization classes

2017-05-25 Thread Akira Ajisaka (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated YARN-6577:

Fix Version/s: 2.9.0

> Remove unused ContainerLocalization classes
> ---
>
> Key: YARN-6577
> URL: https://issues.apache.org/jira/browse/YARN-6577
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.3, 3.0.0-alpha2
>Reporter: ZhangBing Lin
>Assignee: ZhangBing Lin
>Priority: Minor
> Fix For: 2.9.0, 2.8.1, 3.0.0-alpha3
>
> Attachments: YARN-6577.001.patch
>
>
> From 2.7.3  and 3.0.0-alpha2, the ContainerLocalization interface and the 
> ContainerLocalizationImpl implementation class are of no use, and I recommend 
> removing the useless interface and implementation classes



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6111) Rumen input does't work in SLS

2017-05-25 Thread YuJie Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16024353#comment-16024353
 ] 

YuJie Huang commented on YARN-6111:
---

Yufei Gu, There are some strange symbols in your patch like:
@@ -1,4 +1,4 @@
-[{
+{
   "priority" : "NORMAL",
   "jobID" : "job_1369942127770_1205",
   "user" : "jenkins",
@@ -5078,7 +5078,8 @@
   "clusterReduceMB" : -1,
   "jobMapMB" : 200,
   "jobReduceMB" : 200
-}, {
Is there something wrong with my opening way or I just need to remove this kind 
of symbols (@,-, -5078...)?

> Rumen input does't work in SLS
> --
>
> Key: YARN-6111
> URL: https://issues.apache.org/jira/browse/YARN-6111
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler-load-simulator
>Affects Versions: 2.6.0, 2.7.3, 3.0.0-alpha2
> Environment: ubuntu14.0.4 os
>Reporter: YuJie Huang
>Assignee: Yufei Gu
>  Labels: test
> Fix For: 3.0.0-alpha3
>
> Attachments: YARN-6111.001.patch
>
>
> Hi guys,
> I am trying to learn the use of SLS.
> I would like to get the file realtimetrack.json, but this it only 
> contains "[]" at the end of a simulation. This is the command I use to 
> run the instance:
> HADOOP_HOME $ bin/slsrun.sh --input-rumen=sample-data/2jobsmin-rumen-jh.json 
> --output-dir=sample-data 
> All other files, including metrics, appears to be properly populated.I can 
> also trace with web:http://localhost:10001/simulate
> Can someone help?
> Thanks



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6645) Bug fix in ContainerImpl when calling the symLink of LinuxContainerExecutor

2017-05-25 Thread Bingxue Qiu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16024312#comment-16024312
 ] 

Bingxue Qiu commented on YARN-6645:
---

hi, [~cheersyang], we backport the YARN-1503 to hadoop 2.8 in our clusters. for 
this exception, we create the nmPrivateDir in writeScriptToNMPrivateDir method 
like this, please feel free to give me some suggestion, Thank you!

 private File writeScriptToNMPrivateDir(String nmPrivateDir, String command)
  throws IOException {
File file = new File(nmPrivateDir);
if (!file.mkdirs()) {
  if (!file.exists()) {
LOG.error("Failed to create nmPrivate dir " + file);
  }
}

File tmp = File.createTempFile("cmd_", "_tmp", new File(nmPrivateDir));
Writer writer = new OutputStreamWriter(new FileOutputStream(tmp), "UTF-8");
PrintWriter printWriter = new PrintWriter(writer);
printWriter.print(command);
printWriter.close();
return tmp;
  }

> Bug fix in ContainerImpl when calling the symLink of LinuxContainerExecutor
> ---
>
> Key: YARN-6645
> URL: https://issues.apache.org/jira/browse/YARN-6645
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Bingxue Qiu
> Fix For: 2.9.0
>
> Attachments: error when creating symlink.png
>
>
> when creating symlink after the resource localized in our clusters , an 
> IOException has been thrown, because the nmPrivateDir doesn't exist. we add a 
> patch to fix it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6643) TestRMFailover fails rarely due to port conflict

2017-05-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16024307#comment-16024307
 ] 

Hadoop QA commented on YARN-6643:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 42m 59s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 66m 37s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Timed out junit tests | 
org.apache.hadoop.yarn.server.resourcemanager.TestRMStoreCommands |
|   | 
org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA |
|   | org.apache.hadoop.yarn.server.resourcemanager.TestKillApplicationWithRMHA 
|
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | YARN-6643 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12869806/YARN-6643.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux ab1e0ee1cf4c 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 
14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / d049bd2 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/16017/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/16017/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/16017/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> TestRMFailover fails rarely due to 

[jira] [Updated] (YARN-6645) Bug fix in ContainerImpl when calling the symLink of LinuxContainerExecutor

2017-05-25 Thread Bingxue Qiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bingxue Qiu updated YARN-6645:
--
Attachment: error when creating symlink.png

add the error logs when creating symlink

> Bug fix in ContainerImpl when calling the symLink of LinuxContainerExecutor
> ---
>
> Key: YARN-6645
> URL: https://issues.apache.org/jira/browse/YARN-6645
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Bingxue Qiu
> Fix For: 2.9.0
>
> Attachments: error when creating symlink.png
>
>
> when creating symlink after the resource localized in our clusters , an 
> IOException has been thrown, because the nmPrivateDir doesn't exist. we add a 
> patch to fix it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6645) Bug fix in ContainerImpl when calling the symLink of LinuxContainerExecutor

2017-05-25 Thread Bingxue Qiu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16024302#comment-16024302
 ] 

Bingxue Qiu commented on YARN-6645:
---

hi  [~cheersyang] , i will upload the logs and patch later, Thank you!

> Bug fix in ContainerImpl when calling the symLink of LinuxContainerExecutor
> ---
>
> Key: YARN-6645
> URL: https://issues.apache.org/jira/browse/YARN-6645
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Bingxue Qiu
> Fix For: 2.9.0
>
>
> when creating symlink after the resource localized in our clusters , an 
> IOException has been thrown, because the nmPrivateDir doesn't exist. we add a 
> patch to fix it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6645) Bug fix in ContainerImpl when calling the symLink of LinuxContainerExecutor

2017-05-25 Thread Bingxue Qiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bingxue Qiu updated YARN-6645:
--
Fix Version/s: 2.9.0

> Bug fix in ContainerImpl when calling the symLink of LinuxContainerExecutor
> ---
>
> Key: YARN-6645
> URL: https://issues.apache.org/jira/browse/YARN-6645
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Bingxue Qiu
> Fix For: 2.9.0
>
>
> when creating symlink after the resource localized in our clusters , an 
> IOException has been thrown, because the nmPrivateDir doesn't exist. we add a 
> patch to fix it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6645) Bug fix in ContainerImpl when calling the symLink of LinuxContainerExecutor

2017-05-25 Thread Bingxue Qiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bingxue Qiu updated YARN-6645:
--
Description: when creating symlink after the resource localized in our 
clusters , an IOException has been thrown, because the nmPrivateDir doesn't 
exist. we add a patch to fix it.

> Bug fix in ContainerImpl when calling the symLink of LinuxContainerExecutor
> ---
>
> Key: YARN-6645
> URL: https://issues.apache.org/jira/browse/YARN-6645
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Bingxue Qiu
>
> when creating symlink after the resource localized in our clusters , an 
> IOException has been thrown, because the nmPrivateDir doesn't exist. we add a 
> patch to fix it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6555) Enable flow context read (& corresponding write) for recovering application with NM restart

2017-05-25 Thread Rohith Sharma K S (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-6555:

Labels: yarn-5355-merge-blocker  (was: )

> Enable flow context read (& corresponding write) for recovering application 
> with NM restart 
> 
>
> Key: YARN-6555
> URL: https://issues.apache.org/jira/browse/YARN-6555
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-5355, YARN-5355-branch-2, 3.0.0-alpha3
>Reporter: Vrushali C
>Assignee: Rohith Sharma K S
>  Labels: yarn-5355-merge-blocker
> Attachments: YARN-6555.001.patch, YARN-6555.002.patch
>
>
> If timeline service v2 is enabled and NM is restarted with recovery enabled, 
> then NM fails to start and throws an error as  "flow context can't be null".
> This is happening because the flow context did not exist before but now that 
> timeline service v2 is enabled, ApplicationImpl expects it to exist. 
> This would also happen even if flow context existed before but since we are 
> not persisting it / reading it during 
> ContainerManagerImpl#recoverApplication, it does not get passed in to 
> ApplicationImpl.
> full stack trace
> {code}
> 2017-05-03 21:51:52,178 FATAL 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting 
> NodeManager
> java.lang.IllegalArgumentException: flow context cannot be null
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.(ApplicationImpl.java:104)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.(ApplicationImpl.java:90)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recoverApplication(ContainerManagerImpl.java:318)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:280)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:267)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:276)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:588)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:649)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6323) Rolling upgrade/config change is broken on timeline v2.

2017-05-25 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16024242#comment-16024242
 ] 

Rohith Sharma K S commented on YARN-6323:
-

Thanks Vrushali and Haibo for discussing on rolling upgrade. I have specific 
scenario to discuss apart from YARN-6555.
If default context is considered from NM then. 
# Application is NOT submitted with tags. So default values are created by 
YARN. 
## RM creates default FlowContext with FlowName as appName. On NM restart, we 
are creating FlowContex with appId. So, there will be a inconsistencies when 
entities are published during rolling upgrade. 
# Assume that Application is submitted with some tags. 
## RM recover the application and start publishing with tags as flow context. 
Again there is inconsistencies in published entity. 

How are we going to deal with above cases?

> Rolling upgrade/config change is broken on timeline v2. 
> 
>
> Key: YARN-6323
> URL: https://issues.apache.org/jira/browse/YARN-6323
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Li Lu
>Assignee: Vrushali C
>  Labels: yarn-5355-merge-blocker
> Attachments: YARN-6323.001.patch
>
>
> Found this issue when deploying on real clusters. If there are apps running 
> when we enable timeline v2 (with work preserving restart enabled), node 
> managers will fail to start due to missing app context data. We should 
> probably assign some default names to these "left over" apps. I believe it's 
> suboptimal to let users clean up the whole cluster before enabling timeline 
> v2. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org