[jira] [Commented] (YARN-9646) DistributedShell tests failed to bind to a local host name

2019-07-16 Thread Ray Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886672#comment-16886672
 ] 

Ray Yang commented on YARN-9646:


[~haibochen] Thanks!

> DistributedShell tests failed to bind to a local host name
> --
>
> Key: YARN-9646
> URL: https://issues.apache.org/jira/browse/YARN-9646
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.4
>Reporter: Ray Yang
>Assignee: Ray Yang
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9646.00.patch
>
>
> When running the integration test 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell#testDSShellWithoutDomain
> at home
> The following error happened:
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [ruyang-mn3.linkedin.biz:0] 
> java.net.BindException: Can't assign requested address; For more details see: 
>  [http://wiki.apache.org/hadoop/BindException]
>  
> at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.startResourceManager(MiniYARNCluster.java:327)
> at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.access$400(MiniYARNCluster.java:99)
> at 
> org.apache.hadoop.yarn.server.MiniYARNCluster$ResourceManagerWrapper.serviceStart(MiniYARNCluster.java:447)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
> at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.serviceStart(MiniYARNCluster.java:278)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.setupInternal(TestDistributedShell.java:91)
> at 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.setup(TestDistributedShell.java:71)
> …
> Caused by: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [ruyang-mn3.linkedin.biz:0] 
> java.net.BindException: Can't assign requested address; For more details see: 
>  [http://wiki.apache.org/hadoop/BindException]
> at 
> org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:139)
> at 
> org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getServer(HadoopYarnProtoRPC.java:65)
> at org.apache.hadoop.yarn.ipc.YarnRPC.getServer(YarnRPC.java:54)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.*ResourceTrackerService.serviceStart*(ResourceTrackerService.java:163)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:588)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:976)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1017)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1013)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1754)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1013)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1053)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.startResourceManager(MiniYARNCluster.java:319)
> ... 31 more
> Caused by: java.net.BindException: Problem binding to 
> [ruyang-mn3.linkedin.biz:0]java.net.BindException: Can't assign requested 
> address; For more details see:  [http://wiki.apache.org/hadoop/BindException]
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:721)
> at org.apache.hadoop.ipc.Server.bind(Server.java:494)
> at org.apache.hadoop.ipc.Server$Listener.(Server.java:715)
> 

[jira] [Commented] (YARN-9682) Wrong log message when finalizing the upgrade

2019-07-16 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886647#comment-16886647
 ] 

Weiwei Yang commented on YARN-9682:
---

Pushed to trunk, cherry-picked to branch-3.2 and branch-3.1. Thanks for the 
contribution [~kyungwan nam].

> Wrong log message when finalizing the upgrade
> -
>
> Key: YARN-9682
> URL: https://issues.apache.org/jira/browse/YARN-9682
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: kyungwan nam
>Assignee: kyungwan nam
>Priority: Trivial
> Fix For: 3.3.0, 3.2.1, 3.1.3
>
> Attachments: YARN-9682.001.patch
>
>
> I've seen the wrong message as follows when finalize-upgrade for a 
> yarn-service
> {code:java}
> 2019-07-16 17:44:09,204 INFO  client.ServiceClient 
> (ServiceClient.java:actionStartAndGetId(1193)) - Finalize service {} 
> upgrade{code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9682) Wrong log message when finalizing the upgrade

2019-07-16 Thread Weiwei Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-9682:
--
Fix Version/s: 3.1.3

> Wrong log message when finalizing the upgrade
> -
>
> Key: YARN-9682
> URL: https://issues.apache.org/jira/browse/YARN-9682
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: kyungwan nam
>Assignee: kyungwan nam
>Priority: Trivial
> Fix For: 3.3.0, 3.2.1, 3.1.3
>
> Attachments: YARN-9682.001.patch
>
>
> I've seen the wrong message as follows when finalize-upgrade for a 
> yarn-service
> {code:java}
> 2019-07-16 17:44:09,204 INFO  client.ServiceClient 
> (ServiceClient.java:actionStartAndGetId(1193)) - Finalize service {} 
> upgrade{code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9682) Wrong log message when finalizing the upgrade

2019-07-16 Thread Weiwei Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-9682:
--
Fix Version/s: 3.2.1

> Wrong log message when finalizing the upgrade
> -
>
> Key: YARN-9682
> URL: https://issues.apache.org/jira/browse/YARN-9682
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: kyungwan nam
>Assignee: kyungwan nam
>Priority: Trivial
> Fix For: 3.3.0, 3.2.1
>
> Attachments: YARN-9682.001.patch
>
>
> I've seen the wrong message as follows when finalize-upgrade for a 
> yarn-service
> {code:java}
> 2019-07-16 17:44:09,204 INFO  client.ServiceClient 
> (ServiceClient.java:actionStartAndGetId(1193)) - Finalize service {} 
> upgrade{code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9635) Nodes page displayed duplicate nodes

2019-07-16 Thread Tao Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886643#comment-16886643
 ] 

Tao Yang commented on YARN-9635:


Thanks [~jiwq] for the patch and sorry for my late reply. 

Another related place is the description of that conf in NodeManager.md, it 
should be updated as well.

> Nodes page displayed duplicate nodes
> 
>
> Key: YARN-9635
> URL: https://issues.apache.org/jira/browse/YARN-9635
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager, resourcemanager
>Affects Versions: 3.2.0
>Reporter: Wanqiang Ji
>Assignee: Wanqiang Ji
>Priority: Major
> Attachments: UI2-nodes.jpg, YARN-9635.001.patch
>
>
> Steps:
>  * shutdown nodes
>  * start nodes
> Nodes Page:
> !UI2-nodes.jpg!



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9682) Wrong log message when finalizing the upgrade

2019-07-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886640#comment-16886640
 ] 

Hudson commented on YARN-9682:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #16934 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16934/])
YARN-9682. Wrong log message when finalizing the upgrade. Contributed by (wwei: 
rev 85d9111a88f94a5e6833cd142272be2c5823e922)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/client/ServiceClient.java


> Wrong log message when finalizing the upgrade
> -
>
> Key: YARN-9682
> URL: https://issues.apache.org/jira/browse/YARN-9682
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: kyungwan nam
>Assignee: kyungwan nam
>Priority: Trivial
> Fix For: 3.3.0
>
> Attachments: YARN-9682.001.patch
>
>
> I've seen the wrong message as follows when finalize-upgrade for a 
> yarn-service
> {code:java}
> 2019-07-16 17:44:09,204 INFO  client.ServiceClient 
> (ServiceClient.java:actionStartAndGetId(1193)) - Finalize service {} 
> upgrade{code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9682) Wrong log message when finalizing the upgrade

2019-07-16 Thread kyungwan nam (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886641#comment-16886641
 ] 

kyungwan nam commented on YARN-9682:


[~cheersyang] thank you for your review and comment

> Wrong log message when finalizing the upgrade
> -
>
> Key: YARN-9682
> URL: https://issues.apache.org/jira/browse/YARN-9682
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: kyungwan nam
>Assignee: kyungwan nam
>Priority: Trivial
> Fix For: 3.3.0
>
> Attachments: YARN-9682.001.patch
>
>
> I've seen the wrong message as follows when finalize-upgrade for a 
> yarn-service
> {code:java}
> 2019-07-16 17:44:09,204 INFO  client.ServiceClient 
> (ServiceClient.java:actionStartAndGetId(1193)) - Finalize service {} 
> upgrade{code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9438) launchTime not written to state store for running applications

2019-07-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886636#comment-16886636
 ] 

Hadoop QA commented on YARN-9438:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 59s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 34s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 4 new + 259 unchanged - 0 fixed = 263 total (was 259) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 58s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 78m 48s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}129m 33s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=18.09.7 Server=18.09.7 Image:yetus/hadoop:bdbca0e |
| JIRA Issue | YARN-9438 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12974895/YARN-9438.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 5e09c4b4a26c 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 5672efa |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/24402/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| unit | 

[jira] [Commented] (YARN-9635) Nodes page displayed duplicate nodes

2019-07-16 Thread Wanqiang Ji (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886635#comment-16886635
 ] 

Wanqiang Ji commented on YARN-9635:
---

Hi, [~Tao Yang] and [~sunilg]. Any thoughts?

> Nodes page displayed duplicate nodes
> 
>
> Key: YARN-9635
> URL: https://issues.apache.org/jira/browse/YARN-9635
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager, resourcemanager
>Affects Versions: 3.2.0
>Reporter: Wanqiang Ji
>Assignee: Wanqiang Ji
>Priority: Major
> Attachments: UI2-nodes.jpg, YARN-9635.001.patch
>
>
> Steps:
>  * shutdown nodes
>  * start nodes
> Nodes Page:
> !UI2-nodes.jpg!



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9682) Wrong log message when finalizing the upgrade

2019-07-16 Thread Weiwei Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-9682:
--
Summary: Wrong log message when finalizing the upgrade  (was: wrong log 
message when finalize upgrade)

> Wrong log message when finalizing the upgrade
> -
>
> Key: YARN-9682
> URL: https://issues.apache.org/jira/browse/YARN-9682
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: kyungwan nam
>Assignee: kyungwan nam
>Priority: Trivial
> Attachments: YARN-9682.001.patch
>
>
> I've seen the wrong message as follows when finalize-upgrade for a 
> yarn-service
> {code:java}
> 2019-07-16 17:44:09,204 INFO  client.ServiceClient 
> (ServiceClient.java:actionStartAndGetId(1193)) - Finalize service {} 
> upgrade{code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9682) wrong log message when finalize upgrade

2019-07-16 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886630#comment-16886630
 ] 

Weiwei Yang commented on YARN-9682:
---

+1, committing shortly.

> wrong log message when finalize upgrade
> ---
>
> Key: YARN-9682
> URL: https://issues.apache.org/jira/browse/YARN-9682
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: kyungwan nam
>Assignee: kyungwan nam
>Priority: Trivial
> Attachments: YARN-9682.001.patch
>
>
> I've seen the wrong message as follows when finalize-upgrade for a 
> yarn-service
> {code:java}
> 2019-07-16 17:44:09,204 INFO  client.ServiceClient 
> (ServiceClient.java:actionStartAndGetId(1193)) - Finalize service {} 
> upgrade{code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9682) wrong log message when finalize upgrade

2019-07-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886614#comment-16886614
 ] 

Hadoop QA commented on YARN-9682:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 43s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 34s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 
21s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 65m  4s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=18.09.7 Server=18.09.7 Image:yetus/hadoop:bdbca0e |
| JIRA Issue | YARN-9682 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12974896/YARN-9682.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 785d45540c86 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 5915c90 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/24403/testReport/ |
| Max. process+thread count | 754 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/24403/console |
| Powered by 

[jira] [Assigned] (YARN-9682) wrong log message when finalize upgrade

2019-07-16 Thread kyungwan nam (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kyungwan nam reassigned YARN-9682:
--

  Assignee: kyungwan nam
Attachment: YARN-9682.001.patch

> wrong log message when finalize upgrade
> ---
>
> Key: YARN-9682
> URL: https://issues.apache.org/jira/browse/YARN-9682
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: kyungwan nam
>Assignee: kyungwan nam
>Priority: Trivial
> Attachments: YARN-9682.001.patch
>
>
> I've seen the wrong message as follows when finalize-upgrade for a 
> yarn-service
> {code:java}
> 2019-07-16 17:44:09,204 INFO  client.ServiceClient 
> (ServiceClient.java:actionStartAndGetId(1193)) - Finalize service {} 
> upgrade{code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9682) wrong log message when finalize upgrade

2019-07-16 Thread kyungwan nam (JIRA)
kyungwan nam created YARN-9682:
--

 Summary: wrong log message when finalize upgrade
 Key: YARN-9682
 URL: https://issues.apache.org/jira/browse/YARN-9682
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: kyungwan nam


I've seen the wrong message as follows when finalize-upgrade for a yarn-service
{code:java}
2019-07-16 17:44:09,204 INFO  client.ServiceClient 
(ServiceClient.java:actionStartAndGetId(1193)) - Finalize service {} 
upgrade{code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9646) DistributedShell tests failed to bind to a local host name

2019-07-16 Thread Haibo Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886578#comment-16886578
 ] 

Haibo Chen commented on YARN-9646:
--

[~HappyRay] I have fixed the minor import issues along with my commit. It has 
now been merged to trunk. Thanks for your contribution!

> DistributedShell tests failed to bind to a local host name
> --
>
> Key: YARN-9646
> URL: https://issues.apache.org/jira/browse/YARN-9646
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.4
>Reporter: Ray Yang
>Assignee: Ray Yang
>Priority: Major
> Attachments: YARN-9646.00.patch
>
>
> When running the integration test 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell#testDSShellWithoutDomain
> at home
> The following error happened:
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [ruyang-mn3.linkedin.biz:0] 
> java.net.BindException: Can't assign requested address; For more details see: 
>  [http://wiki.apache.org/hadoop/BindException]
>  
> at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.startResourceManager(MiniYARNCluster.java:327)
> at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.access$400(MiniYARNCluster.java:99)
> at 
> org.apache.hadoop.yarn.server.MiniYARNCluster$ResourceManagerWrapper.serviceStart(MiniYARNCluster.java:447)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
> at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.serviceStart(MiniYARNCluster.java:278)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.setupInternal(TestDistributedShell.java:91)
> at 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.setup(TestDistributedShell.java:71)
> …
> Caused by: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [ruyang-mn3.linkedin.biz:0] 
> java.net.BindException: Can't assign requested address; For more details see: 
>  [http://wiki.apache.org/hadoop/BindException]
> at 
> org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:139)
> at 
> org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getServer(HadoopYarnProtoRPC.java:65)
> at org.apache.hadoop.yarn.ipc.YarnRPC.getServer(YarnRPC.java:54)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.*ResourceTrackerService.serviceStart*(ResourceTrackerService.java:163)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:588)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:976)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1017)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1013)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1754)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1013)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1053)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.startResourceManager(MiniYARNCluster.java:319)
> ... 31 more
> Caused by: java.net.BindException: Problem binding to 
> [ruyang-mn3.linkedin.biz:0]java.net.BindException: Can't assign requested 
> address; For more details see:  [http://wiki.apache.org/hadoop/BindException]
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:721)
> at 

[jira] [Commented] (YARN-9646) DistributedShell tests failed to bind to a local host name

2019-07-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886579#comment-16886579
 ] 

Hudson commented on YARN-9646:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #16933 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16933/])
YARN-9646. DistributedShell tests failed to bind to a local host name. 
(haibochen: rev 5915c902aa7a966202f896515aa689f2792467b1)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/TestDistributedShell.java


> DistributedShell tests failed to bind to a local host name
> --
>
> Key: YARN-9646
> URL: https://issues.apache.org/jira/browse/YARN-9646
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.4
>Reporter: Ray Yang
>Assignee: Ray Yang
>Priority: Major
> Attachments: YARN-9646.00.patch
>
>
> When running the integration test 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell#testDSShellWithoutDomain
> at home
> The following error happened:
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [ruyang-mn3.linkedin.biz:0] 
> java.net.BindException: Can't assign requested address; For more details see: 
>  [http://wiki.apache.org/hadoop/BindException]
>  
> at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.startResourceManager(MiniYARNCluster.java:327)
> at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.access$400(MiniYARNCluster.java:99)
> at 
> org.apache.hadoop.yarn.server.MiniYARNCluster$ResourceManagerWrapper.serviceStart(MiniYARNCluster.java:447)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
> at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.serviceStart(MiniYARNCluster.java:278)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.setupInternal(TestDistributedShell.java:91)
> at 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.setup(TestDistributedShell.java:71)
> …
> Caused by: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [ruyang-mn3.linkedin.biz:0] 
> java.net.BindException: Can't assign requested address; For more details see: 
>  [http://wiki.apache.org/hadoop/BindException]
> at 
> org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:139)
> at 
> org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getServer(HadoopYarnProtoRPC.java:65)
> at org.apache.hadoop.yarn.ipc.YarnRPC.getServer(YarnRPC.java:54)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.*ResourceTrackerService.serviceStart*(ResourceTrackerService.java:163)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:588)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:976)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1017)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1013)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1754)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1013)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1053)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.startResourceManager(MiniYARNCluster.java:319)
> ... 31 more
> Caused by: java.net.BindException: Problem binding to 
> [ruyang-mn3.linkedin.biz:0]java.net.BindException: Can't assign requested 
> address; For more details see:  [http://wiki.apache.org/hadoop/BindException]
> at 

[jira] [Updated] (YARN-9646) DistributedShell tests failed to bind to a local host name

2019-07-16 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-9646:
-
Summary: DistributedShell tests failed to bind to a local host name  (was: 
Yarn miniYarn cluster tests failed to bind to a local host name)

> DistributedShell tests failed to bind to a local host name
> --
>
> Key: YARN-9646
> URL: https://issues.apache.org/jira/browse/YARN-9646
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.4
>Reporter: Ray Yang
>Assignee: Ray Yang
>Priority: Major
> Attachments: YARN-9646.00.patch
>
>
> When running the integration test 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell#testDSShellWithoutDomain
> at home
> The following error happened:
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [ruyang-mn3.linkedin.biz:0] 
> java.net.BindException: Can't assign requested address; For more details see: 
>  [http://wiki.apache.org/hadoop/BindException]
>  
> at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.startResourceManager(MiniYARNCluster.java:327)
> at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.access$400(MiniYARNCluster.java:99)
> at 
> org.apache.hadoop.yarn.server.MiniYARNCluster$ResourceManagerWrapper.serviceStart(MiniYARNCluster.java:447)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
> at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.serviceStart(MiniYARNCluster.java:278)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.setupInternal(TestDistributedShell.java:91)
> at 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.setup(TestDistributedShell.java:71)
> …
> Caused by: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: Problem binding to [ruyang-mn3.linkedin.biz:0] 
> java.net.BindException: Can't assign requested address; For more details see: 
>  [http://wiki.apache.org/hadoop/BindException]
> at 
> org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:139)
> at 
> org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getServer(HadoopYarnProtoRPC.java:65)
> at org.apache.hadoop.yarn.ipc.YarnRPC.getServer(YarnRPC.java:54)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.*ResourceTrackerService.serviceStart*(ResourceTrackerService.java:163)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:588)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:976)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1017)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1013)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1754)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1013)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1053)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.startResourceManager(MiniYARNCluster.java:319)
> ... 31 more
> Caused by: java.net.BindException: Problem binding to 
> [ruyang-mn3.linkedin.biz:0]java.net.BindException: Can't assign requested 
> address; For more details see:  [http://wiki.apache.org/hadoop/BindException]
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:721)
> at org.apache.hadoop.ipc.Server.bind(Server.java:494)
> at 

[jira] [Updated] (YARN-9438) launchTime not written to state store for running applications

2019-07-16 Thread Jonathan Hung (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hung updated YARN-9438:

Affects Version/s: 2.10.0

> launchTime not written to state store for running applications
> --
>
> Key: YARN-9438
> URL: https://issues.apache.org/jira/browse/YARN-9438
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.10.0
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
> Attachments: YARN-9438.001.patch
>
>
> launchTime is only saved to state store after application finishes, so if 
> restart happens, any running applications will have launchTime set as -1 
> (since this is the default timestamp of the recovery event).



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9438) launchTime not written to state store for running applications

2019-07-16 Thread Jonathan Hung (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886574#comment-16886574
 ] 

Jonathan Hung commented on YARN-9438:
-

001 saves app's launchTime to state store when attempt is launched.

> launchTime not written to state store for running applications
> --
>
> Key: YARN-9438
> URL: https://issues.apache.org/jira/browse/YARN-9438
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
> Attachments: YARN-9438.001.patch
>
>
> launchTime is only saved to state store after application finishes, so if 
> restart happens, any running applications will have launchTime set as -1 
> (since this is the default timestamp of the recovery event).



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9438) launchTime not written to state store for running applications

2019-07-16 Thread Jonathan Hung (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hung updated YARN-9438:

Attachment: YARN-9438.001.patch

> launchTime not written to state store for running applications
> --
>
> Key: YARN-9438
> URL: https://issues.apache.org/jira/browse/YARN-9438
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
> Attachments: YARN-9438.001.patch
>
>
> launchTime is only saved to state store after application finishes, so if 
> restart happens, any running applications will have launchTime set as -1 
> (since this is the default timestamp of the recovery event).



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6492) Generate queue metrics for each partition

2019-07-16 Thread Eric Payne (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886414#comment-16886414
 ] 

Eric Payne commented on YARN-6492:
--

Thanks a lot for updating the patch, [~maniraj...@gmail.com]. Unfortunately, it 
didn't compile. It looks like patch .004 is missing the new 
{{PartitionQueueMetrics}} class.

bq. .004.patch doesn't introduce any new class for this resource vectors 
metrics, it just manages on its own with bit of extra logic. 
For the completed product, I think it will be necessary to utilize the existing 
{{QueueMetricsForCustomResources}} class rather than create a new set of logic.

> Generate queue metrics for each partition
> -
>
> Key: YARN-6492
> URL: https://issues.apache.org/jira/browse/YARN-6492
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Reporter: Jonathan Hung
>Assignee: Manikandan R
>Priority: Major
> Attachments: PartitionQueueMetrics_default_partition.txt, 
> PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, 
> YARN-6492.001.patch, YARN-6492.002.patch, YARN-6492.003.patch, 
> YARN-6492.004.patch, partition_metrics.txt
>
>
> We are interested in having queue metrics for all partitions. Right now each 
> queue has one QueueMetrics object which captures metrics either in default 
> partition or across all partitions. (After YARN-6467 it will be in default 
> partition)
> But having the partition metrics would be very useful.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-462) Project Parameter for Chargeback

2019-07-16 Thread Eric Payne (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886402#comment-16886402
 ] 

Eric Payne commented on YARN-462:
-

[~kthrapp], it has been several years. Is there still a need for this 
functionality?

> Project Parameter for Chargeback
> 
>
> Key: YARN-462
> URL: https://issues.apache.org/jira/browse/YARN-462
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: resourcemanager
>Affects Versions: 0.23.6
>Reporter: Kendall Thrapp
>Priority: Major
>
> Problem Summary
> For the purpose of chargeback and better understanding of grid usage, we need 
> to be able to associate applications with "projects", e.g. "pipeline X", 
> "property Y".  This would allow us to aggregate on this property, thereby 
> helping us compute grid resource usage for the entire "project".  Currently, 
> for a given application, two things we know about it are the user that 
> submitted it and the queue it was submitted to.  Below, I'll explain why 
> neither of these is adequate for enterprise-level chargeback and 
> understanding resource allocation needs.
> Why Not Users?
> Its not individual users that are paying the bill -- its projects.  When one 
> of our real users submits an application on a Hadoop grid, they're presumably 
> not usually doing it for themselves.  They're doing work for some project or 
> team effort, so its that team or project that should be "charged" for all its 
> users applications.  Maintaining outside lists of associations between users 
> and projects is error-prone because it is time-sensitive and requires 
> continued ongoing maintenance.  New users join organizations, users leave and 
> users even change projects.  Furthermore, users may split their time between 
> multiple projects, making it ambiguous as to which of a user's projects a 
> given application should be charged.  Also, there can be headless users, 
> which can be even more difficult to link to a project and can be shared 
> between teams or projects.
> Why Not Queues?
> The purpose of queues is for scheduling.  Overloading the queues concept to 
> also mean who should be "charged" for an application can have a detrimental 
> effect on the primary purpose of queues.  It could be manageable in the case 
> of a very small number of projects sharing a cluster, but doesn't scale to 
> tens or hundreds of projects sharing a cluster.  If a given cluster is shared 
> between 50 projects, creating 50 separate queues will result in inefficient 
> use of the cluster resources.  Furthermore, a given project may desire more 
> than one queue for different types or priorities of applications.  
> Proposed Solution
> Rather than relying on external tools to infer through the user and/or queue 
> who to "charge" for a given application, I propose a straightforward approach 
> where that information be explicitly supplied when the application is 
> submitted, just like we do with queues.  Let's use a charge card analogy: 
> when you buy something online, you don't just say who you are and how to ship 
> it, you also specify how you're paying for it.  Similarly, when submitting an 
> application in YARN, you could explicitly specify to whom it's resource usage 
> should be associated (a project, team, cost center, etc).
> This new configuration parameter should default to being optional, so that 
> organizations not interested in chargeback or project-level resource tracking 
> can happily continue on as if it wasn't there.  However, it should be 
> configurable at the cluster-level such that, a given cluster to could elect 
> to make it required, so that all applications would have an associated 
> project.  The value of this new parameter should be exposed via the Resource 
> Manager UI and Resource Manager REST API, so that users and tools can make 
> use of it for chargeback, utilization metrics, etc.
> I'm undecided on what to name the new parameter, as I like the flexibility in 
> the ways it could be used.  It is essentially just an additional party other 
> than user or queue that an application can be associated with, so its use is 
> not just limited to a chargeback scenario.  For example, an organization not 
> interested in chargeback could still use this parameter to communicate useful 
> information about a application (e.g. pipelineX.stageN) and aggregate like 
> applications.
> Enforcement
> Couldn't users just specify this information as a prefix for their job names? 
>  Yes, but the missing piece this could provides is enforcement.  Ideally, I'd 
> like this parameter to work very much like how the queues work.  Like already 
> exists with queues, it'd be ideal if a given user couldn't just specify any 
> old value for this parameter.  It could be configurable such that a given 
> user only has 

[jira] [Commented] (YARN-3638) Yarn Resource Manager Scheduler page - show percentage of total cluster that each queue is using

2019-07-16 Thread Eric Payne (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886391#comment-16886391
 ] 

Eric Payne commented on YARN-3638:
--

Hi [~harisekhon],

bq. You could also extend this idea show the % of total cluster capacity that 
each job is consuming too.

I attached a screenshot of the Capacity Scheduler UI page so that we could all 
be on the same page. As you can see, the capacity scheduler page shows the 
absolute used capacity for each leaf queue, as well as the absolute capacity 
being used by each app. 

bq. I'd like "Absolute Used Capacity" to be shown to the right of the 
horizontal bars.

You are correct in that the information is not displayed to the right along 
with the used capacity. This is the one piece that is missing.

Is this feature still given that all of the information is there (albeit not 
quite as convenient)?

> Yarn Resource Manager Scheduler page - show percentage of total cluster that 
> each queue is using
> 
>
> Key: YARN-3638
> URL: https://issues.apache.org/jira/browse/YARN-3638
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 2.6.0
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>Assignee: Kuhu Shukla
>Priority: Minor
> Attachments: Capacity Scheduler Page Shows Percentages of Cluster.png
>
>
> Request to show % of total cluster resources each queue is currently 
> consuming for jobs on the Yarn Resource Manager Scheduler page.
> Currently the Yarn Resource Manager Scheduler page shows the % of total used 
> for root queue and the % of each given queue's configured capacity that is 
> used (often showing say 150% if the max capacity is greater than configured 
> capacity to allow bursting where there are free resources). This is fine, but 
> it would be good to additionally show the % of total cluster that each given 
> queue is consuming and not just the % of that queue's configured capacity.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3638) Yarn Resource Manager Scheduler page - show percentage of total cluster that each queue is using

2019-07-16 Thread Eric Payne (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Payne updated YARN-3638:
-
Attachment: Capacity Scheduler Page Shows Percentages of Cluster.png

> Yarn Resource Manager Scheduler page - show percentage of total cluster that 
> each queue is using
> 
>
> Key: YARN-3638
> URL: https://issues.apache.org/jira/browse/YARN-3638
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 2.6.0
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>Assignee: Kuhu Shukla
>Priority: Minor
> Attachments: Capacity Scheduler Page Shows Percentages of Cluster.png
>
>
> Request to show % of total cluster resources each queue is currently 
> consuming for jobs on the Yarn Resource Manager Scheduler page.
> Currently the Yarn Resource Manager Scheduler page shows the % of total used 
> for root queue and the % of each given queue's configured capacity that is 
> used (often showing say 150% if the max capacity is greater than configured 
> capacity to allow bursting where there are free resources). This is fine, but 
> it would be good to additionally show the % of total cluster that each given 
> queue is consuming and not just the % of that queue's configured capacity.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6492) Generate queue metrics for each partition

2019-07-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886326#comment-16886326
 ] 

Hadoop QA commented on YARN-6492:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
36s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 45s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
25s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch 
failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
26s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch 
failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 26s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 31s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 78 new + 154 unchanged - 4 fixed = 232 total (was 158) 
{color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
28s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch 
failed. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 16 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red}  3m 
37s{color} | {color:red} patch has errors when building and testing our client 
artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
26s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch 
failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 27s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 42m 25s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=18.09.7 Server=18.09.7 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | YARN-6492 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12974848/YARN-6492.004.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 198e6f8a22d3 4.15.0-48-generic #51-Ubuntu SMP Wed Apr 3 
08:28:49 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / c5e3ab5 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-YARN-Build/24401/artifact/out/patch-mvninstall-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| compile | 

[jira] [Updated] (YARN-6492) Generate queue metrics for each partition

2019-07-16 Thread Manikandan R (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikandan R updated YARN-6492:
---
Attachment: YARN-6492.004.patch

> Generate queue metrics for each partition
> -
>
> Key: YARN-6492
> URL: https://issues.apache.org/jira/browse/YARN-6492
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Reporter: Jonathan Hung
>Assignee: Manikandan R
>Priority: Major
> Attachments: PartitionQueueMetrics_default_partition.txt, 
> PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, 
> YARN-6492.001.patch, YARN-6492.002.patch, YARN-6492.003.patch, 
> YARN-6492.004.patch, partition_metrics.txt
>
>
> We are interested in having queue metrics for all partitions. Right now each 
> queue has one QueueMetrics object which captures metrics either in default 
> partition or across all partitions. (After YARN-6467 it will be in default 
> partition)
> But having the partition metrics would be very useful.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6492) Generate queue metrics for each partition

2019-07-16 Thread Manikandan R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886273#comment-16886273
 ] 

Manikandan R commented on YARN-6492:


[~taklwu] Thanks for trying out the patch.

I will go through your findings and will incorporate the fixes if needed. Can 
you please share more info like 
 # No. of nodes
 # Labels info
 # Node -> label mapping
 # Type of labels - exclusive or non exclusive

[~eepayne] There are some improvements applied after .003.patch as well. So I 
rebased using the same patch. Attaching .004.patch for your review. Currently, 
it requires high level review on the approaches taken to compute 
PartitionMetrics as it has been a while and the rest can be taken further as we 
progress. Please refer earlier attachments PartitionQueueMetrics_*.txt to view 
the JMX o/p.

Notes:

1. PartitionQueueMetrics extends QueueMetrics and act as holder for partition 
queue metrics with just couple of methods to create objects.

2. Existing QueueMetrics class methods takes care of updating the partition 
information too through PartitionQueueMetrics object

3. *Resource Vectors/Custom Resources Metrics:*

While we were working on this JIRA, based on suggestions, changes were made to 
incorporate QueueMetrics even for resource vectors for ease of development. But 
later, YARN-8842 has been created and committed as well. YARN-8842 patch 
approach is different from the approach used in .004.patch.

.004.patch doesn't introduce any new class for this resource vectors metrics, 
it just manages on its own with bit of extra logic. However, patch has to be 
polished bit more to make it concrete. For example, As of now, it handles even 
memory-mb and Vcores which can be removed as we progress. Below code in 
{{QueueMetrics}} shows the behaviour:
{code:java}
for (ResourceInformation ri : res.getResources()) {
}
{code}
Ideally, we should traverse from 2nd index onwards. Likewise, We will need to 
do some more minor improvements for sure.

4. Need to make it as robust patch - tighten the test cases, ensuring newly 
added partition flow etc

5. Once High level design flow in patch has been concluded, can apply it even 
for CSQueueMetrics as well.

> Generate queue metrics for each partition
> -
>
> Key: YARN-6492
> URL: https://issues.apache.org/jira/browse/YARN-6492
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Reporter: Jonathan Hung
>Assignee: Manikandan R
>Priority: Major
> Attachments: PartitionQueueMetrics_default_partition.txt, 
> PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, 
> YARN-6492.001.patch, YARN-6492.002.patch, YARN-6492.003.patch, 
> partition_metrics.txt
>
>
> We are interested in having queue metrics for all partitions. Right now each 
> queue has one QueueMetrics object which captures metrics either in default 
> partition or across all partitions. (After YARN-6467 it will be in default 
> partition)
> But having the partition metrics would be very useful.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9647) Docker launch fails when local-dirs or log-dirs is unhealthy.

2019-07-16 Thread KWON BYUNGCHANG (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885995#comment-16885995
 ] 

KWON BYUNGCHANG commented on YARN-9647:
---

[~ebadger] Thanks for your comments.

The process of mounting volume in YARN is as follows.

step1.  validate mountable point (docker.allowed.ro-mounts, 
docker.allowed.rw-mounts) that is configured by yarn administrator  in 
/etc/hadoop/conf/container-executor.cfg
step2. validate mount point that is configured by user 
step3. validate mount point of step2 belong to mountable point of step1

if  /data2/  is unhealthy,  threre is not /data2/ in mount point configuration 
(step2) because nodemanager already know /data2 is unhealthy.
problem is  /data2 still exists in /etc/hadoop/conf/container-executor.cfg 
because container-exector.cfg is static configuation file.
and docker launch fails in step1 because container-executor cannot resolve real 
path of /data2. 
I simply modified step1 to ignore unresolving mountable path.

> Docker launch fails when local-dirs or log-dirs is unhealthy.
> -
>
> Key: YARN-9647
> URL: https://issues.apache.org/jira/browse/YARN-9647
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.1.2
>Reporter: KWON BYUNGCHANG
>Priority: Major
> Attachments: YARN-9647.001.patch, YARN-9647.002.patch
>
>
> my /etc/hadoop/conf/container-executor.cfg
> {code}
> [docker]
>docker.allowed.ro-mounts=/data1/hadoop/yarn/local,/data2/hadoop/yarn/local
>docker.allowed.rw-mounts=/data1/hadoop/yarn/local,/data2/hadoop/yarn/local
> {code}
> if /data2 is unhealthy, docker launch fails  although container can use 
> /data1 as local-dir, log-dir 
> error message is below
> {code}
> [2019-06-25 14:55:26.168]Exception from container-launch. Container id: 
> container_e50_1561100493387_5185_01_000597 Exit code: 29 Exception message: 
> Launch container failed Shell error output: Could not determine real path of 
> mount '/data2/hadoop/yarn/local' Could not determine real path of mount 
> '/data2/hadoop/yarn/local' Unable to find permitted docker mounts on disk 
> Error constructing docker command, docker error code=16, error message='Mount 
> access error' Shell output: main : command provided 4 main : run as user is 
> magnum main : requested yarn user is magnum Creating script paths... Creating 
> local dirs... [2019-06-25 14:55:26.189]Container exited with a non-zero exit 
> code 29. [2019-06-25 14:55:26.192]Container exited with a non-zero exit code 
> 29. 
> {code}
> root cause is that normalize_mounts() in docker-util.c return -1  because it 
> cannot resolve real path of /data2/hadoop/yarn/local.(note that /data2 is 
> disk fault  at this point)
> however disk of nm local dirs and nm log dirs can fail at any time.
> docker launch should succeed if there are available local dirs and log dirs.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9681) AM resource limit is incorrect for queue

2019-07-16 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885987#comment-16885987
 ] 

Sunil Govindan commented on YARN-9681:
--

Hi [~gb.ana...@gmail.com]

cud u pls share more details on same. 

> AM resource limit is incorrect for queue
> 
>
> Key: YARN-9681
> URL: https://issues.apache.org/jira/browse/YARN-9681
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.1.2
>Reporter: ANANDA G B
>Priority: Major
> Attachments: After running job on queue1.png, Before running job on 
> queue1.png
>
>
> After running the job on Queue1 of Partition1, then Queue1 of 
> DEFAULT_PARTITION's 'Max Application Master Resources' is calculated wrongly. 
> Please find the attachement.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9645) Fix Invalid event FINISHED_CONTAINERS_PULLED_BY_AM at NEW on NM restart

2019-07-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885937#comment-16885937
 ] 

Hudson commented on YARN-9645:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #16924 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16924/])
YARN-9645. Fix Invalid event FINISHED_CONTAINERS_PULLED_BY_AM at NEW on 
(bibinchundatt: rev 7a93be0f6002ebb376c30f25a7d403e853c44280)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java


> Fix Invalid event FINISHED_CONTAINERS_PULLED_BY_AM at NEW on NM restart
> ---
>
> Key: YARN-9645
> URL: https://issues.apache.org/jira/browse/YARN-9645
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: krishna reddy
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-9645-001.patch, YARN-9645-002.patch
>
>
> *Description: *While Restarting NM throughing 
> org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: 
> FINISHED_CONTAINERS_PULLED_BY_AM at NEW"
> *Environment: *
> Server OS :- UBUNTU
>  No. of Cluster Node:- 2 RM / 4850 NMs
> total 240 machines, in each machine 21 docker containers (1 DN & 20 NM's)
> *Steps:*
> 1. Total number of containers running state : ~53000
> 2. Restart the NM's and check in the log
> {noformat}
> 019-06-24 09:37:35,345 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Application 
> with id 32744 submitted by user root
> 2019-06-24 09:37:35,346 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=root 
> IP=255.255.19.245   OPERATION=Submit Application Request
> TARGET=ClientRMService  RESULT=SUCCESS  APPID=application_1561358926330_32744 
>   QUEUENAME=default
> 2019-06-24 09:37:35,345 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Can't handle 
> this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: 
> FINISHED_CONTAINERS_PULLED_BY_AM at NEW
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.handle(RMNodeImpl.java:669)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.handle(RMNodeImpl.java:99)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$NodeEventDispatcher.handle(ResourceManager.java:1107)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$NodeEventDispatcher.handle(ResourceManager.java:1091)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:221)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:143)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-9657) AbstractLivelinessMonitor add serviceName to PingChecker thread

2019-07-16 Thread Bilwa S T (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T reassigned YARN-9657:
---

Assignee: Bilwa S T

> AbstractLivelinessMonitor add serviceName to PingChecker thread
> ---
>
> Key: YARN-9657
> URL: https://issues.apache.org/jira/browse/YARN-9657
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Bibin A Chundatt
>Assignee: Bilwa S T
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9681) AM resource limit is incorrect for queue

2019-07-16 Thread ANANDA G B (JIRA)
ANANDA G B created YARN-9681:


 Summary: AM resource limit is incorrect for queue
 Key: YARN-9681
 URL: https://issues.apache.org/jira/browse/YARN-9681
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 3.1.2
Reporter: ANANDA G B
 Attachments: After running job on queue1.png, Before running job on 
queue1.png

After running the job on Queue1 of Partition1, then Queue1 of 
DEFAULT_PARTITION's 'Max Application Master Resources' is calculated wrongly. 
Please find the attachement.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org