[jira] [Commented] (YARN-8685) Add containers query support for nodes/node REST API in RMWebServices

2018-08-21 Thread Tao Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588371#comment-16588371
 ] 

Tao Yang commented on YARN-8685:


[~cheersyang], Thanks for your suggestions. Make sense to me.

There is a ContainerInfo class in hadoop-yarn-server-common module, the patch 
can share this class with adding several fields like 
allocationRequestId/version/allocationTags. Right?

> Add containers query support for nodes/node REST API in RMWebServices
> -
>
> Key: YARN-8685
> URL: https://issues.apache.org/jira/browse/YARN-8685
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: restapi
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8685.001.patch
>
>
> Currently we can only query running containers from NM containers REST API, 
> but can't get the valid containers which are in ALLOCATED/ACQUIRED state. We 
> have the requirements to get all containers allocated on specified nodes for 
> debugging. I want to add a "includeContainers" query param (default false) 
> for nodes/node REST API in RMWebServices, so that we can get valid containers 
> on nodes if "includeContainers=true" specified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8698) Failed to add hadoop dependencies in docker container when submitting a submarine job

2018-08-21 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588372#comment-16588372
 ] 

genericqa commented on YARN-8698:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
24s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  4s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
35s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine 
in trunk has 4 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 12s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine: 
The patch generated 1 new + 5 unchanged - 0 fixed = 6 total (was 5) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 28s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
32s{color} | {color:green} hadoop-yarn-submarine in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 54m 26s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-8698 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12936558/YARN-8698.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 01da4d8f9c08 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / e557c6b |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-YARN-Build/21656/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-submarine-warnings.html
 |
| checkstyle | 

[jira] [Commented] (YARN-8649) Similar as YARN-4355:NPE while processing localizer heartbeat

2018-08-21 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588356#comment-16588356
 ] 

genericqa commented on YARN-8649:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
39s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 51s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 24s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 2 new + 234 unchanged - 0 fixed = 236 total (was 234) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 58s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 
30s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 72m 52s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-8649 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12936555/YARN-8649_2.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 1b2539cf8401 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / e557c6b |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/21655/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/21655/testReport/ |
| Max. process+thread count | 302 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 

[jira] [Comment Edited] (YARN-8649) Similar as YARN-4355:NPE while processing localizer heartbeat

2018-08-21 Thread potato (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588326#comment-16588326
 ] 

potato edited comment on YARN-8649 at 8/22/18 3:20 AM:
---

Shutting down many nodes gracefully  can cause massive NPEs, this will disturb 
our log analysis tool! Fixing this issue may can help us reduce the NPEs.


was (Author: potato):
Gracefully shutting down many nodes can cause massive NPEs, this will disturb 
our log analysis tool!

> Similar as YARN-4355:NPE while processing localizer heartbeat
> -
>
> Key: YARN-8649
> URL: https://issues.apache.org/jira/browse/YARN-8649
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: lujie
>Assignee: lujie
>Priority: Major
> Attachments: YARN-8649.patch, YARN-8649_2.patch, 
> hadoop-hires-nodemanager-hadoop11.log
>
>
> I have noticed that a nodemanager was getting NPEs while tearing down. The 
> reason maybe  similar to YARN-4355 which is reported by [# Jason Lowe]. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8649) Similar as YARN-4355:NPE while processing localizer heartbeat

2018-08-21 Thread potato (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588326#comment-16588326
 ] 

potato commented on YARN-8649:
--

Gracefully shutting down many nodes can cause massive NPEs, this will disturb 
our log analysis tool!

> Similar as YARN-4355:NPE while processing localizer heartbeat
> -
>
> Key: YARN-8649
> URL: https://issues.apache.org/jira/browse/YARN-8649
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: lujie
>Assignee: lujie
>Priority: Major
> Attachments: YARN-8649.patch, YARN-8649_2.patch, 
> hadoop-hires-nodemanager-hadoop11.log
>
>
> I have noticed that a nodemanager was getting NPEs while tearing down. The 
> reason maybe  similar to YARN-4355 which is reported by [# Jason Lowe]. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8649) Similar as YARN-4355:NPE while processing localizer heartbeat

2018-08-21 Thread lujie (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588306#comment-16588306
 ] 

lujie commented on YARN-8649:
-

Hi [~jlowe]:
 # In the new patch, I let the "getPathForLocalization" return null if "rsrc == 
null"(there are also a log statement to indicate why return null).  
 # In the "processHeartbeat" and "addResource" who use the return value of 
"getPathForLocalization" , I add the null check.  The null checker can prevent 
the unnecessary download!

> Similar as YARN-4355:NPE while processing localizer heartbeat
> -
>
> Key: YARN-8649
> URL: https://issues.apache.org/jira/browse/YARN-8649
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: lujie
>Assignee: lujie
>Priority: Major
> Attachments: YARN-8649.patch, YARN-8649_2.patch, 
> hadoop-hires-nodemanager-hadoop11.log
>
>
> I have noticed that a nodemanager was getting NPEs while tearing down. The 
> reason maybe  similar to YARN-4355 which is reported by [# Jason Lowe]. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8649) Similar as YARN-4355:NPE while processing localizer heartbeat

2018-08-21 Thread lujie (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lujie updated YARN-8649:

Attachment: YARN-8649_2.patch

> Similar as YARN-4355:NPE while processing localizer heartbeat
> -
>
> Key: YARN-8649
> URL: https://issues.apache.org/jira/browse/YARN-8649
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: lujie
>Assignee: lujie
>Priority: Major
> Attachments: YARN-8649.patch, YARN-8649_2.patch, 
> hadoop-hires-nodemanager-hadoop11.log
>
>
> I have noticed that a nodemanager was getting NPEs while tearing down. The 
> reason maybe  similar to YARN-4355 which is reported by [# Jason Lowe]. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8015) Complete placement constraint support for Capacity Scheduler

2018-08-21 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588279#comment-16588279
 ] 

Weiwei Yang commented on YARN-8015:
---

Hi [~sunilg], thanks
{quote}I think we can commit this only to trunk alone
{quote}
I am OK with that. We can have the full support of PC and node-attributes in 
3.2 release line.

> Complete placement constraint support for Capacity Scheduler
> 
>
> Key: YARN-8015
> URL: https://issues.apache.org/jira/browse/YARN-8015
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Critical
> Attachments: YARN-8015.001.patch, YARN-8015.002.patch, 
> YARN-8015.003.patch, YARN-8015.004.patch
>
>
> AppPlacementAllocator currently only supports intra-app anti-affinity 
> placement constraints, once YARN-8002 and YARN-8013 are resolved, it needs to 
> support inter-app constraints too. Also, this may require some refactoring on 
> the existing code logic. Use this JIRA to track.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8015) Complete placement constraint support for Capacity Scheduler

2018-08-21 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588274#comment-16588274
 ] 

Sunil Govindan commented on YARN-8015:
--

Thanks [~cheersyang]. Looks fine to me.

Committing shortly. I think we can commit this only to trunk alone, correct?

> Complete placement constraint support for Capacity Scheduler
> 
>
> Key: YARN-8015
> URL: https://issues.apache.org/jira/browse/YARN-8015
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Critical
> Attachments: YARN-8015.001.patch, YARN-8015.002.patch, 
> YARN-8015.003.patch, YARN-8015.004.patch
>
>
> AppPlacementAllocator currently only supports intra-app anti-affinity 
> placement constraints, once YARN-8002 and YARN-8013 are resolved, it needs to 
> support inter-app constraints too. Also, this may require some refactoring on 
> the existing code logic. Use this JIRA to track.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8698) Failed to add hadoop dependencies in docker container when submitting a submarine job

2018-08-21 Thread Zac Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zac Zhou updated YARN-8698:
---
Description: 
When a standalone submarine tf job is submitted, the following error is got :

INFO:tensorflow:image after unit resnet/tower_0/fully_connected/: (?, 11)
 INFO:tensorflow:Done calling model_fn.
 INFO:tensorflow:Create CheckpointSaverHook.
 hdfsBuilderConnect(forceNewInstance=0, nn=submarine, port=0, 
kerbTicketCachePath=(NULL), userNa
 me=(NULL)) error:
 (unable to get root cause for java.lang.NoClassDefFoundError)
 (unable to get stack trace for java.lang.NoClassDefFoundError)
 hdfsBuilderConnect(forceNewInstance=0, nn=submarine, port=0, 
kerbTicketCachePath=(NULL), userNa
 me=(NULL)) error:
 (unable to get root cause for java.lang.NoClassDefFoundError)
 (unable to get stack trace for java.lang.NoClassDefFoundError)

 

This error may be related to hadoop classpath

Hadoop env variables of launch_container.sh are as follows:

export HADOOP_COMMON_HOME=${HADOOP_COMMON_HOME:-"/home/hadoop/yarn-submarine"}
 export HADOOP_HDFS_HOME=${HADOOP_HDFS_HOME:-"/home/hadoop/yarn-submarine"}
 export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/home/hadoop/yarn-submarine/conf"}
 export HADOOP_YARN_HOME=${HADOOP_YARN_HOME:-"/home/hadoop/yarn-submarine"}
 export HADOOP_HOME=${HADOOP_HOME:-"/home/hadoop/yarn-submarine"}

 

run-PRIMARY_WORKER.sh is like:

export HADOOP_YARN_HOME=
 export HADOOP_HDFS_HOME=/hadoop-3.1.0
 export HADOOP_CONF_DIR=$WORK_DIR

 

  

  was:
when a standalone submarine tf job is submitted, the following error is got :

INFO:tensorflow:image after unit resnet/tower_0/fully_connected/: (?, 11)
 INFO:tensorflow:Done calling model_fn.
 INFO:tensorflow:Create CheckpointSaverHook.
 hdfsBuilderConnect(forceNewInstance=0, nn=submarine, port=0, 
kerbTicketCachePath=(NULL), userNa
 me=(NULL)) error:
 (unable to get root cause for java.lang.NoClassDefFoundError)
 (unable to get stack trace for java.lang.NoClassDefFoundError)
 hdfsBuilderConnect(forceNewInstance=0, nn=submarine, port=0, 
kerbTicketCachePath=(NULL), userNa
 me=(NULL)) error:
 (unable to get root cause for java.lang.NoClassDefFoundError)
 (unable to get stack trace for java.lang.NoClassDefFoundError)

 

This error may be related to hadoop classpath

Hadoop env variables of launch_container.sh are as follows:

export HADOOP_COMMON_HOME=${HADOOP_COMMON_HOME:-"/home/hadoop/yarn-submarine"}
 export HADOOP_HDFS_HOME=${HADOOP_HDFS_HOME:-"/home/hadoop/yarn-submarine"}
 export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/home/hadoop/yarn-submarine/conf"}
 export HADOOP_YARN_HOME=${HADOOP_YARN_HOME:-"/home/hadoop/yarn-submarine"}
 export HADOOP_HOME=${HADOOP_HOME:-"/home/hadoop/yarn-submarine"}

 

run-PRIMARY_WORKER.sh is like:

export HADOOP_YARN_HOME=
 export HADOOP_HDFS_HOME=/hadoop-3.1.0
 export HADOOP_CONF_DIR=$WORK_DIR

 

  


> Failed to add hadoop dependencies in docker container when submitting a 
> submarine job
> -
>
> Key: YARN-8698
> URL: https://issues.apache.org/jira/browse/YARN-8698
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Zac Zhou
>Priority: Major
>
> When a standalone submarine tf job is submitted, the following error is got :
> INFO:tensorflow:image after unit resnet/tower_0/fully_connected/: (?, 11)
>  INFO:tensorflow:Done calling model_fn.
>  INFO:tensorflow:Create CheckpointSaverHook.
>  hdfsBuilderConnect(forceNewInstance=0, nn=submarine, port=0, 
> kerbTicketCachePath=(NULL), userNa
>  me=(NULL)) error:
>  (unable to get root cause for java.lang.NoClassDefFoundError)
>  (unable to get stack trace for java.lang.NoClassDefFoundError)
>  hdfsBuilderConnect(forceNewInstance=0, nn=submarine, port=0, 
> kerbTicketCachePath=(NULL), userNa
>  me=(NULL)) error:
>  (unable to get root cause for java.lang.NoClassDefFoundError)
>  (unable to get stack trace for java.lang.NoClassDefFoundError)
>  
> This error may be related to hadoop classpath
> Hadoop env variables of launch_container.sh are as follows:
> export HADOOP_COMMON_HOME=${HADOOP_COMMON_HOME:-"/home/hadoop/yarn-submarine"}
>  export HADOOP_HDFS_HOME=${HADOOP_HDFS_HOME:-"/home/hadoop/yarn-submarine"}
>  export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/home/hadoop/yarn-submarine/conf"}
>  export HADOOP_YARN_HOME=${HADOOP_YARN_HOME:-"/home/hadoop/yarn-submarine"}
>  export HADOOP_HOME=${HADOOP_HOME:-"/home/hadoop/yarn-submarine"}
>  
> run-PRIMARY_WORKER.sh is like:
> export HADOOP_YARN_HOME=
>  export HADOOP_HDFS_HOME=/hadoop-3.1.0
>  export HADOOP_CONF_DIR=$WORK_DIR
>  
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: 

[jira] [Updated] (YARN-8698) Failed to add hadoop dependencies in docker container when submitting a submarine job

2018-08-21 Thread Zac Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zac Zhou updated YARN-8698:
---
Description: 
when a standalone submarine tf job is submitted, the following error is got :

INFO:tensorflow:image after unit resnet/tower_0/fully_connected/: (?, 11)
 INFO:tensorflow:Done calling model_fn.
 INFO:tensorflow:Create CheckpointSaverHook.
 hdfsBuilderConnect(forceNewInstance=0, nn=submarine, port=0, 
kerbTicketCachePath=(NULL), userNa
 me=(NULL)) error:
 (unable to get root cause for java.lang.NoClassDefFoundError)
 (unable to get stack trace for java.lang.NoClassDefFoundError)
 hdfsBuilderConnect(forceNewInstance=0, nn=submarine, port=0, 
kerbTicketCachePath=(NULL), userNa
 me=(NULL)) error:
 (unable to get root cause for java.lang.NoClassDefFoundError)
 (unable to get stack trace for java.lang.NoClassDefFoundError)

 

This error may be related to hadoop classpath

Hadoop env variables of launch_container.sh are as follows:

export HADOOP_COMMON_HOME=${HADOOP_COMMON_HOME:-"/home/hadoop/yarn-submarine"}
 export HADOOP_HDFS_HOME=${HADOOP_HDFS_HOME:-"/home/hadoop/yarn-submarine"}
 export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/home/hadoop/yarn-submarine/conf"}
 export HADOOP_YARN_HOME=${HADOOP_YARN_HOME:-"/home/hadoop/yarn-submarine"}
 export HADOOP_HOME=${HADOOP_HOME:-"/home/hadoop/yarn-submarine"}

 

run-PRIMARY_WORKER.sh is like:

export HADOOP_YARN_HOME=
 export HADOOP_HDFS_HOME=/hadoop-3.1.0
 export HADOOP_CONF_DIR=$WORK_DIR

 

  

  was:
when a standalone submarine tf job is submitted, the following error was got :

INFO:tensorflow:image after unit resnet/tower_0/fully_connected/: (?, 11)
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
hdfsBuilderConnect(forceNewInstance=0, nn=submarine, port=0, 
kerbTicketCachePath=(NULL), userNa
me=(NULL)) error:
(unable to get root cause for java.lang.NoClassDefFoundError)
(unable to get stack trace for java.lang.NoClassDefFoundError)
hdfsBuilderConnect(forceNewInstance=0, nn=submarine, port=0, 
kerbTicketCachePath=(NULL), userNa
me=(NULL)) error:
(unable to get root cause for java.lang.NoClassDefFoundError)
(unable to get stack trace for java.lang.NoClassDefFoundError)

 

This error may be related to hadoop classpath

Hadoop env variables of launch_container.sh are as follows:

export HADOOP_COMMON_HOME=${HADOOP_COMMON_HOME:-"/home/hadoop/yarn-submarine"}
export HADOOP_HDFS_HOME=${HADOOP_HDFS_HOME:-"/home/hadoop/yarn-submarine"}
export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/home/hadoop/yarn-submarine/conf"}
export HADOOP_YARN_HOME=${HADOOP_YARN_HOME:-"/home/hadoop/yarn-submarine"}
export HADOOP_HOME=${HADOOP_HOME:-"/home/hadoop/yarn-submarine"}

 

run-PRIMARY_WORKER.sh is like:

export HADOOP_YARN_HOME=
export HADOOP_HDFS_HOME=/hadoop-3.1.0
export HADOOP_CONF_DIR=$WORK_DIR

 

  


> Failed to add hadoop dependencies in docker container when submitting a 
> submarine job
> -
>
> Key: YARN-8698
> URL: https://issues.apache.org/jira/browse/YARN-8698
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Zac Zhou
>Priority: Major
>
> when a standalone submarine tf job is submitted, the following error is got :
> INFO:tensorflow:image after unit resnet/tower_0/fully_connected/: (?, 11)
>  INFO:tensorflow:Done calling model_fn.
>  INFO:tensorflow:Create CheckpointSaverHook.
>  hdfsBuilderConnect(forceNewInstance=0, nn=submarine, port=0, 
> kerbTicketCachePath=(NULL), userNa
>  me=(NULL)) error:
>  (unable to get root cause for java.lang.NoClassDefFoundError)
>  (unable to get stack trace for java.lang.NoClassDefFoundError)
>  hdfsBuilderConnect(forceNewInstance=0, nn=submarine, port=0, 
> kerbTicketCachePath=(NULL), userNa
>  me=(NULL)) error:
>  (unable to get root cause for java.lang.NoClassDefFoundError)
>  (unable to get stack trace for java.lang.NoClassDefFoundError)
>  
> This error may be related to hadoop classpath
> Hadoop env variables of launch_container.sh are as follows:
> export HADOOP_COMMON_HOME=${HADOOP_COMMON_HOME:-"/home/hadoop/yarn-submarine"}
>  export HADOOP_HDFS_HOME=${HADOOP_HDFS_HOME:-"/home/hadoop/yarn-submarine"}
>  export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/home/hadoop/yarn-submarine/conf"}
>  export HADOOP_YARN_HOME=${HADOOP_YARN_HOME:-"/home/hadoop/yarn-submarine"}
>  export HADOOP_HOME=${HADOOP_HOME:-"/home/hadoop/yarn-submarine"}
>  
> run-PRIMARY_WORKER.sh is like:
> export HADOOP_YARN_HOME=
>  export HADOOP_HDFS_HOME=/hadoop-3.1.0
>  export HADOOP_CONF_DIR=$WORK_DIR
>  
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8698) Failed to add hadoop dependencies in docker container when submitting a submarine job

2018-08-21 Thread Zac Zhou (JIRA)
Zac Zhou created YARN-8698:
--

 Summary: Failed to add hadoop dependencies in docker container 
when submitting a submarine job
 Key: YARN-8698
 URL: https://issues.apache.org/jira/browse/YARN-8698
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Zac Zhou


when a standalone submarine tf job is submitted, the following error was got :

INFO:tensorflow:image after unit resnet/tower_0/fully_connected/: (?, 11)
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
hdfsBuilderConnect(forceNewInstance=0, nn=submarine, port=0, 
kerbTicketCachePath=(NULL), userNa
me=(NULL)) error:
(unable to get root cause for java.lang.NoClassDefFoundError)
(unable to get stack trace for java.lang.NoClassDefFoundError)
hdfsBuilderConnect(forceNewInstance=0, nn=submarine, port=0, 
kerbTicketCachePath=(NULL), userNa
me=(NULL)) error:
(unable to get root cause for java.lang.NoClassDefFoundError)
(unable to get stack trace for java.lang.NoClassDefFoundError)

 

This error may be related to hadoop classpath

Hadoop env variables of launch_container.sh are as follows:

export HADOOP_COMMON_HOME=${HADOOP_COMMON_HOME:-"/home/hadoop/yarn-submarine"}
export HADOOP_HDFS_HOME=${HADOOP_HDFS_HOME:-"/home/hadoop/yarn-submarine"}
export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/home/hadoop/yarn-submarine/conf"}
export HADOOP_YARN_HOME=${HADOOP_YARN_HOME:-"/home/hadoop/yarn-submarine"}
export HADOOP_HOME=${HADOOP_HOME:-"/home/hadoop/yarn-submarine"}

 

run-PRIMARY_WORKER.sh is like:

export HADOOP_YARN_HOME=
export HADOOP_HDFS_HOME=/hadoop-3.1.0
export HADOOP_CONF_DIR=$WORK_DIR

 

  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8513) CapacityScheduler infinite loop when queue is near fully utilized

2018-08-21 Thread Chen Yufei (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588238#comment-16588238
 ] 

Chen Yufei commented on YARN-8513:
--

[~leftnoteasy] My original config did not have the two config options specified 
so should be using the default values.

Currently I have applied the configuration suggested by [~cheersyang], so 
maximum-container-assignments is 10 now.

> CapacityScheduler infinite loop when queue is near fully utilized
> -
>
> Key: YARN-8513
> URL: https://issues.apache.org/jira/browse/YARN-8513
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, yarn
>Affects Versions: 3.1.0, 2.9.1
> Environment: Ubuntu 14.04.5 and 16.04.4
> YARN is configured with one label and 5 queues.
>Reporter: Chen Yufei
>Priority: Major
> Attachments: jstack-1.log, jstack-2.log, jstack-3.log, jstack-4.log, 
> jstack-5.log, top-during-lock.log, top-when-normal.log, yarn3-jstack1.log, 
> yarn3-jstack2.log, yarn3-jstack3.log, yarn3-jstack4.log, yarn3-jstack5.log, 
> yarn3-resourcemanager.log, yarn3-top
>
>
> ResourceManager does not respond to any request when queue is near fully 
> utilized sometimes. Sending SIGTERM won't stop RM, only SIGKILL can. After RM 
> restart, it can recover running jobs and start accepting new ones.
>  
> Seems like CapacityScheduler is in an infinite loop printing out the 
> following log messages (more than 25,000 lines in a second):
>  
> {{2018-07-10 17:16:29,227 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
> assignedContainer queue=root usedCapacity=0.99816763 
> absoluteUsedCapacity=0.99816763 used= 
> cluster=}}
> {{2018-07-10 17:16:29,227 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
>  Failed to accept allocation proposal}}
> {{2018-07-10 17:16:29,227 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.AbstractContainerAllocator:
>  assignedContainer application attempt=appattempt_1530619767030_1652_01 
> container=null 
> queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@14420943
>  clusterResource= type=NODE_LOCAL 
> requestedPartition=}}
>  
> I encounter this problem several times after upgrading to YARN 2.9.1, while 
> the same configuration works fine under version 2.7.3.
>  
> YARN-4477 is an infinite loop bug in FairScheduler, not sure if this is a 
> similar problem.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8696) FederationInterceptor upgrade: home sub-cluster heartbeat async

2018-08-21 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588205#comment-16588205
 ] 

genericqa commented on YARN-8696:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 21m 
48s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  3m 
57s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 24s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
55s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m  
1s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 16s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 5 new + 225 unchanged - 0 fixed = 230 total (was 225) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
22s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 48s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
48s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 43s{color} 
| {color:red} hadoop-yarn-api in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
16s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
26s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 
49s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 69m 42s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}210m 12s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.conf.TestYarnConfigurationFields |
|   | hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce 

[jira] [Updated] (YARN-8697) LocalityMulticastAMRMProxyPolicy should fallback to random sub-cluster when cannot resolve resource

2018-08-21 Thread Botong Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated YARN-8697:
---
Issue Type: Sub-task  (was: Task)
Parent: YARN-5597

> LocalityMulticastAMRMProxyPolicy should fallback to random sub-cluster when 
> cannot resolve resource
> ---
>
> Key: YARN-8697
> URL: https://issues.apache.org/jira/browse/YARN-8697
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Major
>
> Right now in LocalityMulticastAMRMProxyPolicy, whenever we cannot resolve the 
> resource name (node or rack), we always route the request to home 
> sub-cluster. However, home sub-cluster might not be always be ready to use 
> (timed out YARN-8581) or enabled (by AMRMProxyPolicy weights). It might also 
> be overwhelmed by the requests if sub-cluster resolver has some issue. In 
> this Jira, we are changing it to pick a random active and enabled sub-cluster 
> for resource request we cannot resolve. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8697) LocalityMulticastAMRMProxyPolicy should fallback to random sub-cluster when cannot resolve resource

2018-08-21 Thread Botong Huang (JIRA)
Botong Huang created YARN-8697:
--

 Summary: LocalityMulticastAMRMProxyPolicy should fallback to 
random sub-cluster when cannot resolve resource
 Key: YARN-8697
 URL: https://issues.apache.org/jira/browse/YARN-8697
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Botong Huang
Assignee: Botong Huang


Right now in LocalityMulticastAMRMProxyPolicy, whenever we cannot resolve the 
resource name (node or rack), we always route the request to home sub-cluster. 
However, home sub-cluster might not be always be ready to use (timed out 
YARN-8581) or enabled (by AMRMProxyPolicy weights). It might also be 
overwhelmed by the requests if sub-cluster resolver has some issue. In this 
Jira, we are changing it to pick a random active and enabled sub-cluster for 
resource request we cannot resolve. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service

2018-08-21 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588184#comment-16588184
 ] 

Chandni Singh commented on YARN-8298:
-

Thanks [~eyang] 

> Yarn Service Upgrade: Support express upgrade of a service
> --
>
> Key: YARN-8298
> URL: https://issues.apache.org/jira/browse/YARN-8298
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.1.1
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-8298.001.patch, YARN-8298.002.patch, 
> YARN-8298.003.patch, YARN-8298.004.patch, YARN-8298.005.patch, 
> YARN-8298.006.patch
>
>
> Currently service upgrade involves 2 steps
>  * initiate upgrade by providing new spec
>  * trigger upgrade of each instance/component
>  
> We need to add the ability to upgrade the service in one shot:
>  # Aborting the upgrade will not be supported
>  # Upgrade finalization will be done automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service

2018-08-21 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588181#comment-16588181
 ] 

Hudson commented on YARN-8298:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14812 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14812/])
YARN-8298.  Added express upgrade for YARN service. (eyang: rev 
e557c6bd8de2811a561210f672f47b4d07a9d5c6)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/component/ComponentEvent.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/ServiceEvent.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/api/records/ServiceState.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestYarnCLI.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-api/src/main/java/org/apache/hadoop/yarn/service/client/ApiServiceClient.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/ServiceScheduler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/ServiceManager.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/test/java/org/apache/hadoop/yarn/service/TestYarnNativeServices.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-api/src/main/java/org/apache/hadoop/yarn/service/webapp/ApiServer.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/client/ServiceClient.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/component/instance/ComponentInstance.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/ApplicationCLI.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/test/java/org/apache/hadoop/yarn/service/utils/TestServiceApiUtil.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/component/Component.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/test/java/org/apache/hadoop/yarn/service/TestServiceManager.java
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/test/java/org/apache/hadoop/yarn/service/TestServiceApiUtil.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/proto/ClientAMProtocol.proto
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/AppAdminClient.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/utils/ServiceApiUtil.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/ClientAMService.java


> Yarn Service Upgrade: Support express upgrade of a service
> --
>
> Key: YARN-8298
> URL: https://issues.apache.org/jira/browse/YARN-8298
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.1.1
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-8298.001.patch, YARN-8298.002.patch, 
> YARN-8298.003.patch, YARN-8298.004.patch, YARN-8298.005.patch, 
> YARN-8298.006.patch
>
>
> Currently service upgrade involves 2 steps
>  * initiate upgrade by providing new spec
>  * trigger upgrade of each instance/component
>  
> We need to add the ability to upgrade the service in one shot:
>  # Aborting the upgrade will not be supported
>  # Upgrade finalization will be done automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service

2018-08-21 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588169#comment-16588169
 ] 

Eric Yang commented on YARN-8298:
-

+1 Patch 6 looks good to me.

> Yarn Service Upgrade: Support express upgrade of a service
> --
>
> Key: YARN-8298
> URL: https://issues.apache.org/jira/browse/YARN-8298
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.1.1
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8298.001.patch, YARN-8298.002.patch, 
> YARN-8298.003.patch, YARN-8298.004.patch, YARN-8298.005.patch, 
> YARN-8298.006.patch
>
>
> Currently service upgrade involves 2 steps
>  * initiate upgrade by providing new spec
>  * trigger upgrade of each instance/component
>  
> We need to add the ability to upgrade the service in one shot:
>  # Aborting the upgrade will not be supported
>  # Upgrade finalization will be done automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8509) Total pending resource calculation in preemption should use user-limit factor instead of minimum-user-limit-percent

2018-08-21 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588153#comment-16588153
 ] 

genericqa commented on YARN-8509:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 10 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 39s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 39s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 13 new + 1374 unchanged - 5 fixed = 1387 total (was 1379) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 43s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 70m 
58s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}123m  7s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-8509 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12936517/YARN-8509.005.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 62bb3eacc75c 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 
08:53:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 9c3fc3e |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/21653/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/21653/testReport/ |
| Max. process+thread count | 866 (vs. ulimit of 1) |
| modules | C: 

[jira] [Commented] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service

2018-08-21 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588152#comment-16588152
 ] 

genericqa commented on YARN-8298:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
 7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 15s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
42s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  7m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
13s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 15s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 9 new + 387 unchanged - 2 fixed = 396 total (was 389) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 24s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
5s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
23s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 25m  
3s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 
43s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
51s{color} | {color:green} hadoop-yarn-services-api in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}119m 51s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-8298 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12936518/YARN-8298.006.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  

[jira] [Assigned] (YARN-8672) TestContainerManager#testLocalingResourceWhileContainerRunning occasionally times out

2018-08-21 Thread Chandni Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandni Singh reassigned YARN-8672:
---

Assignee: Chandni Singh

> TestContainerManager#testLocalingResourceWhileContainerRunning occasionally 
> times out
> -
>
> Key: YARN-8672
> URL: https://issues.apache.org/jira/browse/YARN-8672
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.2.0
>Reporter: Jason Lowe
>Assignee: Chandni Singh
>Priority: Major
>
> Precommit builds have been failing in 
> TestContainerManager#testLocalingResourceWhileContainerRunning.  I have been 
> able to reproduce the problem without any patch applied if I run the test 
> enough times.  It looks like something is removing container tokens from the 
> nmPrivate area just as a new localizer starts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7644) NM gets backed up deleting docker containers

2018-08-21 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588083#comment-16588083
 ] 

Chandni Singh commented on YARN-7644:
-

[~ebadger] I would like to work on this issue. Please re-assign to me if you 
are not working on it.

> NM gets backed up deleting docker containers
> 
>
> Key: YARN-7644
> URL: https://issues.apache.org/jira/browse/YARN-7644
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
>  Labels: Docker
>
> We are sending a {{docker stop}} to the docker container with a timeout of 10 
> seconds when we shut down a container. If the container does not stop after 
> 10 seconds then we force kill it. However, the {{docker stop}} command is a 
> blocking call. So in cases where lots of containers don't go down with the 
> initial SIGTERM, we have to wait 10+ seconds for the {{docker stop}} to 
> return. This ties up the ContainerLaunch handler and so these kill events 
> back up. It also appears to be backing up new container launches as well. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7644) NM gets backed up deleting docker containers

2018-08-21 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-7644:

Parent Issue: YARN-8472  (was: YARN-3611)

> NM gets backed up deleting docker containers
> 
>
> Key: YARN-7644
> URL: https://issues.apache.org/jira/browse/YARN-7644
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
>  Labels: Docker
>
> We are sending a {{docker stop}} to the docker container with a timeout of 10 
> seconds when we shut down a container. If the container does not stop after 
> 10 seconds then we force kill it. However, the {{docker stop}} command is a 
> blocking call. So in cases where lots of containers don't go down with the 
> initial SIGTERM, we have to wait 10+ seconds for the {{docker stop}} to 
> return. This ties up the ContainerLaunch handler and so these kill events 
> back up. It also appears to be backing up new container launches as well. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8675) Setting hostname of docker container breaks with "host" networking mode for Apps which do not run as a YARN service

2018-08-21 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-8675:

Labels: Docker  (was: )

> Setting hostname of docker container breaks with "host" networking mode for 
> Apps which do not run as a YARN service
> ---
>
> Key: YARN-8675
> URL: https://issues.apache.org/jira/browse/YARN-8675
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yesha Vora
>Assignee: Suma Shivaprasad
>Priority: Major
>  Labels: Docker
>
> Applications like the Spark AM currently do not run as a YARN service and 
> setting hostname breaks driver/executor communication if docker version 
> >=1.13.1 , especially with wire-encryption turned on.
> YARN-8027 sets the hostname if YARN DNS is enabled. But the cluster could 
> have a mix of YARN service/native Applications.
> The proposal is to not set the hostname when "host" networking mode is 
> enabled.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7863) Modify placement constraints to support node attributes

2018-08-21 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588066#comment-16588066
 ] 

genericqa commented on YARN-7863:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
24s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} YARN-3409 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  3m 
31s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 29m 
24s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
24s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
15s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m  7s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
33s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
36s{color} | {color:green} YARN-3409 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 10m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 59s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
3s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m  
0s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 80m 27s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 16m 34s{color} 
| {color:red} hadoop-yarn-applications-distributedshell in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
57s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}199m  7s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMHA |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-7863 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12936498/YARN-7863-YARN-3409.008.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 4401f7177d81 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Updated] (YARN-8696) FederationInterceptor upgrade: home sub-cluster heartbeat async

2018-08-21 Thread Botong Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated YARN-8696:
---
Attachment: YARN-8696.v1.patch

> FederationInterceptor upgrade: home sub-cluster heartbeat async
> ---
>
> Key: YARN-8696
> URL: https://issues.apache.org/jira/browse/YARN-8696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Major
> Attachments: YARN-8696.v1.patch
>
>
> Today in _FederationInterceptor_, the heartbeat to home sub-cluster is 
> synchronous. After the heartbeat is sent out to home sub-cluster, it waits 
> for the home response to come back before merging and returning the (merged) 
> heartbeat result to back AM. If home sub-cluster is suffering from connection 
> issues, or down during an YarnRM master-slave switch, all heartbeat threads 
> in _FederationInterceptor_ will be blocked waiting for home response. As a 
> result, the successful UAM heartbeats from secondary sub-clusters will not be 
> returned to AM at all. Additionally, because of the fact that we kept the 
> same heartbeat responseId between AM and home RM, lots of tricky handling are 
> needed regarding the responseId resync when it comes to 
> _FederationInterceptor_ (part of AMRMProxy, NM) work preserving restart 
> (YARN-6127, YARN-1336), home RM master-slave switch etc. 
> In this patch, we change the heartbeat to home sub-cluster to asynchronous, 
> same as the way we handle UAM heartbeats in secondaries. So that any 
> sub-cluster down or connection issues won't impact AM getting responses from 
> other sub-clusters. The responseId is also managed separately for home 
> sub-cluster and AM, and they increment independently. The resync logic 
> becomes much cleaner. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8696) FederationInterceptor upgrade: home sub-cluster heartbeat async

2018-08-21 Thread Botong Huang (JIRA)
Botong Huang created YARN-8696:
--

 Summary: FederationInterceptor upgrade: home sub-cluster heartbeat 
async
 Key: YARN-8696
 URL: https://issues.apache.org/jira/browse/YARN-8696
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Botong Huang
Assignee: Botong Huang


Today in _FederationInterceptor_, the heartbeat to home sub-cluster is 
synchronous. After the heartbeat is sent out to home sub-cluster, it waits for 
the home response to come back before merging and returning the (merged) 
heartbeat result to back AM. If home sub-cluster is suffering from connection 
issues, or down during an YarnRM master-slave switch, all heartbeat threads in 
_FederationInterceptor_ will be blocked waiting for home response. As a result, 
the successful UAM heartbeats from secondary sub-clusters will not be returned 
to AM at all. Additionally, because of the fact that we kept the same heartbeat 
responseId between AM and home RM, lots of tricky handling are needed regarding 
the responseId resync when it comes to _FederationInterceptor_ (part of 
AMRMProxy, NM) work preserving restart (YARN-6127, YARN-1336), home RM 
master-slave switch etc. 

In this patch, we change the heartbeat to home sub-cluster to asynchronous, 
same as the way we handle UAM heartbeats in secondaries. So that any 
sub-cluster down or connection issues won't impact AM getting responses from 
other sub-clusters. The responseId is also managed separately for home 
sub-cluster and AM, and they increment independently. The resync logic becomes 
much cleaner. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8696) FederationInterceptor upgrade: home sub-cluster heartbeat async

2018-08-21 Thread Botong Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated YARN-8696:
---
Issue Type: Sub-task  (was: Task)
Parent: YARN-5597

> FederationInterceptor upgrade: home sub-cluster heartbeat async
> ---
>
> Key: YARN-8696
> URL: https://issues.apache.org/jira/browse/YARN-8696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Major
>
> Today in _FederationInterceptor_, the heartbeat to home sub-cluster is 
> synchronous. After the heartbeat is sent out to home sub-cluster, it waits 
> for the home response to come back before merging and returning the (merged) 
> heartbeat result to back AM. If home sub-cluster is suffering from connection 
> issues, or down during an YarnRM master-slave switch, all heartbeat threads 
> in _FederationInterceptor_ will be blocked waiting for home response. As a 
> result, the successful UAM heartbeats from secondary sub-clusters will not be 
> returned to AM at all. Additionally, because of the fact that we kept the 
> same heartbeat responseId between AM and home RM, lots of tricky handling are 
> needed regarding the responseId resync when it comes to 
> _FederationInterceptor_ (part of AMRMProxy, NM) work preserving restart 
> (YARN-6127, YARN-1336), home RM master-slave switch etc. 
> In this patch, we change the heartbeat to home sub-cluster to asynchronous, 
> same as the way we handle UAM heartbeats in secondaries. So that any 
> sub-cluster down or connection issues won't impact AM getting responses from 
> other sub-clusters. The responseId is also managed separately for home 
> sub-cluster and AM, and they increment independently. The resync logic 
> becomes much cleaner. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service

2018-08-21 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588018#comment-16588018
 ] 

Chandni Singh commented on YARN-8298:
-

[~eyang] patch 6 includes the change for upgrading component by component. 
However, if a component upgrade fails, manual intervention is required.

 

> Yarn Service Upgrade: Support express upgrade of a service
> --
>
> Key: YARN-8298
> URL: https://issues.apache.org/jira/browse/YARN-8298
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.1.1
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8298.001.patch, YARN-8298.002.patch, 
> YARN-8298.003.patch, YARN-8298.004.patch, YARN-8298.005.patch, 
> YARN-8298.006.patch
>
>
> Currently service upgrade involves 2 steps
>  * initiate upgrade by providing new spec
>  * trigger upgrade of each instance/component
>  
> We need to add the ability to upgrade the service in one shot:
>  # Aborting the upgrade will not be supported
>  # Upgrade finalization will be done automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service

2018-08-21 Thread Chandni Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandni Singh updated YARN-8298:

Attachment: YARN-8298.006.patch

> Yarn Service Upgrade: Support express upgrade of a service
> --
>
> Key: YARN-8298
> URL: https://issues.apache.org/jira/browse/YARN-8298
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.1.1
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8298.001.patch, YARN-8298.002.patch, 
> YARN-8298.003.patch, YARN-8298.004.patch, YARN-8298.005.patch, 
> YARN-8298.006.patch
>
>
> Currently service upgrade involves 2 steps
>  * initiate upgrade by providing new spec
>  * trigger upgrade of each instance/component
>  
> We need to add the ability to upgrade the service in one shot:
>  # Aborting the upgrade will not be supported
>  # Upgrade finalization will be done automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8509) Total pending resource calculation in preemption should use user-limit factor instead of minimum-user-limit-percent

2018-08-21 Thread Zian Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588010#comment-16588010
 ] 

Zian Chen commented on YARN-8509:
-

Fix failed UTs and re-upload the patch

> Total pending resource calculation in preemption should use user-limit factor 
> instead of minimum-user-limit-percent
> ---
>
> Key: YARN-8509
> URL: https://issues.apache.org/jira/browse/YARN-8509
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
>  Labels: capacityscheduler
> Attachments: YARN-8509.001.patch, YARN-8509.002.patch, 
> YARN-8509.003.patch, YARN-8509.004.patch, YARN-8509.005.patch
>
>
> In LeafQueue#getTotalPendingResourcesConsideringUserLimit, we calculate total 
> pending resource based on user-limit percent and user-limit factor which will 
> cap pending resource for each user to the minimum of user-limit pending and 
> actual pending. This will prevent queue from taking more pending resource to 
> achieve queue balance after all queue satisfied with its ideal allocation.
>   
>  We need to change the logic to let queue pending can go beyond userlimit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8509) Total pending resource calculation in preemption should use user-limit factor instead of minimum-user-limit-percent

2018-08-21 Thread Zian Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-8509:

Attachment: YARN-8509.005.patch

> Total pending resource calculation in preemption should use user-limit factor 
> instead of minimum-user-limit-percent
> ---
>
> Key: YARN-8509
> URL: https://issues.apache.org/jira/browse/YARN-8509
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
>  Labels: capacityscheduler
> Attachments: YARN-8509.001.patch, YARN-8509.002.patch, 
> YARN-8509.003.patch, YARN-8509.004.patch, YARN-8509.005.patch
>
>
> In LeafQueue#getTotalPendingResourcesConsideringUserLimit, we calculate total 
> pending resource based on user-limit percent and user-limit factor which will 
> cap pending resource for each user to the minimum of user-limit pending and 
> actual pending. This will prevent queue from taking more pending resource to 
> achieve queue balance after all queue satisfied with its ideal allocation.
>   
>  We need to change the logic to let queue pending can go beyond userlimit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8581) [AMRMProxy] Add sub-cluster timeout in LocalityMulticastAMRMProxyPolicy

2018-08-21 Thread Botong Huang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588006#comment-16588006
 ] 

Botong Huang commented on YARN-8581:


Thanks [~giovanni.fumarola] for the review! 

> [AMRMProxy] Add sub-cluster timeout in LocalityMulticastAMRMProxyPolicy
> ---
>
> Key: YARN-8581
> URL: https://issues.apache.org/jira/browse/YARN-8581
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: amrmproxy, federation
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Major
> Attachments: YARN-8581-branch-2.v2.patch, YARN-8581.v1.patch, 
> YARN-8581.v2.patch
>
>
> In Federation, every time an AM heartbeat comes in, 
> LocalityMulticastAMRMProxyPolicy in AMRMProxy splits the asks according to 
> the list of active and enabled sub-clusters. However, if we haven't been able 
> to heartbeat to a sub-cluster for some time (network issues, or we keep 
> hitting some exception from YarnRM, or YarnRM master-slave switch is taking a 
> long time etc.), we should consider the sub-cluster as unhealthy and stop 
> routing asks there, until the heartbeat channel becomes healthy again. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8673) [AMRMProxy] More robust responseId resync after an YarnRM master slave switch

2018-08-21 Thread Botong Huang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588003#comment-16588003
 ] 

Botong Huang commented on YARN-8673:


Thanks [~giovanni.fumarola]!

> [AMRMProxy] More robust responseId resync after an YarnRM master slave switch
> -
>
> Key: YARN-8673
> URL: https://issues.apache.org/jira/browse/YARN-8673
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: amrmproxy
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Major
> Attachments: YARN-8673-branch-2.v2.patch, YARN-8673.v1.patch, 
> YARN-8673.v2.patch
>
>
> After master slave switch of YarnRM, an _ApplicationNotRegisteredException_ 
> will be thrown from the new YarnRM. AM will re-regsiter and reset the 
> responseId to zero. _AMRMClientRelayer_ inside _FederationInterceptor_ 
> follows the same protocol, and does the automatic re-register and responseId 
> resync. However, when exceptions or temporary network issue happens in the 
> allocate call after re-register, the resync logic might be broken. This patch 
> improves the robustness of the process by parsing the expected repsonseId 
> from YarnRM exception message. So that whenever the responseId is out of sync 
> for whatever reason, we can automatically resync and move on. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8673) [AMRMProxy] More robust responseId resync after an YarnRM master slave switch

2018-08-21 Thread Giovanni Matteo Fumarola (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587946#comment-16587946
 ] 

Giovanni Matteo Fumarola commented on YARN-8673:


Committed to branch-2 as well.

> [AMRMProxy] More robust responseId resync after an YarnRM master slave switch
> -
>
> Key: YARN-8673
> URL: https://issues.apache.org/jira/browse/YARN-8673
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: amrmproxy
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Major
> Attachments: YARN-8673-branch-2.v2.patch, YARN-8673.v1.patch, 
> YARN-8673.v2.patch
>
>
> After master slave switch of YarnRM, an _ApplicationNotRegisteredException_ 
> will be thrown from the new YarnRM. AM will re-regsiter and reset the 
> responseId to zero. _AMRMClientRelayer_ inside _FederationInterceptor_ 
> follows the same protocol, and does the automatic re-register and responseId 
> resync. However, when exceptions or temporary network issue happens in the 
> allocate call after re-register, the resync logic might be broken. This patch 
> improves the robustness of the process by parsing the expected repsonseId 
> from YarnRM exception message. So that whenever the responseId is out of sync 
> for whatever reason, we can automatically resync and move on. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8649) Similar as YARN-4355:NPE while processing localizer heartbeat

2018-08-21 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587944#comment-16587944
 ] 

Jason Lowe commented on YARN-8649:
--

Thanks for the analysis and patch, [~xiaoheipangzi]!

Is ignoring the null the right thing to do here? This in the middle of trying 
to find a path to localize a resource, and if the NM doesn't know about the 
resource then it seems inappropriate to go ahead and find a local path to put 
the resource and let the localizer go ahead and download it.  That will be a 
waste of network and disk resources at best or an outright leak of the disk 
space at worst if its not cleaned up when the localizer finishes the download 
and reports completion on a resource the NM doesn't know about.


> Similar as YARN-4355:NPE while processing localizer heartbeat
> -
>
> Key: YARN-8649
> URL: https://issues.apache.org/jira/browse/YARN-8649
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: lujie
>Assignee: lujie
>Priority: Major
> Attachments: YARN-8649.patch, hadoop-hires-nodemanager-hadoop11.log
>
>
> I have noticed that a nodemanager was getting NPEs while tearing down. The 
> reason maybe  similar to YARN-4355 which is reported by [# Jason Lowe]. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8581) [AMRMProxy] Add sub-cluster timeout in LocalityMulticastAMRMProxyPolicy

2018-08-21 Thread Giovanni Matteo Fumarola (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586549#comment-16586549
 ] 

Giovanni Matteo Fumarola edited comment on YARN-8581 at 8/21/18 8:07 PM:
-

Thanks [~botong] . Committed to trunk and branch-2.


was (Author: giovanni.fumarola):
Thanks [~botong] . Committed to trunk.

> [AMRMProxy] Add sub-cluster timeout in LocalityMulticastAMRMProxyPolicy
> ---
>
> Key: YARN-8581
> URL: https://issues.apache.org/jira/browse/YARN-8581
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: amrmproxy, federation
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Major
> Attachments: YARN-8581-branch-2.v2.patch, YARN-8581.v1.patch, 
> YARN-8581.v2.patch
>
>
> In Federation, every time an AM heartbeat comes in, 
> LocalityMulticastAMRMProxyPolicy in AMRMProxy splits the asks according to 
> the list of active and enabled sub-clusters. However, if we haven't been able 
> to heartbeat to a sub-cluster for some time (network issues, or we keep 
> hitting some exception from YarnRM, or YarnRM master-slave switch is taking a 
> long time etc.), we should consider the sub-cluster as unhealthy and stop 
> routing asks there, until the heartbeat channel becomes healthy again. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8468) Limit container sizes per queue in FairScheduler

2018-08-21 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587903#comment-16587903
 ] 

genericqa commented on YARN-8468:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 10 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 39s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
23s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 14s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 41 new + 558 unchanged - 15 fixed = 599 total (was 573) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 40s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 72m 28s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
21s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
36s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}140m  5s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMService |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-8468 |
| JIRA Patch URL | 

[jira] [Commented] (YARN-8675) Setting hostname of docker container breaks with "host" networking mode for Apps which do not run as a YARN service

2018-08-21 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587873#comment-16587873
 ] 

Eric Yang commented on YARN-8675:
-

Docker only prevents --hostname and --net=host flag to be combined in older 
version of docker (1.13) . See [Docker 
issue|https://github.com/moby/moby/pull/29144].  The following table illustrate 
the possible combination when we should and should not support customized 
hostname:

| YARN Registry DNS | YARN Service | Custom AM | Network Type | Custom Hostname 
|
| Enabled | Yes | No | Host | Yes |
| Enabled | No | Yes | Host | No |
| Disabled | No | Yes | Host | No |
| Disabled | No | Yes | Bridge | N/A |

Today registryDNS and YARN service are coupled together, only YARN service 
knows how to populate hostname information to registryDNS.  If custom AM 
creates it's own logic to generate custom hostname, it must have some way to 
populate RegistryDNS to translate correct name.  Without using YARN service, 
there is no programmable API to customize hostname.  This is the reason that 
Spark on YARN cluster on docker mode fails with buzzard hostname composition.

For resolving spark issue, it is entirely possible to run spark using YARN 
service API without making any code changes to spark standalone mode.  To fix 
this issue properly, it would be best to provide hint to docker runtime to 
decide if custom hostname can be supported.  Custom hostname is a new concept 
that doesn't exist prior to docker container.  Therefore, only new application 
using YARN service should be supported in my view.

> Setting hostname of docker container breaks with "host" networking mode for 
> Apps which do not run as a YARN service
> ---
>
> Key: YARN-8675
> URL: https://issues.apache.org/jira/browse/YARN-8675
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yesha Vora
>Assignee: Suma Shivaprasad
>Priority: Major
>
> Applications like the Spark AM currently do not run as a YARN service and 
> setting hostname breaks driver/executor communication if docker version 
> >=1.13.1 , especially with wire-encryption turned on.
> YARN-8027 sets the hostname if YARN DNS is enabled. But the cluster could 
> have a mix of YARN service/native Applications.
> The proposal is to not set the hostname when "host" networking mode is 
> enabled.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7863) Modify placement constraints to support node attributes

2018-08-21 Thread Sunil Govindan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil Govindan updated YARN-7863:
-
Attachment: YARN-7863-YARN-3409.008.patch

> Modify placement constraints to support node attributes
> ---
>
> Key: YARN-7863
> URL: https://issues.apache.org/jira/browse/YARN-7863
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Sunil Govindan
>Assignee: Sunil Govindan
>Priority: Major
> Attachments: YARN-7863-YARN-3409.002.patch, 
> YARN-7863-YARN-3409.003.patch, YARN-7863-YARN-3409.004.patch, 
> YARN-7863-YARN-3409.005.patch, YARN-7863-YARN-3409.006.patch, 
> YARN-7863-YARN-3409.007.patch, YARN-7863-YARN-3409.008.patch, 
> YARN-7863.v0.patch
>
>
> This Jira will track to *Modify existing placement constraints to support 
> node attributes.*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7863) Modify placement constraints to support node attributes

2018-08-21 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587849#comment-16587849
 ] 

Sunil Govindan commented on YARN-7863:
--

Updated v8 patch.

cc [~cheersyang] [~Naganarasimha]

> Modify placement constraints to support node attributes
> ---
>
> Key: YARN-7863
> URL: https://issues.apache.org/jira/browse/YARN-7863
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Sunil Govindan
>Assignee: Sunil Govindan
>Priority: Major
> Attachments: YARN-7863-YARN-3409.002.patch, 
> YARN-7863-YARN-3409.003.patch, YARN-7863-YARN-3409.004.patch, 
> YARN-7863-YARN-3409.005.patch, YARN-7863-YARN-3409.006.patch, 
> YARN-7863-YARN-3409.007.patch, YARN-7863-YARN-3409.008.patch, 
> YARN-7863.v0.patch
>
>
> This Jira will track to *Modify existing placement constraints to support 
> node attributes.*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8675) Setting hostname of docker container breaks with "host" networking mode for Apps which do not run as a YARN service

2018-08-21 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-8675:
-
Reporter: Yesha Vora  (was: Suma Shivaprasad)

> Setting hostname of docker container breaks with "host" networking mode for 
> Apps which do not run as a YARN service
> ---
>
> Key: YARN-8675
> URL: https://issues.apache.org/jira/browse/YARN-8675
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yesha Vora
>Assignee: Suma Shivaprasad
>Priority: Major
>
> Applications like the Spark AM currently do not run as a YARN service and 
> setting hostname breaks driver/executor communication if docker version 
> >=1.13.1 , especially with wire-encryption turned on.
> YARN-8027 sets the hostname if YARN DNS is enabled. But the cluster could 
> have a mix of YARN service/native Applications.
> The proposal is to not set the hostname when "host" networking mode is 
> enabled.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-8572) YarnClient getContainers API should support filtering by container status

2018-08-21 Thread Abhishek Modi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Modi reassigned YARN-8572:
---

Assignee: Abhishek Modi

> YarnClient getContainers API should support filtering by container status
> -
>
> Key: YARN-8572
> URL: https://issues.apache.org/jira/browse/YARN-8572
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Reporter: Suma Shivaprasad
>Assignee: Abhishek Modi
>Priority: Major
>
> YarnClient.getContainers should support filtering containers by their status 
> - RUNNING, COMPLETED etc . This may require corresponding changes in ATS to 
> filter by container status for a given application attempt



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-8663) Opportunistic Container property "mapreduce.job.num-opportunistic-maps-percent" is throwing wrong exception at wrong sequence

2018-08-21 Thread Abhishek Modi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Modi reassigned YARN-8663:
---

Assignee: Abhishek Modi

> Opportunistic Container property 
> "mapreduce.job.num-opportunistic-maps-percent" is throwing wrong exception at 
> wrong sequence
> -
>
> Key: YARN-8663
> URL: https://issues.apache.org/jira/browse/YARN-8663
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.1.1
> Environment: Secure Installation with Kerberos ON.
>Reporter: Akshay Agarwal
>Assignee: Abhishek Modi
>Priority: Major
>
> Pre-requisites:
> {code:java}
> 1. Install HA cluster.
> 2.Set yarn.nodemanager.opportunistic-containers-max-queue-length=(positive 
> integer value)[NodeManager->yarnsite.xml]
> 3. Set yarn.resourcemanager.opportunistic-container-allocation.enabled= 
> true[ResourceManager->yarnsite.xml]
> {code}
>  
> Steps to reproduce:
> {code:java}
> 1.Keep All NodeManagers Up
> 2. Submit a job with -Dmapreduce.job.num-opportunistic-maps-percent="abh" or 
> "2.5" 
> {code}
> Expected Result: 
> {code:java}
> Should through an Exception stating "NumberFormatException" before writing 
> the input for mappers.
> {code}
> Log Details:
> {code:java}
> 2018-08-14 18:15:54,049 INFO mapreduce.Job:  map 0% reduce 0%
> 2018-08-14 18:15:54,069 INFO mapreduce.Job: Job job_1534236847054_0005 failed 
> with state FAILED due to: Application application_1534236847054_0005 failed 2 
> times due to AM Container for appattempt_1534236847054_0005_02 exited 
> with  exitCode: 1
> Failing this attempt.Diagnostics: [2018-08-14 18:15:53.110]Exception from 
> container-launch.
> Container id: container_e31_1534236847054_0005_02_01
> Exit code: 1
> [2018-08-14 18:15:53.113]Container exited with a non-zero exit code 1. Error 
> file: prelaunch.err.
> Last 4096 bytes of prelaunch.err :
> Last 4096 bytes of stderr :
> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option UseSplitVerifier; 
> support was removed in 8.0
> Aug 14, 2018 6:15:51 PM 
> com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
> INFO: Registering 
> org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver as a provider 
> class
> Aug 14, 2018 6:15:51 PM 
> com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
> INFO: Registering org.apache.hadoop.yarn.webapp.GenericExceptionHandler as a 
> provider class
> Aug 14, 2018 6:15:51 PM 
> com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
> INFO: Registering org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices as 
> a root resource class
> Aug 14, 2018 6:15:51 PM 
> com.sun.jersey.server.impl.application.WebApplicationImpl _initiate
> INFO: Initiating Jersey application, version 'Jersey: 1.19 02/11/2015 03:25 
> AM'
> Aug 14, 2018 6:15:51 PM 
> com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory 
> getComponentProvider
> INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver 
> to GuiceManagedComponentProvider with the scope "Singleton"
> Aug 14, 2018 6:15:52 PM 
> com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory 
> getComponentProvider
> INFO: Binding org.apache.hadoop.yarn.webapp.GenericExceptionHandler to 
> GuiceManagedComponentProvider with the scope "Singleton"
> Aug 14, 2018 6:15:52 PM 
> com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory 
> getComponentProvider
> INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices to 
> GuiceManagedComponentProvider with the scope "PerRequest"
> log4j:WARN No appenders could be found for logger 
> (org.apache.hadoop.mapreduce.v2.app.MRAppMaster).
> log4j:WARN Please initialize the log4j system properly.
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
> info.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8649) Similar as YARN-4355:NPE while processing localizer heartbeat

2018-08-21 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587715#comment-16587715
 ] 

genericqa commented on YARN-8649:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
25s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 42s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 22s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 1 new + 103 unchanged - 0 fixed = 104 total (was 103) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 58s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 
58s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 66m 56s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-8649 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12936466/YARN-8649.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 70258d99e914 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 
08:53:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 9c3fc3e |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/21649/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/21649/testReport/ |
| Max. process+thread count | 439 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 

[jira] [Updated] (YARN-8468) Limit container sizes per queue in FairScheduler

2018-08-21 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antal Bálint Steinbach updated YARN-8468:
-
Attachment: YARN-8468.005.patch

> Limit container sizes per queue in FairScheduler
> 
>
> Key: YARN-8468
> URL: https://issues.apache.org/jira/browse/YARN-8468
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 3.1.0
>Reporter: Antal Bálint Steinbach
>Assignee: Antal Bálint Steinbach
>Priority: Critical
> Attachments: YARN-8468.000.patch, YARN-8468.001.patch, 
> YARN-8468.002.patch, YARN-8468.003.patch, YARN-8468.004.patch, 
> YARN-8468.005.patch
>
>
> When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" 
> to limit the overall size of a container. This applies globally to all 
> containers and cannot be limited by queue or and is not scheduler dependent.
>  
> The goal of this ticket is to allow this value to be set on a per queue basis.
>  
> The use case: User has two pools, one for ad hoc jobs and one for enterprise 
> apps. User wants to limit ad hoc jobs to small containers but allow 
> enterprise apps to request as many resources as needed. Setting 
> yarn.scheduler.maximum-allocation-mb sets a default value for maximum 
> container size for all queues and setting maximum resources per queue with 
> “maxContainerResources” queue config value.
>  
> Suggested solution:
>  
> All the infrastructure is already in the code. We need to do the following:
>  * add the setting to the queue properties for all queue types (parent and 
> leaf), this will cover dynamically created queues.
>  * if we set it on the root we override the scheduler setting and we should 
> not allow that.
>  * make sure that queue resource cap can not be larger than scheduler max 
> resource cap in the config.
>  * implement getMaximumResourceCapability(String queueName) in the 
> FairScheduler
>  * implement getMaximumResourceCapability() in both FSParentQueue and 
> FSLeafQueue as follows
>  * expose the setting in the queue information in the RM web UI.
>  * expose the setting in the metrics etc for the queue.
>  * write JUnit tests.
>  * update the scheduler documentation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker

2018-08-21 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587681#comment-16587681
 ] 

Eric Badger commented on YARN-8648:
---

IMO I like the idea of actually dealing with this problem via proposal 1, but 
it seems like a much bigger effort that has many corner cases and realistically 
is going to require a whole new resource module controller. Therefore, I think 
we shouldn't let perfect get in the way of good and move forward with the more 
simple approach of removing the cgroups via the container-executor. I don't 
_like_ this solution, but I think it is a stopgap until we are able to really 
fix the underlying issues. 

> Container cgroups are leaked when using docker
> --
>
> Key: YARN-8648
> URL: https://issues.apache.org/jira/browse/YARN-8648
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
>  Labels: Docker
>
> When you run with docker and enable cgroups for cpu, docker creates cgroups 
> for all resources on the system, not just for cpu.  For instance, if the 
> {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}}, 
> the nodemanager will create a cgroup for each container under 
> {{/sys/fs/cgroup/cpu/hadoop-yarn}}.  In the docker case, we pass this path 
> via the {{--cgroup-parent}} command line argument.   Docker then creates a 
> cgroup for the docker container under that, for instance: 
> {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}.
> When the container exits, docker cleans up the {{docker_container_id}} 
> cgroup, and the nodemanager cleans up the {{container_id}} cgroup,   All is 
> good under {{/sys/fs/cgroup/hadoop-yarn}}.
> The problem is that docker also creates that same hierarchy under every 
> resource under {{/sys/fs/cgroup}}.  On the rhel7 system I am using, these 
> are: blkio, cpuset, devices, freezer, hugetlb, memory, net_cls, net_prio, 
> perf_event, and systemd.So for instance, docker creates 
> {{/sys/fs/cgroup/cpuset/hadoop-yarn/container_id/docker_container_id}}, but 
> it only cleans up the leaf cgroup {{docker_container_id}}.  Nobody cleans up 
> the {{container_id}} cgroups for these other resources.  On one of our busy 
> clusters, we found > 100,000 of these leaked cgroups.
> I found this in our 2.8-based version of hadoop, but I have been able to 
> repro with current hadoop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7863) Modify placement constraints to support node attributes

2018-08-21 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587634#comment-16587634
 ] 

genericqa commented on YARN-7863:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} YARN-3409 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  3m 
22s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 28m 
54s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
13s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
13s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m  3s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
38s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
35s{color} | {color:green} YARN-3409 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 44s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 52s{color} 
| {color:red} hadoop-yarn-api in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 70m 24s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 15m 53s{color} 
| {color:red} hadoop-yarn-applications-distributedshell in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
41s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}180m 45s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.api.resource.TestPlacementConstraintParser |
|   | hadoop.yarn.server.resourcemanager.TestRMHA |
|   | hadoop.yarn.applications.distributedshell.TestDistributedShell |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-7863 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12936438/YARN-7863-YARN-3409.007.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux da294ff3ca12 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 

[jira] [Assigned] (YARN-8649) Similar as YARN-4355:NPE while processing localizer heartbeat

2018-08-21 Thread lujie (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lujie reassigned YARN-8649:
---

  Assignee: lujie
Attachment: YARN-8649.patch

Hi [~jlowe], [~pradeepambati],[~$iddhe$h]

I have restudied the bug according the logs.

*The root cause:*
 # When NM shutdowns, it will sent KILL_CONTAINER to the Container, The log has 
shown this event:

{code:java}
2018-08-21 20:11:08,316 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl:
 Container container_1534853453424_0001_01_01 transitioned from LOCALIZING 
to KILLING
{code}
this will led the KillBeforeRunningTransition to execute.
 # In KillBeforeRunningTransition, it will call "container.cleanup()", and in 
"cleanup" function, it will sent "ContainerLocalizationCleanupEvent".
 # ContainerLocalizationCleanupEvent will cause the 
ResourceLocalizationService.handleCleanupContainerResources to execute, and in 
"handleCleanupContainerResources", it  will send  "ResourceReleaseEvent".
 # ResourceReleaseEvent will led cause the LocalResourcesTrackerImpl.handle to 
execute, and in handle(at line 199in source code) it will call removeResouce:

{code:java}
if (event.getType() == ResourceEventType.RELEASE) {
if (rsrc.getState() == ResourceState.DOWNLOADING &&
rsrc.getRefCount() <= 0 &&
rsrc.getRequest().getVisibility() != LocalResourceVisibility.PUBLIC) {
removeResource(req);
}
}
{code}



 # in removeResouce, it will do:

{code:java}
LocalizedResource rsrc = localrsrc.remove(req);
{code}

 # when heartbeat come in, the LocalResourcesTrackerImpl.getPathForLocalization 
will  do:

{code:java}
Path localPath = new Path(rPath, req.getPath().getName());
LocalizedResource rsrc = localrsrc.get(req);//rsec is null
rsrc.setLocalPath(localPath);//NPE
{code}
NPE happens!

*Unit test:*


While fixing YARN-4355, the patch added the test 
"testLocalizerHeartbeatWhenAppCleaningUp" in Class 
"TestResourceLocalizationService"

In the test, it also send the "ContainerLocalizationCleanupEvent", but the test 
doesn't  cover that heartbeat can comes at this moment.

In this patch, we change the "testLocalizerHeartbeatWhenAppCleaningUp" to cover 
this situation. This change will trigger the bug.

 

Fixing:

When we fix the NPE, we only add null check, i think it is suitable here!

> Similar as YARN-4355:NPE while processing localizer heartbeat
> -
>
> Key: YARN-8649
> URL: https://issues.apache.org/jira/browse/YARN-8649
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: lujie
>Assignee: lujie
>Priority: Major
> Attachments: YARN-8649.patch, hadoop-hires-nodemanager-hadoop11.log
>
>
> I have noticed that a nodemanager was getting NPEs while tearing down. The 
> reason maybe  similar to YARN-4355 which is reported by [# Jason Lowe]. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8649) Similar as YARN-4355:NPE while processing localizer heartbeat

2018-08-21 Thread lujie (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587597#comment-16587597
 ] 

lujie edited comment on YARN-8649 at 8/21/18 3:35 PM:
--

Hi [~jlowe], [~pradeepambati],[~$iddhe$h]

I have restudied the bug according the logs.

*The root cause:*
 # When NM shutdowns, it will sent KILL_CONTAINER to the Container, The log has 
shown this event:

{code:java}
2018-08-21 20:11:08,316 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl:
 Container container_1534853453424_0001_01_01 transitioned from LOCALIZING 
to KILLING
{code}
      this will led the KillBeforeRunningTransition to execute.

     2. In KillBeforeRunningTransition, it will call "container.cleanup()", and 
in "cleanup" function, it will sent "ContainerLocalizationCleanupEvent".

     3.  ContainerLocalizationCleanupEvent will cause the 
ResourceLocalizationService.handleCleanupContainerResources to execute, and in 
"handleCleanupContainerResources", it  will send  "ResourceReleaseEvent".

    4. ResourceReleaseEvent will led cause the LocalResourcesTrackerImpl.handle 
to execute, and in handle(at line 199in source code) it will call removeResouce:
{code:java}
if (event.getType() == ResourceEventType.RELEASE) {
if (rsrc.getState() == ResourceState.DOWNLOADING &&
rsrc.getRefCount() <= 0 &&
rsrc.getRequest().getVisibility() != LocalResourceVisibility.PUBLIC) {
removeResource(req);
}
}
{code}
    5.  in removeResouce, it will do:
{code:java}
LocalizedResource rsrc = localrsrc.remove(req);
{code}
   6.  when heartbeat come in, the 
LocalResourcesTrackerImpl.getPathForLocalization will  do:
{code:java}
Path localPath = new Path(rPath, req.getPath().getName());
LocalizedResource rsrc = localrsrc.get(req);//rsec is null
rsrc.setLocalPath(localPath);//NPE
{code}
NPE happens!

*Unit test:*

While fixing YARN-4355, the patch added the test 
"testLocalizerHeartbeatWhenAppCleaningUp" in Class 
"TestResourceLocalizationService"

In the test, it also send the "ContainerLocalizationCleanupEvent", but the test 
doesn't  cover that heartbeat can comes at this moment.

In this patch, we change the "testLocalizerHeartbeatWhenAppCleaningUp" to cover 
this situation. This change will trigger the bug.

 

Fixing:

When we fix the NPE, we only add null check, i think it is suitable here!


was (Author: xiaoheipangzi):
Hi [~jlowe], [~pradeepambati],[~$iddhe$h]

I have restudied the bug according the logs.

*The root cause:*
 # When NM shutdowns, it will sent KILL_CONTAINER to the Container, The log has 
shown this event:

{code:java}
2018-08-21 20:11:08,316 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl:
 Container container_1534853453424_0001_01_01 transitioned from LOCALIZING 
to KILLING
{code}
this will led the KillBeforeRunningTransition to execute.
 # In KillBeforeRunningTransition, it will call "container.cleanup()", and in 
"cleanup" function, it will sent "ContainerLocalizationCleanupEvent".
 # ContainerLocalizationCleanupEvent will cause the 
ResourceLocalizationService.handleCleanupContainerResources to execute, and in 
"handleCleanupContainerResources", it  will send  "ResourceReleaseEvent".
 # ResourceReleaseEvent will led cause the LocalResourcesTrackerImpl.handle to 
execute, and in handle(at line 199in source code) it will call removeResouce:

{code:java}
if (event.getType() == ResourceEventType.RELEASE) {
if (rsrc.getState() == ResourceState.DOWNLOADING &&
rsrc.getRefCount() <= 0 &&
rsrc.getRequest().getVisibility() != LocalResourceVisibility.PUBLIC) {
removeResource(req);
}
}
{code}



 # in removeResouce, it will do:

{code:java}
LocalizedResource rsrc = localrsrc.remove(req);
{code}

 # when heartbeat come in, the LocalResourcesTrackerImpl.getPathForLocalization 
will  do:

{code:java}
Path localPath = new Path(rPath, req.getPath().getName());
LocalizedResource rsrc = localrsrc.get(req);//rsec is null
rsrc.setLocalPath(localPath);//NPE
{code}
NPE happens!

*Unit test:*


While fixing YARN-4355, the patch added the test 
"testLocalizerHeartbeatWhenAppCleaningUp" in Class 
"TestResourceLocalizationService"

In the test, it also send the "ContainerLocalizationCleanupEvent", but the test 
doesn't  cover that heartbeat can comes at this moment.

In this patch, we change the "testLocalizerHeartbeatWhenAppCleaningUp" to cover 
this situation. This change will trigger the bug.

 

Fixing:

When we fix the NPE, we only add null check, i think it is suitable here!

> Similar as YARN-4355:NPE while processing localizer heartbeat
> -
>
> Key: YARN-8649
> URL: https://issues.apache.org/jira/browse/YARN-8649
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.1.1
>  

[jira] [Commented] (YARN-7494) Add muti-node lookup mechanism and pluggable nodes sorting policies to optimize placement decision

2018-08-21 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587591#comment-16587591
 ] 

Hudson commented on YARN-7494:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14811 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14811/])
YARN-7494. Add muti-node lookup mechanism and pluggable nodes sorting (wwei: 
rev 9c3fc3ef2865164aa5f121793ac914cfeb21a181)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerNodeLabelUpdate.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/placement/MultiNodeLookupPolicy.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestAppSchedulingInfo.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/ClusterNodeTracker.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/placement/MultiNodeSorter.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSystemTestUtil.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/ApplicationSchedulingConfig.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/placement/MultiNodePolicySpec.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/placement/MultiNodeSortingManager.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/allocator/RegularContainerAllocator.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/placement/ResourceUsageMultiNodeLookupPolicy.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/activities/ActivitiesLogger.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/activities/ActivitiesManager.java
* (edit) 

[jira] [Commented] (YARN-7494) Add muti-node lookup mechanism and pluggable nodes sorting policies to optimize placement decision

2018-08-21 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587564#comment-16587564
 ] 

Weiwei Yang commented on YARN-7494:
---

I just pushed the patch to trunk, thanks for all the efforts [~sunilg], and 
also thanks for the reviews [~leftnoteasy].

> Add muti-node lookup mechanism and pluggable nodes sorting policies to 
> optimize placement decision
> --
>
> Key: YARN-7494
> URL: https://issues.apache.org/jira/browse/YARN-7494
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Reporter: Sunil Govindan
>Assignee: Sunil Govindan
>Priority: Major
> Attachments: YARN-7494.001.patch, YARN-7494.002.patch, 
> YARN-7494.003.patch, YARN-7494.004.patch, YARN-7494.005.patch, 
> YARN-7494.006.patch, YARN-7494.007.patch, YARN-7494.008.patch, 
> YARN-7494.009.patch, YARN-7494.010.patch, YARN-7494.11.patch, 
> YARN-7494.12.patch, YARN-7494.13.patch, YARN-7494.14.patch, 
> YARN-7494.15.patch, YARN-7494.16.patch, YARN-7494.17.patch, 
> YARN-7494.18.patch, YARN-7494.19.patch, YARN-7494.20.patch, 
> YARN-7494.v0.patch, YARN-7494.v1.patch, multi-node-designProposal.png
>
>
> Instead of single node, for effectiveness we can consider a multi node lookup 
> based on partition to start with.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7863) Modify placement constraints to support node attributes

2018-08-21 Thread Weiwei Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-7863:
--
Fix Version/s: (was: 3.2.0)

> Modify placement constraints to support node attributes
> ---
>
> Key: YARN-7863
> URL: https://issues.apache.org/jira/browse/YARN-7863
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Sunil Govindan
>Assignee: Sunil Govindan
>Priority: Major
> Attachments: YARN-7863-YARN-3409.002.patch, 
> YARN-7863-YARN-3409.003.patch, YARN-7863-YARN-3409.004.patch, 
> YARN-7863-YARN-3409.005.patch, YARN-7863-YARN-3409.006.patch, 
> YARN-7863-YARN-3409.007.patch, YARN-7863.v0.patch
>
>
> This Jira will track to *Modify existing placement constraints to support 
> node attributes.*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (YARN-7863) Modify placement constraints to support node attributes

2018-08-21 Thread Weiwei Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-7863:
--
Comment: was deleted

(was: I just pushed this to trunk. Thanks for getting it done [~sunilg]! And 
thanks for the reviews from [~leftnoteasy].)

> Modify placement constraints to support node attributes
> ---
>
> Key: YARN-7863
> URL: https://issues.apache.org/jira/browse/YARN-7863
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Sunil Govindan
>Assignee: Sunil Govindan
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-7863-YARN-3409.002.patch, 
> YARN-7863-YARN-3409.003.patch, YARN-7863-YARN-3409.004.patch, 
> YARN-7863-YARN-3409.005.patch, YARN-7863-YARN-3409.006.patch, 
> YARN-7863-YARN-3409.007.patch, YARN-7863.v0.patch
>
>
> This Jira will track to *Modify existing placement constraints to support 
> node attributes.*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Reopened] (YARN-7863) Modify placement constraints to support node attributes

2018-08-21 Thread Weiwei Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang reopened YARN-7863:
---

> Modify placement constraints to support node attributes
> ---
>
> Key: YARN-7863
> URL: https://issues.apache.org/jira/browse/YARN-7863
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Sunil Govindan
>Assignee: Sunil Govindan
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-7863-YARN-3409.002.patch, 
> YARN-7863-YARN-3409.003.patch, YARN-7863-YARN-3409.004.patch, 
> YARN-7863-YARN-3409.005.patch, YARN-7863-YARN-3409.006.patch, 
> YARN-7863-YARN-3409.007.patch, YARN-7863.v0.patch
>
>
> This Jira will track to *Modify existing placement constraints to support 
> node attributes.*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7494) Add muti-node lookup mechanism and pluggable nodes sorting policies to optimize placement decision

2018-08-21 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587535#comment-16587535
 ] 

Weiwei Yang commented on YARN-7494:
---

LGTM, +1 to v20 patch. I will commit this to trunk shortly. Thanks [~sunilg]

> Add muti-node lookup mechanism and pluggable nodes sorting policies to 
> optimize placement decision
> --
>
> Key: YARN-7494
> URL: https://issues.apache.org/jira/browse/YARN-7494
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Reporter: Sunil Govindan
>Assignee: Sunil Govindan
>Priority: Major
> Attachments: YARN-7494.001.patch, YARN-7494.002.patch, 
> YARN-7494.003.patch, YARN-7494.004.patch, YARN-7494.005.patch, 
> YARN-7494.006.patch, YARN-7494.007.patch, YARN-7494.008.patch, 
> YARN-7494.009.patch, YARN-7494.010.patch, YARN-7494.11.patch, 
> YARN-7494.12.patch, YARN-7494.13.patch, YARN-7494.14.patch, 
> YARN-7494.15.patch, YARN-7494.16.patch, YARN-7494.17.patch, 
> YARN-7494.18.patch, YARN-7494.19.patch, YARN-7494.20.patch, 
> YARN-7494.v0.patch, YARN-7494.v1.patch, multi-node-designProposal.png
>
>
> Instead of single node, for effectiveness we can consider a multi node lookup 
> based on partition to start with.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7494) Add muti-node lookup mechanism and pluggable nodes sorting policies to optimize placement decision

2018-08-21 Thread Weiwei Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-7494:
--
Summary: Add muti-node lookup mechanism and pluggable nodes sorting 
policies to optimize placement decision  (was: Add muti node lookup support for 
better placement)

> Add muti-node lookup mechanism and pluggable nodes sorting policies to 
> optimize placement decision
> --
>
> Key: YARN-7494
> URL: https://issues.apache.org/jira/browse/YARN-7494
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Reporter: Sunil Govindan
>Assignee: Sunil Govindan
>Priority: Major
> Attachments: YARN-7494.001.patch, YARN-7494.002.patch, 
> YARN-7494.003.patch, YARN-7494.004.patch, YARN-7494.005.patch, 
> YARN-7494.006.patch, YARN-7494.007.patch, YARN-7494.008.patch, 
> YARN-7494.009.patch, YARN-7494.010.patch, YARN-7494.11.patch, 
> YARN-7494.12.patch, YARN-7494.13.patch, YARN-7494.14.patch, 
> YARN-7494.15.patch, YARN-7494.16.patch, YARN-7494.17.patch, 
> YARN-7494.18.patch, YARN-7494.19.patch, YARN-7494.20.patch, 
> YARN-7494.v0.patch, YARN-7494.v1.patch, multi-node-designProposal.png
>
>
> Instead of single node, for effectiveness we can consider a multi node lookup 
> based on partition to start with.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-8642) Add support for tmpfs mounts with the Docker runtime

2018-08-21 Thread Craig Condit (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit reassigned YARN-8642:
--

Assignee: Craig Condit

> Add support for tmpfs mounts with the Docker runtime
> 
>
> Key: YARN-8642
> URL: https://issues.apache.org/jira/browse/YARN-8642
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Shane Kumpf
>Assignee: Craig Condit
>Priority: Major
>  Labels: Docker
>
> Add support to the existing Docker runtime to allow the user to request tmpfs 
> mounts for their containers. For example:
> {code}/usr/bin/docker run --name=container_name --tmpfs /run image 
> /bootstrap/start-systemd
> {code}
> One use case is to allow systemd to run as PID 1 in a non-privileged 
> container, /run is expected to be a tmpfs mount in the container for that to 
> work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8642) Add support for tmpfs mounts with the Docker runtime

2018-08-21 Thread Craig Condit (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587489#comment-16587489
 ] 

Craig Condit commented on YARN-8642:


[~shaneku...@gmail.com], I'd like to work on this.

> Add support for tmpfs mounts with the Docker runtime
> 
>
> Key: YARN-8642
> URL: https://issues.apache.org/jira/browse/YARN-8642
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Shane Kumpf
>Priority: Major
>  Labels: Docker
>
> Add support to the existing Docker runtime to allow the user to request tmpfs 
> mounts for their containers. For example:
> {code}/usr/bin/docker run --name=container_name --tmpfs /run image 
> /bootstrap/start-systemd
> {code}
> One use case is to allow systemd to run as PID 1 in a non-privileged 
> container, /run is expected to be a tmpfs mount in the container for that to 
> work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-7680) ContainerMetrics is registered even if yarn.nodemanager.container-metrics.enable is set to false

2018-08-21 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen reassigned YARN-7680:


Assignee: Zoltan Siegl

> ContainerMetrics is registered even if 
> yarn.nodemanager.container-metrics.enable is set to false
> 
>
> Key: YARN-7680
> URL: https://issues.apache.org/jira/browse/YARN-7680
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: metrics
>Affects Versions: 3.0.0
>Reporter: Akira Ajisaka
>Assignee: Zoltan Siegl
>Priority: Critical
>
> ContainerMetrics is unintentionally registered to DefaultMetricsSystem even 
> if yarn.nodemanager.container-metrics.enable is set to false. For example, 
> when we set 
> *.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31 to 
> sink all the metrics to Ganglia, MetricsSystem sink ContainerMetrics to 
> ganglia server (localhost:8649 by default).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8468) Limit container sizes per queue in FairScheduler

2018-08-21 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587425#comment-16587425
 ] 

genericqa commented on YARN-8468:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
33s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 8 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 44s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
11s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 27s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 31 new + 562 unchanged - 11 fixed = 593 total (was 573) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 28s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 72m 17s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
18s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
30s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}145m 42s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMService |
|   | hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart |
|   | hadoop.yarn.server.resourcemanager.TestAppManager |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce 

[jira] [Commented] (YARN-8468) Limit container sizes per queue in FairScheduler

2018-08-21 Thread JIRA


[ 
https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587422#comment-16587422
 ] 

Antal Bálint Steinbach commented on YARN-8468:
--

Hi [~wilfreds],

Currently, resource types are not handled at all. The minimum and maximum 
checks are done only for vcores and memory. I think we can create a new ticket 
for this.

> Limit container sizes per queue in FairScheduler
> 
>
> Key: YARN-8468
> URL: https://issues.apache.org/jira/browse/YARN-8468
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 3.1.0
>Reporter: Antal Bálint Steinbach
>Assignee: Antal Bálint Steinbach
>Priority: Critical
> Attachments: YARN-8468.000.patch, YARN-8468.001.patch, 
> YARN-8468.002.patch, YARN-8468.003.patch, YARN-8468.004.patch
>
>
> When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" 
> to limit the overall size of a container. This applies globally to all 
> containers and cannot be limited by queue or and is not scheduler dependent.
>  
> The goal of this ticket is to allow this value to be set on a per queue basis.
>  
> The use case: User has two pools, one for ad hoc jobs and one for enterprise 
> apps. User wants to limit ad hoc jobs to small containers but allow 
> enterprise apps to request as many resources as needed. Setting 
> yarn.scheduler.maximum-allocation-mb sets a default value for maximum 
> container size for all queues and setting maximum resources per queue with 
> “maxContainerResources” queue config value.
>  
> Suggested solution:
>  
> All the infrastructure is already in the code. We need to do the following:
>  * add the setting to the queue properties for all queue types (parent and 
> leaf), this will cover dynamically created queues.
>  * if we set it on the root we override the scheduler setting and we should 
> not allow that.
>  * make sure that queue resource cap can not be larger than scheduler max 
> resource cap in the config.
>  * implement getMaximumResourceCapability(String queueName) in the 
> FairScheduler
>  * implement getMaximumResourceCapability() in both FSParentQueue and 
> FSLeafQueue as follows
>  * expose the setting in the queue information in the RM web UI.
>  * expose the setting in the metrics etc for the queue.
>  * write JUnit tests.
>  * update the scheduler documentation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7494) Add muti node lookup support for better placement

2018-08-21 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587389#comment-16587389
 ] 

Sunil Govindan commented on YARN-7494:
--

[~cheersyang] Fixed checkstyles which are possible. Some lines length cant be 
done as it is name etc.

Also that class doesnt need a setter and getter. 

 

Pls check.

> Add muti node lookup support for better placement
> -
>
> Key: YARN-7494
> URL: https://issues.apache.org/jira/browse/YARN-7494
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Reporter: Sunil Govindan
>Assignee: Sunil Govindan
>Priority: Major
> Attachments: YARN-7494.001.patch, YARN-7494.002.patch, 
> YARN-7494.003.patch, YARN-7494.004.patch, YARN-7494.005.patch, 
> YARN-7494.006.patch, YARN-7494.007.patch, YARN-7494.008.patch, 
> YARN-7494.009.patch, YARN-7494.010.patch, YARN-7494.11.patch, 
> YARN-7494.12.patch, YARN-7494.13.patch, YARN-7494.14.patch, 
> YARN-7494.15.patch, YARN-7494.16.patch, YARN-7494.17.patch, 
> YARN-7494.18.patch, YARN-7494.19.patch, YARN-7494.20.patch, 
> YARN-7494.v0.patch, YARN-7494.v1.patch, multi-node-designProposal.png
>
>
> Instead of single node, for effectiveness we can consider a multi node lookup 
> based on partition to start with.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7494) Add muti node lookup support for better placement

2018-08-21 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587386#comment-16587386
 ] 

genericqa commented on YARN-7494:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
 0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 33s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 41s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 6 new + 670 unchanged - 4 fixed = 676 total (was 674) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 12s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 69m 
47s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}126m 35s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-7494 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12936423/YARN-7494.20.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 469a0c4aa9e0 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / d3fef7a |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/21647/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/21647/testReport/ |
| Max. process+thread count | 869 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 

[jira] [Updated] (YARN-8695) ERROR: Container complete event for unknown container id

2018-08-21 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-8695:
-
Priority: Minor  (was: Major)

Downgrading the priority since this has no impact on functionality.

Does "container id container_1534394833079_0012_01_06" appear earlier in 
the AM log?  This may simply be a case where the AM has already decided to 
forget about a container it used earlier to run a task, and complains when the 
RM informs the AM of the completion of that container.  If that is indeed what 
is happening then this bug should be moved to the MAPREDUCE project, as it 
would be a bug in the MapReduce AM code rather than YARN.


> ERROR: Container complete event for unknown container id
> 
>
> Key: YARN-8695
> URL: https://issues.apache.org/jira/browse/YARN-8695
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: RM
>Reporter: sivasankar
>Priority: Minor
>
> Have deployed a cluster with *3 data nodes*. YARN/MapReduce2/HDFS version is 
> *2.7.3* on HDP. While running teragen and Gobblin the following Yarn errors 
> get reported in the logs. Errors get reported only when the map tasks defined 
> for the job less than or equals to the number of data nodes in the cluster.
> For *Teragen* -Dmapreduce.job.maps=4
> For *Gobblin* mr.job.max.mappers=4
> There are no errors if the map tasks(splits) are <= number of data nodes.
>  2018-08-16 06:54:05,681 ERROR [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: *Container 
> complete event for unknown container id 
> container_1534394833079_0012_01_06*
> 2018-08-16 05:00:50,138 ERROR [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Container 
> complete event for unknown container id 
> container_1534394833079_0001_01_55 2018-08-16 05:00:50,138 INFO 
> [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Received 
> completed container container_1534394833079_0001_01_54 2018-08-16 
> 05:00:50,138 ERROR [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Container 
> complete event for unknown container id 
> container_1534394833079_0001_01_54 2018-08-16 05:00:50,138 INFO 
> [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Received 
> completed container container_1534394833079_0001_01_53 2018-08-16 
> 05:00:50,138 ERROR [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Container 
> complete event for unknown container id container_1534394833079_0001_01_53
> *Note*: There is no functionality issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7863) Modify placement constraints to support node attributes

2018-08-21 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587371#comment-16587371
 ] 

Sunil Govindan commented on YARN-7863:
--

Thanks [~cheersyang] [~Naganarasimha]

*TestCases* could be added for DS in another patch i think. It ll cover all DS 
level cases.

I ll add AND & OR cases in this one.

Other cases are covered.

> Modify placement constraints to support node attributes
> ---
>
> Key: YARN-7863
> URL: https://issues.apache.org/jira/browse/YARN-7863
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Sunil Govindan
>Assignee: Sunil Govindan
>Priority: Major
> Attachments: YARN-7863-YARN-3409.002.patch, 
> YARN-7863-YARN-3409.003.patch, YARN-7863-YARN-3409.004.patch, 
> YARN-7863-YARN-3409.005.patch, YARN-7863-YARN-3409.006.patch, 
> YARN-7863-YARN-3409.007.patch, YARN-7863.v0.patch
>
>
> This Jira will track to *Modify existing placement constraints to support 
> node attributes.*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7863) Modify placement constraints to support node attributes

2018-08-21 Thread Sunil Govindan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil Govindan updated YARN-7863:
-
Attachment: YARN-7863-YARN-3409.007.patch

> Modify placement constraints to support node attributes
> ---
>
> Key: YARN-7863
> URL: https://issues.apache.org/jira/browse/YARN-7863
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Sunil Govindan
>Assignee: Sunil Govindan
>Priority: Major
> Attachments: YARN-7863-YARN-3409.002.patch, 
> YARN-7863-YARN-3409.003.patch, YARN-7863-YARN-3409.004.patch, 
> YARN-7863-YARN-3409.005.patch, YARN-7863-YARN-3409.006.patch, 
> YARN-7863-YARN-3409.007.patch, YARN-7863.v0.patch
>
>
> This Jira will track to *Modify existing placement constraints to support 
> node attributes.*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8695) ERROR: Container complete event for unknown container id

2018-08-21 Thread sivasankar (JIRA)
sivasankar created YARN-8695:


 Summary: ERROR: Container complete event for unknown container id
 Key: YARN-8695
 URL: https://issues.apache.org/jira/browse/YARN-8695
 Project: Hadoop YARN
  Issue Type: Bug
  Components: RM
Reporter: sivasankar


Have deployed a cluster with *3 data nodes*. YARN/MapReduce2/HDFS version is 
*2.7.3* on HDP. While running teragen and Gobblin the following Yarn errors get 
reported in the logs. Errors get reported only when the map tasks defined for 
the job less than or equals to the number of data nodes in the cluster.

For *Teragen* -Dmapreduce.job.maps=4

For *Gobblin* mr.job.max.mappers=4

There are no errors if the map tasks(splits) are <= number of data nodes.

 2018-08-16 06:54:05,681 ERROR [RMCommunicator Allocator] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: *Container complete 
event for unknown container id container_1534394833079_0012_01_06*

2018-08-16 05:00:50,138 ERROR [RMCommunicator Allocator] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Container complete 
event for unknown container id container_1534394833079_0001_01_55 
2018-08-16 05:00:50,138 INFO [RMCommunicator Allocator] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Received completed 
container container_1534394833079_0001_01_54 2018-08-16 05:00:50,138 ERROR 
[RMCommunicator Allocator] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Container complete 
event for unknown container id container_1534394833079_0001_01_54 
2018-08-16 05:00:50,138 INFO [RMCommunicator Allocator] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Received completed 
container container_1534394833079_0001_01_53 2018-08-16 05:00:50,138 ERROR 
[RMCommunicator Allocator] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Container complete 
event for unknown container id container_1534394833079_0001_01_53

*Note*: There is no functionality issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8694) app flex with relative changes does not work

2018-08-21 Thread kyungwan nam (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kyungwan nam updated YARN-8694:
---
Attachment: YARN-8694.001.patch

> app flex with relative changes does not work
> 
>
> Key: YARN-8694
> URL: https://issues.apache.org/jira/browse/YARN-8694
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-native-services
>Affects Versions: 3.1.1
>Reporter: kyungwan nam
>Priority: Major
> Attachments: YARN-8694.001.patch
>
>
> I'd like to increase 2 containers as belows.
> {code:java}
> yarn app -flex my-sleeper -component sleeper +2{code}
> but, It did not work. it seems to request 2, not +2.
>  
> ApiServiceClient.actionFlex
> {code:java}
> @Override
> public int actionFlex(String appName, Map componentCounts)
> throws IOException, YarnException {
>   int result = EXIT_SUCCESS;
>   try {
> Service service = new Service();
> service.setName(appName);
> service.setState(ServiceState.FLEX);
> for (Map.Entry entry : componentCounts.entrySet()) {
>   Component component = new Component();
>   component.setName(entry.getKey());
>   Long numberOfContainers = Long.parseLong(entry.getValue());
>   component.setNumberOfContainers(numberOfContainers);
>   service.addComponent(component);
> }
> String buffer = jsonSerDeser.toJson(service);
> ClientResponse response = getApiClient(getServicePath(appName))
> .put(ClientResponse.class, buffer);{code}
> It looks like there is no code, which handle “+”, “-“ in 
> ApiServiceClient.actionFlex



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8685) Add containers query support for nodes/node REST API in RMWebServices

2018-08-21 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587317#comment-16587317
 ] 

Weiwei Yang commented on YARN-8685:
---

Hi [~Tao Yang]

Can we add a new endpoint in RM something like
{noformat}
http:///ws/v1/cluster/containers/{nodeId}?states=ALLOCATED
{noformat}
to display RM containers. Where query parameter \{{state}} is an optional 
filter, a comma list of states. This way we avoid returning too much info in a 
single API. And second, can we pull out the the \{{ContainerInfo}} to 
{{hadoop-yarn-common/o.a.h.y.webapp.dao}} so it can be shared by both RM and NM 
containers endpoints?

> Add containers query support for nodes/node REST API in RMWebServices
> -
>
> Key: YARN-8685
> URL: https://issues.apache.org/jira/browse/YARN-8685
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: restapi
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8685.001.patch
>
>
> Currently we can only query running containers from NM containers REST API, 
> but can't get the valid containers which are in ALLOCATED/ACQUIRED state. We 
> have the requirements to get all containers allocated on specified nodes for 
> debugging. I want to add a "includeContainers" query param (default false) 
> for nodes/node REST API in RMWebServices, so that we can get valid containers 
> on nodes if "includeContainers=true" specified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8683) Support to display pending scheduling requests in RM app attempt page

2018-08-21 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587312#comment-16587312
 ] 

Hudson commented on YARN-8683:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14810 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14810/])
YARN-8683. Support to display pending scheduling requests in RM app (wwei: rev 
54d0bf8935e35aad0f4d67df358ceb970cfcd713)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMAppAttemptBlock.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/YarnScheduler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/AppInfo.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/placement/AppPlacementAllocator.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/placement/LocalityAppPlacementAllocator.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/placement/SingleConstraintAppPlacementAllocator.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/ResourceRequestInfo.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java


> Support to display pending scheduling requests in RM app attempt page 
> --
>
> Key: YARN-8683
> URL: https://issues.apache.org/jira/browse/YARN-8683
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: webapp
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-8683.001.patch, YARN-8683.002.patch, 
> YARN-8683.003.patch, YARN-8683.004.patch, screenshot-1.png, screenshot-2.png
>
>
> Currently outstanding requests info in app attempt page only show pending 
> resource requests, pending scheduling requests should be shown here too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8683) Support to display pending scheduling requests in RM app attempt page

2018-08-21 Thread Tao Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587300#comment-16587300
 ] 

Tao Yang commented on YARN-8683:


Thanks [~cheersyang]. 

> Support to display pending scheduling requests in RM app attempt page 
> --
>
> Key: YARN-8683
> URL: https://issues.apache.org/jira/browse/YARN-8683
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: webapp
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-8683.001.patch, YARN-8683.002.patch, 
> YARN-8683.003.patch, YARN-8683.004.patch, screenshot-1.png, screenshot-2.png
>
>
> Currently outstanding requests info in app attempt page only show pending 
> resource requests, pending scheduling requests should be shown here too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8683) Support to display pending scheduling requests in RM app attempt page

2018-08-21 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587295#comment-16587295
 ] 

Weiwei Yang commented on YARN-8683:
---

Thanks [~Tao Yang] for the contribution, I have committed this to trunk.

> Support to display pending scheduling requests in RM app attempt page 
> --
>
> Key: YARN-8683
> URL: https://issues.apache.org/jira/browse/YARN-8683
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: webapp
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-8683.001.patch, YARN-8683.002.patch, 
> YARN-8683.003.patch, YARN-8683.004.patch, screenshot-1.png, screenshot-2.png
>
>
> Currently outstanding requests info in app attempt page only show pending 
> resource requests, pending scheduling requests should be shown here too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8683) Support to display pending scheduling requests in RM app attempt page

2018-08-21 Thread Weiwei Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-8683:
--
Issue Type: Improvement  (was: Bug)

> Support to display pending scheduling requests in RM app attempt page 
> --
>
> Key: YARN-8683
> URL: https://issues.apache.org/jira/browse/YARN-8683
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: webapp
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-8683.001.patch, YARN-8683.002.patch, 
> YARN-8683.003.patch, YARN-8683.004.patch, screenshot-1.png, screenshot-2.png
>
>
> Currently outstanding requests info in app attempt page only show pending 
> resource requests, pending scheduling requests should be shown here too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8683) Support to display pending scheduling requests in RM app attempt page

2018-08-21 Thread Weiwei Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-8683:
--
Summary: Support to display pending scheduling requests in RM app attempt 
page   (was: Display pending scheduling requests in RM app attempt page )

> Support to display pending scheduling requests in RM app attempt page 
> --
>
> Key: YARN-8683
> URL: https://issues.apache.org/jira/browse/YARN-8683
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8683.001.patch, YARN-8683.002.patch, 
> YARN-8683.003.patch, YARN-8683.004.patch, screenshot-1.png, screenshot-2.png
>
>
> Currently outstanding requests info in app attempt page only show pending 
> resource requests, pending scheduling requests should be shown here too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8694) app flex with relative changes does not work

2018-08-21 Thread kyungwan nam (JIRA)
kyungwan nam created YARN-8694:
--

 Summary: app flex with relative changes does not work
 Key: YARN-8694
 URL: https://issues.apache.org/jira/browse/YARN-8694
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn-native-services
Affects Versions: 3.1.1
Reporter: kyungwan nam


I'd like to increase 2 containers as belows.
{code:java}
yarn app -flex my-sleeper -component sleeper +2{code}
but, It did not work. it seems to request 2, not +2.

 

ApiServiceClient.actionFlex
{code:java}
@Override
public int actionFlex(String appName, Map componentCounts)
throws IOException, YarnException {
  int result = EXIT_SUCCESS;
  try {
Service service = new Service();
service.setName(appName);
service.setState(ServiceState.FLEX);
for (Map.Entry entry : componentCounts.entrySet()) {
  Component component = new Component();
  component.setName(entry.getKey());

  Long numberOfContainers = Long.parseLong(entry.getValue());
  component.setNumberOfContainers(numberOfContainers);
  service.addComponent(component);
}
String buffer = jsonSerDeser.toJson(service);
ClientResponse response = getApiClient(getServicePath(appName))
.put(ClientResponse.class, buffer);{code}
It looks like there is no code, which handle “+”, “-“ in 
ApiServiceClient.actionFlex



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8683) Display pending scheduling requests in RM app attempt page

2018-08-21 Thread Weiwei Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-8683:
--
Summary: Display pending scheduling requests in RM app attempt page   (was: 
Display outstanding pending scheduling requests in RM app attempt page )

> Display pending scheduling requests in RM app attempt page 
> ---
>
> Key: YARN-8683
> URL: https://issues.apache.org/jira/browse/YARN-8683
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8683.001.patch, YARN-8683.002.patch, 
> YARN-8683.003.patch, YARN-8683.004.patch, screenshot-1.png, screenshot-2.png
>
>
> Currently outstanding requests info in app attempt page only show pending 
> resource requests, pending scheduling requests should be shown here too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8683) Display outstanding pending scheduling requests in RM app attempt page

2018-08-21 Thread Weiwei Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-8683:
--
Summary: Display outstanding pending scheduling requests in RM app attempt 
page   (was: Support scheduling request for outstanding requests info in 
RMAppAttemptBlock)

> Display outstanding pending scheduling requests in RM app attempt page 
> ---
>
> Key: YARN-8683
> URL: https://issues.apache.org/jira/browse/YARN-8683
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8683.001.patch, YARN-8683.002.patch, 
> YARN-8683.003.patch, YARN-8683.004.patch, screenshot-1.png, screenshot-2.png
>
>
> Currently outstanding requests info in app attempt page only show pending 
> resource requests, pending scheduling requests should be shown here too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8683) Support scheduling request for outstanding requests info in RMAppAttemptBlock

2018-08-21 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587287#comment-16587287
 ] 

Weiwei Yang commented on YARN-8683:
---

+1, committing now

> Support scheduling request for outstanding requests info in RMAppAttemptBlock
> -
>
> Key: YARN-8683
> URL: https://issues.apache.org/jira/browse/YARN-8683
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8683.001.patch, YARN-8683.002.patch, 
> YARN-8683.003.patch, YARN-8683.004.patch, screenshot-1.png, screenshot-2.png
>
>
> Currently outstanding requests info in app attempt page only show pending 
> resource requests, pending scheduling requests should be shown here too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8468) Limit container sizes per queue in FairScheduler

2018-08-21 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antal Bálint Steinbach updated YARN-8468:
-
Attachment: YARN-8468.004.patch

> Limit container sizes per queue in FairScheduler
> 
>
> Key: YARN-8468
> URL: https://issues.apache.org/jira/browse/YARN-8468
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 3.1.0
>Reporter: Antal Bálint Steinbach
>Assignee: Antal Bálint Steinbach
>Priority: Critical
> Attachments: YARN-8468.000.patch, YARN-8468.001.patch, 
> YARN-8468.002.patch, YARN-8468.003.patch, YARN-8468.004.patch
>
>
> When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" 
> to limit the overall size of a container. This applies globally to all 
> containers and cannot be limited by queue or and is not scheduler dependent.
>  
> The goal of this ticket is to allow this value to be set on a per queue basis.
>  
> The use case: User has two pools, one for ad hoc jobs and one for enterprise 
> apps. User wants to limit ad hoc jobs to small containers but allow 
> enterprise apps to request as many resources as needed. Setting 
> yarn.scheduler.maximum-allocation-mb sets a default value for maximum 
> container size for all queues and setting maximum resources per queue with 
> “maxContainerResources” queue config value.
>  
> Suggested solution:
>  
> All the infrastructure is already in the code. We need to do the following:
>  * add the setting to the queue properties for all queue types (parent and 
> leaf), this will cover dynamically created queues.
>  * if we set it on the root we override the scheduler setting and we should 
> not allow that.
>  * make sure that queue resource cap can not be larger than scheduler max 
> resource cap in the config.
>  * implement getMaximumResourceCapability(String queueName) in the 
> FairScheduler
>  * implement getMaximumResourceCapability() in both FSParentQueue and 
> FSLeafQueue as follows
>  * expose the setting in the queue information in the RM web UI.
>  * expose the setting in the metrics etc for the queue.
>  * write JUnit tests.
>  * update the scheduler documentation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7494) Add muti node lookup support for better placement

2018-08-21 Thread Sunil Govindan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil Govindan updated YARN-7494:
-
Attachment: YARN-7494.20.patch

> Add muti node lookup support for better placement
> -
>
> Key: YARN-7494
> URL: https://issues.apache.org/jira/browse/YARN-7494
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Reporter: Sunil Govindan
>Assignee: Sunil Govindan
>Priority: Major
> Attachments: YARN-7494.001.patch, YARN-7494.002.patch, 
> YARN-7494.003.patch, YARN-7494.004.patch, YARN-7494.005.patch, 
> YARN-7494.006.patch, YARN-7494.007.patch, YARN-7494.008.patch, 
> YARN-7494.009.patch, YARN-7494.010.patch, YARN-7494.11.patch, 
> YARN-7494.12.patch, YARN-7494.13.patch, YARN-7494.14.patch, 
> YARN-7494.15.patch, YARN-7494.16.patch, YARN-7494.17.patch, 
> YARN-7494.18.patch, YARN-7494.19.patch, YARN-7494.20.patch, 
> YARN-7494.v0.patch, YARN-7494.v1.patch, multi-node-designProposal.png
>
>
> Instead of single node, for effectiveness we can consider a multi node lookup 
> based on partition to start with.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8468) Limit container sizes per queue in FairScheduler

2018-08-21 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antal Bálint Steinbach updated YARN-8468:
-
Attachment: (was: YARN-8468.005.patch)

> Limit container sizes per queue in FairScheduler
> 
>
> Key: YARN-8468
> URL: https://issues.apache.org/jira/browse/YARN-8468
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 3.1.0
>Reporter: Antal Bálint Steinbach
>Assignee: Antal Bálint Steinbach
>Priority: Critical
> Attachments: YARN-8468.000.patch, YARN-8468.001.patch, 
> YARN-8468.002.patch, YARN-8468.003.patch
>
>
> When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" 
> to limit the overall size of a container. This applies globally to all 
> containers and cannot be limited by queue or and is not scheduler dependent.
>  
> The goal of this ticket is to allow this value to be set on a per queue basis.
>  
> The use case: User has two pools, one for ad hoc jobs and one for enterprise 
> apps. User wants to limit ad hoc jobs to small containers but allow 
> enterprise apps to request as many resources as needed. Setting 
> yarn.scheduler.maximum-allocation-mb sets a default value for maximum 
> container size for all queues and setting maximum resources per queue with 
> “maxContainerResources” queue config value.
>  
> Suggested solution:
>  
> All the infrastructure is already in the code. We need to do the following:
>  * add the setting to the queue properties for all queue types (parent and 
> leaf), this will cover dynamically created queues.
>  * if we set it on the root we override the scheduler setting and we should 
> not allow that.
>  * make sure that queue resource cap can not be larger than scheduler max 
> resource cap in the config.
>  * implement getMaximumResourceCapability(String queueName) in the 
> FairScheduler
>  * implement getMaximumResourceCapability() in both FSParentQueue and 
> FSLeafQueue as follows
>  * expose the setting in the queue information in the RM web UI.
>  * expose the setting in the metrics etc for the queue.
>  * write JUnit tests.
>  * update the scheduler documentation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8468) Limit container sizes per queue in FairScheduler

2018-08-21 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antal Bálint Steinbach updated YARN-8468:
-
Attachment: YARN-8468.005.patch

> Limit container sizes per queue in FairScheduler
> 
>
> Key: YARN-8468
> URL: https://issues.apache.org/jira/browse/YARN-8468
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 3.1.0
>Reporter: Antal Bálint Steinbach
>Assignee: Antal Bálint Steinbach
>Priority: Critical
> Attachments: YARN-8468.000.patch, YARN-8468.001.patch, 
> YARN-8468.002.patch, YARN-8468.003.patch, YARN-8468.005.patch
>
>
> When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" 
> to limit the overall size of a container. This applies globally to all 
> containers and cannot be limited by queue or and is not scheduler dependent.
>  
> The goal of this ticket is to allow this value to be set on a per queue basis.
>  
> The use case: User has two pools, one for ad hoc jobs and one for enterprise 
> apps. User wants to limit ad hoc jobs to small containers but allow 
> enterprise apps to request as many resources as needed. Setting 
> yarn.scheduler.maximum-allocation-mb sets a default value for maximum 
> container size for all queues and setting maximum resources per queue with 
> “maxContainerResources” queue config value.
>  
> Suggested solution:
>  
> All the infrastructure is already in the code. We need to do the following:
>  * add the setting to the queue properties for all queue types (parent and 
> leaf), this will cover dynamically created queues.
>  * if we set it on the root we override the scheduler setting and we should 
> not allow that.
>  * make sure that queue resource cap can not be larger than scheduler max 
> resource cap in the config.
>  * implement getMaximumResourceCapability(String queueName) in the 
> FairScheduler
>  * implement getMaximumResourceCapability() in both FSParentQueue and 
> FSLeafQueue as follows
>  * expose the setting in the queue information in the RM web UI.
>  * expose the setting in the metrics etc for the queue.
>  * write JUnit tests.
>  * update the scheduler documentation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8683) Support scheduling request for outstanding requests info in RMAppAttemptBlock

2018-08-21 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587266#comment-16587266
 ] 

genericqa commented on YARN-8683:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 22s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  5s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 76m 
44s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}125m  6s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-8683 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12936398/YARN-8683.003.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 53a57a3a4bbf 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 
08:53:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / d3fef7a |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/21644/testReport/ |
| Max. process+thread count | 885 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/21644/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT  

[jira] [Created] (YARN-8693) Add signalToContainer REST API for RMWebServices

2018-08-21 Thread Tao Yang (JIRA)
Tao Yang created YARN-8693:
--

 Summary: Add signalToContainer REST API for RMWebServices
 Key: YARN-8693
 URL: https://issues.apache.org/jira/browse/YARN-8693
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: restapi
Affects Versions: 3.2.0
Reporter: Tao Yang
Assignee: Tao Yang


Currently YARN has a RPC command which is "yarn container -signal " to signal 
OUTPUT_THREAD_DUMP/GRACEFUL_SHUTDOWN/FORCEFUL_SHUTDOWN commands to container. 
That is not enough and we need to add signalToContainer REST API for better 
management from cluster administrators or management system.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8692) Support node utilization metrics for SLS

2018-08-21 Thread Tao Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-8692:
---
Description: 
The distribution of node utilization is an important healthy factor for the 
YARN cluster, related metrics in SLS can be used to evaluate the scheduling 
effects and optimize related configurations. 

To implement this improvement, we need to do things as below:

(1) Add input configurations (contain avg and stddev for cpu/memory utilization 
ratio) and generate utilization samples for tasks, not include AM container 
cause I think it's negligible.

(2) Simulate containers and node utilization within node status. 

(3) calculate and generate the distribution metrics and use standard deviation 
metric (stddev for short) to evaluate the effects(smaller is better).  

(4) show these metrics on SLS simulator page like this:

!image-2018-08-21-18-04-22-749.png!

For Node memory/CPU utilization distribution graphs, Y-axis is nodes number, 
and P0 represents 0%~9% utilization ratio(containers-utilization / 
node-total-resource), P1 represents 10%~19% utilization ratio, P2 represents 
20%~29% utilization ratio, ..., at last P9 represents 90%~100% utilization 
ratio. 

  was:
The distribution of node utilization is an important healthy factor for the 
YARN cluster, related metrics in SLS can be used to evaluate the scheduling 
effects and optimize related configurations. 

To implement this improvement, we need to do things as below:

(1) Add input configurations (contain avg and stddev for cpu/memory utilization 
ratio) and generate utilization samples for tasks, not include AM container 
cause I think it's negligible. (2) Simulate containers and node utilization 
within node status. 

(3) calculate and generate the distribution metrics and use standard deviation 
metric (stddev for short) to evaluate the effects(smaller is better).  

(4) show these metrics on SLS simulator page like this:

!image-2018-08-21-18-04-22-749.png!

For Node memory/CPU utilization distribution graphs, Y-axis is nodes number, 
and P0 represents 0%~9% utilization ratio(containers-utilization / 
node-total-resource), P1 represents 10%~19% utilization ratio, P2 represents 
20%~29% utilization ratio, ..., at last P9 represents 90%~100% utilization 
ratio. 


> Support node utilization metrics for SLS
> 
>
> Key: YARN-8692
> URL: https://issues.apache.org/jira/browse/YARN-8692
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler-load-simulator
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: image-2018-08-21-18-04-22-749.png
>
>
> The distribution of node utilization is an important healthy factor for the 
> YARN cluster, related metrics in SLS can be used to evaluate the scheduling 
> effects and optimize related configurations. 
> To implement this improvement, we need to do things as below:
> (1) Add input configurations (contain avg and stddev for cpu/memory 
> utilization ratio) and generate utilization samples for tasks, not include AM 
> container cause I think it's negligible.
> (2) Simulate containers and node utilization within node status. 
> (3) calculate and generate the distribution metrics and use standard 
> deviation metric (stddev for short) to evaluate the effects(smaller is 
> better).  
> (4) show these metrics on SLS simulator page like this:
> !image-2018-08-21-18-04-22-749.png!
> For Node memory/CPU utilization distribution graphs, Y-axis is nodes number, 
> and P0 represents 0%~9% utilization ratio(containers-utilization / 
> node-total-resource), P1 represents 10%~19% utilization ratio, P2 represents 
> 20%~29% utilization ratio, ..., at last P9 represents 90%~100% utilization 
> ratio. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8692) Support node utilization metrics for SLS

2018-08-21 Thread Tao Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-8692:
---
Attachment: (was: image-2018-08-21-18-03-59-665.png)

> Support node utilization metrics for SLS
> 
>
> Key: YARN-8692
> URL: https://issues.apache.org/jira/browse/YARN-8692
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler-load-simulator
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: image-2018-08-21-18-04-22-749.png
>
>
> The distribution of node utilization is an important healthy factor for the 
> YARN cluster, related metrics in SLS can be used to evaluate the scheduling 
> effects and optimize related configurations. 
> To implement this improvement, we need to do things as below:
> (1) Add input configurations (contain avg and stddev for cpu/memory 
> utilization ratio) and generate utilization samples for tasks, not include AM 
> container cause I think it's negligible. (2) Simulate containers and node 
> utilization within node status. 
> (3) calculate and generate the distribution metrics and use standard 
> deviation metric (stddev for short) to evaluate the effects(smaller is 
> better).  
> (4) show these metrics on SLS simulator page like this:
> !image-2018-08-21-18-04-22-749.png!
> For Node memory/CPU utilization distribution graphs, Y-axis is nodes number, 
> and P0 represents 0%~9% utilization ratio(containers-utilization / 
> node-total-resource), P1 represents 10%~19% utilization ratio, P2 represents 
> 20%~29% utilization ratio, ..., at last P9 represents 90%~100% utilization 
> ratio. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8692) Support node utilization metrics for SLS

2018-08-21 Thread Tao Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-8692:
---
Description: 
The distribution of node utilization is an important healthy factor for the 
YARN cluster, related metrics in SLS can be used to evaluate the scheduling 
effects and optimize related configurations. 

To implement this improvement, we need to do things as below:

(1) Add input configurations (contain avg and stddev for cpu/memory utilization 
ratio) and generate utilization samples for tasks, not include AM container 
cause I think it's negligible. (2) Simulate containers and node utilization 
within node status. 

(3) calculate and generate the distribution metrics and use standard deviation 
metric (stddev for short) to evaluate the effects(smaller is better).  

(4) show these metrics on SLS simulator page like this:

!image-2018-08-21-18-04-22-749.png!

For Node memory/CPU utilization distribution graphs, Y-axis is nodes number, 
and P0 represents 0%~9% utilization ratio(containers-utilization / 
node-total-resource), P1 represents 10%~19% utilization ratio, P2 represents 
20%~29% utilization ratio, ..., at last P9 represents 90%~100% utilization 
ratio. 

  was:
The distribution of node utilization is an important healthy factor for the 
YARN cluster, related metrics in SLS can be used to evaluate the scheduling 
effects and optimize related configurations. 

To implement this improvement, we need to do things as below:

(1) Add input configurations (contain avg and stddev for cpu/memory utilization 
ratio) and generate utilization samples for tasks, not include AM container 
cause I think it's negligible. (2) Simulate containers and node utilization 
within node status. 

(3) calculate and generate the distribution metrics and use standard deviation 
metric (stddev for short) to evaluate the effects(smaller is better).  

(4) show these metrics on SLS simulator page like this:

!image-2018-08-21-17-50-04-011.png!

For Node memory/CPU utilization distribution graphs, Y-axis is nodes number, 
and P0 represents 0%~9% utilization ratio(containers-utilization / 
node-total-resource), P1 represents 10%~19% utilization ratio, P2 represents 
20%~29% utilization ratio, ..., at last P9 represents 90%~100% utilization 
ratio. 


> Support node utilization metrics for SLS
> 
>
> Key: YARN-8692
> URL: https://issues.apache.org/jira/browse/YARN-8692
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler-load-simulator
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: image-2018-08-21-18-03-59-665.png, 
> image-2018-08-21-18-04-22-749.png
>
>
> The distribution of node utilization is an important healthy factor for the 
> YARN cluster, related metrics in SLS can be used to evaluate the scheduling 
> effects and optimize related configurations. 
> To implement this improvement, we need to do things as below:
> (1) Add input configurations (contain avg and stddev for cpu/memory 
> utilization ratio) and generate utilization samples for tasks, not include AM 
> container cause I think it's negligible. (2) Simulate containers and node 
> utilization within node status. 
> (3) calculate and generate the distribution metrics and use standard 
> deviation metric (stddev for short) to evaluate the effects(smaller is 
> better).  
> (4) show these metrics on SLS simulator page like this:
> !image-2018-08-21-18-04-22-749.png!
> For Node memory/CPU utilization distribution graphs, Y-axis is nodes number, 
> and P0 represents 0%~9% utilization ratio(containers-utilization / 
> node-total-resource), P1 represents 10%~19% utilization ratio, P2 represents 
> 20%~29% utilization ratio, ..., at last P9 represents 90%~100% utilization 
> ratio. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8692) Support node utilization metrics for SLS

2018-08-21 Thread Tao Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-8692:
---
Attachment: image-2018-08-21-18-04-22-749.png

> Support node utilization metrics for SLS
> 
>
> Key: YARN-8692
> URL: https://issues.apache.org/jira/browse/YARN-8692
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler-load-simulator
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: image-2018-08-21-18-03-59-665.png, 
> image-2018-08-21-18-04-22-749.png
>
>
> The distribution of node utilization is an important healthy factor for the 
> YARN cluster, related metrics in SLS can be used to evaluate the scheduling 
> effects and optimize related configurations. 
> To implement this improvement, we need to do things as below:
> (1) Add input configurations (contain avg and stddev for cpu/memory 
> utilization ratio) and generate utilization samples for tasks, not include AM 
> container cause I think it's negligible. (2) Simulate containers and node 
> utilization within node status. 
> (3) calculate and generate the distribution metrics and use standard 
> deviation metric (stddev for short) to evaluate the effects(smaller is 
> better).  
> (4) show these metrics on SLS simulator page like this:
> !image-2018-08-21-17-50-04-011.png!
> For Node memory/CPU utilization distribution graphs, Y-axis is nodes number, 
> and P0 represents 0%~9% utilization ratio(containers-utilization / 
> node-total-resource), P1 represents 10%~19% utilization ratio, P2 represents 
> 20%~29% utilization ratio, ..., at last P9 represents 90%~100% utilization 
> ratio. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8692) Support node utilization metrics for SLS

2018-08-21 Thread Tao Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-8692:
---
Attachment: image-2018-08-21-18-03-59-665.png

> Support node utilization metrics for SLS
> 
>
> Key: YARN-8692
> URL: https://issues.apache.org/jira/browse/YARN-8692
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler-load-simulator
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: image-2018-08-21-18-03-59-665.png, 
> image-2018-08-21-18-04-22-749.png
>
>
> The distribution of node utilization is an important healthy factor for the 
> YARN cluster, related metrics in SLS can be used to evaluate the scheduling 
> effects and optimize related configurations. 
> To implement this improvement, we need to do things as below:
> (1) Add input configurations (contain avg and stddev for cpu/memory 
> utilization ratio) and generate utilization samples for tasks, not include AM 
> container cause I think it's negligible. (2) Simulate containers and node 
> utilization within node status. 
> (3) calculate and generate the distribution metrics and use standard 
> deviation metric (stddev for short) to evaluate the effects(smaller is 
> better).  
> (4) show these metrics on SLS simulator page like this:
> !image-2018-08-21-17-50-04-011.png!
> For Node memory/CPU utilization distribution graphs, Y-axis is nodes number, 
> and P0 represents 0%~9% utilization ratio(containers-utilization / 
> node-total-resource), P1 represents 10%~19% utilization ratio, P2 represents 
> 20%~29% utilization ratio, ..., at last P9 represents 90%~100% utilization 
> ratio. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8692) Support node utilization metrics for SLS

2018-08-21 Thread Tao Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-8692:
---
Attachment: (was: image-2018-08-21-17-50-04-011.png)

> Support node utilization metrics for SLS
> 
>
> Key: YARN-8692
> URL: https://issues.apache.org/jira/browse/YARN-8692
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler-load-simulator
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: image-2018-08-21-18-03-59-665.png, 
> image-2018-08-21-18-04-22-749.png
>
>
> The distribution of node utilization is an important healthy factor for the 
> YARN cluster, related metrics in SLS can be used to evaluate the scheduling 
> effects and optimize related configurations. 
> To implement this improvement, we need to do things as below:
> (1) Add input configurations (contain avg and stddev for cpu/memory 
> utilization ratio) and generate utilization samples for tasks, not include AM 
> container cause I think it's negligible. (2) Simulate containers and node 
> utilization within node status. 
> (3) calculate and generate the distribution metrics and use standard 
> deviation metric (stddev for short) to evaluate the effects(smaller is 
> better).  
> (4) show these metrics on SLS simulator page like this:
> !image-2018-08-21-18-04-22-749.png!
> For Node memory/CPU utilization distribution graphs, Y-axis is nodes number, 
> and P0 represents 0%~9% utilization ratio(containers-utilization / 
> node-total-resource), P1 represents 10%~19% utilization ratio, P2 represents 
> 20%~29% utilization ratio, ..., at last P9 represents 90%~100% utilization 
> ratio. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8692) Support node utilization metrics for SLS

2018-08-21 Thread Tao Yang (JIRA)
Tao Yang created YARN-8692:
--

 Summary: Support node utilization metrics for SLS
 Key: YARN-8692
 URL: https://issues.apache.org/jira/browse/YARN-8692
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler-load-simulator
Affects Versions: 3.2.0
Reporter: Tao Yang
Assignee: Tao Yang
 Attachments: image-2018-08-21-17-50-04-011.png

The distribution of node utilization is an important healthy factor for the 
YARN cluster, related metrics in SLS can be used to evaluate the scheduling 
effects and optimize related configurations. 

To implement this improvement, we need to do things as below:

(1) Add input configurations (contain avg and stddev for cpu/memory utilization 
ratio) and generate utilization samples for tasks, not include AM container 
cause I think it's negligible. (2) Simulate containers and node utilization 
within node status. 

(3) calculate and generate the distribution metrics and use standard deviation 
metric (stddev for short) to evaluate the effects(smaller is better).  

(4) show these metrics on SLS simulator page like this:

!image-2018-08-21-17-50-04-011.png!

For Node memory/CPU utilization distribution graphs, Y-axis is nodes number, 
and P0 represents 0%~9% utilization ratio(containers-utilization / 
node-total-resource), P1 represents 10%~19% utilization ratio, P2 represents 
20%~29% utilization ratio, ..., at last P9 represents 90%~100% utilization 
ratio. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7494) Add muti node lookup support for better placement

2018-08-21 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587157#comment-16587157
 ] 

genericqa commented on YARN-7494:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
48s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 51s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 42s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 10 new + 671 unchanged - 4 fixed = 681 total (was 675) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m  7s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 78m  5s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}141m 48s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.metrics.TestSystemMetricsPublisher |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-7494 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12936384/YARN-7494.19.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 8547b23bbe05 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 770d9d9 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/21643/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| unit | 

[jira] [Updated] (YARN-8683) Support scheduling request for outstanding requests info in RMAppAttemptBlock

2018-08-21 Thread Tao Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-8683:
---
Attachment: YARN-8683.004.patch

> Support scheduling request for outstanding requests info in RMAppAttemptBlock
> -
>
> Key: YARN-8683
> URL: https://issues.apache.org/jira/browse/YARN-8683
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8683.001.patch, YARN-8683.002.patch, 
> YARN-8683.003.patch, YARN-8683.004.patch, screenshot-1.png, screenshot-2.png
>
>
> Currently outstanding requests info in app attempt page only show pending 
> resource requests, pending scheduling requests should be shown here too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8683) Support scheduling request for outstanding requests info in RMAppAttemptBlock

2018-08-21 Thread Tao Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587146#comment-16587146
 ] 

Tao Yang commented on YARN-8683:


Thanks [~cheersyang] for your suggestion! 
 Attached v4 to improve the content of AllocationTags.
 Updates:
{code:java}
-  .append(resourceRequest.getAllocationTags() == null ? "N/A"
-  : resourceRequest.getAllocationTags())
+  .append(resourceRequest.getAllocationTags() == null ? "N/A" :
+  StringUtils.join(resourceRequest.getAllocationTags(), ","))
{code}

> Support scheduling request for outstanding requests info in RMAppAttemptBlock
> -
>
> Key: YARN-8683
> URL: https://issues.apache.org/jira/browse/YARN-8683
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8683.001.patch, YARN-8683.002.patch, 
> YARN-8683.003.patch, YARN-8683.004.patch, screenshot-1.png, screenshot-2.png
>
>
> Currently outstanding requests info in app attempt page only show pending 
> resource requests, pending scheduling requests should be shown here too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4931) Preempted resources go back to the same application

2018-08-21 Thread Haibo Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587141#comment-16587141
 ] 

Haibo Chen commented on YARN-4931:
--

This should have been fixed by YARN-6432, which is included since 2.9

> Preempted resources go back to the same application
> ---
>
> Key: YARN-4931
> URL: https://issues.apache.org/jira/browse/YARN-4931
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.2
>Reporter: Miles Crawford
>Priority: Major
> Attachments: resourcemanager.log
>
>
> Sometimes a queue that needs resources causes preemption - but the preempted 
> containers are just allocated right back to the application that just 
> released them!
> Here is a tiny application (0007) that wants resources, and a container is 
> preempted from application 0002 to satisfy it:
> {code}
> 2016-04-07 21:08:13,463 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler 
> (FairSchedulerUpdateThread): Should preempt  res for 
> queue root.default: resDueToMinShare = , 
> resDueToFairShare = 
> 2016-04-07 21:08:13,463 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler 
> (FairSchedulerUpdateThread): Preempting container (prio=1res= vCores:1>) from queue root.milesc
> 2016-04-07 21:08:13,463 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics
>  (FairSchedulerUpdateThread): Non-AM container preempted, current 
> appAttemptId=appattempt_1460047303577_0002_01, 
> containerId=container_1460047303577_0002_01_001038, resource= vCores:1>
> 2016-04-07 21:08:13,463 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl 
> (FairSchedulerUpdateThread): container_1460047303577_0002_01_001038 Container 
> Transitioned from RUNNING to KILLED
> {code}
> But then a moment later, application 2 gets the container right back:
> {code}
> 2016-04-07 21:08:13,844 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode 
> (ResourceManager Event Processor): Assigned container 
> container_1460047303577_0002_01_001039 of capacity  
> on host ip-10-12-40-63.us-west-2.compute.internal:8041, which has 13 
> containers,  used and  
> available after allocation
> 2016-04-07 21:08:14,555 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl 
> (IPC Server handler 59 on 8030): container_1460047303577_0002_01_001039 
> Container Transitioned from ALLOCATED to ACQUIRED
> 2016-04-07 21:08:14,845 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl 
> (ResourceManager Event Processor): container_1460047303577_0002_01_001039 
> Container Transitioned from ACQUIRED to RUNNING
> {code}
> This results in new applications being unable to even get an AM, and never 
> starting at all.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8681) Wrong error message in RM placement constraints check

2018-08-21 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587140#comment-16587140
 ] 

Weiwei Yang commented on YARN-8681:
---

Hi [~snemeth] lets close it as won't fix as this code will be removed by 
YARN-8015. Thanks!

> Wrong error message in RM placement constraints check
> -
>
> Key: YARN-8681
> URL: https://issues.apache.org/jira/browse/YARN-8681
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 3.2.0, 3.1.1
>Reporter: Daniel Templeton
>Assignee: Szilard Nemeth
>Priority: Major
> Attachments: YARN-8681.001.patch
>
>
> In 
> {{SingleConstraintAppPlacementAllocator.validateAndSetSchedulingRequest()}} I 
> see the following:
> {code}  if (singleConstraint.getMinCardinality() != 0
>   || singleConstraint.getMaxCardinality() != 0) {
> throwExceptionWithMetaInfo(
> "Only support anti-affinity, which is: minCardinality=0, "
> + "maxCardinality=1");
>   }{code}
> I think the error message should say {{"maxCardinality=0"}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8683) Support scheduling request for outstanding requests info in RMAppAttemptBlock

2018-08-21 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587137#comment-16587137
 ] 

Weiwei Yang commented on YARN-8683:
---

Hi [~Tao Yang], yep, that looks better, could you please update the patch 
accordingly?

> Support scheduling request for outstanding requests info in RMAppAttemptBlock
> -
>
> Key: YARN-8683
> URL: https://issues.apache.org/jira/browse/YARN-8683
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8683.001.patch, YARN-8683.002.patch, 
> YARN-8683.003.patch, screenshot-1.png, screenshot-2.png
>
>
> Currently outstanding requests info in app attempt page only show pending 
> resource requests, pending scheduling requests should be shown here too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8683) Support scheduling request for outstanding requests info in RMAppAttemptBlock

2018-08-21 Thread Tao Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587135#comment-16587135
 ] 

Tao Yang commented on YARN-8683:


Just noticed that AllocationTags content shows "[tag1]" or "[tag1,tag2]". 
Is it better to remove the bracket, make it just show "tag1" or "tag1,tag2" ?

> Support scheduling request for outstanding requests info in RMAppAttemptBlock
> -
>
> Key: YARN-8683
> URL: https://issues.apache.org/jira/browse/YARN-8683
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8683.001.patch, YARN-8683.002.patch, 
> YARN-8683.003.patch, screenshot-1.png, screenshot-2.png
>
>
> Currently outstanding requests info in app attempt page only show pending 
> resource requests, pending scheduling requests should be shown here too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



  1   2   >