[jira] [Created] (YARN-8803) [UI2] Show flow runs in the order of recently created time in the table and graph widgets

2018-09-19 Thread Akhil PB (JIRA)
Akhil PB created YARN-8803:
--

 Summary: [UI2] Show flow runs in the order of recently created 
time in the table and graph widgets
 Key: YARN-8803
 URL: https://issues.apache.org/jira/browse/YARN-8803
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn-ui-v2
Reporter: Akhil PB
Assignee: Akhil PB






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8549) Adding a NoOp timeline writer and reader plugin classes for ATSv2

2018-09-19 Thread Prabha Manepalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabha Manepalli updated YARN-8549:
---
Attachment: (was: YARN-8549.v3.patch)

> Adding a NoOp timeline writer and reader plugin classes for ATSv2
> -
>
> Key: YARN-8549
> URL: https://issues.apache.org/jira/browse/YARN-8549
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2, timelineclient, timelineserver
>Reporter: Prabha Manepalli
>Assignee: Prabha Manepalli
>Priority: Minor
> Attachments: YARN-8549-branch-2.03.patch, 
> YARN-8549-branch-2.04.patch, YARN-8549.v1.patch, YARN-8549.v2.patch
>
>
> Stub implementation for TimeLineReader and TimeLineWriter classes. 
> These are useful for functional testing of writer and reader path for ATSv2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8549) Adding a NoOp timeline writer and reader plugin classes for ATSv2

2018-09-19 Thread Prabha Manepalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabha Manepalli updated YARN-8549:
---
Attachment: YARN-8549.v3.patch

> Adding a NoOp timeline writer and reader plugin classes for ATSv2
> -
>
> Key: YARN-8549
> URL: https://issues.apache.org/jira/browse/YARN-8549
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2, timelineclient, timelineserver
>Reporter: Prabha Manepalli
>Assignee: Prabha Manepalli
>Priority: Minor
> Attachments: YARN-8549-branch-2.03.patch, 
> YARN-8549-branch-2.04.patch, YARN-8549.v1.patch, YARN-8549.v2.patch, 
> YARN-8549.v3.patch
>
>
> Stub implementation for TimeLineReader and TimeLineWriter classes. 
> These are useful for functional testing of writer and reader path for ATSv2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8802) [JDK9] Fail to run yarn application after building hadoop pkg with jdk9 in jdk9 env

2018-09-19 Thread Akira Ajisaka (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621533#comment-16621533
 ] 

Akira Ajisaka commented on YARN-8802:
-

Filed MAPREDUCE-7142 to add the options automatically.

> [JDK9] Fail to run yarn application after building hadoop pkg with jdk9 in 
> jdk9 env
> ---
>
> Key: YARN-8802
> URL: https://issues.apache.org/jira/browse/YARN-8802
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: liyunzhang
>Priority: Critical
> Attachments: HADOOP-14984.patch
>
>
> After building latest code with jdk9. (patch HADOOP-12760.03.patch, 
> HDFS-11610.001.patch). And start hdfs, yarn service(HADOOP-14978) 
> successfully. I met exception when running TestDFSIO
> {code}
> hadoop jar 
> $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.1.0-SNAPSHOT-tests.jar
>  TestDFSIO -write -nrFiles 8 -fileSize 1MB -resFile ./write.1MB.8
> {code}
> the exception
> {code}
> 67 1) Error injecting constructor, java.lang.NoClassDefFoundError: 
> javax/activation/DataSource
> 68   at 
> org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver.(JAXBContextResolver.java:72)
> 69   at 
> org.apache.hadoop.mapreduce.v2.app.webapp.AMWebApp.setup(AMWebApp.java:33)
> 70   while locating 
> org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver
> 71 
>  72 1 error
> 73 at 
> com.google.inject.internal.InjectorImpl$2.get(InjectorImpl.java:1025)
> 74 at 
> com.google.inject.internal.InjectorImpl.getInstance(InjectorImpl.java:1051)
> 75 at 
> com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory$GuiceInstantiatedComponentProvider.getInstance(GuiceComponentProviderFactory.java:345)
> 76 at 
> com.sun.jersey.core.spi.component.ioc.IoCProviderFactory$ManagedSingleton.(IoCProviderFactory.java:202)
> 77 at 
> com.sun.jersey.core.spi.component.ioc.IoCProviderFactory.wrap(IoCProviderFactory.java:123)
> 78 at 
> com.sun.jersey.core.spi.component.ioc.IoCProviderFactory._getComponentProvider(IoCProviderFactory.java:116)
> 79 at 
> com.sun.jersey.core.spi.component.ProviderFactory.getComponentProvider(ProviderFactory.java:153)
> 80 at 
> com.sun.jersey.core.spi.component.ProviderServices.getComponent(ProviderServices.java:278)
> 81 at 
> com.sun.jersey.core.spi.component.ProviderServices.getProviders(ProviderServices.java:151)
> 82 at 
> com.sun.jersey.core.spi.factory.ContextResolverFactory.init(ContextResolverFactory.java:83)
> 83 at 
> com.sun.jersey.server.impl.application.WebApplicationImpl._initiate(WebApplicationImpl.java:1332)
> 84 at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.access$700(WebApplicationImpl.java:180)
> 85 at 
> com.sun.jersey.server.impl.application.WebApplicationImpl$13.f(WebApplicationImpl.java:799)
> 86 at 
> com.sun.jersey.server.impl.application.WebApplicationImpl$13.f(WebApplicationImpl.java:795)
> 87 at 
> com.sun.jersey.spi.inject.Errors.processWithErrors(Errors.java:193)
> 88 at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.initiate(WebApplicationImpl.java:795)
> 89 at 
> com.sun.jersey.guice.spi.container.servlet.GuiceContainer.initiate(GuiceContainer.java:121)
> 90 at 
> com.sun.jersey.spi.container.servlet.ServletContainer$InternalWebComponent.initiate(ServletContainer.java:339)
> 91 at 
> com.sun.jersey.spi.container.servlet.WebComponent.load(WebComponent.java:605)
> 92 at 
> com.sun.jersey.spi.container.servlet.WebComponent.init(WebComponent.java:207)
> 93 at 
> com.sun.jersey.spi.container.servlet.ServletContainer.init(ServletContainer.java:394)
> 94 at 
> com.sun.jersey.spi.container.servlet.ServletContainer.init(ServletContainer.java:744)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8802) [JDK9] Fail to run yarn application after building hadoop pkg with jdk9 in jdk9 env

2018-09-19 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621528#comment-16621528
 ] 

Hadoop QA commented on YARN-8802:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
29s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 49s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 29s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 2 new + 48 unchanged - 0 fixed = 50 total (was 48) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 18s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m  
4s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 80m  3s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | YARN-8802 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12894098/HADOOP-14984.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux c29d278a7929 3.13.0-144-generic #193-Ubuntu SMP Thu Mar 15 
17:03:53 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 6fc293f |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/21896/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/21896/testReport/ |
| Max. process+thread count | 305 (vs. 

[jira] [Commented] (YARN-8802) [JDK9] Fail to run yarn application after building hadoop pkg with jdk9 in jdk9 env

2018-09-19 Thread Akira Ajisaka (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621520#comment-16621520
 ] 

Akira Ajisaka commented on YARN-8802:
-

Tried the following settings and ran MapReduce job successfully.
{code:title=mapred-site.xml}



{code}
Detail: 
https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+and+Java+9%2C+10%2C+11

> [JDK9] Fail to run yarn application after building hadoop pkg with jdk9 in 
> jdk9 env
> ---
>
> Key: YARN-8802
> URL: https://issues.apache.org/jira/browse/YARN-8802
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: liyunzhang
>Priority: Critical
> Attachments: HADOOP-14984.patch
>
>
> After building latest code with jdk9. (patch HADOOP-12760.03.patch, 
> HDFS-11610.001.patch). And start hdfs, yarn service(HADOOP-14978) 
> successfully. I met exception when running TestDFSIO
> {code}
> hadoop jar 
> $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.1.0-SNAPSHOT-tests.jar
>  TestDFSIO -write -nrFiles 8 -fileSize 1MB -resFile ./write.1MB.8
> {code}
> the exception
> {code}
> 67 1) Error injecting constructor, java.lang.NoClassDefFoundError: 
> javax/activation/DataSource
> 68   at 
> org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver.(JAXBContextResolver.java:72)
> 69   at 
> org.apache.hadoop.mapreduce.v2.app.webapp.AMWebApp.setup(AMWebApp.java:33)
> 70   while locating 
> org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver
> 71 
>  72 1 error
> 73 at 
> com.google.inject.internal.InjectorImpl$2.get(InjectorImpl.java:1025)
> 74 at 
> com.google.inject.internal.InjectorImpl.getInstance(InjectorImpl.java:1051)
> 75 at 
> com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory$GuiceInstantiatedComponentProvider.getInstance(GuiceComponentProviderFactory.java:345)
> 76 at 
> com.sun.jersey.core.spi.component.ioc.IoCProviderFactory$ManagedSingleton.(IoCProviderFactory.java:202)
> 77 at 
> com.sun.jersey.core.spi.component.ioc.IoCProviderFactory.wrap(IoCProviderFactory.java:123)
> 78 at 
> com.sun.jersey.core.spi.component.ioc.IoCProviderFactory._getComponentProvider(IoCProviderFactory.java:116)
> 79 at 
> com.sun.jersey.core.spi.component.ProviderFactory.getComponentProvider(ProviderFactory.java:153)
> 80 at 
> com.sun.jersey.core.spi.component.ProviderServices.getComponent(ProviderServices.java:278)
> 81 at 
> com.sun.jersey.core.spi.component.ProviderServices.getProviders(ProviderServices.java:151)
> 82 at 
> com.sun.jersey.core.spi.factory.ContextResolverFactory.init(ContextResolverFactory.java:83)
> 83 at 
> com.sun.jersey.server.impl.application.WebApplicationImpl._initiate(WebApplicationImpl.java:1332)
> 84 at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.access$700(WebApplicationImpl.java:180)
> 85 at 
> com.sun.jersey.server.impl.application.WebApplicationImpl$13.f(WebApplicationImpl.java:799)
> 86 at 
> com.sun.jersey.server.impl.application.WebApplicationImpl$13.f(WebApplicationImpl.java:795)
> 87 at 
> com.sun.jersey.spi.inject.Errors.processWithErrors(Errors.java:193)
> 88 at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.initiate(WebApplicationImpl.java:795)
> 89 at 
> com.sun.jersey.guice.spi.container.servlet.GuiceContainer.initiate(GuiceContainer.java:121)
> 90 at 
> com.sun.jersey.spi.container.servlet.ServletContainer$InternalWebComponent.initiate(ServletContainer.java:339)
> 91 at 
> com.sun.jersey.spi.container.servlet.WebComponent.load(WebComponent.java:605)
> 92 at 
> com.sun.jersey.spi.container.servlet.WebComponent.init(WebComponent.java:207)
> 93 at 
> com.sun.jersey.spi.container.servlet.ServletContainer.init(ServletContainer.java:394)
> 94 at 
> com.sun.jersey.spi.container.servlet.ServletContainer.init(ServletContainer.java:744)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8769) [Submarine] Allow user to specify customized quicklink(s) when submit Submarine job

2018-09-19 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621509#comment-16621509
 ] 

Hadoop QA commented on YARN-8769:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 44s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
34s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine 
in trunk has 2 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 13s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine: 
The patch generated 3 new + 43 unchanged - 0 fixed = 46 total (was 43) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 43s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
37s{color} | {color:green} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine 
generated 0 new + 1 unchanged - 1 fixed = 1 total (was 2) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
30s{color} | {color:green} hadoop-yarn-submarine in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 51m 22s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | YARN-8769 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12940531/YARN-8769.004.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 86442f9e0322 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 6fc293f |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-YARN-Build/21895/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-submarine-warnings.html
 |
| checkstyle | 

[jira] [Commented] (YARN-8793) QueuePlacementPolicy bind more information to assgining result

2018-09-19 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621499#comment-16621499
 ] 

Hadoop QA commented on YARN-8793:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
29s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 11s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 32s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 2 new + 287 unchanged - 10 fixed = 289 total (was 297) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 14 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 34s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 73m  
7s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}122m 17s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | YARN-8793 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12940527/YARN-8793.005.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 93ae8e115c7e 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 
07:31:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 6b5838e |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/21894/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/21894/artifact/out/whitespace-eol.txt
 |
|  Test Results | 

[jira] [Updated] (YARN-8795) QueuePlacementRule move to separate files

2018-09-19 Thread Shuai Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuai Zhang updated YARN-8795:
--
Attachment: YARN-8795.004.patch

> QueuePlacementRule move to separate files
> -
>
> Key: YARN-8795
> URL: https://issues.apache.org/jira/browse/YARN-8795
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Affects Versions: 3.1.1
>Reporter: Shuai Zhang
>Priority: Major
> Attachments: YARN-8795.002.patch, YARN-8795.003.patch, 
> YARN-8795.004.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7481) Gpu locality support for Better AI scheduling

2018-09-19 Thread Chen Qingcha (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Qingcha updated YARN-7481:
---
Attachment: hadoop-2.9.0.gpu-port.20180920.patch

> Gpu locality support for Better AI scheduling
> -
>
> Key: YARN-7481
> URL: https://issues.apache.org/jira/browse/YARN-7481
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: api, RM, yarn
>Affects Versions: 2.7.2
>Reporter: Chen Qingcha
>Priority: Major
> Attachments: GPU locality support for Job scheduling.pdf, 
> branch-2.7.2.gpu-port-20180723.patch, hadoop-2.7.2.gpu-port-20180711.patch, 
> hadoop-2.7.2.gpu-port.patch, hadoop-2.9.0.gpu-port.20180725.patch, 
> hadoop-2.9.0.gpu-port.20180920.patch, hadoop-2.9.0.gpu-port.patch, 
> hadoop_2.9.0.patch
>
>   Original Estimate: 1,344h
>  Remaining Estimate: 1,344h
>
> We enhance Hadoop with GPU support for better AI job scheduling. 
> Currently, YARN-3926 also supports GPU scheduling, which treats GPU as 
> countable resource. 
> However, GPU placement is also very important to deep learning job for better 
> efficiency.
>  For example, a 2-GPU job runs on gpu {0,1} could be faster than run on gpu 
> {0, 7}, if GPU 0 and 1 are under the same PCI-E switch while 0 and 7 are not.
>  We add the support to Hadoop 2.7.2 to enable GPU locality scheduling, which 
> support fine-grained GPU placement. 
> A 64-bits bitmap is added to yarn Resource, which indicates both GPU usage 
> and locality information in a node (up to 64 GPUs per node). '1' means 
> available and '0' otherwise in the corresponding position of the bit.   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8802) [JDK9] Fail to run yarn application after building hadoop pkg with jdk9 in jdk9 env

2018-09-19 Thread Akira Ajisaka (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621477#comment-16621477
 ] 

Akira Ajisaka commented on YARN-8802:
-

Move this to YARN project and raised the priority. This issue blocks Java 9 
support.

> [JDK9] Fail to run yarn application after building hadoop pkg with jdk9 in 
> jdk9 env
> ---
>
> Key: YARN-8802
> URL: https://issues.apache.org/jira/browse/YARN-8802
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: liyunzhang
>Priority: Critical
> Attachments: HADOOP-14984.patch
>
>
> After building latest code with jdk9. (patch HADOOP-12760.03.patch, 
> HDFS-11610.001.patch). And start hdfs, yarn service(HADOOP-14978) 
> successfully. I met exception when running TestDFSIO
> {code}
> hadoop jar 
> $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.1.0-SNAPSHOT-tests.jar
>  TestDFSIO -write -nrFiles 8 -fileSize 1MB -resFile ./write.1MB.8
> {code}
> the exception
> {code}
> 67 1) Error injecting constructor, java.lang.NoClassDefFoundError: 
> javax/activation/DataSource
> 68   at 
> org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver.(JAXBContextResolver.java:72)
> 69   at 
> org.apache.hadoop.mapreduce.v2.app.webapp.AMWebApp.setup(AMWebApp.java:33)
> 70   while locating 
> org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver
> 71 
>  72 1 error
> 73 at 
> com.google.inject.internal.InjectorImpl$2.get(InjectorImpl.java:1025)
> 74 at 
> com.google.inject.internal.InjectorImpl.getInstance(InjectorImpl.java:1051)
> 75 at 
> com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory$GuiceInstantiatedComponentProvider.getInstance(GuiceComponentProviderFactory.java:345)
> 76 at 
> com.sun.jersey.core.spi.component.ioc.IoCProviderFactory$ManagedSingleton.(IoCProviderFactory.java:202)
> 77 at 
> com.sun.jersey.core.spi.component.ioc.IoCProviderFactory.wrap(IoCProviderFactory.java:123)
> 78 at 
> com.sun.jersey.core.spi.component.ioc.IoCProviderFactory._getComponentProvider(IoCProviderFactory.java:116)
> 79 at 
> com.sun.jersey.core.spi.component.ProviderFactory.getComponentProvider(ProviderFactory.java:153)
> 80 at 
> com.sun.jersey.core.spi.component.ProviderServices.getComponent(ProviderServices.java:278)
> 81 at 
> com.sun.jersey.core.spi.component.ProviderServices.getProviders(ProviderServices.java:151)
> 82 at 
> com.sun.jersey.core.spi.factory.ContextResolverFactory.init(ContextResolverFactory.java:83)
> 83 at 
> com.sun.jersey.server.impl.application.WebApplicationImpl._initiate(WebApplicationImpl.java:1332)
> 84 at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.access$700(WebApplicationImpl.java:180)
> 85 at 
> com.sun.jersey.server.impl.application.WebApplicationImpl$13.f(WebApplicationImpl.java:799)
> 86 at 
> com.sun.jersey.server.impl.application.WebApplicationImpl$13.f(WebApplicationImpl.java:795)
> 87 at 
> com.sun.jersey.spi.inject.Errors.processWithErrors(Errors.java:193)
> 88 at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.initiate(WebApplicationImpl.java:795)
> 89 at 
> com.sun.jersey.guice.spi.container.servlet.GuiceContainer.initiate(GuiceContainer.java:121)
> 90 at 
> com.sun.jersey.spi.container.servlet.ServletContainer$InternalWebComponent.initiate(ServletContainer.java:339)
> 91 at 
> com.sun.jersey.spi.container.servlet.WebComponent.load(WebComponent.java:605)
> 92 at 
> com.sun.jersey.spi.container.servlet.WebComponent.init(WebComponent.java:207)
> 93 at 
> com.sun.jersey.spi.container.servlet.ServletContainer.init(ServletContainer.java:394)
> 94 at 
> com.sun.jersey.spi.container.servlet.ServletContainer.init(ServletContainer.java:744)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Moved] (YARN-8802) [JDK9] Fail to run yarn application after building hadoop pkg with jdk9 in jdk9 env

2018-09-19 Thread Akira Ajisaka (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka moved HADOOP-14984 to YARN-8802:
--

Key: YARN-8802  (was: HADOOP-14984)
Project: Hadoop YARN  (was: Hadoop Common)

> [JDK9] Fail to run yarn application after building hadoop pkg with jdk9 in 
> jdk9 env
> ---
>
> Key: YARN-8802
> URL: https://issues.apache.org/jira/browse/YARN-8802
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: liyunzhang
>Priority: Major
> Attachments: HADOOP-14984.patch
>
>
> After building latest code with jdk9. (patch HADOOP-12760.03.patch, 
> HDFS-11610.001.patch). And start hdfs, yarn service(HADOOP-14978) 
> successfully. I met exception when running TestDFSIO
> {code}
> hadoop jar 
> $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.1.0-SNAPSHOT-tests.jar
>  TestDFSIO -write -nrFiles 8 -fileSize 1MB -resFile ./write.1MB.8
> {code}
> the exception
> {code}
> 67 1) Error injecting constructor, java.lang.NoClassDefFoundError: 
> javax/activation/DataSource
> 68   at 
> org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver.(JAXBContextResolver.java:72)
> 69   at 
> org.apache.hadoop.mapreduce.v2.app.webapp.AMWebApp.setup(AMWebApp.java:33)
> 70   while locating 
> org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver
> 71 
>  72 1 error
> 73 at 
> com.google.inject.internal.InjectorImpl$2.get(InjectorImpl.java:1025)
> 74 at 
> com.google.inject.internal.InjectorImpl.getInstance(InjectorImpl.java:1051)
> 75 at 
> com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory$GuiceInstantiatedComponentProvider.getInstance(GuiceComponentProviderFactory.java:345)
> 76 at 
> com.sun.jersey.core.spi.component.ioc.IoCProviderFactory$ManagedSingleton.(IoCProviderFactory.java:202)
> 77 at 
> com.sun.jersey.core.spi.component.ioc.IoCProviderFactory.wrap(IoCProviderFactory.java:123)
> 78 at 
> com.sun.jersey.core.spi.component.ioc.IoCProviderFactory._getComponentProvider(IoCProviderFactory.java:116)
> 79 at 
> com.sun.jersey.core.spi.component.ProviderFactory.getComponentProvider(ProviderFactory.java:153)
> 80 at 
> com.sun.jersey.core.spi.component.ProviderServices.getComponent(ProviderServices.java:278)
> 81 at 
> com.sun.jersey.core.spi.component.ProviderServices.getProviders(ProviderServices.java:151)
> 82 at 
> com.sun.jersey.core.spi.factory.ContextResolverFactory.init(ContextResolverFactory.java:83)
> 83 at 
> com.sun.jersey.server.impl.application.WebApplicationImpl._initiate(WebApplicationImpl.java:1332)
> 84 at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.access$700(WebApplicationImpl.java:180)
> 85 at 
> com.sun.jersey.server.impl.application.WebApplicationImpl$13.f(WebApplicationImpl.java:799)
> 86 at 
> com.sun.jersey.server.impl.application.WebApplicationImpl$13.f(WebApplicationImpl.java:795)
> 87 at 
> com.sun.jersey.spi.inject.Errors.processWithErrors(Errors.java:193)
> 88 at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.initiate(WebApplicationImpl.java:795)
> 89 at 
> com.sun.jersey.guice.spi.container.servlet.GuiceContainer.initiate(GuiceContainer.java:121)
> 90 at 
> com.sun.jersey.spi.container.servlet.ServletContainer$InternalWebComponent.initiate(ServletContainer.java:339)
> 91 at 
> com.sun.jersey.spi.container.servlet.WebComponent.load(WebComponent.java:605)
> 92 at 
> com.sun.jersey.spi.container.servlet.WebComponent.init(WebComponent.java:207)
> 93 at 
> com.sun.jersey.spi.container.servlet.ServletContainer.init(ServletContainer.java:394)
> 94 at 
> com.sun.jersey.spi.container.servlet.ServletContainer.init(ServletContainer.java:744)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8802) [JDK9] Fail to run yarn application after building hadoop pkg with jdk9 in jdk9 env

2018-09-19 Thread Akira Ajisaka (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated YARN-8802:

Priority: Critical  (was: Major)

> [JDK9] Fail to run yarn application after building hadoop pkg with jdk9 in 
> jdk9 env
> ---
>
> Key: YARN-8802
> URL: https://issues.apache.org/jira/browse/YARN-8802
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: liyunzhang
>Priority: Critical
> Attachments: HADOOP-14984.patch
>
>
> After building latest code with jdk9. (patch HADOOP-12760.03.patch, 
> HDFS-11610.001.patch). And start hdfs, yarn service(HADOOP-14978) 
> successfully. I met exception when running TestDFSIO
> {code}
> hadoop jar 
> $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.1.0-SNAPSHOT-tests.jar
>  TestDFSIO -write -nrFiles 8 -fileSize 1MB -resFile ./write.1MB.8
> {code}
> the exception
> {code}
> 67 1) Error injecting constructor, java.lang.NoClassDefFoundError: 
> javax/activation/DataSource
> 68   at 
> org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver.(JAXBContextResolver.java:72)
> 69   at 
> org.apache.hadoop.mapreduce.v2.app.webapp.AMWebApp.setup(AMWebApp.java:33)
> 70   while locating 
> org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver
> 71 
>  72 1 error
> 73 at 
> com.google.inject.internal.InjectorImpl$2.get(InjectorImpl.java:1025)
> 74 at 
> com.google.inject.internal.InjectorImpl.getInstance(InjectorImpl.java:1051)
> 75 at 
> com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory$GuiceInstantiatedComponentProvider.getInstance(GuiceComponentProviderFactory.java:345)
> 76 at 
> com.sun.jersey.core.spi.component.ioc.IoCProviderFactory$ManagedSingleton.(IoCProviderFactory.java:202)
> 77 at 
> com.sun.jersey.core.spi.component.ioc.IoCProviderFactory.wrap(IoCProviderFactory.java:123)
> 78 at 
> com.sun.jersey.core.spi.component.ioc.IoCProviderFactory._getComponentProvider(IoCProviderFactory.java:116)
> 79 at 
> com.sun.jersey.core.spi.component.ProviderFactory.getComponentProvider(ProviderFactory.java:153)
> 80 at 
> com.sun.jersey.core.spi.component.ProviderServices.getComponent(ProviderServices.java:278)
> 81 at 
> com.sun.jersey.core.spi.component.ProviderServices.getProviders(ProviderServices.java:151)
> 82 at 
> com.sun.jersey.core.spi.factory.ContextResolverFactory.init(ContextResolverFactory.java:83)
> 83 at 
> com.sun.jersey.server.impl.application.WebApplicationImpl._initiate(WebApplicationImpl.java:1332)
> 84 at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.access$700(WebApplicationImpl.java:180)
> 85 at 
> com.sun.jersey.server.impl.application.WebApplicationImpl$13.f(WebApplicationImpl.java:799)
> 86 at 
> com.sun.jersey.server.impl.application.WebApplicationImpl$13.f(WebApplicationImpl.java:795)
> 87 at 
> com.sun.jersey.spi.inject.Errors.processWithErrors(Errors.java:193)
> 88 at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.initiate(WebApplicationImpl.java:795)
> 89 at 
> com.sun.jersey.guice.spi.container.servlet.GuiceContainer.initiate(GuiceContainer.java:121)
> 90 at 
> com.sun.jersey.spi.container.servlet.ServletContainer$InternalWebComponent.initiate(ServletContainer.java:339)
> 91 at 
> com.sun.jersey.spi.container.servlet.WebComponent.load(WebComponent.java:605)
> 92 at 
> com.sun.jersey.spi.container.servlet.WebComponent.init(WebComponent.java:207)
> 93 at 
> com.sun.jersey.spi.container.servlet.ServletContainer.init(ServletContainer.java:394)
> 94 at 
> com.sun.jersey.spi.container.servlet.ServletContainer.init(ServletContainer.java:744)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8769) [Submarine] Allow user to specify customized quicklink(s) when submit Submarine job

2018-09-19 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-8769:
-
Attachment: YARN-8769.004.patch

> [Submarine] Allow user to specify customized quicklink(s) when submit 
> Submarine job
> ---
>
> Key: YARN-8769
> URL: https://issues.apache.org/jira/browse/YARN-8769
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Critical
> Attachments: YARN-8769.001.patch, YARN-8769.002.patch, 
> YARN-8769.003.patch, YARN-8769.004.patch
>
>
> This will be helpful when user submit a job and some links need to be shown 
> on YARN UI2 (service page). For example, user can specify a quick link to 
> Zeppelin notebook UI when a Zeppelin notebook got launched.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8468) Limit container sizes per queue in FairScheduler

2018-09-19 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621460#comment-16621460
 ] 

Weiwei Yang commented on YARN-8468:
---

[~leftnoteasy], YARN-8720 already addressed the changes needed in CS side, so I 
am also under the impression changes in this patch should be limited to FS code 
as much as possible. That's why I suggested a lot reverts in last comment.

[~bsteinbach], thanks for the new patch
{quote}Some of the changes you requested to revert came from formatting only 
the modified code/method with the defined Hadoop formatter. Also, it can happen 
that the modification was reverted but some small formatting remained there.
{quote}
Sorry could not understand this. They are still a lot of unnecessary changes. 
Are you using auto-format tools? If so, please do not do that. (even it fixes 
some formatting issues, e.g remove spaces, empty lines etc).

I don't understand the changes in {{SchedulerUtils}}
{code:java}
// Before
private static void normalizeAndValidateRequest(ResourceRequest resReq, 
Resource maximumResource, String queueName, YarnScheduler scheduler, boolean 
isRecovery, RMContext rmContext, QueueInfo queueInfo)

// After
public static void normalizeAndValidateRequest(ResourceRequest resReq, String 
queueName, YarnScheduler scheduler, boolean isRecovery, RMContext rmContext, 
QueueInfo queueInfo, Resource maximumAllocation)

{code}
 it looks like the patch only moves the position of the parameter 
maximumResource.

Looked a bit more for the API changes in {{YarnScheduler}}
{code:java}
// added param maxResourceCapacity
Resource getNormalizedResource(Resource requestedResource, Resource 
maxResourceCapability);
{code}
I am wondering if we have to add maxResourceCapacity to this. Given we already 
have the changes in the {{DefaultAMSProcessor#allocate}}, that ensures all 
coming resource requests are already normalized against max resource (global or 
queue level), do we have to do that again in the scheduler?

> Limit container sizes per queue in FairScheduler
> 
>
> Key: YARN-8468
> URL: https://issues.apache.org/jira/browse/YARN-8468
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 3.1.0
>Reporter: Antal Bálint Steinbach
>Assignee: Antal Bálint Steinbach
>Priority: Critical
> Attachments: YARN-8468.000.patch, YARN-8468.001.patch, 
> YARN-8468.002.patch, YARN-8468.003.patch, YARN-8468.004.patch, 
> YARN-8468.005.patch, YARN-8468.006.patch, YARN-8468.007.patch, 
> YARN-8468.008.patch, YARN-8468.009.patch, YARN-8468.010.patch, 
> YARN-8468.011.patch, YARN-8468.012.patch, YARN-8468.013.patch, 
> YARN-8468.014.patch
>
>
> When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" 
> to limit the overall size of a container. This applies globally to all 
> containers and cannot be limited by queue or and is not scheduler dependent.
>  
> The goal of this ticket is to allow this value to be set on a per queue basis.
>  
> The use case: User has two pools, one for ad hoc jobs and one for enterprise 
> apps. User wants to limit ad hoc jobs to small containers but allow 
> enterprise apps to request as many resources as needed. Setting 
> yarn.scheduler.maximum-allocation-mb sets a default value for maximum 
> container size for all queues and setting maximum resources per queue with 
> “maxContainerResources” queue config value.
>  
> Suggested solution:
>  
> All the infrastructure is already in the code. We need to do the following:
>  * add the setting to the queue properties for all queue types (parent and 
> leaf), this will cover dynamically created queues.
>  * if we set it on the root we override the scheduler setting and we should 
> not allow that.
>  * make sure that queue resource cap can not be larger than scheduler max 
> resource cap in the config.
>  * implement getMaximumResourceCapability(String queueName) in the 
> FairScheduler
>  * implement getMaximumResourceCapability() in both FSParentQueue and 
> FSLeafQueue as follows
>  * expose the setting in the queue information in the RM web UI.
>  * expose the setting in the metrics etc for the queue.
>  * write JUnit tests.
>  * update the scheduler documentation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8725) Submarine job staging directory has a lot of useless PRIMARY_WORKER-launch-script-***.sh scripts when submitting a job multiple times

2018-09-19 Thread Zhankun Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621442#comment-16621442
 ] 

Zhankun Tang commented on YARN-8725:


[~leftnoteasy] Thanks for the explanation. It should have a server in the 
cluster to handle this. The "submarine gateway" could be a possible option?

> Submarine job staging directory has a lot of useless 
> PRIMARY_WORKER-launch-script-***.sh  scripts when submitting a job multiple 
> times
> --
>
> Key: YARN-8725
> URL: https://issues.apache.org/jira/browse/YARN-8725
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zac Zhou
>Assignee: Zhankun Tang
>Priority: Major
> Attachments: YARN-8725-trunk.001.patch
>
>
> Submarine jobs upload core-site.xml, hdfs-site.xml, job.info and 
> PRIMARY_WORKER-launch-script.sh to staging dir.
> The core-site.xml, hdfs-site.xml and job.info would be overwritten if a job 
> is submitted multiple times.
> But PRIMARY_WORKER-launch-script.sh would not be overwritten, as it has 
> random numbers in its name.
> The files in the staging dir are as follows:
> {code:java}
> -rw-r- 2 hadoop hdfs 580 2018-08-17 10:11 
> hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/PRIMARY_WORKER-launch-script6954941665090337726.sh
> -rw-r- 2 hadoop hdfs 580 2018-08-17 10:02 
> hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/PRIMARY_WORKER-launch-script7037369696166769734.sh
> -rw-r- 2 hadoop hdfs 580 2018-08-17 10:06 
> hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/PRIMARY_WORKER-launch-script8047707294763488040.sh
> -rw-r- 2 hadoop hdfs 15225 2018-08-17 18:46 
> hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/PRIMARY_WORKER-launch-script8122565781159446375.sh
> -rw-r- 2 hadoop hdfs 580 2018-08-16 20:48 
> hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/PRIMARY_WORKER-launch-script8598604480700049845.sh
> -rw-r- 2 hadoop hdfs 580 2018-08-17 14:53 
> hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/PRIMARY_WORKER-launch-script971703616848859353.sh
> -rw-r- 2 hadoop hdfs 580 2018-08-17 10:16 
> hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/PRIMARY_WORKER-launch-script990214235580089093.sh
> -rw-r- 2 hadoop hdfs 8815 2018-08-27 15:54 
> hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/core-site.xml
> -rw-r- 2 hadoop hdfs 11583 2018-08-27 15:54 
> hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/hdfs-site.xml
> -rw-rw-rw- 2 hadoop hdfs 846 2018-08-22 10:56 
> hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/job.info
> {code}
>  
> We should stop the staging dir from growing or have a way to clean it up



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8793) QueuePlacementPolicy bind more information to assgining result

2018-09-19 Thread Shuai Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuai Zhang updated YARN-8793:
--
Attachment: YARN-8793.005.patch

> QueuePlacementPolicy bind more information to assgining result
> --
>
> Key: YARN-8793
> URL: https://issues.apache.org/jira/browse/YARN-8793
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Affects Versions: 3.1.1
>Reporter: Shuai Zhang
>Priority: Major
> Attachments: YARN-8793.001.patch, YARN-8793.002.patch, 
> YARN-8793.003.patch, YARN-8793.004.patch, YARN-8793.005.patch
>
>
> Fair scheduler's QueuePlacementPolicy should bind more information to 
> assigning result:
>  # Whether to terminate the chain of responsibility
>  # The reason to reject a request



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8771) CapacityScheduler fails to unreserve when cluster resource contains empty resource type

2018-09-19 Thread Tao Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621414#comment-16621414
 ] 

Tao Yang commented on YARN-8771:


Thanks [~cheersyang] and [~leftnoteasy] !

> CapacityScheduler fails to unreserve when cluster resource contains empty 
> resource type
> ---
>
> Key: YARN-8771
> URL: https://issues.apache.org/jira/browse/YARN-8771
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Critical
> Fix For: 3.2.0, 3.1.2
>
> Attachments: YARN-8771.001.patch, YARN-8771.002.patch, 
> YARN-8771.003.patch, YARN-8771.004.patch
>
>
> We found this problem when cluster is almost but not exhausted (93% used), 
> scheduler kept allocating for an app but always fail to commit, this can 
> blocking requests from other apps and parts of cluster resource can't be used.
> Reproduce this problem:
> (1) use DominantResourceCalculator
> (2) cluster resource has empty resource type, for example: gpu=0
> (3) scheduler allocates container for app1 who has reserved containers and 
> whose queue limit or user limit reached(used + required > limit). 
> Reference codes in RegularContainerAllocator#assignContainer:
> {code:java}
> // How much need to unreserve equals to:
> // max(required - headroom, amountNeedUnreserve)
> Resource headRoom = Resources.clone(currentResoureLimits.getHeadroom());
> Resource resourceNeedToUnReserve =
> Resources.max(rc, clusterResource,
> Resources.subtract(capability, headRoom),
> currentResoureLimits.getAmountNeededUnreserve());
> boolean needToUnreserve =
> Resources.greaterThan(rc, clusterResource,
> resourceNeedToUnReserve, Resources.none());
> {code}
> For example, resourceNeedToUnReserve can be <8GB, -6 cores, 0 gpu> when 
> {{headRoom=<0GB, 8 vcores, 0 gpu>}} and {{capacity=<8GB, 2 vcores, 0 gpu>}}, 
> needToUnreserve which is the result of {{Resources#greaterThan}} will be 
> {{false}}. This is not reasonable because required resource did exceed the 
> headroom and unreserve is needed.
> After that, when reaching the unreserve process in 
> RegularContainerAllocator#assignContainer, unreserve process will be skipped 
> when shouldAllocOrReserveNewContainer is true (when required containers > 
> reserved containers) and needToUnreserve is wrongly calculated to be false:
> {code:java}
> if (availableContainers > 0) {
>  if (rmContainer == null && reservationsContinueLooking
>   && node.getLabels().isEmpty()) {
>   // unreserve process can be wrongly skipped when 
> shouldAllocOrReserveNewContainer=true and needToUnreserve=false but required 
> resource did exceed the headroom
>   if (!shouldAllocOrReserveNewContainer || needToUnreserve) { 
> ... 
>   }
>  }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8801) java doc comments in docker-util.h is confusing

2018-09-19 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621407#comment-16621407
 ] 

Hadoop QA commented on YARN-8801:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
33m 14s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 57s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 
32s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
35s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 71m 52s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | YARN-8801 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12940517/YARN-8801.001.patch |
| Optional Tests |  dupname  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux 97f04c4a0e50 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 6b5838e |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/21893/testReport/ |
| Max. process+thread count | 294 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/21893/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> java doc comments in docker-util.h is confusing
> ---
>
> Key: YARN-8801
> URL: https://issues.apache.org/jira/browse/YARN-8801
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Minor
>  Labels: Docker
> Attachments: YARN-8801.001.patch
>
>
> {code:java}
>  /**
> + * Get the Docker exec command line string. The function will verify that 
> the params file is meant for the exec command.
> + * @param command_file File containing the params for the Docker start 
> command

[jira] [Commented] (YARN-8769) [Submarine] Allow user to specify customized quicklink(s) when submit Submarine job

2018-09-19 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621368#comment-16621368
 ] 

Hadoop QA commented on YARN-8769:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
24s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 30s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
34s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine 
in trunk has 2 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
18s{color} | {color:red} hadoop-yarn-submarine in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
18s{color} | {color:red} hadoop-yarn-submarine in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 18s{color} 
| {color:red} hadoop-yarn-submarine in the patch failed. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 13s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine: 
The patch generated 3 new + 43 unchanged - 0 fixed = 46 total (was 43) {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
20s{color} | {color:red} hadoop-yarn-submarine in the patch failed. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  1s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
22s{color} | {color:red} hadoop-yarn-submarine in the patch failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 21s{color} 
| {color:red} hadoop-yarn-submarine in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 50m 29s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | YARN-8769 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12940515/YARN-8769.003.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 4157d9d5c654 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 6b5838e |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-YARN-Build/21892/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-submarine-warnings.html
 |
| mvninstall | 

[jira] [Commented] (YARN-8696) [AMRMProxy] FederationInterceptor upgrade: home sub-cluster heartbeat async

2018-09-19 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621359#comment-16621359
 ] 

Hadoop QA commented on YARN-8696:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
8s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 31s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m  
0s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 33s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  7m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
49s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
45s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
18s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
28s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 
55s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 51s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
35s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}192m 24s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | YARN-8696 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12940505/YARN-8696.v6.patch |
| Optional Tests |  dupname  asflicense  

[jira] [Commented] (YARN-8801) java doc comments in docker-util.h is confusing

2018-09-19 Thread Zian Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621350#comment-16621350
 ] 

Zian Chen commented on YARN-8801:
-

Provide patch for the fix. Don't need to add UTs here.

> java doc comments in docker-util.h is confusing
> ---
>
> Key: YARN-8801
> URL: https://issues.apache.org/jira/browse/YARN-8801
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Minor
>  Labels: Docker
>
> {code:java}
>  /**
> + * Get the Docker exec command line string. The function will verify that 
> the params file is meant for the exec command.
> + * @param command_file File containing the params for the Docker start 
> command
> + * @param conf Configuration struct containing the container-executor.cfg 
> details
> + * @param out Buffer to fill with the exec command
> + * @param outlen Size of the output buffer
> + * @return Return code with 0 indicating success and non-zero codes 
> indicating error
> + */
> +int get_docker_exec_command(const char* command_file, const struct 
> configuration* conf, args *args);{code}
> The method param list have out an outlen which didn't match the signature, 
> and we miss description for param args. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8801) java doc comments in docker-util.h is confusing

2018-09-19 Thread Zian Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-8801:

Labels: Docker  (was: )

> java doc comments in docker-util.h is confusing
> ---
>
> Key: YARN-8801
> URL: https://issues.apache.org/jira/browse/YARN-8801
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Minor
>  Labels: Docker
>
> {code:java}
>  /**
> + * Get the Docker exec command line string. The function will verify that 
> the params file is meant for the exec command.
> + * @param command_file File containing the params for the Docker start 
> command
> + * @param conf Configuration struct containing the container-executor.cfg 
> details
> + * @param out Buffer to fill with the exec command
> + * @param outlen Size of the output buffer
> + * @return Return code with 0 indicating success and non-zero codes 
> indicating error
> + */
> +int get_docker_exec_command(const char* command_file, const struct 
> configuration* conf, args *args);{code}
> The method param list have out an outlen which didn't match the signature, 
> and we miss description for param args. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8801) java doc comments in docker-util.h is confusing

2018-09-19 Thread Zian Chen (JIRA)
Zian Chen created YARN-8801:
---

 Summary: java doc comments in docker-util.h is confusing
 Key: YARN-8801
 URL: https://issues.apache.org/jira/browse/YARN-8801
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zian Chen
Assignee: Zian Chen


{code:java}
 /**
+ * Get the Docker exec command line string. The function will verify that the 
params file is meant for the exec command.
+ * @param command_file File containing the params for the Docker start command
+ * @param conf Configuration struct containing the container-executor.cfg 
details
+ * @param out Buffer to fill with the exec command
+ * @param outlen Size of the output buffer
+ * @return Return code with 0 indicating success and non-zero codes indicating 
error
+ */
+int get_docker_exec_command(const char* command_file, const struct 
configuration* conf, args *args);{code}
The method param list have out an outlen which didn't match the signature, and 
we miss description for param args. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-8799) [Submarine] Correct the default directory path in HDFS for "checkout_path"

2018-09-19 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan resolved YARN-8799.
--
Resolution: Duplicate

This should be duplicated by YARN-8757. 

> [Submarine] Correct the default directory path in HDFS for "checkout_path"
> --
>
> Key: YARN-8799
> URL: https://issues.apache.org/jira/browse/YARN-8799
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhankun Tang
>Assignee: Zhankun Tang
>Priority: Major
> Fix For: 3.2.0
>
>
>  
> {code:java}
> yarn jar 
> $HADOOP_BASE_DIR/home/share/hadoop/yarn/hadoop-yarn-submarine-3.2.0-SNAPSHOT.jar
>  job run \
>  -verbose \
>  -wait_job_finish \
>  -keep_staging_dir \
>  --env DOCKER_JAVA_HOME=/usr/lib/jvm/java-8-oracle \
>  --env DOCKER_HADOOP_HDFS_HOME=/hadoop-3.2.0-SNAPSHOT \
>  --name tf-job-001 \
>  --docker_image tangzhankun/tensorflow \
>  --input_path hdfs://default/user/yarn/cifar-10-data \
>  --worker_resources memory=4G,vcores=2 \
>  --worker_launch_cmd "cd /cifar10_estimator && python cifar10_main.py 
> --data-dir=%input_path% --job-dir=%checkpoint_path% --num-gpus=0 
> --train-steps=5"{code}
>  
> Above script should work, but the job failed due to invalid path passed to 
> "--job-dir" per my testing. It should be a URI start with "hdfs://".
> {code:java}
> 2018-09-19 23:19:34,729 INFO yarnservice.YarnServiceJobSubmitter: Worker 
> command =[cd /cifar10_estimator && python cifar10_main.py 
> --data-dir=hdfs://default/user/yarn/cifar-10-data 
> --job-dir=submarine/jobs/tf-job-001/staging/checkpoint_path --num-gpus=0 
> --train-steps=2]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8725) Submarine job staging directory has a lot of useless PRIMARY_WORKER-launch-script-***.sh scripts when submitting a job multiple times

2018-09-19 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621339#comment-16621339
 ] 

Wangda Tan commented on YARN-8725:
--

Thanks [~tangzhankun], 
{quote}But the job failed due to invalid path passed to "--job-dir" per my 
testing. It should be a URI start with "hdfs://".
{quote}
The issue is handled by YARN-8757. 
{quote}... Because the user is better to not know so such details
{quote}
I think it is fine since we don't suppose user set to the location if manual 
checkpoint_path being specified. 
{quote}Could you please elaborate on this?
{quote}
Sure, launch of worker/ps rely on these scripts localization. For default case, 
Submarine client exits when application submitted to YARN. And this is before 
worker/ps launch. If we do cleanup staging dir right before Submarine client 
exits, it is very likely that ps/worker launch will be failed. 

To be able to handle app failure, etc. cases, instead of adding cleanup logics 
to cli,  it's better to have a server to handle this. 

> Submarine job staging directory has a lot of useless 
> PRIMARY_WORKER-launch-script-***.sh  scripts when submitting a job multiple 
> times
> --
>
> Key: YARN-8725
> URL: https://issues.apache.org/jira/browse/YARN-8725
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zac Zhou
>Assignee: Zhankun Tang
>Priority: Major
> Attachments: YARN-8725-trunk.001.patch
>
>
> Submarine jobs upload core-site.xml, hdfs-site.xml, job.info and 
> PRIMARY_WORKER-launch-script.sh to staging dir.
> The core-site.xml, hdfs-site.xml and job.info would be overwritten if a job 
> is submitted multiple times.
> But PRIMARY_WORKER-launch-script.sh would not be overwritten, as it has 
> random numbers in its name.
> The files in the staging dir are as follows:
> {code:java}
> -rw-r- 2 hadoop hdfs 580 2018-08-17 10:11 
> hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/PRIMARY_WORKER-launch-script6954941665090337726.sh
> -rw-r- 2 hadoop hdfs 580 2018-08-17 10:02 
> hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/PRIMARY_WORKER-launch-script7037369696166769734.sh
> -rw-r- 2 hadoop hdfs 580 2018-08-17 10:06 
> hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/PRIMARY_WORKER-launch-script8047707294763488040.sh
> -rw-r- 2 hadoop hdfs 15225 2018-08-17 18:46 
> hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/PRIMARY_WORKER-launch-script8122565781159446375.sh
> -rw-r- 2 hadoop hdfs 580 2018-08-16 20:48 
> hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/PRIMARY_WORKER-launch-script8598604480700049845.sh
> -rw-r- 2 hadoop hdfs 580 2018-08-17 14:53 
> hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/PRIMARY_WORKER-launch-script971703616848859353.sh
> -rw-r- 2 hadoop hdfs 580 2018-08-17 10:16 
> hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/PRIMARY_WORKER-launch-script990214235580089093.sh
> -rw-r- 2 hadoop hdfs 8815 2018-08-27 15:54 
> hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/core-site.xml
> -rw-r- 2 hadoop hdfs 11583 2018-08-27 15:54 
> hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/hdfs-site.xml
> -rw-rw-rw- 2 hadoop hdfs 846 2018-08-22 10:56 
> hdfs://submarine/user/hadoop/submarine/jobs/standlone-tf/staging/job.info
> {code}
>  
> We should stop the staging dir from growing or have a way to clean it up



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8769) [Submarine] Allow user to specify customized quicklink(s) when submit Submarine job

2018-09-19 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621328#comment-16621328
 ] 

Wangda Tan commented on YARN-8769:
--

Attached ver.3 patch fixed checkstyle and findbugs issues. 

> [Submarine] Allow user to specify customized quicklink(s) when submit 
> Submarine job
> ---
>
> Key: YARN-8769
> URL: https://issues.apache.org/jira/browse/YARN-8769
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Critical
> Attachments: YARN-8769.001.patch, YARN-8769.002.patch, 
> YARN-8769.003.patch
>
>
> This will be helpful when user submit a job and some links need to be shown 
> on YARN UI2 (service page). For example, user can specify a quick link to 
> Zeppelin notebook UI when a Zeppelin notebook got launched.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8769) [Submarine] Allow user to specify customized quicklink(s) when submit Submarine job

2018-09-19 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-8769:
-
Attachment: YARN-8769.003.patch

> [Submarine] Allow user to specify customized quicklink(s) when submit 
> Submarine job
> ---
>
> Key: YARN-8769
> URL: https://issues.apache.org/jira/browse/YARN-8769
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Critical
> Attachments: YARN-8769.001.patch, YARN-8769.002.patch, 
> YARN-8769.003.patch
>
>
> This will be helpful when user submit a job and some links need to be shown 
> on YARN UI2 (service page). For example, user can specify a quick link to 
> Zeppelin notebook UI when a Zeppelin notebook got launched.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8777) Container Executor C binary change to execute interactive docker command

2018-09-19 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621322#comment-16621322
 ] 

Hadoop QA commented on YARN-8777:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
27s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
33m 59s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 58s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 
36s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 69m 15s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | YARN-8777 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12940510/YARN-8777.003.patch |
| Optional Tests |  dupname  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux f1c19ac32cf6 3.13.0-144-generic #193-Ubuntu SMP Thu Mar 15 
17:03:53 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 6b5838e |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/21891/testReport/ |
| Max. process+thread count | 300 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/21891/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Container Executor C binary change to execute interactive docker command
> 
>
> Key: YARN-8777
> URL: https://issues.apache.org/jira/browse/YARN-8777
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8777.001.patch, YARN-8777.002.patch, 
> YARN-8777.003.patch
>
>
> Since Container Executor provides Container execution using the native 
> container-executor binary, we also need to make changes to accept new 
> “dockerExec” method to invoke the corresponding native function to execute 
> docker exec command to the running 

[jira] [Commented] (YARN-7599) [GPG] ApplicationCleaner in Global Policy Generator

2018-09-19 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621301#comment-16621301
 ] 

Hadoop QA commented on YARN-7599:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} YARN-7402 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  3m 
24s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
19s{color} | {color:green} YARN-7402 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m  
4s{color} | {color:green} YARN-7402 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
34s{color} | {color:green} YARN-7402 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
50s{color} | {color:green} YARN-7402 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 56s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
52s{color} | {color:green} YARN-7402 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
24s{color} | {color:green} YARN-7402 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m  
6s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 28s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 2 new + 229 unchanged - 0 fixed = 231 total (was 229) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 56s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
13s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
47s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
12s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
20s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
46s{color} | {color:green} hadoop-yarn-server-globalpolicygenerator in the 
patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
39s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}101m 58s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | YARN-7599 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12940501/YARN-7599-YARN-7402.v6.patch
 |
| Optional Tests |  dupname  

[jira] [Commented] (YARN-6456) Allow administrators to set a single ContainerRuntime for all containers

2018-09-19 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621291#comment-16621291
 ] 

Eric Yang commented on YARN-6456:
-

[~jlowe] The proposed path is good way to unblock this feature.

> Allow administrators to set a single ContainerRuntime for all containers
> 
>
> Key: YARN-6456
> URL: https://issues.apache.org/jira/browse/YARN-6456
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Miklos Szegedi
>Assignee: Craig Condit
>Priority: Major
>  Labels: Docker
> Attachments: YARN-6456-ForceDockerRuntimeIfSupported.patch, 
> YARN-6456.001.patch, YARN-6456.002.patch, YARN-6456.003.patch
>
>
>  
> With LCE, there are multiple ContainerRuntimes available for handling 
> different types of containers; default, docker, java sandbox. Admins should 
> have the ability to override the user decision and set a single global 
> ContainerRuntime to be used for all containers.
> Original Description:
> {quote}One reason to use Docker containers is to be able to isolate different 
> workloads, even, if they run as the same user.
> I have noticed some issues in the current design:
>  1. DockerLinuxContainerRuntime mounts containerLocalDirs 
> {{nm-local-dir/usercache/user/appcache/application_1491598755372_0011/}} and 
> userLocalDirs {{nm-local-dir/usercache/user/}}, so that a container can see 
> and modify the files of another container. I think the application file cache 
> directory should be enough for the container to run in most of the cases.
>  2. The whole cgroups directory is mounted. Would the container directory be 
> enough?
>  3. There is no way to enforce exclusive use of Docker for all containers. 
> There should be an option that it is not the user but the admin that requires 
> to use Docker.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8777) Container Executor C binary change to execute interactive docker command

2018-09-19 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621278#comment-16621278
 ] 

Eric Yang commented on YARN-8777:
-

[~Zian Chen] [~ebadger] Patch 3 will change the .cmd file format to require 
launch-command to support generic command. 

{code}
[docker-command-execution]
  docker-command=exec
  name=container_1536945486532_0004_01_09
  launch-command=bash
{code}

 From Java side, we have to construct the binary path to use for the docker 
container.

> Container Executor C binary change to execute interactive docker command
> 
>
> Key: YARN-8777
> URL: https://issues.apache.org/jira/browse/YARN-8777
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8777.001.patch, YARN-8777.002.patch, 
> YARN-8777.003.patch
>
>
> Since Container Executor provides Container execution using the native 
> container-executor binary, we also need to make changes to accept new 
> “dockerExec” method to invoke the corresponding native function to execute 
> docker exec command to the running container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8784) DockerLinuxContainerRuntime prevents access to distributed cache entries on a full disk

2018-09-19 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621276#comment-16621276
 ] 

Hudson commented on YARN-8784:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15025 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/15025/])
YARN-8784. DockerLinuxContainerRuntime prevents access to distributed (jlowe: 
rev 6b5838ed3220f992092c7348f92f1d9d0d4a3061)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/TestContainerLaunch.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java


> DockerLinuxContainerRuntime prevents access to distributed cache entries on a 
> full disk
> ---
>
> Key: YARN-8784
> URL: https://issues.apache.org/jira/browse/YARN-8784
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.2.0, 3.1.1
>Reporter: Jason Lowe
>Assignee: Eric Badger
>Priority: Major
>  Labels: Docker
> Fix For: 3.2.0, 3.1.2
>
> Attachments: YARN-8784.001.patch
>
>
> DockerLinuxContainerRuntime bind mounts the filecache and usercache 
> directories into the container to allow tasks to access entries in the 
> distributed cache.  However it only bind mounts  directories on disks that 
> are considered good, and disks that are full or bad are not in that list.  If 
> a container tries to run with a distributed cache entry that has been 
> previously localized to a disk that is now considered full/bad, the dist 
> cache directory will _not_ be bind-mounted into the container's filesystem 
> namespace.  At that point any symlinks in the container's current working 
> directory that point to those disks will reference invalid paths.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8777) Container Executor C binary change to execute interactive docker command

2018-09-19 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-8777:

Attachment: YARN-8777.003.patch

> Container Executor C binary change to execute interactive docker command
> 
>
> Key: YARN-8777
> URL: https://issues.apache.org/jira/browse/YARN-8777
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8777.001.patch, YARN-8777.002.patch, 
> YARN-8777.003.patch
>
>
> Since Container Executor provides Container execution using the native 
> container-executor binary, we also need to make changes to accept new 
> “dockerExec” method to invoke the corresponding native function to execute 
> docker exec command to the running container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8777) Container Executor C binary change to execute interactive docker command

2018-09-19 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621259#comment-16621259
 ] 

Eric Yang commented on YARN-8777:
-

Base on today's meeting, there is possibility that bash doesn't exist in docker 
image.  It would be nice to implement launch-command for docker exec.  I will 
make adjustment accordingly.

> Container Executor C binary change to execute interactive docker command
> 
>
> Key: YARN-8777
> URL: https://issues.apache.org/jira/browse/YARN-8777
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8777.001.patch, YARN-8777.002.patch
>
>
> Since Container Executor provides Container execution using the native 
> container-executor binary, we also need to make changes to accept new 
> “dockerExec” method to invoke the corresponding native function to execute 
> docker exec command to the running container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8774) Memory leak when CapacityScheduler allocates from reserved container with non-default label

2018-09-19 Thread Eric Payne (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621254#comment-16621254
 ] 

Eric Payne commented on YARN-8774:
--

Thanks [~Tao Yang] for reporting this issue and providing a patch. The fix 
looks good to me. I will finish reviewing the unit test tomorrow.

> Memory leak when CapacityScheduler allocates from reserved container with 
> non-default label
> ---
>
> Key: YARN-8774
> URL: https://issues.apache.org/jira/browse/YARN-8774
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.2.0, 2.8.5
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Critical
> Attachments: YARN-8774.001.patch
>
>
> The cause is that the RMContainerImpl instance of reserved container lost its 
> node label expression, when scheduler reserves containers for non-default 
> node-label requests, it will be wrongly added into 
> LeafQueue#ignorePartitionExclusivityRMContainers and never be removed.
> To reproduce this memory leak:
> (1) create reserved container
> RegularContainerAllocator#doAllocation:  create RMContainerImpl instanceA 
> (nodeLabelExpression="")
> LeafQueue#allocateResource:  RMContainerImpl instanceA is put into  
> LeafQueue#ignorePartitionExclusivityRMContainers
> (2) allocate from reserved container
> RegularContainerAllocator#doAllocation: create RMContainerImpl instanceB 
> (nodeLabelExpression="test-label")
> (3) From now on, RMContainerImpl instanceA will be left in memory (be kept in 
> LeafQueue#ignorePartitionExclusivityRMContainers) forever until RM restarted



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8783) Improve the documentation for the docker.trusted.registries configuration

2018-09-19 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621225#comment-16621225
 ] 

Jason Lowe commented on YARN-8783:
--

[~simonprewo] I added you to the YARN contributor list.  Feel free to assign 
this JIRA to yourself if you want to provide a patch.

> Improve the documentation for the docker.trusted.registries configuration
> -
>
> Key: YARN-8783
> URL: https://issues.apache.org/jira/browse/YARN-8783
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.1.1
>Reporter: Simon Prewo
>Priority: Major
>  Labels: Docker, container-executor, docker
>
> I am deploying the default yarn distributed shell example:
> {code:java}
> yarn jar hadoop-yarn-applications-distributedshell.jar -shell_env 
> YARN_CONTAINER_RUNTIME_TYPE=docker -shell_env 
> YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=centos -shell_command "sleep 90" -jar 
> hadoop-yarn-applications-distributedshell.jar -num_containers 1{code}
> Having a *single trusted registry configured like this works*:
> {code:java}
> docker.trusted.registries=centos{code}
> But having *a list of trusted registries configured fails* ("Shell error 
> output: image: centos is not trusted."):
> {code:java}
> docker.trusted.registries=centos,ubuntu{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8696) [AMRMProxy] FederationInterceptor upgrade: home sub-cluster heartbeat async

2018-09-19 Thread Botong Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated YARN-8696:
---
Attachment: YARN-8696.v6.patch

> [AMRMProxy] FederationInterceptor upgrade: home sub-cluster heartbeat async
> ---
>
> Key: YARN-8696
> URL: https://issues.apache.org/jira/browse/YARN-8696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Major
> Attachments: YARN-8696.v1.patch, YARN-8696.v2.patch, 
> YARN-8696.v3.patch, YARN-8696.v4.patch, YARN-8696.v5.patch, YARN-8696.v6.patch
>
>
> Today in _FederationInterceptor_, the heartbeat to home sub-cluster is 
> synchronous. After the heartbeat is sent out to home sub-cluster, it waits 
> for the home response to come back before merging and returning the (merged) 
> heartbeat result to back AM. If home sub-cluster is suffering from connection 
> issues, or down during an YarnRM master-slave switch, all heartbeat threads 
> in _FederationInterceptor_ will be blocked waiting for home response. As a 
> result, the successful UAM heartbeats from secondary sub-clusters will not be 
> returned to AM at all. Additionally, because of the fact that we kept the 
> same heartbeat responseId between AM and home RM, lots of tricky handling are 
> needed regarding the responseId resync when it comes to 
> _FederationInterceptor_ (part of AMRMProxy, NM) work preserving restart 
> (YARN-6127, YARN-1336), home RM master-slave switch etc. 
> In this patch, we change the heartbeat to home sub-cluster to asynchronous, 
> same as the way we handle UAM heartbeats in secondaries. So that any 
> sub-cluster down or connection issues won't impact AM getting responses from 
> other sub-clusters. The responseId is also managed separately for home 
> sub-cluster and AM, and they increment independently. The resync logic 
> becomes much cleaner. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6456) Allow administrators to set a single ContainerRuntime for all containers

2018-09-19 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621224#comment-16621224
 ] 

Jason Lowe commented on YARN-6456:
--

My apologies for the delay -- I missed the last comment being posted.

Since there's quite a bit of discussion around the allowed images part of this 
patch, I think it would make sense to separate that part of it from the rest of 
the patch.  Technically all I think we really need here is the default image so 
the Docker container runtime knows how to force a particular image when the 
user didn't request a Docker container.  The allowed images feature is a 
nice-to-have thing that does not seem completely necessary, and we can file a 
followup JIRA to add the ability to limit not only to a set of trusted 
registries but also to a specified set of images.  The allowed images 
discussion can then be moved there, unblocking this one.

> Allow administrators to set a single ContainerRuntime for all containers
> 
>
> Key: YARN-6456
> URL: https://issues.apache.org/jira/browse/YARN-6456
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Miklos Szegedi
>Assignee: Craig Condit
>Priority: Major
>  Labels: Docker
> Attachments: YARN-6456-ForceDockerRuntimeIfSupported.patch, 
> YARN-6456.001.patch, YARN-6456.002.patch, YARN-6456.003.patch
>
>
>  
> With LCE, there are multiple ContainerRuntimes available for handling 
> different types of containers; default, docker, java sandbox. Admins should 
> have the ability to override the user decision and set a single global 
> ContainerRuntime to be used for all containers.
> Original Description:
> {quote}One reason to use Docker containers is to be able to isolate different 
> workloads, even, if they run as the same user.
> I have noticed some issues in the current design:
>  1. DockerLinuxContainerRuntime mounts containerLocalDirs 
> {{nm-local-dir/usercache/user/appcache/application_1491598755372_0011/}} and 
> userLocalDirs {{nm-local-dir/usercache/user/}}, so that a container can see 
> and modify the files of another container. I think the application file cache 
> directory should be enough for the container to run in most of the cases.
>  2. The whole cgroups directory is mounted. Would the container directory be 
> enough?
>  3. There is no way to enforce exclusive use of Docker for all containers. 
> There should be an option that it is not the user but the admin that requires 
> to use Docker.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7599) [GPG] ApplicationCleaner in Global Policy Generator

2018-09-19 Thread Botong Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated YARN-7599:
---
Attachment: YARN-7599-YARN-7402.v6.patch

> [GPG] ApplicationCleaner in Global Policy Generator
> ---
>
> Key: YARN-7599
> URL: https://issues.apache.org/jira/browse/YARN-7599
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
>  Labels: federation, gpg
> Attachments: YARN-7599-YARN-7402.v1.patch, 
> YARN-7599-YARN-7402.v2.patch, YARN-7599-YARN-7402.v3.patch, 
> YARN-7599-YARN-7402.v4.patch, YARN-7599-YARN-7402.v5.patch, 
> YARN-7599-YARN-7402.v6.patch
>
>
> In Federation, we need a cleanup service for StateStore as well as Yarn 
> Registry. For the former, we need to remove old application records. For the 
> latter, failed and killed applications might leave records in the Yarn 
> Registry (see YARN-6128). We plan to do both cleanup work in 
> ApplicationCleaner in GPG



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7599) [GPG] ApplicationCleaner in Global Policy Generator

2018-09-19 Thread Botong Huang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621216#comment-16621216
 ] 

Botong Huang commented on YARN-7599:


Thanks [~bibinchundatt] for the comment! v6 patch uploaded. 

bq. I was thinking of disabling cleaner while the GPG service is live
I see. Yeah let's leave it as future work. For now restarting GPG will do, it 
is an out of band service anyways. 

bq. Can you change to single configuration similar to 
dfs.http.client.retry.policy.spec {min,max,interval}
I already changed the new configs to something like 
application.cleaner.router.min.success. This is what you meant right? 

Somehow the link from yetus run hasn't work at all. I think the checkstyle run 
has some build issue. I just rebased the YARN-7402 base branch to latest trunk, 
let's see. 

> [GPG] ApplicationCleaner in Global Policy Generator
> ---
>
> Key: YARN-7599
> URL: https://issues.apache.org/jira/browse/YARN-7599
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
>  Labels: federation, gpg
> Attachments: YARN-7599-YARN-7402.v1.patch, 
> YARN-7599-YARN-7402.v2.patch, YARN-7599-YARN-7402.v3.patch, 
> YARN-7599-YARN-7402.v4.patch, YARN-7599-YARN-7402.v5.patch
>
>
> In Federation, we need a cleanup service for StateStore as well as Yarn 
> Registry. For the former, we need to remove old application records. For the 
> latter, failed and killed applications might leave records in the Yarn 
> Registry (see YARN-6128). We plan to do both cleanup work in 
> ApplicationCleaner in GPG



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8769) [Submarine] Allow user to specify customized quicklink(s) when submit Submarine job

2018-09-19 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621114#comment-16621114
 ] 

Hadoop QA commented on YARN-8769:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m  0s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
33s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine 
in trunk has 2 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 12s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine: 
The patch generated 9 new + 42 unchanged - 0 fixed = 51 total (was 42) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 42s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
38s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine 
generated 2 new + 2 unchanged - 0 fixed = 4 total (was 2) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
28s{color} | {color:green} hadoop-yarn-submarine in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 47m 10s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | 
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine
 |
|  |  
org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceJobSubmitter.TENSORBOARD_QUICKLINK_LABEL
 isn't final but should be  At YarnServiceJobSubmitter.java:be  At 
YarnServiceJobSubmitter.java:[line 60] |
|  |  Self assignment of field YarnServiceJobSubmitter.serviceSpec in 
org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceJobSubmitter.submitJob(RunJobParameters)
  At YarnServiceJobSubmitter.java:in 
org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceJobSubmitter.submitJob(RunJobParameters)
  At YarnServiceJobSubmitter.java:[line 538] |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | YARN-8769 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12940481/YARN-8769.002.patch |
| Optional Tests 

[jira] [Commented] (YARN-8665) Yarn Service Upgrade: Support cancelling upgrade

2018-09-19 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621108#comment-16621108
 ] 

Chandni Singh commented on YARN-8665:
-

[~eyang] Thanks for the review
{quote}
Instances at READY/FAILED_UPGRADE state should be marked for NEEDS_UPGRADE, and 
instances at NEEDS_UPGRADE state should be reset to RUNNING_BUT_NOT_READY to 
revert the process. 
{quote}
In my second patch I will fix the issue that if the instance was not upgraded 
yet, cancellation will not be triggered for it  as well and it will immediately 
go back to READY state.  
Otherwise there will be race conditions when the component reads the state of 
the container and then updates it to a new state based on a condition.


> Yarn Service Upgrade:  Support cancelling upgrade
> -
>
> Key: YARN-8665
> URL: https://issues.apache.org/jira/browse/YARN-8665
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8665.001.patch
>
>
> When a service is upgraded without auto-finalization or express upgrade, then 
> the upgrade can be cancelled. This provides the user ability to test upgrade 
> of a single instance and if that doesn't go well, they get a chance to cancel 
> it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8777) Container Executor C binary change to execute interactive docker command

2018-09-19 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621059#comment-16621059
 ] 

Eric Yang commented on YARN-8777:
-

[~ebadger] {quote}Where do the shell expansion cases come from? We invoke the 
docker binary using exec.{quote}

There is no shell expansion in this patch, but junior developer who interact 
with docker exec from their terminal might not know that their command are 
expanded by shell before docker exec is running.

> Container Executor C binary change to execute interactive docker command
> 
>
> Key: YARN-8777
> URL: https://issues.apache.org/jira/browse/YARN-8777
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8777.001.patch, YARN-8777.002.patch
>
>
> Since Container Executor provides Container execution using the native 
> container-executor binary, we also need to make changes to accept new 
> “dockerExec” method to invoke the corresponding native function to execute 
> docker exec command to the running container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8696) [AMRMProxy] FederationInterceptor upgrade: home sub-cluster heartbeat async

2018-09-19 Thread Botong Huang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621060#comment-16621060
 ] 

Botong Huang commented on YARN-8696:


Unit test failure in TestCapacityOverTimePolicy is irrelevant. 

> [AMRMProxy] FederationInterceptor upgrade: home sub-cluster heartbeat async
> ---
>
> Key: YARN-8696
> URL: https://issues.apache.org/jira/browse/YARN-8696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Major
> Attachments: YARN-8696.v1.patch, YARN-8696.v2.patch, 
> YARN-8696.v3.patch, YARN-8696.v4.patch, YARN-8696.v5.patch
>
>
> Today in _FederationInterceptor_, the heartbeat to home sub-cluster is 
> synchronous. After the heartbeat is sent out to home sub-cluster, it waits 
> for the home response to come back before merging and returning the (merged) 
> heartbeat result to back AM. If home sub-cluster is suffering from connection 
> issues, or down during an YarnRM master-slave switch, all heartbeat threads 
> in _FederationInterceptor_ will be blocked waiting for home response. As a 
> result, the successful UAM heartbeats from secondary sub-clusters will not be 
> returned to AM at all. Additionally, because of the fact that we kept the 
> same heartbeat responseId between AM and home RM, lots of tricky handling are 
> needed regarding the responseId resync when it comes to 
> _FederationInterceptor_ (part of AMRMProxy, NM) work preserving restart 
> (YARN-6127, YARN-1336), home RM master-slave switch etc. 
> In this patch, we change the heartbeat to home sub-cluster to asynchronous, 
> same as the way we handle UAM heartbeats in secondaries. So that any 
> sub-cluster down or connection issues won't impact AM getting responses from 
> other sub-clusters. The responseId is also managed separately for home 
> sub-cluster and AM, and they increment independently. The resync logic 
> becomes much cleaner. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8665) Yarn Service Upgrade: Support cancelling upgrade

2018-09-19 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621049#comment-16621049
 ] 

Eric Yang commented on YARN-8665:
-

[~csingh] Thank you for the patch.  When cancel upgrade is triggered, app 
master seems to reset all instance to state NEEDS_UPGRADE, and set service 
state to CANCEL_UPGRADING.  It becomes hard to identify if the instance should 
be restarted with original configuration.  I think we can avoid the step of 
reset NEEDS_UPGRADE state for all instances.  Instances at READY/FAILED_UPGRADE 
state should be marked for NEEDS_UPGRADE, and instances at NEEDS_UPGRADE state 
should be reset to RUNNING_BUT_NOT_READY to revert the process.  The inversion 
approach might work better to restore service to its original form without 
version control reinit process.

If you choose to stay on course of the current implementation, node manager 
report back to app master might need to introduce new versioning mechanism of 
the operation performed.  This helps to track if the reinit operation was 
performed for upgrade or upgrade cancel operation like you described as a 
separate JIRA.  However, I would feel more comfortable to solve the problem in 
this JIRA to make sure we don't destabilize the code base.

I also try to launch the app, and trigger upgrade with -initiate flag, then 
cancel with -cancel flag without actually upgrade any instance.  When this is 
performed, the service stuck in CANCEL_UPGRADING state without revert back to 
STABLE state.

> Yarn Service Upgrade:  Support cancelling upgrade
> -
>
> Key: YARN-8665
> URL: https://issues.apache.org/jira/browse/YARN-8665
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8665.001.patch
>
>
> When a service is upgraded without auto-finalization or express upgrade, then 
> the upgrade can be cancelled. This provides the user ability to test upgrade 
> of a single instance and if that doesn't go well, they get a chance to cancel 
> it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8800) Updated documentation of Submarine with latest examples.

2018-09-19 Thread Wangda Tan (JIRA)
Wangda Tan created YARN-8800:


 Summary: Updated documentation of Submarine with latest examples.
 Key: YARN-8800
 URL: https://issues.apache.org/jira/browse/YARN-8800
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Wangda Tan
Assignee: Wangda Tan






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8769) [Submarine] Allow user to specify customized quicklink(s) when submit Submarine job

2018-09-19 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621042#comment-16621042
 ] 

Wangda Tan commented on YARN-8769:
--

Thanks [~sunilg], attached ver.2

> [Submarine] Allow user to specify customized quicklink(s) when submit 
> Submarine job
> ---
>
> Key: YARN-8769
> URL: https://issues.apache.org/jira/browse/YARN-8769
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Critical
> Attachments: YARN-8769.001.patch, YARN-8769.002.patch
>
>
> This will be helpful when user submit a job and some links need to be shown 
> on YARN UI2 (service page). For example, user can specify a quick link to 
> Zeppelin notebook UI when a Zeppelin notebook got launched.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8769) [Submarine] Allow user to specify customized quicklink(s) when submit Submarine job

2018-09-19 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-8769:
-
Attachment: YARN-8769.002.patch

> [Submarine] Allow user to specify customized quicklink(s) when submit 
> Submarine job
> ---
>
> Key: YARN-8769
> URL: https://issues.apache.org/jira/browse/YARN-8769
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Critical
> Attachments: YARN-8769.001.patch, YARN-8769.002.patch
>
>
> This will be helpful when user submit a job and some links need to be shown 
> on YARN UI2 (service page). For example, user can specify a quick link to 
> Zeppelin notebook UI when a Zeppelin notebook got launched.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8468) Limit container sizes per queue in FairScheduler

2018-09-19 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621002#comment-16621002
 ] 

Wangda Tan commented on YARN-8468:
--

[~cheersyang], 

I saw YARN-8720 just committed, what is the relationship between YARN-8720 and 
this ticket? Haven't check details yet. 

It seems CS-related enforcement is already resolved by YARN-8720, correct? If 
so, do we still require massive changes to common code? 

> Limit container sizes per queue in FairScheduler
> 
>
> Key: YARN-8468
> URL: https://issues.apache.org/jira/browse/YARN-8468
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 3.1.0
>Reporter: Antal Bálint Steinbach
>Assignee: Antal Bálint Steinbach
>Priority: Critical
> Attachments: YARN-8468.000.patch, YARN-8468.001.patch, 
> YARN-8468.002.patch, YARN-8468.003.patch, YARN-8468.004.patch, 
> YARN-8468.005.patch, YARN-8468.006.patch, YARN-8468.007.patch, 
> YARN-8468.008.patch, YARN-8468.009.patch, YARN-8468.010.patch, 
> YARN-8468.011.patch, YARN-8468.012.patch, YARN-8468.013.patch, 
> YARN-8468.014.patch
>
>
> When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" 
> to limit the overall size of a container. This applies globally to all 
> containers and cannot be limited by queue or and is not scheduler dependent.
>  
> The goal of this ticket is to allow this value to be set on a per queue basis.
>  
> The use case: User has two pools, one for ad hoc jobs and one for enterprise 
> apps. User wants to limit ad hoc jobs to small containers but allow 
> enterprise apps to request as many resources as needed. Setting 
> yarn.scheduler.maximum-allocation-mb sets a default value for maximum 
> container size for all queues and setting maximum resources per queue with 
> “maxContainerResources” queue config value.
>  
> Suggested solution:
>  
> All the infrastructure is already in the code. We need to do the following:
>  * add the setting to the queue properties for all queue types (parent and 
> leaf), this will cover dynamically created queues.
>  * if we set it on the root we override the scheduler setting and we should 
> not allow that.
>  * make sure that queue resource cap can not be larger than scheduler max 
> resource cap in the config.
>  * implement getMaximumResourceCapability(String queueName) in the 
> FairScheduler
>  * implement getMaximumResourceCapability() in both FSParentQueue and 
> FSLeafQueue as follows
>  * expose the setting in the queue information in the RM web UI.
>  * expose the setting in the metrics etc for the queue.
>  * write JUnit tests.
>  * update the scheduler documentation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8777) Container Executor C binary change to execute interactive docker command

2018-09-19 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620989#comment-16620989
 ] 

Eric Badger commented on YARN-8777:
---

bq. Shell expansion corner cases will not handle gracefully and cause people to 
scratch their heads.
Where do the shell expansion cases come from? We invoke the docker binary using 
exec. 

> Container Executor C binary change to execute interactive docker command
> 
>
> Key: YARN-8777
> URL: https://issues.apache.org/jira/browse/YARN-8777
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8777.001.patch, YARN-8777.002.patch
>
>
> Since Container Executor provides Container execution using the native 
> container-executor binary, we also need to make changes to accept new 
> “dockerExec” method to invoke the corresponding native function to execute 
> docker exec command to the running container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8777) Container Executor C binary change to execute interactive docker command

2018-09-19 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620969#comment-16620969
 ] 

Hadoop QA commented on YARN-8777:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
32m 34s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 30s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 
46s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 68m 54s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | YARN-8777 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12940444/YARN-8777.002.patch |
| Optional Tests |  dupname  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux 20df1217f3f7 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 1824d5d |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/21887/testReport/ |
| Max. process+thread count | 337 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/21887/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Container Executor C binary change to execute interactive docker command
> 
>
> Key: YARN-8777
> URL: https://issues.apache.org/jira/browse/YARN-8777
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8777.001.patch, YARN-8777.002.patch
>
>
> Since Container Executor provides Container execution using the native 
> container-executor binary, we also need to make changes to accept new 
> “dockerExec” method to invoke the corresponding native function to execute 
> docker exec command to the running container.



--
This 

[jira] [Commented] (YARN-8791) When STOPSIGNAL is not present then docker inspect returns an extra line feed

2018-09-19 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620958#comment-16620958
 ] 

Hudson commented on YARN-8791:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15018 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/15018/])
YARN-8791. Trim docker inspect output for line feed for STOPSIGNAL (eyang: rev 
efdea85ad1cd4cc5a2a306898dbdb2c14b952d02)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/DockerLinuxContainerRuntime.java


> When STOPSIGNAL is not present then docker inspect returns an extra line feed
> -
>
> Key: YARN-8791
> URL: https://issues.apache.org/jira/browse/YARN-8791
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: Docker
> Fix For: 3.2.0
>
> Attachments: YARN-8791.001.patch
>
>
> When the STOPSIGNAL is missing, then an extra line feed is appended to the 
> output. This messes with the signal sent to the docker container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8757) [Submarine] Add Tensorboard component when --tensorboard is specified

2018-09-19 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620923#comment-16620923
 ] 

Hudson commented on YARN-8757:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15017 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/15017/])
YARN-8757. [Submarine] Add Tensorboard component when --tensorboard is (sunilg: 
rev 1824d5d1c49c16db6341141fa204d4a4c02d0944)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/src/main/java/org/apache/hadoop/yarn/submarine/common/fs/RemoteDirectoryManager.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/src/main/java/org/apache/hadoop/yarn/submarine/runtimes/common/FSBasedSubmarineStorageImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/src/test/java/org/apache/hadoop/yarn/submarine/common/fs/MockRemoteDirectoryManager.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/src/main/java/org/apache/hadoop/yarn/submarine/client/cli/param/RunJobParameters.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/src/main/java/org/apache/hadoop/yarn/submarine/client/cli/CliConstants.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/src/main/java/org/apache/hadoop/yarn/submarine/runtimes/yarnservice/YarnServiceJobSubmitter.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/src/main/java/org/apache/hadoop/yarn/submarine/client/cli/CliUtils.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/src/main/java/org/apache/hadoop/yarn/submarine/common/fs/DefaultRemoteDirectoryManager.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/src/test/java/org/apache/hadoop/yarn/submarine/client/cli/yarnservice/TestYarnServiceRunJobCli.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/src/main/java/org/apache/hadoop/yarn/submarine/client/cli/RunJobCli.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/src/main/java/org/apache/hadoop/yarn/submarine/runtimes/yarnservice/YarnServiceUtils.java


> [Submarine] Add Tensorboard component when --tensorboard is specified
> -
>
> Key: YARN-8757
> URL: https://issues.apache.org/jira/browse/YARN-8757
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Critical
> Fix For: 3.2.0
>
> Attachments: YARN-8757.001.patch, YARN-8757.002.patch, 
> YARN-8757.003.patch
>
>
> We need to have a Tensorboard component when --tensorboard is specified. And 
> we need to set quicklinks to let users view tensorboard.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8791) When STOPSIGNAL is not present then docker inspect returns an extra line feed

2018-09-19 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620865#comment-16620865
 ] 

Eric Yang commented on YARN-8791:
-

+1

> When STOPSIGNAL is not present then docker inspect returns an extra line feed
> -
>
> Key: YARN-8791
> URL: https://issues.apache.org/jira/browse/YARN-8791
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8791.001.patch
>
>
> When the STOPSIGNAL is missing, then an extra line feed is appended to the 
> output. This messes with the signal sent to the docker container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8725) Submarine job staging directory has a lot of useless PRIMARY_WORKER-launch-script-***.sh scripts when submitting a job multiple times

2018-09-19 Thread Zhankun Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620834#comment-16620834
 ] 

Zhankun Tang edited comment on YARN-8725 at 9/19/18 4:52 PM:
-

[~leftnoteasy]
{quote}cleanup whole staging dir seems overkill because models, etc. by default 
is placed under the directory as well.
{quote}
Thanks for pointing out this model directory stuff. This patch seems overkill. 
Let's hold on it.

I double-checked the relations between "--checkpoint_path", 
"--saved_model_path" and "–input_path".

For the --checkpoint_path", the situations are as below:

1. if set, in my case I set this "--checkpoint_path 
hdfs://default/user/yarn/cifar-10-jobdir". So it'll be safe to delete the 
staging dir since the checkout_dir that contains model data is not in staging 
dir.

2. if not set, it'll replace "%checkpoint_path%" with 
"submarine/jobs/tf-job-001/staging/checkpoint_path" due to below code:

 
{code:java}
public static String replacePatternsInLaunchCommand(String specifiedCli,
 RunJobParameters jobRunParameters,
 RemoteDirectoryManager directoryManager) throws IOException {
 String jobDir = jobRunParameters.getCheckpointPath();
 if (null == jobDir) {
 jobDir = directoryManager.getJobCheckpointDir(jobRunParameters.getName(),
 true).toString();
 }

 String input = jobRunParameters.getInputPath();
 String savedModelDir = jobRunParameters.getSavedModelPath();
 if (null == savedModelDir) {
 savedModelDir = jobDir;
 }

 Map replacePattern = new HashMap<>();
 if (jobDir != null) {
 replacePattern.put("%" + CliConstants.CHECKPOINT_PATH + "%", jobDir);
 }
...
if (savedModelDir != null) {
  replacePattern.put("%" + CliConstants.SAVED_MODEL_PATH + "%",
  savedModelDir);
}{code}
 

 
{code:java}
2018-09-19 23:19:34,729 INFO yarnservice.YarnServiceJobSubmitter: Worker 
command =[cd /cifar10_estimator && python cifar10_main.py 
--data-dir=hdfs://default/user/yarn/cifar-10-data 
--job-dir=submarine/jobs/tf-job-001/staging/checkpoint_path --num-gpus=0 
--train-steps=2]{code}
 

But the job failed due to invalid path passed to "--job-dir" per my testing. It 
should be a URI start with "hdfs://".

And attached the script I use for this testing. Has submitted a Jira to track 
this. YARN-8799
{code:java}
yarn jar 
$HADOOP_BASE_DIR/home/share/hadoop/yarn/hadoop-yarn-submarine-3.2.0-SNAPSHOT.jar
 job run \
 -verbose \
 -wait_job_finish \
 -keep_staging_dir \
 --env DOCKER_JAVA_HOME=/usr/lib/jvm/java-8-oracle \
 --env DOCKER_HADOOP_HDFS_HOME=/hadoop-3.2.0-SNAPSHOT \
 --name tf-job-001 \
 --docker_image tangzhankun/tensorflow \
 --input_path hdfs://default/user/yarn/cifar-10-data \
 --worker_resources memory=4G,vcores=2 \
 --worker_launch_cmd "cd /cifar10_estimator && python cifar10_main.py 
--data-dir=%input_path% --job-dir=%checkpoint_path% --num-gpus=0 
--train-steps=2"{code}
 

*I thought that if a user sets the "–checkpoint_path" option explicitly, it 
seems not possible that the end user set a path string under the internal 
staging dir like 
"*hdfs*://default/user/yarn/submarine/jobs/*tf*-job-001/staging/checkpoint_path"(Because
 the user is better to not know so such details). And in the other hand, don't 
set this option but still use the pattern "%checkpoint_path" in worker command 
seems strange to me.*

*So I assume we'll put checkpoint_path outside of staging dir, missed above 
script test and also the current code fact that we intend to put it under the 
staging dir by default. :)*

 

For the "–input_path", it's a must option. Nothing more to discuss except we 
should check invalid value(YARN-8798).

For the "--saved_model_path", it might have the same default value issue(needs 
more tests).  But it's mainly for serving. won't discuss here.

 
{quote}And logics in your patch cleans up dirs after job submitted. It is 
possible that workers get launched after dir got deleted.
{quote}
Could you please elaborate on this?
{quote}I'm not sure if we can do many meaningful things here in the client 
code. It might be better to do this in the server side, I don't have a clear 
idea about how to handle the service part. Maybe it should be a plugin of 
ApiServer, or it is a completely new service like a system service.
{quote}
Maybe let's talk about this offline.


was (Author: tangzhankun):
[~leftnoteasy]
{quote}cleanup whole staging dir seems overkill because models, etc. by default 
is placed under the directory as well.
{quote}
Thanks for pointing out this model directory stuff. This patch seems overkill. 
Let's hold on it.

I double-checked the relations between "--checkpoint_path", 
"--saved_model_path" and "–input_path".

For the --checkpoint_path", the situations are as below:

1. if set, in my case I set this "--checkpoint_path 
hdfs://default/user/yarn/cifar-10-jobdir". So it'll be safe to delete the 
staging dir since the checkout_dir that contains model data is not in staging 
dir.

2. if not 

[jira] [Commented] (YARN-8769) [Submarine] Allow user to specify customized quicklink(s) when submit Submarine job

2018-09-19 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620839#comment-16620839
 ] 

Sunil Govindan commented on YARN-8769:
--

This patch doesnt apply posy YARN-8757. [~leftnoteasy] could u pls help to 
rebase.

> [Submarine] Allow user to specify customized quicklink(s) when submit 
> Submarine job
> ---
>
> Key: YARN-8769
> URL: https://issues.apache.org/jira/browse/YARN-8769
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Critical
> Attachments: YARN-8769.001.patch
>
>
> This will be helpful when user submit a job and some links need to be shown 
> on YARN UI2 (service page). For example, user can specify a quick link to 
> Zeppelin notebook UI when a Zeppelin notebook got launched.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8725) Submarine job staging directory has a lot of useless PRIMARY_WORKER-launch-script-***.sh scripts when submitting a job multiple times

2018-09-19 Thread Zhankun Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620834#comment-16620834
 ] 

Zhankun Tang commented on YARN-8725:


[~leftnoteasy]
{quote}cleanup whole staging dir seems overkill because models, etc. by default 
is placed under the directory as well.
{quote}
Thanks for pointing out this model directory stuff. This patch seems overkill. 
Let's hold on it.

I double-checked the relations between "--checkpoint_path", 
"--saved_model_path" and "–input_path".

For the --checkpoint_path", the situations are as below:

1. if set, in my case I set this "--checkpoint_path 
hdfs://default/user/yarn/cifar-10-jobdir". So it'll be safe to delete the 
staging dir since the checkout_dir that contains model data is not in staging 
dir.

2. if not set, it'll replace "%checkout_path" with 
"submarine/jobs/tf-job-001/staging/checkpoint_path" due to below code:

 
{code:java}
public static String replacePatternsInLaunchCommand(String specifiedCli,
 RunJobParameters jobRunParameters,
 RemoteDirectoryManager directoryManager) throws IOException {
 String jobDir = jobRunParameters.getCheckpointPath();
 if (null == jobDir) {
 jobDir = directoryManager.getJobCheckpointDir(jobRunParameters.getName(),
 true).toString();
 }

 String input = jobRunParameters.getInputPath();
 String savedModelDir = jobRunParameters.getSavedModelPath();
 if (null == savedModelDir) {
 savedModelDir = jobDir;
 }

 Map replacePattern = new HashMap<>();
 if (jobDir != null) {
 replacePattern.put("%" + CliConstants.CHECKPOINT_PATH + "%", jobDir);
 }
...
if (savedModelDir != null) {
  replacePattern.put("%" + CliConstants.SAVED_MODEL_PATH + "%",
  savedModelDir);
}{code}
 

 
{code:java}
2018-09-19 23:19:34,729 INFO yarnservice.YarnServiceJobSubmitter: Worker 
command =[cd /cifar10_estimator && python cifar10_main.py 
--data-dir=hdfs://default/user/yarn/cifar-10-data 
--job-dir=submarine/jobs/tf-job-001/staging/checkpoint_path --num-gpus=0 
--train-steps=2]{code}
 

But the job failed due to invalid path passed to "--job-dir" per my testing. It 
should be a URI start with "hdfs://".

And attached the script I use for this testing. Has submitted a Jira to track 
this. YARN-8799
{code}
yarn jar 
$HADOOP_BASE_DIR/home/share/hadoop/yarn/hadoop-yarn-submarine-3.2.0-SNAPSHOT.jar
 job run \
 -verbose \
 -wait_job_finish \
 -keep_staging_dir \
 --env DOCKER_JAVA_HOME=/usr/lib/jvm/java-8-oracle \
 --env DOCKER_HADOOP_HDFS_HOME=/hadoop-3.2.0-SNAPSHOT \
 --name tf-job-001 \
 --docker_image tangzhankun/tensorflow \
 --input_path hdfs://default/user/yarn/cifar-10-data \
 --worker_resources memory=4G,vcores=2 \
 --worker_launch_cmd "cd /cifar10_estimator && python cifar10_main.py 
--data-dir=%input_path% --job-dir=%checkpoint_path% --num-gpus=0 
--train-steps=2"{code}
 

*I thought that if a user sets the "–checkoutpoint_path" option explicitly, it 
seems not possible that the end user set a path string under the internal 
staging dir like 
"*hdfs*://default/user/yarn/submarine/jobs/*tf*-job-001/staging/checkpoint_path"(Because
 the user is better to not know so such details). And in the other hand, don't 
set this option but still use the pattern "%checkpoint_path" in worker command 
seems strange to me.*

*So I assume we'll put checkpoint_dir outside of staging dir, missed above 
script test and also the current code fact that we intend to put it under the 
staging dir by default. :)*

 

For the "–input_path", it's a must option. Nothing more to discuss except we 
should check invalid value(YARN-8798).

For the "--saved_model_path", it might have the same default value issue(needs 
more tests).  But it's mainly for serving. won't discuss here.

 
{quote}And logics in your patch cleans up dirs after job submitted. It is 
possible that workers get launched after dir got deleted.
{quote}
Could you please elaborate on this?
{quote}I'm not sure if we can do many meaningful things here in the client 
code. It might be better to do this in the server side, I don't have a clear 
idea about how to handle the service part. Maybe it should be a plugin of 
ApiServer, or it is a completely new service like a system service.
{quote}
Maybe let's talk about this offline.

> Submarine job staging directory has a lot of useless 
> PRIMARY_WORKER-launch-script-***.sh  scripts when submitting a job multiple 
> times
> --
>
> Key: YARN-8725
> URL: https://issues.apache.org/jira/browse/YARN-8725
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zac Zhou
>Assignee: Zhankun Tang
>Priority: Major
> Attachments: YARN-8725-trunk.001.patch
>
>
> Submarine jobs upload core-site.xml, hdfs-site.xml, job.info and 
> 

[jira] [Commented] (YARN-8757) [Submarine] Add Tensorboard component when --tensorboard is specified

2018-09-19 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620833#comment-16620833
 ] 

Sunil Govindan commented on YARN-8757:
--

+1. Looks good to me.

Committing shortly if no objections.

> [Submarine] Add Tensorboard component when --tensorboard is specified
> -
>
> Key: YARN-8757
> URL: https://issues.apache.org/jira/browse/YARN-8757
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Critical
> Attachments: YARN-8757.001.patch, YARN-8757.002.patch, 
> YARN-8757.003.patch
>
>
> We need to have a Tensorboard component when --tensorboard is specified. And 
> we need to set quicklinks to let users view tensorboard.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8777) Container Executor C binary change to execute interactive docker command

2018-09-19 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620831#comment-16620831
 ] 

Eric Yang commented on YARN-8777:
-

[~Zian Chen] Patch 002 fixed comment in the header file.

> Container Executor C binary change to execute interactive docker command
> 
>
> Key: YARN-8777
> URL: https://issues.apache.org/jira/browse/YARN-8777
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8777.001.patch, YARN-8777.002.patch
>
>
> Since Container Executor provides Container execution using the native 
> container-executor binary, we also need to make changes to accept new 
> “dockerExec” method to invoke the corresponding native function to execute 
> docker exec command to the running container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8777) Container Executor C binary change to execute interactive docker command

2018-09-19 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-8777:

Attachment: YARN-8777.002.patch

> Container Executor C binary change to execute interactive docker command
> 
>
> Key: YARN-8777
> URL: https://issues.apache.org/jira/browse/YARN-8777
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8777.001.patch, YARN-8777.002.patch
>
>
> Since Container Executor provides Container execution using the native 
> container-executor binary, we also need to make changes to accept new 
> “dockerExec” method to invoke the corresponding native function to execute 
> docker exec command to the running container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8777) Container Executor C binary change to execute interactive docker command

2018-09-19 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620815#comment-16620815
 ] 

Eric Yang commented on YARN-8777:
-

[~ebadger] {quote}Opening up a bash session allows the user to then execute 
whatever commands they want to anyway. Am I missing something here?{quote}

The difference is causing harm to container executor's memory prior to docker 
exec -it bash is launched.  The argv is constructed and checked in 
container-executor and forward to docker exec.  If it is fixed number of 
arguments, container-executor logic has little chance of getting it wrong.  If 
the number is dynamic, and passing args exceeding docker's parameter parser 
have unpredictable result, and other corner case such as:

{code}
cmd="mongo --eval 'rs.isMaster()"
docker exec d886e775dfad "$cmd"
{code}

Docker will look for binary named "mongo -eval 'rs.isMaster()'" instead of 
mongo, and the rest of parameters to monogo.  Shell expansion corner cases will 
not handle gracefully and cause people to scratch their heads.  It is entirely 
possible to use ProcessBuilder and launch container-executor to run docker 
exec, and send unix command to be executed.  The added bash gives less 
experienced developers ability to script their execution without thinking about 
parameter passing overflow and shell expansion.  The command are ran inside the 
container without leaking to problems to container-executor level.  Hope this 
explains the choice of parameter passing for this use case.

> Container Executor C binary change to execute interactive docker command
> 
>
> Key: YARN-8777
> URL: https://issues.apache.org/jira/browse/YARN-8777
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8777.001.patch
>
>
> Since Container Executor provides Container execution using the native 
> container-executor binary, we also need to make changes to accept new 
> “dockerExec” method to invoke the corresponding native function to execute 
> docker exec command to the running container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8799) [Submarine] Correct the default directory path in HDFS for "checkout_path"

2018-09-19 Thread Zhankun Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhankun Tang updated YARN-8799:
---
Description: 
 
{code:java}
yarn jar 
$HADOOP_BASE_DIR/home/share/hadoop/yarn/hadoop-yarn-submarine-3.2.0-SNAPSHOT.jar
 job run \
 -verbose \
 -wait_job_finish \
 -keep_staging_dir \
 --env DOCKER_JAVA_HOME=/usr/lib/jvm/java-8-oracle \
 --env DOCKER_HADOOP_HDFS_HOME=/hadoop-3.2.0-SNAPSHOT \
 --name tf-job-001 \
 --docker_image tangzhankun/tensorflow \
 --input_path hdfs://default/user/yarn/cifar-10-data \
 --worker_resources memory=4G,vcores=2 \
 --worker_launch_cmd "cd /cifar10_estimator && python cifar10_main.py 
--data-dir=%input_path% --job-dir=%checkpoint_path% --num-gpus=0 
--train-steps=5"{code}
 

Above script should work, but the job failed due to invalid path passed to 
"--job-dir" per my testing. It should be a URI start with "hdfs://".
{code:java}
2018-09-19 23:19:34,729 INFO yarnservice.YarnServiceJobSubmitter: Worker 
command =[cd /cifar10_estimator && python cifar10_main.py 
--data-dir=hdfs://default/user/yarn/cifar-10-data 
--job-dir=submarine/jobs/tf-job-001/staging/checkpoint_path --num-gpus=0 
--train-steps=2]{code}

  was:
It might be more simple for user to use "--checkout_path" if we provide a 
default path in HDFS for checkout_path. It could be under current staging dir 
or other place that make sense.
{code:java}
yarn jar 
$HADOOP_BASE_DIR/home/share/hadoop/yarn/hadoop-yarn-submarine-3.2.0-SNAPSHOT.jar
 job run \
 -verbose \
 -wait_job_finish \
 -keep_staging_dir \
 --env DOCKER_JAVA_HOME=/usr/lib/jvm/java-8-oracle \
 --env DOCKER_HADOOP_HDFS_HOME=/hadoop-3.2.0-SNAPSHOT \
 --name tf-job-001 \
 --docker_image tangzhankun/tensorflow \
 --input_path hdfs://default/user/yarn/cifar-10-data \
 --worker_resources memory=4G,vcores=2 \
 --worker_launch_cmd "cd /cifar10_estimator && python cifar10_main.py 
--data-dir=%input_path% --job-dir=%checkpoint_path% --num-gpus=0 
--train-steps=5"{code}
Above script should works, but the job failed due to invalid path passed to 
"--job-dir" per my testing. It should be a URI start with "hdfs://".


> [Submarine] Correct the default directory path in HDFS for "checkout_path"
> --
>
> Key: YARN-8799
> URL: https://issues.apache.org/jira/browse/YARN-8799
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhankun Tang
>Assignee: Zhankun Tang
>Priority: Major
> Fix For: 3.2.0
>
>
>  
> {code:java}
> yarn jar 
> $HADOOP_BASE_DIR/home/share/hadoop/yarn/hadoop-yarn-submarine-3.2.0-SNAPSHOT.jar
>  job run \
>  -verbose \
>  -wait_job_finish \
>  -keep_staging_dir \
>  --env DOCKER_JAVA_HOME=/usr/lib/jvm/java-8-oracle \
>  --env DOCKER_HADOOP_HDFS_HOME=/hadoop-3.2.0-SNAPSHOT \
>  --name tf-job-001 \
>  --docker_image tangzhankun/tensorflow \
>  --input_path hdfs://default/user/yarn/cifar-10-data \
>  --worker_resources memory=4G,vcores=2 \
>  --worker_launch_cmd "cd /cifar10_estimator && python cifar10_main.py 
> --data-dir=%input_path% --job-dir=%checkpoint_path% --num-gpus=0 
> --train-steps=5"{code}
>  
> Above script should work, but the job failed due to invalid path passed to 
> "--job-dir" per my testing. It should be a URI start with "hdfs://".
> {code:java}
> 2018-09-19 23:19:34,729 INFO yarnservice.YarnServiceJobSubmitter: Worker 
> command =[cd /cifar10_estimator && python cifar10_main.py 
> --data-dir=hdfs://default/user/yarn/cifar-10-data 
> --job-dir=submarine/jobs/tf-job-001/staging/checkpoint_path --num-gpus=0 
> --train-steps=2]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8799) [Submarine] Correct the default directory path in HDFS for "checkout_path"

2018-09-19 Thread Zhankun Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhankun Tang updated YARN-8799:
---
Summary: [Submarine] Correct the default directory path in HDFS for 
"checkout_path"  (was: [Submarine] A default directory path in HDFS for 
"checkout_path"?)

> [Submarine] Correct the default directory path in HDFS for "checkout_path"
> --
>
> Key: YARN-8799
> URL: https://issues.apache.org/jira/browse/YARN-8799
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhankun Tang
>Assignee: Zhankun Tang
>Priority: Major
> Fix For: 3.2.0
>
>
> It might be more simple for user to use "--checkout_path" if we provide a 
> default path in HDFS for checkout_path. It could be under current staging dir 
> or other place that make sense.
> {code:java}
> yarn jar 
> $HADOOP_BASE_DIR/home/share/hadoop/yarn/hadoop-yarn-submarine-3.2.0-SNAPSHOT.jar
>  job run \
>  -verbose \
>  -wait_job_finish \
>  -keep_staging_dir \
>  --env DOCKER_JAVA_HOME=/usr/lib/jvm/java-8-oracle \
>  --env DOCKER_HADOOP_HDFS_HOME=/hadoop-3.2.0-SNAPSHOT \
>  --name tf-job-001 \
>  --docker_image tangzhankun/tensorflow \
>  --input_path hdfs://default/user/yarn/cifar-10-data \
>  --worker_resources memory=4G,vcores=2 \
>  --worker_launch_cmd "cd /cifar10_estimator && python cifar10_main.py 
> --data-dir=%input_path% --job-dir=%checkpoint_path% --num-gpus=0 
> --train-steps=5"{code}
> Above script should works, but the job failed due to invalid path passed to 
> "--job-dir" per my testing. It should be a URI start with "hdfs://".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8791) When STOPSIGNAL is not present then docker inspect returns an extra line feed

2018-09-19 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620784#comment-16620784
 ] 

Hadoop QA commented on YARN-8791:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 16s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 36s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m  
7s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 73m  5s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | YARN-8791 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12940324/YARN-8791.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux af03f61a7de9 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 15ed74f |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/21886/testReport/ |
| Max. process+thread count | 300 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/21886/console |
| Powered by | Apache Yetus 0.8.0   

[jira] [Created] (YARN-8799) [Submarine] A default directory path in HDFS for "checkout_path"?

2018-09-19 Thread Zhankun Tang (JIRA)
Zhankun Tang created YARN-8799:
--

 Summary: [Submarine] A default directory path in HDFS for 
"checkout_path"?
 Key: YARN-8799
 URL: https://issues.apache.org/jira/browse/YARN-8799
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhankun Tang
Assignee: Zhankun Tang
 Fix For: 3.2.0


It might be more simple for user to use "--checkout_path" if we provide a 
default path in HDFS for checkout_path. It could be under current staging dir 
or other place that make sense.
{code:java}
yarn jar 
$HADOOP_BASE_DIR/home/share/hadoop/yarn/hadoop-yarn-submarine-3.2.0-SNAPSHOT.jar
 job run \
 -verbose \
 -wait_job_finish \
 -keep_staging_dir \
 --env DOCKER_JAVA_HOME=/usr/lib/jvm/java-8-oracle \
 --env DOCKER_HADOOP_HDFS_HOME=/hadoop-3.2.0-SNAPSHOT \
 --name tf-job-001 \
 --docker_image tangzhankun/tensorflow \
 --input_path hdfs://default/user/yarn/cifar-10-data \
 --worker_resources memory=4G,vcores=2 \
 --worker_launch_cmd "cd /cifar10_estimator && python cifar10_main.py 
--data-dir=%input_path% --job-dir=%checkpoint_path% --num-gpus=0 
--train-steps=5"{code}
Above script should works, but the job failed due to invalid path passed to 
"--job-dir" per my testing. It should be a URI start with "hdfs://".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8783) Improve the documentation for the docker.trusted.registries configuration

2018-09-19 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620778#comment-16620778
 ] 

Eric Yang commented on YARN-8783:
-

[~simonprewo] The suggested changes looks good to me.  Would you like to take 
ownership of this JIRA and generate a patch?  I don't have access to assign the 
JIRA to you.  We may need to find a PMC to add you as a contributor.

> Improve the documentation for the docker.trusted.registries configuration
> -
>
> Key: YARN-8783
> URL: https://issues.apache.org/jira/browse/YARN-8783
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.1.1
>Reporter: Simon Prewo
>Priority: Major
>  Labels: Docker, container-executor, docker
>
> I am deploying the default yarn distributed shell example:
> {code:java}
> yarn jar hadoop-yarn-applications-distributedshell.jar -shell_env 
> YARN_CONTAINER_RUNTIME_TYPE=docker -shell_env 
> YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=centos -shell_command "sleep 90" -jar 
> hadoop-yarn-applications-distributedshell.jar -num_containers 1{code}
> Having a *single trusted registry configured like this works*:
> {code:java}
> docker.trusted.registries=centos{code}
> But having *a list of trusted registries configured fails* ("Shell error 
> output: image: centos is not trusted."):
> {code:java}
> docker.trusted.registries=centos,ubuntu{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8777) Container Executor C binary change to execute interactive docker command

2018-09-19 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620751#comment-16620751
 ] 

Eric Badger commented on YARN-8777:
---

bq. The enum approach can be used for fixed number of parameters or a small set 
of parameters. It is probably not an ideal interface to pass arbitrary commands 
to container-executor for docker exec. One possible danger is sending hex code 
as argv to trigger buffer overflow in container-executor or docker, where there 
is no logic to validate the arbitrary command.
I don't see how the attack surface is any different with bash vs arbitrary 
commands. Opening up a bash session allows the user to then execute whatever 
commands they want to anyway. Am I missing something here?

> Container Executor C binary change to execute interactive docker command
> 
>
> Key: YARN-8777
> URL: https://issues.apache.org/jira/browse/YARN-8777
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zian Chen
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8777.001.patch
>
>
> Since Container Executor provides Container execution using the native 
> container-executor binary, we also need to make changes to accept new 
> “dockerExec” method to invoke the corresponding native function to execute 
> docker exec command to the running container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-7599) [GPG] ApplicationCleaner in Global Policy Generator

2018-09-19 Thread Bibin A Chundatt (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620437#comment-16620437
 ] 

Bibin A Chundatt edited comment on YARN-7599 at 9/19/18 3:26 PM:
-

Thanks [~botong] for updated patch
{quote}Application cleaner is disabled when 
YarnConfiguration.GPG_APPCLEANER_INTERVAL_MS is set to zero or negative value:
{quote} # I was thinking of disabling cleaner while the GPG service is live. 
Shell command to reset clean interval. Similar to 
{{HSAdminServer#refreshLogRetentionSettings}}. As future work
 # Related to testcase name,IMHO better rename to 
testFederationStoreAppsCleanUp.
 # Can you change to single configuration similar to 
dfs.http.client.retry.policy.spec \{min,max,interval}. Any comments on this 
point?
{code:java}
60if (LOG.isDebugEnabled()) {
61  LOG.debug("List of apps: ", 
routerApps.stream().map(Object::toString)
62  .collect(Collectors.joining(",")));
63}
{code}

 # Regarding logs, i think better to print candidates for delete . Instead of 
all apps
 # Please handle applicable check style issues too. Not able to open CI results 
now.


was (Author: bibinchundatt):
Thanks [~botong] for updated patch

{quote}
Application cleaner is disabled when 
YarnConfiguration.GPG_APPCLEANER_INTERVAL_MS is set to zero or negative value:
{quote}
# I was thinking of disabling cleaner while the GPG service is live. Shell 
command to reset clean interval. Similar to  
{{HSAdminServer#refreshLogRetentionSettings}}. As future work 
# Related to testcase name,IMHO better rename to testFederationStoreAppsCleanUp.
# Can you change to single configuration similar to 
dfs.http.client.retry.policy.spec {min,max,interval}. Any comments on this 
point?
{code}
60if (LOG.isDebugEnabled()) {
61  LOG.debug("List of apps: ", 
routerApps.stream().map(Object::toString)
62  .collect(Collectors.joining(",")));
63}
{code}
# Regarding logs, i think better to print candidates for delete . Instead of 
all apps
# Handle checkstyle issues .I am not able to CI results now.

> [GPG] ApplicationCleaner in Global Policy Generator
> ---
>
> Key: YARN-7599
> URL: https://issues.apache.org/jira/browse/YARN-7599
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
>  Labels: federation, gpg
> Attachments: YARN-7599-YARN-7402.v1.patch, 
> YARN-7599-YARN-7402.v2.patch, YARN-7599-YARN-7402.v3.patch, 
> YARN-7599-YARN-7402.v4.patch, YARN-7599-YARN-7402.v5.patch
>
>
> In Federation, we need a cleanup service for StateStore as well as Yarn 
> Registry. For the former, we need to remove old application records. For the 
> latter, failed and killed applications might leave records in the Yarn 
> Registry (see YARN-6128). We plan to do both cleanup work in 
> ApplicationCleaner in GPG



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8791) When STOPSIGNAL is not present then docker inspect returns an extra line feed

2018-09-19 Thread Shane Kumpf (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620748#comment-16620748
 ] 

Shane Kumpf commented on YARN-8791:
---

Thanks for the patch, [~csingh]! I have confirmed this fixes the issue. +1 
pending Jenkins

> When STOPSIGNAL is not present then docker inspect returns an extra line feed
> -
>
> Key: YARN-8791
> URL: https://issues.apache.org/jira/browse/YARN-8791
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8791.001.patch
>
>
> When the STOPSIGNAL is missing, then an extra line feed is appended to the 
> output. This messes with the signal sent to the docker container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8784) DockerLinuxContainerRuntime prevents access to distributed cache entries on a full disk

2018-09-19 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620747#comment-16620747
 ] 

Jason Lowe commented on YARN-8784:
--

Thanks for the patch!  We should be fine bind-mounting the full and good disk 
locations and ignoring the bad disks.  After YARN-3591 containers should not 
longer be referencing bad disks when launching.

+1 lgtm.  I'll commit this later today if there are no objections.

> DockerLinuxContainerRuntime prevents access to distributed cache entries on a 
> full disk
> ---
>
> Key: YARN-8784
> URL: https://issues.apache.org/jira/browse/YARN-8784
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.2.0, 3.1.1
>Reporter: Jason Lowe
>Assignee: Eric Badger
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8784.001.patch
>
>
> DockerLinuxContainerRuntime bind mounts the filecache and usercache 
> directories into the container to allow tasks to access entries in the 
> distributed cache.  However it only bind mounts  directories on disks that 
> are considered good, and disks that are full or bad are not in that list.  If 
> a container tries to run with a distributed cache entry that has been 
> previously localized to a disk that is now considered full/bad, the dist 
> cache directory will _not_ be bind-mounted into the container's filesystem 
> namespace.  At that point any symlinks in the container's current working 
> directory that point to those disks will reference invalid paths.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8655) FairScheduler: FSStarvedApps is not thread safe

2018-09-19 Thread JIRA


[ 
https://issues.apache.org/jira/browse/YARN-8655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620735#comment-16620735
 ] 

Antal Bálint Steinbach commented on YARN-8655:
--

Hi [~uranus],

Thanks for uploading the patch.

If I understand correctly the reason to replace take() with poll() is that you 
don't want to do synchronization on a blocking call.

I have some suggestions about it:

1) Method name FSStarvedApps.take() is not consistent since it is a poll not a 
take

2) Comments tell in FSStarvedApps that take is a blocking method

3) The behavior of the class changed, in your solution the thread will run like 
an endless loop very often and will do a high-cost lock in every iteration. 
Maybe it would worth to add some sleep in the loop just like before.

> FairScheduler: FSStarvedApps is not thread safe
> ---
>
> Key: YARN-8655
> URL: https://issues.apache.org/jira/browse/YARN-8655
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0
>Reporter: Zhaohui Xin
>Assignee: Zhaohui Xin
>Priority: Major
> Attachments: YARN-8655.patch
>
>
> *FSStarvedApps is not thread safe, this may make one starve app is processed 
> for two times continuously.*
> For example, when app1 is fair share starved, it has been added to 
> appsToProcess. After that, app1 is taken but appBeingProcessed is not yet 
> update to app1. At the moment, app1 is starved by min share, so this app is 
> added to appsToProcess again! Because appBeingProcessed is null and 
> appsToProcess also have not this one. 
> {code:java}
> void addStarvedApp(FSAppAttempt app) {
> if (!app.equals(appBeingProcessed) && !appsToProcess.contains(app)) {
> appsToProcess.add(app);
> }
> }
> FSAppAttempt take() throws InterruptedException {
>   // Reset appBeingProcessed before the blocking call
>   appBeingProcessed = null;
>   // Blocking call to fetch the next starved application
>   FSAppAttempt app = appsToProcess.take();
>   appBeingProcessed = app;
>   return app;
> }
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8798) [Submarine] Job should not be submitted if "--input_path" option is missing

2018-09-19 Thread Zhankun Tang (JIRA)
Zhankun Tang created YARN-8798:
--

 Summary: [Submarine] Job should not be submitted if "--input_path" 
option is missing
 Key: YARN-8798
 URL: https://issues.apache.org/jira/browse/YARN-8798
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhankun Tang
Assignee: Zhankun Tang
 Fix For: 3.2.0


If a user doesn't set "–input_path" option, the job will still be submitted. 
Here is my command to run the job:
{code:java}
yarn jar 
$HADOOP_BASE_DIR/home/share/hadoop/yarn/hadoop-yarn-submarine-3.2.0-SNAPSHOT.jar
 job run \
 -verbose \
 -wait_job_finish \
 --env DOCKER_JAVA_HOME=/usr/lib/jvm/java-8-oracle \
 --env DOCKER_HADOOP_HDFS_HOME=/hadoop-3.2.0-SNAPSHOT \
 --name tf-job-001 \
 --docker_image tangzhankun/tensorflow \
 --worker_resources memory=4G,vcores=2 \
 --worker_launch_cmd "cd /cifar10_estimator && python cifar10_main.py 
--data-dir=%input_path% --job-dir=%checkpoint_path% --num-gpus=0 
--train-steps=5"{code}
Due to lack of invalidity check, the job is still submitted. We should add a 
check on this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8468) Limit container sizes per queue in FairScheduler

2018-09-19 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620713#comment-16620713
 ] 

Hadoop QA commented on YARN-8468:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 15 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 34s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
38s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 29s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 22 new + 880 unchanged - 22 fixed = 902 total (was 902) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 45s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 73m 
26s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
24s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
38s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}142m 12s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | YARN-8468 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12940422/YARN-8468.014.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  

[jira] [Updated] (YARN-8791) When STOPSIGNAL is not present then docker inspect returns an extra line feed

2018-09-19 Thread Eric Badger (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated YARN-8791:
--
Issue Type: Sub-task  (was: Bug)
Parent: YARN-8472

> When STOPSIGNAL is not present then docker inspect returns an extra line feed
> -
>
> Key: YARN-8791
> URL: https://issues.apache.org/jira/browse/YARN-8791
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8791.001.patch
>
>
> When the STOPSIGNAL is missing, then an extra line feed is appended to the 
> output. This messes with the signal sent to the docker container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8791) When STOPSIGNAL is not present then docker inspect returns an extra line feed

2018-09-19 Thread Eric Badger (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated YARN-8791:
--
Labels: Docker  (was: )

> When STOPSIGNAL is not present then docker inspect returns an extra line feed
> -
>
> Key: YARN-8791
> URL: https://issues.apache.org/jira/browse/YARN-8791
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8791.001.patch
>
>
> When the STOPSIGNAL is missing, then an extra line feed is appended to the 
> output. This messes with the signal sent to the docker container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8744) In some cases docker kill is used to stop non-privileged containers instead of sending the signal directly

2018-09-19 Thread Eric Badger (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated YARN-8744:
--
Issue Type: Sub-task  (was: Improvement)
Parent: YARN-8472

> In some cases docker kill is used to stop non-privileged containers instead 
> of sending the signal directly
> --
>
> Key: YARN-8744
> URL: https://issues.apache.org/jira/browse/YARN-8744
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
>  Labels: docker
>
> With YARN-8706, stopping docker containers was achieved by 
> 1. parsing the user specified {{STOPSIGNAL}} via docker inspect
> 2. executing {{docker kill --signal=}}
> Quoting [~ebadger]
> {quote}
> Additionally, for non-privileged containers, we don't need to call docker 
> kill. Instead, we can follow the code in handleContainerKill() and send the 
> signal directly. I think this code could probably be combined, since at this 
> point handleContainerKill() and handleContainerStop() will be doing the same 
> thing. The only difference is that the STOPSIGNAL will be used for the stop.
> {quote}
> To achieve the above, we need native code that accepts the name of the signal 
> rather than the value (number) of the signal. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6307) Refactor FairShareComparator#compare

2018-09-19 Thread stefanlee (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620523#comment-16620523
 ] 

stefanlee commented on YARN-6307:
-

Hi, [~yufeigu] ,after merge this patch, we met the problem YARN-4743.

> Refactor FairShareComparator#compare
> 
>
> Key: YARN-6307
> URL: https://issues.apache.org/jira/browse/YARN-6307
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Yufei Gu
>Assignee: Yufei Gu
>Priority: Major
> Fix For: 2.9.0, 3.0.0-beta1
>
> Attachments: YARN-6307.001.patch, YARN-6307.002.patch, 
> YARN-6307.003.patch
>
>
> The method does three things: compare the min share usage, compare fair share 
> usage by checking weight ratio, break tied by submit time and name. They are 
> mixed with each other which is not easy to read and maintenance, poor style. 
> Additionally, there are potential performance issues, like no need to check 
> weight ratio if minShare usage comparison already indicate the order. It is 
> worth to improve considering huge amount invokings in scheduler.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8793) QueuePlacementPolicy bind more information to assgining result

2018-09-19 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620521#comment-16620521
 ] 

Hadoop QA commented on YARN-8793:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 26s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 36s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 2 new + 287 unchanged - 10 fixed = 289 total (was 297) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 14 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 12s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 70m 
28s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}122m 42s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | YARN-8793 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12940405/YARN-8793.004.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux d38e05d44f89 3.13.0-144-generic #193-Ubuntu SMP Thu Mar 15 
17:03:53 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 28ceb34 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/21883/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/21883/artifact/out/whitespace-eol.txt
 |
|  Test Results | 

[jira] [Updated] (YARN-8468) Limit container sizes per queue in FairScheduler

2018-09-19 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antal Bálint Steinbach updated YARN-8468:
-
Attachment: YARN-8468.014.patch

> Limit container sizes per queue in FairScheduler
> 
>
> Key: YARN-8468
> URL: https://issues.apache.org/jira/browse/YARN-8468
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 3.1.0
>Reporter: Antal Bálint Steinbach
>Assignee: Antal Bálint Steinbach
>Priority: Critical
> Attachments: YARN-8468.000.patch, YARN-8468.001.patch, 
> YARN-8468.002.patch, YARN-8468.003.patch, YARN-8468.004.patch, 
> YARN-8468.005.patch, YARN-8468.006.patch, YARN-8468.007.patch, 
> YARN-8468.008.patch, YARN-8468.009.patch, YARN-8468.010.patch, 
> YARN-8468.011.patch, YARN-8468.012.patch, YARN-8468.013.patch, 
> YARN-8468.014.patch
>
>
> When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" 
> to limit the overall size of a container. This applies globally to all 
> containers and cannot be limited by queue or and is not scheduler dependent.
>  
> The goal of this ticket is to allow this value to be set on a per queue basis.
>  
> The use case: User has two pools, one for ad hoc jobs and one for enterprise 
> apps. User wants to limit ad hoc jobs to small containers but allow 
> enterprise apps to request as many resources as needed. Setting 
> yarn.scheduler.maximum-allocation-mb sets a default value for maximum 
> container size for all queues and setting maximum resources per queue with 
> “maxContainerResources” queue config value.
>  
> Suggested solution:
>  
> All the infrastructure is already in the code. We need to do the following:
>  * add the setting to the queue properties for all queue types (parent and 
> leaf), this will cover dynamically created queues.
>  * if we set it on the root we override the scheduler setting and we should 
> not allow that.
>  * make sure that queue resource cap can not be larger than scheduler max 
> resource cap in the config.
>  * implement getMaximumResourceCapability(String queueName) in the 
> FairScheduler
>  * implement getMaximumResourceCapability() in both FSParentQueue and 
> FSLeafQueue as follows
>  * expose the setting in the queue information in the RM web UI.
>  * expose the setting in the metrics etc for the queue.
>  * write JUnit tests.
>  * update the scheduler documentation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8468) Limit container sizes per queue in FairScheduler

2018-09-19 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antal Bálint Steinbach updated YARN-8468:
-
Attachment: (was: YARN-8468.014.patch)

> Limit container sizes per queue in FairScheduler
> 
>
> Key: YARN-8468
> URL: https://issues.apache.org/jira/browse/YARN-8468
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 3.1.0
>Reporter: Antal Bálint Steinbach
>Assignee: Antal Bálint Steinbach
>Priority: Critical
> Attachments: YARN-8468.000.patch, YARN-8468.001.patch, 
> YARN-8468.002.patch, YARN-8468.003.patch, YARN-8468.004.patch, 
> YARN-8468.005.patch, YARN-8468.006.patch, YARN-8468.007.patch, 
> YARN-8468.008.patch, YARN-8468.009.patch, YARN-8468.010.patch, 
> YARN-8468.011.patch, YARN-8468.012.patch, YARN-8468.013.patch
>
>
> When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" 
> to limit the overall size of a container. This applies globally to all 
> containers and cannot be limited by queue or and is not scheduler dependent.
>  
> The goal of this ticket is to allow this value to be set on a per queue basis.
>  
> The use case: User has two pools, one for ad hoc jobs and one for enterprise 
> apps. User wants to limit ad hoc jobs to small containers but allow 
> enterprise apps to request as many resources as needed. Setting 
> yarn.scheduler.maximum-allocation-mb sets a default value for maximum 
> container size for all queues and setting maximum resources per queue with 
> “maxContainerResources” queue config value.
>  
> Suggested solution:
>  
> All the infrastructure is already in the code. We need to do the following:
>  * add the setting to the queue properties for all queue types (parent and 
> leaf), this will cover dynamically created queues.
>  * if we set it on the root we override the scheduler setting and we should 
> not allow that.
>  * make sure that queue resource cap can not be larger than scheduler max 
> resource cap in the config.
>  * implement getMaximumResourceCapability(String queueName) in the 
> FairScheduler
>  * implement getMaximumResourceCapability() in both FSParentQueue and 
> FSLeafQueue as follows
>  * expose the setting in the queue information in the RM web UI.
>  * expose the setting in the metrics etc for the queue.
>  * write JUnit tests.
>  * update the scheduler documentation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



***UNCHECKED*** [jira] [Commented] (YARN-8436) FSParentQueue: Comparison method violates its general contract

2018-09-19 Thread stefanlee (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620509#comment-16620509
 ] 

stefanlee commented on YARN-8436:
-

[~wilfreds]  thanks for this jira, As you mentioned above:
{quote}If during this sorting a different node update changes a child queue 
then we allow that. 
{quote}
doesn't RM handle NODE_UPDATE event one by one?

> FSParentQueue: Comparison method violates its general contract
> --
>
> Key: YARN-8436
> URL: https://issues.apache.org/jira/browse/YARN-8436
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 3.1.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Minor
> Fix For: 3.2.0
>
> Attachments: YARN-8436.001.patch, YARN-8436.002.patch, 
> YARN-8436.003.patch
>
>
> The ResourceManager can fail while sorting queues if an update comes in:
> {code:java}
> FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
> handling event type NODE_UPDATE to the scheduler
> java.lang.IllegalArgumentException: Comparison method violates its general 
> contract!
>   at java.util.TimSort.mergeLo(TimSort.java:777)
>   at java.util.TimSort.mergeAt(TimSort.java:514)
> ...
>   at java.util.Collections.sort(Collections.java:175)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java:223){code}
> The reason it breaks is a change in the sorted object itself. 
> This is why it fails:
>  * an update from a node comes in as a heartbeat.
>  * the update triggers a check to see if we can assign a container on the 
> node.
>  * walk over the queue hierarchy to find a queue to assign a container to: 
> top down.
>  * for each parent queue we sort the child queues in {{assignContainer}} to 
> decide which queue to descent into.
>  * we lock the parent queue when sort to prevent changes, but we do not lock 
> the child queues that we are sorting.
> If during this sorting a different node update changes a child queue then we 
> allow that. This means that the objects that we are trying to sort now might 
> be out of order. That causes the issue with the comparator. The comparator 
> itself is not broken.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8468) Limit container sizes per queue in FairScheduler

2018-09-19 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antal Bálint Steinbach updated YARN-8468:
-
Attachment: YARN-8468.014.patch

> Limit container sizes per queue in FairScheduler
> 
>
> Key: YARN-8468
> URL: https://issues.apache.org/jira/browse/YARN-8468
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 3.1.0
>Reporter: Antal Bálint Steinbach
>Assignee: Antal Bálint Steinbach
>Priority: Critical
> Attachments: YARN-8468.000.patch, YARN-8468.001.patch, 
> YARN-8468.002.patch, YARN-8468.003.patch, YARN-8468.004.patch, 
> YARN-8468.005.patch, YARN-8468.006.patch, YARN-8468.007.patch, 
> YARN-8468.008.patch, YARN-8468.009.patch, YARN-8468.010.patch, 
> YARN-8468.011.patch, YARN-8468.012.patch, YARN-8468.013.patch, 
> YARN-8468.014.patch
>
>
> When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" 
> to limit the overall size of a container. This applies globally to all 
> containers and cannot be limited by queue or and is not scheduler dependent.
>  
> The goal of this ticket is to allow this value to be set on a per queue basis.
>  
> The use case: User has two pools, one for ad hoc jobs and one for enterprise 
> apps. User wants to limit ad hoc jobs to small containers but allow 
> enterprise apps to request as many resources as needed. Setting 
> yarn.scheduler.maximum-allocation-mb sets a default value for maximum 
> container size for all queues and setting maximum resources per queue with 
> “maxContainerResources” queue config value.
>  
> Suggested solution:
>  
> All the infrastructure is already in the code. We need to do the following:
>  * add the setting to the queue properties for all queue types (parent and 
> leaf), this will cover dynamically created queues.
>  * if we set it on the root we override the scheduler setting and we should 
> not allow that.
>  * make sure that queue resource cap can not be larger than scheduler max 
> resource cap in the config.
>  * implement getMaximumResourceCapability(String queueName) in the 
> FairScheduler
>  * implement getMaximumResourceCapability() in both FSParentQueue and 
> FSLeafQueue as follows
>  * expose the setting in the queue information in the RM web UI.
>  * expose the setting in the metrics etc for the queue.
>  * write JUnit tests.
>  * update the scheduler documentation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8797) [UI2] Fix YARN UI2 error pages

2018-09-19 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620502#comment-16620502
 ] 

Hadoop QA commented on YARN-8797:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
 8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
30m  3s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 27s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 44m  0s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | YARN-8797 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12940411/YARN-8797.001.patch |
| Optional Tests |  dupname  asflicense  shadedclient  |
| uname | Linux 9030c395246c 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 28ceb34 |
| maven | version: Apache Maven 3.3.9 |
| Max. process+thread count | 311 (vs. ulimit of 1) |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/21884/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> [UI2] Fix YARN UI2 error pages
> --
>
> Key: YARN-8797
> URL: https://issues.apache.org/jira/browse/YARN-8797
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Akhil PB
>Assignee: Akhil PB
>Priority: Major
> Attachments: YARN-8797.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8771) CapacityScheduler fails to unreserve when cluster resource contains empty resource type

2018-09-19 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620498#comment-16620498
 ] 

Hudson commented on YARN-8771:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15011 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/15011/])
YARN-8771. CapacityScheduler fails to unreserve when cluster resource (wwei: 
rev 0712537e799bc03855d548d1f4bd690dd478b871)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestContainerAllocation.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/allocator/RegularContainerAllocator.java


> CapacityScheduler fails to unreserve when cluster resource contains empty 
> resource type
> ---
>
> Key: YARN-8771
> URL: https://issues.apache.org/jira/browse/YARN-8771
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Critical
> Fix For: 3.2.0, 3.1.2
>
> Attachments: YARN-8771.001.patch, YARN-8771.002.patch, 
> YARN-8771.003.patch, YARN-8771.004.patch
>
>
> We found this problem when cluster is almost but not exhausted (93% used), 
> scheduler kept allocating for an app but always fail to commit, this can 
> blocking requests from other apps and parts of cluster resource can't be used.
> Reproduce this problem:
> (1) use DominantResourceCalculator
> (2) cluster resource has empty resource type, for example: gpu=0
> (3) scheduler allocates container for app1 who has reserved containers and 
> whose queue limit or user limit reached(used + required > limit). 
> Reference codes in RegularContainerAllocator#assignContainer:
> {code:java}
> // How much need to unreserve equals to:
> // max(required - headroom, amountNeedUnreserve)
> Resource headRoom = Resources.clone(currentResoureLimits.getHeadroom());
> Resource resourceNeedToUnReserve =
> Resources.max(rc, clusterResource,
> Resources.subtract(capability, headRoom),
> currentResoureLimits.getAmountNeededUnreserve());
> boolean needToUnreserve =
> Resources.greaterThan(rc, clusterResource,
> resourceNeedToUnReserve, Resources.none());
> {code}
> For example, resourceNeedToUnReserve can be <8GB, -6 cores, 0 gpu> when 
> {{headRoom=<0GB, 8 vcores, 0 gpu>}} and {{capacity=<8GB, 2 vcores, 0 gpu>}}, 
> needToUnreserve which is the result of {{Resources#greaterThan}} will be 
> {{false}}. This is not reasonable because required resource did exceed the 
> headroom and unreserve is needed.
> After that, when reaching the unreserve process in 
> RegularContainerAllocator#assignContainer, unreserve process will be skipped 
> when shouldAllocOrReserveNewContainer is true (when required containers > 
> reserved containers) and needToUnreserve is wrongly calculated to be false:
> {code:java}
> if (availableContainers > 0) {
>  if (rmContainer == null && reservationsContinueLooking
>   && node.getLabels().isEmpty()) {
>   // unreserve process can be wrongly skipped when 
> shouldAllocOrReserveNewContainer=true and needToUnreserve=false but required 
> resource did exceed the headroom
>   if (!shouldAllocOrReserveNewContainer || needToUnreserve) { 
> ... 
>   }
>  }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8468) Limit container sizes per queue in FairScheduler

2018-09-19 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antal Bálint Steinbach updated YARN-8468:
-
Attachment: (was: YARN-8468.014.patch)

> Limit container sizes per queue in FairScheduler
> 
>
> Key: YARN-8468
> URL: https://issues.apache.org/jira/browse/YARN-8468
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 3.1.0
>Reporter: Antal Bálint Steinbach
>Assignee: Antal Bálint Steinbach
>Priority: Critical
> Attachments: YARN-8468.000.patch, YARN-8468.001.patch, 
> YARN-8468.002.patch, YARN-8468.003.patch, YARN-8468.004.patch, 
> YARN-8468.005.patch, YARN-8468.006.patch, YARN-8468.007.patch, 
> YARN-8468.008.patch, YARN-8468.009.patch, YARN-8468.010.patch, 
> YARN-8468.011.patch, YARN-8468.012.patch, YARN-8468.013.patch
>
>
> When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" 
> to limit the overall size of a container. This applies globally to all 
> containers and cannot be limited by queue or and is not scheduler dependent.
>  
> The goal of this ticket is to allow this value to be set on a per queue basis.
>  
> The use case: User has two pools, one for ad hoc jobs and one for enterprise 
> apps. User wants to limit ad hoc jobs to small containers but allow 
> enterprise apps to request as many resources as needed. Setting 
> yarn.scheduler.maximum-allocation-mb sets a default value for maximum 
> container size for all queues and setting maximum resources per queue with 
> “maxContainerResources” queue config value.
>  
> Suggested solution:
>  
> All the infrastructure is already in the code. We need to do the following:
>  * add the setting to the queue properties for all queue types (parent and 
> leaf), this will cover dynamically created queues.
>  * if we set it on the root we override the scheduler setting and we should 
> not allow that.
>  * make sure that queue resource cap can not be larger than scheduler max 
> resource cap in the config.
>  * implement getMaximumResourceCapability(String queueName) in the 
> FairScheduler
>  * implement getMaximumResourceCapability() in both FSParentQueue and 
> FSLeafQueue as follows
>  * expose the setting in the queue information in the RM web UI.
>  * expose the setting in the metrics etc for the queue.
>  * write JUnit tests.
>  * update the scheduler documentation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8771) CapacityScheduler fails to unreserve when cluster resource contains empty resource type

2018-09-19 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620492#comment-16620492
 ] 

Weiwei Yang commented on YARN-8771:
---

Oops, it should not be cherry-picked to branch-3.0 as YARN-8292 was fixed in 
3.1.1. Just reverted it from branch-3.0.

> CapacityScheduler fails to unreserve when cluster resource contains empty 
> resource type
> ---
>
> Key: YARN-8771
> URL: https://issues.apache.org/jira/browse/YARN-8771
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Critical
> Fix For: 3.2.0, 3.1.2
>
> Attachments: YARN-8771.001.patch, YARN-8771.002.patch, 
> YARN-8771.003.patch, YARN-8771.004.patch
>
>
> We found this problem when cluster is almost but not exhausted (93% used), 
> scheduler kept allocating for an app but always fail to commit, this can 
> blocking requests from other apps and parts of cluster resource can't be used.
> Reproduce this problem:
> (1) use DominantResourceCalculator
> (2) cluster resource has empty resource type, for example: gpu=0
> (3) scheduler allocates container for app1 who has reserved containers and 
> whose queue limit or user limit reached(used + required > limit). 
> Reference codes in RegularContainerAllocator#assignContainer:
> {code:java}
> // How much need to unreserve equals to:
> // max(required - headroom, amountNeedUnreserve)
> Resource headRoom = Resources.clone(currentResoureLimits.getHeadroom());
> Resource resourceNeedToUnReserve =
> Resources.max(rc, clusterResource,
> Resources.subtract(capability, headRoom),
> currentResoureLimits.getAmountNeededUnreserve());
> boolean needToUnreserve =
> Resources.greaterThan(rc, clusterResource,
> resourceNeedToUnReserve, Resources.none());
> {code}
> For example, resourceNeedToUnReserve can be <8GB, -6 cores, 0 gpu> when 
> {{headRoom=<0GB, 8 vcores, 0 gpu>}} and {{capacity=<8GB, 2 vcores, 0 gpu>}}, 
> needToUnreserve which is the result of {{Resources#greaterThan}} will be 
> {{false}}. This is not reasonable because required resource did exceed the 
> headroom and unreserve is needed.
> After that, when reaching the unreserve process in 
> RegularContainerAllocator#assignContainer, unreserve process will be skipped 
> when shouldAllocOrReserveNewContainer is true (when required containers > 
> reserved containers) and needToUnreserve is wrongly calculated to be false:
> {code:java}
> if (availableContainers > 0) {
>  if (rmContainer == null && reservationsContinueLooking
>   && node.getLabels().isEmpty()) {
>   // unreserve process can be wrongly skipped when 
> shouldAllocOrReserveNewContainer=true and needToUnreserve=false but required 
> resource did exceed the headroom
>   if (!shouldAllocOrReserveNewContainer || needToUnreserve) { 
> ... 
>   }
>  }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



***UNCHECKED*** [jira] [Updated] (YARN-8771) CapacityScheduler fails to unreserve when cluster resource contains empty resource type

2018-09-19 Thread Weiwei Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-8771:
--
Fix Version/s: (was: 3.0.4)

> CapacityScheduler fails to unreserve when cluster resource contains empty 
> resource type
> ---
>
> Key: YARN-8771
> URL: https://issues.apache.org/jira/browse/YARN-8771
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Critical
> Fix For: 3.2.0, 3.1.2
>
> Attachments: YARN-8771.001.patch, YARN-8771.002.patch, 
> YARN-8771.003.patch, YARN-8771.004.patch
>
>
> We found this problem when cluster is almost but not exhausted (93% used), 
> scheduler kept allocating for an app but always fail to commit, this can 
> blocking requests from other apps and parts of cluster resource can't be used.
> Reproduce this problem:
> (1) use DominantResourceCalculator
> (2) cluster resource has empty resource type, for example: gpu=0
> (3) scheduler allocates container for app1 who has reserved containers and 
> whose queue limit or user limit reached(used + required > limit). 
> Reference codes in RegularContainerAllocator#assignContainer:
> {code:java}
> // How much need to unreserve equals to:
> // max(required - headroom, amountNeedUnreserve)
> Resource headRoom = Resources.clone(currentResoureLimits.getHeadroom());
> Resource resourceNeedToUnReserve =
> Resources.max(rc, clusterResource,
> Resources.subtract(capability, headRoom),
> currentResoureLimits.getAmountNeededUnreserve());
> boolean needToUnreserve =
> Resources.greaterThan(rc, clusterResource,
> resourceNeedToUnReserve, Resources.none());
> {code}
> For example, resourceNeedToUnReserve can be <8GB, -6 cores, 0 gpu> when 
> {{headRoom=<0GB, 8 vcores, 0 gpu>}} and {{capacity=<8GB, 2 vcores, 0 gpu>}}, 
> needToUnreserve which is the result of {{Resources#greaterThan}} will be 
> {{false}}. This is not reasonable because required resource did exceed the 
> headroom and unreserve is needed.
> After that, when reaching the unreserve process in 
> RegularContainerAllocator#assignContainer, unreserve process will be skipped 
> when shouldAllocOrReserveNewContainer is true (when required containers > 
> reserved containers) and needToUnreserve is wrongly calculated to be false:
> {code:java}
> if (availableContainers > 0) {
>  if (rmContainer == null && reservationsContinueLooking
>   && node.getLabels().isEmpty()) {
>   // unreserve process can be wrongly skipped when 
> shouldAllocOrReserveNewContainer=true and needToUnreserve=false but required 
> resource did exceed the headroom
>   if (!shouldAllocOrReserveNewContainer || needToUnreserve) { 
> ... 
>   }
>  }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8468) Limit container sizes per queue in FairScheduler

2018-09-19 Thread JIRA


[ 
https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620490#comment-16620490
 ] 

Antal Bálint Steinbach commented on YARN-8468:
--

Hi [~cheersyang] ,

Thanks for the detailed feedback. Some of the changes you requested to revert 
came from formatting only the modified code/method with the defined Hadoop 
formatter. Also, it can happen that the modification was reverted but some 
small formatting remained there.

1) Reverted, my required changes were introduced in another commit before.

2) Fixed

3) Using maxContainerAllocation/Max Container Allocation as it was suggested by 
[~haibochen] before

4) Fixed

5) Fixed

6) Removed in the steps before

7) Reverted

8) Reverted

Lots of checkstyle issues were not related to my patch. I cant see how can I 
filter the ones which were introduced by my changes, using a checkstyle plugin 
makes my IDE unusable because of the millions of issues. Anyway, I try to 
eliminate some more issues, but there are some which would cause the same 
changes I just reverted like renaming variables etc...

> Limit container sizes per queue in FairScheduler
> 
>
> Key: YARN-8468
> URL: https://issues.apache.org/jira/browse/YARN-8468
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 3.1.0
>Reporter: Antal Bálint Steinbach
>Assignee: Antal Bálint Steinbach
>Priority: Critical
> Attachments: YARN-8468.000.patch, YARN-8468.001.patch, 
> YARN-8468.002.patch, YARN-8468.003.patch, YARN-8468.004.patch, 
> YARN-8468.005.patch, YARN-8468.006.patch, YARN-8468.007.patch, 
> YARN-8468.008.patch, YARN-8468.009.patch, YARN-8468.010.patch, 
> YARN-8468.011.patch, YARN-8468.012.patch, YARN-8468.013.patch, 
> YARN-8468.014.patch
>
>
> When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" 
> to limit the overall size of a container. This applies globally to all 
> containers and cannot be limited by queue or and is not scheduler dependent.
>  
> The goal of this ticket is to allow this value to be set on a per queue basis.
>  
> The use case: User has two pools, one for ad hoc jobs and one for enterprise 
> apps. User wants to limit ad hoc jobs to small containers but allow 
> enterprise apps to request as many resources as needed. Setting 
> yarn.scheduler.maximum-allocation-mb sets a default value for maximum 
> container size for all queues and setting maximum resources per queue with 
> “maxContainerResources” queue config value.
>  
> Suggested solution:
>  
> All the infrastructure is already in the code. We need to do the following:
>  * add the setting to the queue properties for all queue types (parent and 
> leaf), this will cover dynamically created queues.
>  * if we set it on the root we override the scheduler setting and we should 
> not allow that.
>  * make sure that queue resource cap can not be larger than scheduler max 
> resource cap in the config.
>  * implement getMaximumResourceCapability(String queueName) in the 
> FairScheduler
>  * implement getMaximumResourceCapability() in both FSParentQueue and 
> FSLeafQueue as follows
>  * expose the setting in the queue information in the RM web UI.
>  * expose the setting in the metrics etc for the queue.
>  * write JUnit tests.
>  * update the scheduler documentation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8468) Limit container sizes per queue in FairScheduler

2018-09-19 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antal Bálint Steinbach updated YARN-8468:
-
Attachment: YARN-8468.014.patch

> Limit container sizes per queue in FairScheduler
> 
>
> Key: YARN-8468
> URL: https://issues.apache.org/jira/browse/YARN-8468
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 3.1.0
>Reporter: Antal Bálint Steinbach
>Assignee: Antal Bálint Steinbach
>Priority: Critical
> Attachments: YARN-8468.000.patch, YARN-8468.001.patch, 
> YARN-8468.002.patch, YARN-8468.003.patch, YARN-8468.004.patch, 
> YARN-8468.005.patch, YARN-8468.006.patch, YARN-8468.007.patch, 
> YARN-8468.008.patch, YARN-8468.009.patch, YARN-8468.010.patch, 
> YARN-8468.011.patch, YARN-8468.012.patch, YARN-8468.013.patch, 
> YARN-8468.014.patch
>
>
> When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" 
> to limit the overall size of a container. This applies globally to all 
> containers and cannot be limited by queue or and is not scheduler dependent.
>  
> The goal of this ticket is to allow this value to be set on a per queue basis.
>  
> The use case: User has two pools, one for ad hoc jobs and one for enterprise 
> apps. User wants to limit ad hoc jobs to small containers but allow 
> enterprise apps to request as many resources as needed. Setting 
> yarn.scheduler.maximum-allocation-mb sets a default value for maximum 
> container size for all queues and setting maximum resources per queue with 
> “maxContainerResources” queue config value.
>  
> Suggested solution:
>  
> All the infrastructure is already in the code. We need to do the following:
>  * add the setting to the queue properties for all queue types (parent and 
> leaf), this will cover dynamically created queues.
>  * if we set it on the root we override the scheduler setting and we should 
> not allow that.
>  * make sure that queue resource cap can not be larger than scheduler max 
> resource cap in the config.
>  * implement getMaximumResourceCapability(String queueName) in the 
> FairScheduler
>  * implement getMaximumResourceCapability() in both FSParentQueue and 
> FSLeafQueue as follows
>  * expose the setting in the queue information in the RM web UI.
>  * expose the setting in the metrics etc for the queue.
>  * write JUnit tests.
>  * update the scheduler documentation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8793) QueuePlacementPolicy bind more information to assgining result

2018-09-19 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620487#comment-16620487
 ] 

Hadoop QA commented on YARN-8793:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  9s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 33s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 7 new + 287 unchanged - 10 fixed = 294 total (was 297) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 21 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
1s{color} | {color:red} The patch 1 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  4s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 73m  
5s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}121m 11s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | YARN-8793 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12940399/YARN-8793.003.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 62daab0c5ede 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 
07:31:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / e435e12 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/21882/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| whitespace | 

[jira] [Commented] (YARN-8771) CapacityScheduler fails to unreserve when cluster resource contains empty resource type

2018-09-19 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620474#comment-16620474
 ] 

Weiwei Yang commented on YARN-8771:
---

LGTM, +1. Committing soon.

> CapacityScheduler fails to unreserve when cluster resource contains empty 
> resource type
> ---
>
> Key: YARN-8771
> URL: https://issues.apache.org/jira/browse/YARN-8771
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Critical
> Attachments: YARN-8771.001.patch, YARN-8771.002.patch, 
> YARN-8771.003.patch, YARN-8771.004.patch
>
>
> We found this problem when cluster is almost but not exhausted (93% used), 
> scheduler kept allocating for an app but always fail to commit, this can 
> blocking requests from other apps and parts of cluster resource can't be used.
> Reproduce this problem:
> (1) use DominantResourceCalculator
> (2) cluster resource has empty resource type, for example: gpu=0
> (3) scheduler allocates container for app1 who has reserved containers and 
> whose queue limit or user limit reached(used + required > limit). 
> Reference codes in RegularContainerAllocator#assignContainer:
> {code:java}
> // How much need to unreserve equals to:
> // max(required - headroom, amountNeedUnreserve)
> Resource headRoom = Resources.clone(currentResoureLimits.getHeadroom());
> Resource resourceNeedToUnReserve =
> Resources.max(rc, clusterResource,
> Resources.subtract(capability, headRoom),
> currentResoureLimits.getAmountNeededUnreserve());
> boolean needToUnreserve =
> Resources.greaterThan(rc, clusterResource,
> resourceNeedToUnReserve, Resources.none());
> {code}
> For example, resourceNeedToUnReserve can be <8GB, -6 cores, 0 gpu> when 
> {{headRoom=<0GB, 8 vcores, 0 gpu>}} and {{capacity=<8GB, 2 vcores, 0 gpu>}}, 
> needToUnreserve which is the result of {{Resources#greaterThan}} will be 
> {{false}}. This is not reasonable because required resource did exceed the 
> headroom and unreserve is needed.
> After that, when reaching the unreserve process in 
> RegularContainerAllocator#assignContainer, unreserve process will be skipped 
> when shouldAllocOrReserveNewContainer is true (when required containers > 
> reserved containers) and needToUnreserve is wrongly calculated to be false:
> {code:java}
> if (availableContainers > 0) {
>  if (rmContainer == null && reservationsContinueLooking
>   && node.getLabels().isEmpty()) {
>   // unreserve process can be wrongly skipped when 
> shouldAllocOrReserveNewContainer=true and needToUnreserve=false but required 
> resource did exceed the headroom
>   if (!shouldAllocOrReserveNewContainer || needToUnreserve) { 
> ... 
>   }
>  }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8771) CapacityScheduler fails to unreserve when cluster resource contains empty resource type

2018-09-19 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620469#comment-16620469
 ] 

Hadoop QA commented on YARN-8771:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
26s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
 9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 16s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 59s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 73m 
46s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}131m 20s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | YARN-8771 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12940394/YARN-8771.004.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 1ac8077cf8a5 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / e435e12 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/21880/testReport/ |
| Max. process+thread count | 903 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/21880/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> CapacityScheduler fails to unreserve when 

[jira] [Commented] (YARN-8627) EntityGroupFSTimelineStore hdfs done directory keeps on accumulating

2018-09-19 Thread Tarun Parimi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620464#comment-16620464
 ] 

Tarun Parimi commented on YARN-8627:


Thanks for the review [~rohithsharma]. 

I tested for folder path appid/appid/appid and this patch handles it fine. This 
is because only the first appid directory encountered will be deleted 
recursively after its child directories have been tested for modification time. 

I agree that we should try to find root cause for the actual creation of 
repeated directories. I wasn't able to reproduce this locally so wasn't able to 
dig much deeper.

I had looked at the hadoop fs -ls -R output of /ats/done for the cluster in 
which I had observed the issue. One thing I noticed is that only the 
"domainlog" file was present in these type of repeated appid directories. Other 
types such as summarylog/entitylog were present only in the normal expected 
directory structure. Also two domainlogs are created and they have different 
size and modification time of one causing problem is much greater at 13:00. But 
not sure on the exact scenario which is causing this to happen. A sample is 
below.

 
{code:java}
drwxrwx--- - appuser hadoop 0 2017-10-16 13:01 
/ats/done/1508116310016//000/application_1508116310016_0010
drwxrwx--- - appuser hadoop 0 2017-10-16 12:16 
/ats/done/1508116310016//000/application_1508116310016_0010/appattempt_1508116310016_0010_01
-rw-r- 3 appuser hadoop 88 2017-10-16 12:20 
/ats/done/1508116310016//000/application_1508116310016_0010/appattempt_1508116310016_0010_01/domainlog-appattempt_1508116310016_0010_01
-rw-r- 3 appuser hadoop 92324 2017-10-16 12:22 
/ats/done/1508116310016//000/application_1508116310016_0010/appattempt_1508116310016_0010_01/summarylog-appattempt_1508116310016_0010_01
drwxrwxrwx - appuser hadoop 0 2017-10-16 13:00 
/ats/done/1508116310016//000/application_1508116310016_0010/application_1508116310016_0010
drwxrwxrwx - appuser hadoop 0 2017-10-16 13:00 
/ats/done/1508116310016//000/application_1508116310016_0010/application_1508116310016_0010/appattempt_1508116310016_0010_01
-rw-r- 3 appuser hadoop 90 2017-10-16 13:00 
/ats/done/1508116310016//000/application_1508116310016_0010/application_1508116310016_0010/appattempt_1508116310016_0010_01/domainlog-appattempt_1508116310016_0010_01
 
{code}
 

> EntityGroupFSTimelineStore hdfs done directory keeps on accumulating
> 
>
> Key: YARN-8627
> URL: https://issues.apache.org/jira/browse/YARN-8627
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 2.8.0
>Reporter: Tarun Parimi
>Assignee: Tarun Parimi
>Priority: Major
> Attachments: YARN-8627.001.patch, YARN-8627.002.patch
>
>
> The EntityLogCleaner threads exits with the following ERROR every time it 
> runs.  
> {code:java}
> 2018-07-18 19:59:39,837 INFO timeline.EntityGroupFSTimelineStore 
> (EntityGroupFSTimelineStore.java:cleanLogs(462)) - Deleting 
> hdfs://namenode/ats/done/1499684568068//018/application_1499684568068_18268
> 2018-07-18 19:59:39,844 INFO timeline.EntityGroupFSTimelineStore 
> (EntityGroupFSTimelineStore.java:cleanLogs(462)) - Deleting 
> hdfs://namenode/ats/done/1499684568068//018/application_1499684568068_18270
> 2018-07-18 19:59:39,848 ERROR timeline.EntityGroupFSTimelineStore 
> (EntityGroupFSTimelineStore.java:run(899)) - Error cleaning files  
> java.io.FileNotFoundException: File 
> hdfs://namenode/ats/done/1499684568068//018/application_1499684568068_18270
>  does not exist.  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1062)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1069)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1040)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1019)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1015)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusIterator(DistributedFileSystem.java:1015)
>   at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.shouldCleanAppLogDir(EntityGroupFSTimelineStore.java:480)
>  
> {code}
>  
>  Each time the thread gets scheduled, it is a different folder encountering 
> the error. As a result, the thread is not able to clean all the old done 
> directories, since it stops after this error. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (YARN-8797) [UI2] Fix YARN UI2 error pages

2018-09-19 Thread Akhil PB (JIRA)
Akhil PB created YARN-8797:
--

 Summary: [UI2] Fix YARN UI2 error pages
 Key: YARN-8797
 URL: https://issues.apache.org/jira/browse/YARN-8797
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn-ui-v2
Reporter: Akhil PB
Assignee: Akhil PB






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7599) [GPG] ApplicationCleaner in Global Policy Generator

2018-09-19 Thread Bibin A Chundatt (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620437#comment-16620437
 ] 

Bibin A Chundatt commented on YARN-7599:


Thanks [~botong] for updated patch

{quote}
Application cleaner is disabled when 
YarnConfiguration.GPG_APPCLEANER_INTERVAL_MS is set to zero or negative value:
{quote}
# I was thinking of disabling cleaner while the GPG service is live. Shell 
command to reset clean interval. Similar to  
{{HSAdminServer#refreshLogRetentionSettings}}. As future work 
# Related to testcase name,IMHO better rename to testFederationStoreAppsCleanUp.
# Can you change to single configuration similar to 
dfs.http.client.retry.policy.spec {min,max,interval}. Any comments on this 
point?
{code}
60if (LOG.isDebugEnabled()) {
61  LOG.debug("List of apps: ", 
routerApps.stream().map(Object::toString)
62  .collect(Collectors.joining(",")));
63}
{code}
# Regarding logs, i think better to print candidates for delete . Instead of 
all apps
# Handle checkstyle issues .I am not able to CI results now.

> [GPG] ApplicationCleaner in Global Policy Generator
> ---
>
> Key: YARN-7599
> URL: https://issues.apache.org/jira/browse/YARN-7599
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
>  Labels: federation, gpg
> Attachments: YARN-7599-YARN-7402.v1.patch, 
> YARN-7599-YARN-7402.v2.patch, YARN-7599-YARN-7402.v3.patch, 
> YARN-7599-YARN-7402.v4.patch, YARN-7599-YARN-7402.v5.patch
>
>
> In Federation, we need a cleanup service for StateStore as well as Yarn 
> Registry. For the former, we need to remove old application records. For the 
> latter, failed and killed applications might leave records in the Yarn 
> Registry (see YARN-6128). We plan to do both cleanup work in 
> ApplicationCleaner in GPG



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8793) QueuePlacementPolicy bind more information to assgining result

2018-09-19 Thread Shuai Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuai Zhang updated YARN-8793:
--
Attachment: YARN-8793.004.patch

> QueuePlacementPolicy bind more information to assgining result
> --
>
> Key: YARN-8793
> URL: https://issues.apache.org/jira/browse/YARN-8793
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Affects Versions: 3.1.1
>Reporter: Shuai Zhang
>Priority: Major
> Attachments: YARN-8793.001.patch, YARN-8793.002.patch, 
> YARN-8793.003.patch, YARN-8793.004.patch
>
>
> Fair scheduler's QueuePlacementPolicy should bind more information to 
> assigning result:
>  # Whether to terminate the chain of responsibility
>  # The reason to reject a request



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8767) TestStreamingStatus fails

2018-09-19 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620425#comment-16620425
 ] 

Hadoop QA commented on YARN-8767:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 13s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
24s{color} | {color:green} hadoop-tools_hadoop-streaming generated 0 new + 78 
unchanged - 5 fixed = 78 total (was 83) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} hadoop-tools/hadoop-streaming: The patch generated 0 
new + 49 unchanged - 16 fixed = 49 total (was 65) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 55s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  6m  
1s{color} | {color:green} hadoop-streaming in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 62m 23s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 |
| JIRA Issue | YARN-8767 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12940395/YARN-8767.004.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux e0a1724f05c4 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / e435e12 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/21881/testReport/ |
| Max. process+thread count | 708 (vs. ulimit of 1) |
| modules | C: hadoop-tools/hadoop-streaming U: hadoop-tools/hadoop-streaming |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/21881/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> TestStreamingStatus 

[jira] [Updated] (YARN-8793) QueuePlacementPolicy bind more information to assgining result

2018-09-19 Thread Shuai Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuai Zhang updated YARN-8793:
--
Attachment: YARN-8793.003.patch

> QueuePlacementPolicy bind more information to assgining result
> --
>
> Key: YARN-8793
> URL: https://issues.apache.org/jira/browse/YARN-8793
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Affects Versions: 3.1.1
>Reporter: Shuai Zhang
>Priority: Major
> Attachments: YARN-8793.001.patch, YARN-8793.002.patch, 
> YARN-8793.003.patch
>
>
> Fair scheduler's QueuePlacementPolicy should bind more information to 
> assigning result:
>  # Whether to terminate the chain of responsibility
>  # The reason to reject a request



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8767) TestStreamingStatus fails

2018-09-19 Thread Andras Bokor (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Bokor updated YARN-8767:
---
Attachment: YARN-8767.004.patch

> TestStreamingStatus fails
> -
>
> Key: YARN-8767
> URL: https://issues.apache.org/jira/browse/YARN-8767
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Andras Bokor
>Assignee: Andras Bokor
>Priority: Major
> Attachments: YARN-8767.001.patch, YARN-8767.002.patch, 
> YARN-8767.003.patch, YARN-8767.004.patch
>
>
> The test tries to connect to RM through 0.0.0.0:8032, but it cannot.
> On the console I see the following error message:
> {code}Your endpoint configuration is wrong; For more details see:  
> http://wiki.apache.org/hadoop/UnsetHostnameOrPort, while invoking 
> ApplicationClientProtocolPBClientImpl.getNewApplication over null after 1 
> failover attempts. Trying to failover after sleeping for 44892ms.{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



  1   2   >