date:20171114

[jira] [Commented] (YARN-7492) Set up SASS for UI styling

2017-11-14 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253076#comment-16253076
 ] 

Sunil G commented on YARN-7492:
---

Please revert configs.env local changes

> Set up SASS for UI styling
> --
>
> Key: YARN-7492
> URL: https://issues.apache.org/jira/browse/YARN-7492
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Vasudevan Skm
>Assignee: Vasudevan Skm
> Attachments: YARN-7492.001.patch, YARN-7492.002.patch, 
> YARN-7492.003.patch
>
>
> SASS will help in improving the quality and maintainablity of our styles. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7497) Add HDFSSchedulerConfigurationStore for RM HA

2017-11-14 Thread Jiandan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiandan Yang  updated YARN-7497:

Description: 
YARN-5947 add LeveldbConfigurationStore using Leveldb as backing store, but it 
does not support Yarn RM HA. 
YARN-6840 supports RM HA, but too many scheduler configurations may exceed 
znode limit, for example 10 thousand queues.
HDFSSchedulerConfigurationStore store conf file in HDFS, when RM failover, new 
active RM can load scheduler configuration from HDFS.

  was:YARN-5947 add LeveldbConfigurationStore using Leveldb as backing store, 
but it does not support Yarn RM HA. HDFSSchedulerConfigurationStore store conf 
file in HDFS, when RM failover, new active RM can load scheduler configuration 
from HDFS.


> Add HDFSSchedulerConfigurationStore for RM HA
> -
>
> Key: YARN-7497
> URL: https://issues.apache.org/jira/browse/YARN-7497
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: yarn
>Reporter: Jiandan Yang 
> Attachments: YARN-7497.001.patch
>
>
> YARN-5947 add LeveldbConfigurationStore using Leveldb as backing store, but 
> it does not support Yarn RM HA. 
> YARN-6840 supports RM HA, but too many scheduler configurations may exceed 
> znode limit, for example 10 thousand queues.
> HDFSSchedulerConfigurationStore store conf file in HDFS, when RM failover, 
> new active RM can load scheduler configuration from HDFS.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7497) Add HDFSSchedulerConfigurationStore for RM HA

2017-11-14 Thread Jiandan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiandan Yang  updated YARN-7497:

Attachment: YARN-7497.001.patch

> Add HDFSSchedulerConfigurationStore for RM HA
> -
>
> Key: YARN-7497
> URL: https://issues.apache.org/jira/browse/YARN-7497
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: yarn
>Reporter: Jiandan Yang 
> Attachments: YARN-7497.001.patch
>
>
> YARN-5947 add LeveldbConfigurationStore using Leveldb as backing store, but 
> it does not support Yarn RM HA. HDFSSchedulerConfigurationStore store conf 
> file in HDFS, when RM failover, new active RM can load scheduler 
> configuration from HDFS.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7492) Set up SASS for UI styling

2017-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253065#comment-16253065
 ] 

Hadoop QA commented on YARN-7492:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
31s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
24m 58s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 38s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 35m 57s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7492 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12897685/YARN-7492.003.patch |
| Optional Tests |  asflicense  shadedclient  |
| uname | Linux bf7b81ab47e2 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / c4c57b8 |
| maven | version: Apache Maven 3.3.9 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/18495/artifact/out/whitespace-eol.txt
 |
| Max. process+thread count | 402 (vs. ulimit of 5000) |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui . U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/18495/console |
| Powered by | Apache Yetus 0.7.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Set up SASS for UI styling
> --
>
> Key: YARN-7492
> URL: https://issues.apache.org/jira/browse/YARN-7492
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Vasudevan Skm
>Assignee: Vasudevan Skm
> Attachments: YARN-7492.001.patch, YARN-7492.002.patch, 
> YARN-7492.003.patch
>
>
> SASS will help in improving the quality and maintainablity of our styles. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7464) Allow fiters on Nodes page

2017-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253063#comment-16253063
 ] 

Hadoop QA commented on YARN-7464:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
 0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
23m 20s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 51s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 34m 11s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7464 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12897687/YARN-7464.004.patch |
| Optional Tests |  asflicense  shadedclient  |
| uname | Linux 8250c511efc3 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / c4c57b8 |
| maven | version: Apache Maven 3.3.9 |
| Max. process+thread count | 436 (vs. ulimit of 5000) |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/18494/console |
| Powered by | Apache Yetus 0.7.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Allow fiters on Nodes page
> --
>
> Key: YARN-7464
> URL: https://issues.apache.org/jira/browse/YARN-7464
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Vasudevan Skm
>Assignee: Vasudevan Skm
> Attachments: Screen Shot 2017-11-08 at 4.56.04 PM.png, Screen Shot 
> 2017-11-08 at 4.56.12 PM.png, YARN-7464.001.patch, YARN-7464.002.patch, 
> YARN-7464.003.patch, YARN-7464.004.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7464) Allow fiters on Nodes page

2017-11-14 Thread Vasudevan Skm (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vasudevan Skm updated YARN-7464:

Attachment: YARN-7464.004.patch

> Allow fiters on Nodes page
> --
>
> Key: YARN-7464
> URL: https://issues.apache.org/jira/browse/YARN-7464
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Vasudevan Skm
>Assignee: Vasudevan Skm
> Attachments: Screen Shot 2017-11-08 at 4.56.04 PM.png, Screen Shot 
> 2017-11-08 at 4.56.12 PM.png, YARN-7464.001.patch, YARN-7464.002.patch, 
> YARN-7464.003.patch, YARN-7464.004.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7492) Set up SASS for UI styling

2017-11-14 Thread Vasudevan Skm (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vasudevan Skm updated YARN-7492:

Attachment: YARN-7492.003.patch

> Set up SASS for UI styling
> --
>
> Key: YARN-7492
> URL: https://issues.apache.org/jira/browse/YARN-7492
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Vasudevan Skm
>Assignee: Vasudevan Skm
> Attachments: YARN-7492.001.patch, YARN-7492.002.patch, 
> YARN-7492.003.patch
>
>
> SASS will help in improving the quality and maintainablity of our styles. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7464) Allow fiters on Nodes page

2017-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253035#comment-16253035
 ] 

Hadoop QA commented on YARN-7464:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
1s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  4s{color} 
| {color:red} YARN-7464 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-7464 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12897683/YARN-7464.003.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/18493/console |
| Powered by | Apache Yetus 0.7.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Allow fiters on Nodes page
> --
>
> Key: YARN-7464
> URL: https://issues.apache.org/jira/browse/YARN-7464
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Vasudevan Skm
>Assignee: Vasudevan Skm
> Attachments: Screen Shot 2017-11-08 at 4.56.04 PM.png, Screen Shot 
> 2017-11-08 at 4.56.12 PM.png, YARN-7464.001.patch, YARN-7464.002.patch, 
> YARN-7464.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7462) Render outstanding resource requests on application page of new YARN UI

2017-11-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253034#comment-16253034
 ] 

Hudson commented on YARN-7462:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13237 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13237/])
YARN-7462. Render outstanding resource requests on application page of (sunilg: 
rev c4c57b80e1e43391417e958f455e25fd7ff67d07)
* (edit) .gitignore
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/app/models/yarn-app.js
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/app/serializers/yarn-app.js
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/app/templates/yarn-app/info.hbs


> Render outstanding resource requests on application page of new YARN UI
> ---
>
> Key: YARN-7462
> URL: https://issues.apache.org/jira/browse/YARN-7462
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Vasudevan Skm
>Assignee: Vasudevan Skm
> Fix For: 3.1.0
>
> Attachments: Screen Shot 2017-11-08 at 3.24.30 PM.png, Screen Shot 
> 2017-11-08 at 3.38.48 PM.png, YARN-7462.001.patch, YARN-7462.002.patch, 
> YARN-7462.003.patch, YARN-7462.004.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7464) Allow fiters on Nodes page

2017-11-14 Thread Vasudevan Skm (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vasudevan Skm updated YARN-7464:

Attachment: YARN-7464.003.patch

> Allow fiters on Nodes page
> --
>
> Key: YARN-7464
> URL: https://issues.apache.org/jira/browse/YARN-7464
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Vasudevan Skm
>Assignee: Vasudevan Skm
> Attachments: Screen Shot 2017-11-08 at 4.56.04 PM.png, Screen Shot 
> 2017-11-08 at 4.56.12 PM.png, YARN-7464.001.patch, YARN-7464.002.patch, 
> YARN-7464.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Resolved] (YARN-7498) NM failed to start if the namespace of remote log dirs differs from fs.defaultFS

2017-11-14 Thread sandflee (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sandflee resolved YARN-7498.

Resolution: Duplicate

> NM failed to start if the namespace of remote log dirs differs from 
> fs.defaultFS
> 
>
> Key: YARN-7498
> URL: https://issues.apache.org/jira/browse/YARN-7498
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: sandflee
>Assignee: sandflee
>
> fs.defaultFS is hdfs://nameservice1 and yarn.nodemanager.remote-app-log-dir 
> is hdfs://nameservice2, when nm start see errors:
> {quote}
> java.lang.IllegalArgumentException: Wrong FS: hdfs://nameservice2/yarn-logs, 
> expected: hdfs://nameservice1
>   at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1128)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1124)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1124)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.verifyAndCreateRemoteLogDir(LogAggregationService.java:192)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initApp(LogAggregationService.java:319)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:443)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:67)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
>   at java.lang.Thread.run(Thread.java:745)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7462) Render outstanding resource requests on application page of new YARN UI

2017-11-14 Thread Sunil G (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-7462:
--
Summary: Render outstanding resource requests on application page of new 
YARN UI  (was: Render outstanding resource requests on application details page)

> Render outstanding resource requests on application page of new YARN UI
> ---
>
> Key: YARN-7462
> URL: https://issues.apache.org/jira/browse/YARN-7462
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Vasudevan Skm
>Assignee: Vasudevan Skm
> Attachments: Screen Shot 2017-11-08 at 3.24.30 PM.png, Screen Shot 
> 2017-11-08 at 3.38.48 PM.png, YARN-7462.001.patch, YARN-7462.002.patch, 
> YARN-7462.003.patch, YARN-7462.004.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7346) Fix compilation errors against hbase2 alpha release

2017-11-14 Thread Vrushali C (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253000#comment-16253000
 ] 

Vrushali C commented on YARN-7346:
--

Thanks for the heads-up, [~ram_krish] , I will look at it tomorrow. 

> Fix compilation errors against hbase2 alpha release
> ---
>
> Key: YARN-7346
> URL: https://issues.apache.org/jira/browse/YARN-7346
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Vrushali C
>
> When compiling hadoop-yarn-server-timelineservice-hbase against 2.0.0-alpha3, 
> I got the following errors:
> https://pastebin.com/Ms4jYEVB
> This issue is to fix the compilation errors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7346) Fix compilation errors against hbase2 alpha release

2017-11-14 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16252991#comment-16252991
 ] 

ramkrishna.s.vasudevan commented on YARN-7346:
--

[~vrushalic]
Pls have a look at HBASE-19092 also. I have added my observation seeing YARN's 
master branch timeline server code.
I feel the recent patch in HBASE-19092 would help address exposing the 
necessary APIs for timeline server Tag usage.
Pls feel to share your thoughts.

> Fix compilation errors against hbase2 alpha release
> ---
>
> Key: YARN-7346
> URL: https://issues.apache.org/jira/browse/YARN-7346
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Vrushali C
>
> When compiling hadoop-yarn-server-timelineservice-hbase against 2.0.0-alpha3, 
> I got the following errors:
> https://pastebin.com/Ms4jYEVB
> This issue is to fix the compilation errors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5983) [Umbrella] Support for FPGA as a Resource in YARN

2017-11-14 Thread Zhankun Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhankun Tang updated YARN-5983:
---
Attachment: YARN-5983-implementation-notes.pdf

Add a design and implementation note for YARN-6507 and YARN-7443. Please 
review. [~wangda]

> [Umbrella] Support for FPGA as a Resource in YARN
> -
>
> Key: YARN-5983
> URL: https://issues.apache.org/jira/browse/YARN-5983
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: yarn
>Reporter: Zhankun Tang
>Assignee: Zhankun Tang
> Attachments: YARN-5983-Support-FPGA-resource-on-NM-side_v1.pdf, 
> YARN-5983-implementation-notes.pdf
>
>
> As various big data workload running on YARN, CPU will no longer scale 
> eventually and heterogeneous systems will become more important. ML/DL is a 
> rising star in recent years, applications focused on these areas have to 
> utilize GPU or FPGA to boost performance. Also, hardware vendors such as 
> Intel also invest in such hardware. It is most likely that FPGA will become 
> popular in data centers like CPU in the near future.
> So YARN as a resource managing and scheduling system, would be great to 
> evolve to support this. This JIRA proposes FPGA to be a first-class citizen. 
> The changes roughly includes:
> 1. FPGA resource detection and heartbeat
> 2. Scheduler changes
> 3. FPGA related preparation and isolation before launch container
> We know that YARN-3926 is trying to extend current resource model. But still 
> we can leave some FPGA related discussion here



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7346) Fix compilation errors against hbase2 alpha release

2017-11-14 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16252927#comment-16252927
 ] 

ramkrishna.s.vasudevan commented on YARN-7346:
--

[~tedyu]
Thanks for the pointer. Will watch this JIRA to see all the compilation issues 
and to see if the Tag related clean up for beta-1 helps in this.

> Fix compilation errors against hbase2 alpha release
> ---
>
> Key: YARN-7346
> URL: https://issues.apache.org/jira/browse/YARN-7346
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Vrushali C
>
> When compiling hadoop-yarn-server-timelineservice-hbase against 2.0.0-alpha3, 
> I got the following errors:
> https://pastebin.com/Ms4jYEVB
> This issue is to fix the compilation errors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-7498) NM failed to start if the namespace of remote log dirs differs from fs.defaultFS

2017-11-14 Thread sandflee (JIRA)

sandflee created YARN-7498:
--

 Summary: NM failed to start if the namespace of remote log dirs 
differs from fs.defaultFS
 Key: YARN-7498
 URL: https://issues.apache.org/jira/browse/YARN-7498
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: sandflee
Assignee: sandflee


fs.defaultFS is hdfs://nameservice1 and yarn.nodemanager.remote-app-log-dir is 
hdfs://nameservice2, when nm start see errors:
{quote}
java.lang.IllegalArgumentException: Wrong FS: hdfs://nameservice2/yarn-logs, 
expected: hdfs://nameservice1
  at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
  at 
org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
  at 
org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
  at 
org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1128)
  at 
org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1124)
  at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
  at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1124)
  at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.verifyAndCreateRemoteLogDir(LogAggregationService.java:192)
  at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initApp(LogAggregationService.java:319)
  at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:443)
  at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:67)
  at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
  at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
  at java.lang.Thread.run(Thread.java:745)
{quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7489) ConcurrentModificationException in RMAppImpl#getRMAppMetrics

2017-11-14 Thread Tao Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-7489:
---
Attachment: YARN-7489.001.patch

> ConcurrentModificationException in RMAppImpl#getRMAppMetrics
> 
>
> Key: YARN-7489
> URL: https://issues.apache.org/jira/browse/YARN-7489
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Reporter: Tao Yang
>Assignee: Tao Yang
> Attachments: YARN-7489.001.patch
>
>
> The REST clients have sometimes failed to query applications through apps 
> REST API in RMWebService and it happened when iterating 
> attempts(RMWebServices#getApps --> AppInfo# --> 
> RMAppImpl#getRMAppMetrics) and meanwhile these attempts 
> changed(AttemptFailedTransition#transition --> 
> RMAppImpl#createAndStartNewAttempt --> RMAppImpl#createNewAttempt). 
> Application state changed within the lockup period of writeLock in RMAppImpl, 
> so that we can add readLock before iterating attempts to fix this problem.
> Exception stack:
> {noformat}
> java.util.ConcurrentModificationException
> at 
> java.util.LinkedHashMap$LinkedHashIterator.nextNode(LinkedHashMap.java:719)
> at 
> java.util.LinkedHashMap$LinkedValueIterator.next(LinkedHashMap.java:747)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.getRMAppMetrics(RMAppImpl.java:1487)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.AppInfo.(AppInfo.java:199)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices.getApps(RMWebServices.java:597)
> at sun.reflect.GeneratedMethodAccessor81.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
> at 
> com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$TypeOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:185)
> at 
> com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)
> at 
> com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:288)
> at 
> com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
> at 
> com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
> at 
> com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
> at 
> com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
> at 
> com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1469)
> at 
> com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1400)
> at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1349)
> at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1339)
> at 
> com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416)
> at 
> com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:537)
> at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:886)
> at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebAppFilter.doFilter(RMWebAppFilter.java:178)
> at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795)
> at 
> com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163)
> at 
> com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58)
> at 
> com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7484) Use yarn application -kill and kill by rm webpage the app information cannot log to userlogs directory so jobhistory cannot display it.

2017-11-14 Thread maobaolong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16252902#comment-16252902
 ] 

maobaolong commented on YARN-7484:
--

[~brahma] Thank you for correct it.

> Use yarn application -kill and kill by rm webpage the app information cannot 
> log to userlogs directory so jobhistory cannot display it.
> ---
>
> Key: YARN-7484
> URL: https://issues.apache.org/jira/browse/YARN-7484
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: maobaolong
>  Labels: YARN
>
> Use `yarn application -kill` can successfully kill the job but the app 
> information cannot generate into
> the "/userlogs/history/done_intermediate". So the jobhistory cannot display 
> the job information.
> But, use `yarn application -kill` can work well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-7497) Add HDFSSchedulerConfigurationStore for RM HA

2017-11-14 Thread Jiandan Yang (JIRA)

Jiandan Yang  created YARN-7497:
---

 Summary: Add HDFSSchedulerConfigurationStore for RM HA
 Key: YARN-7497
 URL: https://issues.apache.org/jira/browse/YARN-7497
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: yarn
Reporter: Jiandan Yang 


YARN-5947 add LeveldbConfigurationStore using Leveldb as backing store, but it 
does not support Yarn RM HA. HDFSSchedulerConfigurationStore store conf file in 
HDFS, when RM failover, new active RM can load scheduler configuration from 
HDFS.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-2331) Distinguish shutdown during supervision vs. shutdown for rolling upgrade

2017-11-14 Thread Karthik Palaniappan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16252784#comment-16252784
 ] 

Karthik Palaniappan commented on YARN-2331:
---

Toggling this new configuration property (yarn.nodemanager.recovery.supervised) 
isn't very different than just toggling the property that enables recovery 
(yarn.nodemanager.recovery.enabled). It's surprising that you now need to flip 
two properties to get NM work preservation to work.

Is there a reason that you need to distinguish between a supervised NM shutdown 
and a rolling upgrade related shutdown?

I'm complaining because the instructions in the 2.7 line are incorrect in 2.8: 
https://hadoop.apache.org/docs/r2.7.4/hadoop-yarn/hadoop-yarn-site/NodeManagerRestart.html.
 Equivalent docs don't exist in the 2.8 line (i.e. if you change the url to be 
r2.8.2), so I couldn't find any documentation of this new property.

> Distinguish shutdown during supervision vs. shutdown for rolling upgrade
> 
>
> Key: YARN-2331
> URL: https://issues.apache.org/jira/browse/YARN-2331
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>  Labels: BB2015-05-RFC
> Fix For: 2.8.0, 3.0.0-alpha1
>
> Attachments: YARN-2331.patch, YARN-2331v2.patch, YARN-2331v3.patch
>
>
> When the NM is shutting down with restart support enabled there are scenarios 
> we'd like to distinguish and behave accordingly:
> # The NM is running under supervision.  In that case containers should be 
> preserved so the automatic restart can recover them.
> # The NM is not running under supervision and a rolling upgrade is not being 
> performed.  In that case the shutdown should kill all containers since it is 
> unlikely the NM will be restarted in a timely manner to recover them.
> # The NM is not running under supervision and a rolling upgrade is being 
> performed.  In that case the shutdown should not kill all containers since a 
> restart is imminent due to the rolling upgrade and the containers will be 
> recovered.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-7274) Ability to disable elasticity at leaf queue level

2017-11-14 Thread Zian Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16252759#comment-16252759
 ] 

Zian Chen edited comment on YARN-7274 at 11/15/17 12:39 AM:


I investigated this issue and figured out how we could solve this issue. 
Basically, we have sanity check within capacitiesSanityCheck which require 
capacity must be smaller than or equal to maxCapacity.  Actually, we don't need 
this constraint since we want to let absolute capacity can be equal to absolute 
max capacity in the extreme situation, but if we restrict the capacity cannot 
be bigger than maxCapacity, we will never able to achieve absCapacity equal to 
absMaxCapacity since the way to calculate absCapacity is absCap of queue x is 
multiple of capacity of x  * capacity of the parent of x * ... which will make 
absCap of queue x always smaller than absolute Max capacity of x. So, when we 
release the constriction that capacity should be smaller than or equal to 
maxCapacity, we will solve this problem.  Opinions [~leftnoteasy] 


was (Author: zian chen):
I investigated this issue and figured out how we could solve this issue. 
Basically, we have sanity check within capacitiesSanityCheck which require 
capacity must be smaller than or equal to maxCapacity.  Actually, we don't need 
this constraint since we want to let absolute capacity can be equal to absolute 
max capacity in the extreme situation, but if we restrict the capacity cannot 
be bigger than maxCapacity, we will never able to achieve absCapacity equal to 
absMaxCapacity since the way to calculate absCapacity is abs(x) = cap(x) * 
cap(x.parent) * ... which will make abs(x) always smaller than absMax(x). So, 
when we release the constriction that capacity should be smaller than or equal 
to maxCapacity, we will solve this problem.  Opinions [~leftnoteasy] 

> Ability to disable elasticity at leaf queue level
> -
>
> Key: YARN-7274
> URL: https://issues.apache.org/jira/browse/YARN-7274
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Reporter: Scott Brokaw
>Assignee: Zian Chen
>
> The 
> [documentation|https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html]
>  defines yarn.scheduler.capacity..maximum-capacity as "Maximum 
> queue capacity in percentage (%) as a float. This limits the elasticity for 
> applications in the queue. Defaults to -1 which disables it."
> However, setting this value to -1 sets maximum capacity to 100% but I thought 
> (perhaps incorrectly) that the intention of the -1 setting is that it would 
> disable elasticity.  This is confirmed looking at the code:
> {code:java}
> public static final float MAXIMUM_CAPACITY_VALUE = 100;
> public static final float DEFAULT_MAXIMUM_CAPACITY_VALUE = -1.0f;
> ..
> maxCapacity = (maxCapacity == DEFAULT_MAXIMUM_CAPACITY_VALUE) ? 
> MAXIMUM_CAPACITY_VALUE : maxCapacity;
> {code}
> The sum of yarn.scheduler.capacity..capacity for all queues, at 
> each level, must be equal to 100 but for 
> yarn.scheduler.capacity..maximum-capacity this value is actually 
> a percentage of the entire cluster not just the parent queue.  Yet it can not 
> be set lower then the leaf queue's capacity setting. This seems to make it 
> impossible to disable elasticity at a leaf queue level.
> This improvement is proposing that YARN have the ability to have elasticity 
> disabled at a leaf queue level even if a parent queue permits elasticity by 
> having a yarn.scheduler.capacity..maximum-capacity greater then 
> it's yarn.scheduler.capacity..capacity



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7274) Ability to disable elasticity at leaf queue level

2017-11-14 Thread Zian Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16252759#comment-16252759
 ] 

Zian Chen commented on YARN-7274:
-

I investigated this issue and figured out how we could solve this issue. 
Basically, we have sanity check within capacitiesSanityCheck which require 
capacity must be smaller than or equal to maxCapacity.  Actually, we don't need 
this constraint since we want to let absolute capacity can be equal to absolute 
max capacity in the extreme situation, but if we restrict the capacity cannot 
be bigger than maxCapacity, we will never able to achieve absCapacity equal to 
absMaxCapacity since the way to calculate absCapacity is abs(x) = cap(x) * 
cap(x.parent) * ... which will make abs(x) always smaller than absMax(x). So, 
when we release the constriction that capacity should be smaller than or equal 
to maxCapacity, we will solve this problem.  Opinions [~leftnoteasy] 

> Ability to disable elasticity at leaf queue level
> -
>
> Key: YARN-7274
> URL: https://issues.apache.org/jira/browse/YARN-7274
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Reporter: Scott Brokaw
>Assignee: Zian Chen
>
> The 
> [documentation|https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html]
>  defines yarn.scheduler.capacity..maximum-capacity as "Maximum 
> queue capacity in percentage (%) as a float. This limits the elasticity for 
> applications in the queue. Defaults to -1 which disables it."
> However, setting this value to -1 sets maximum capacity to 100% but I thought 
> (perhaps incorrectly) that the intention of the -1 setting is that it would 
> disable elasticity.  This is confirmed looking at the code:
> {code:java}
> public static final float MAXIMUM_CAPACITY_VALUE = 100;
> public static final float DEFAULT_MAXIMUM_CAPACITY_VALUE = -1.0f;
> ..
> maxCapacity = (maxCapacity == DEFAULT_MAXIMUM_CAPACITY_VALUE) ? 
> MAXIMUM_CAPACITY_VALUE : maxCapacity;
> {code}
> The sum of yarn.scheduler.capacity..capacity for all queues, at 
> each level, must be equal to 100 but for 
> yarn.scheduler.capacity..maximum-capacity this value is actually 
> a percentage of the entire cluster not just the parent queue.  Yet it can not 
> be set lower then the leaf queue's capacity setting. This seems to make it 
> impossible to disable elasticity at a leaf queue level.
> This improvement is proposing that YARN have the ability to have elasticity 
> disabled at a leaf queue level even if a parent queue permits elasticity by 
> having a yarn.scheduler.capacity..maximum-capacity greater then 
> it's yarn.scheduler.capacity..capacity



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-6124) Make SchedulingEditPolicy can be enabled / disabled / updated with RMAdmin -refreshQueues

2017-11-14 Thread Zian Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16252742#comment-16252742
 ] 

Zian Chen edited comment on YARN-6124 at 11/15/17 12:23 AM:


[~leftnoteasy] No problem Wangda, I'm taking this issue and wrote some UT as 
well as local cluster test for the second patch. I found a minor issue for the 
patch which is the monitor_interval was not been set which caused RM fail to 
start, the reason for this bug is the SchedulingMonitorManager was started 
before the YARN configuration been loaded, so I'll fix this bug and redo the 
test. 


was (Author: zian chen):
[~wangda] No problem Wangda, I'm taking this issue and wrote some UT as well as 
local cluster test for the second patch. I found a minor issue for the patch 
which is the monitor_interval was not been set which caused RM fail to start, 
the reason for this bug is the SchedulingMonitorManager was started before the 
YARN configuration been loaded, so I'll fix this bug and redo the test. 

> Make SchedulingEditPolicy can be enabled / disabled / updated with RMAdmin 
> -refreshQueues
> -
>
> Key: YARN-6124
> URL: https://issues.apache.org/jira/browse/YARN-6124
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Wangda Tan
>Assignee: Zian Chen
> Attachments: YARN-6124.wip.1.patch, YARN-6124.wip.2.patch
>
>
> Now enabled / disable / update SchedulingEditPolicy config requires restart 
> RM. This is inconvenient when admin wants to make changes to 
> SchedulingEditPolicies.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-6124) Make SchedulingEditPolicy can be enabled / disabled / updated with RMAdmin -refreshQueues

2017-11-14 Thread Zian Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16252742#comment-16252742
 ] 

Zian Chen edited comment on YARN-6124 at 11/15/17 12:21 AM:


[~wangda] No problem Wangda, I'm taking this issue and wrote some UT as well as 
local cluster test for the second patch. I found a minor issue for the patch 
which is the monitor_interval was not been set which caused RM fail to start, 
the reason for this bug is the SchedulingMonitorManager was started before the 
YARN configuration been loaded, so I'll fix this bug and redo the test. 


was (Author: zian chen):
[~wangdxa] No problem Wangda, I'm taking this issue and wrote some UT as well 
as local cluster test for the second patch. I found a minor issue for the patch 
which is the monitor_interval was not been set which caused RM fail to start, 
the reason for this bug is the SchedulingMonitorManager was started before the 
YARN configuration been loaded, so I'll fix this bug and redo the test. 

> Make SchedulingEditPolicy can be enabled / disabled / updated with RMAdmin 
> -refreshQueues
> -
>
> Key: YARN-6124
> URL: https://issues.apache.org/jira/browse/YARN-6124
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Wangda Tan
>Assignee: Zian Chen
> Attachments: YARN-6124.wip.1.patch, YARN-6124.wip.2.patch
>
>
> Now enabled / disable / update SchedulingEditPolicy config requires restart 
> RM. This is inconvenient when admin wants to make changes to 
> SchedulingEditPolicies.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6124) Make SchedulingEditPolicy can be enabled / disabled / updated with RMAdmin -refreshQueues

2017-11-14 Thread Zian Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16252742#comment-16252742
 ] 

Zian Chen commented on YARN-6124:
-

[~wangdxa] No problem Wangda, I'm taking this issue and wrote some UT as well 
as local cluster test for the second patch. I found a minor issue for the patch 
which is the monitor_interval was not been set which caused RM fail to start, 
the reason for this bug is the SchedulingMonitorManager was started before the 
YARN configuration been loaded, so I'll fix this bug and redo the test. 

> Make SchedulingEditPolicy can be enabled / disabled / updated with RMAdmin 
> -refreshQueues
> -
>
> Key: YARN-6124
> URL: https://issues.apache.org/jira/browse/YARN-6124
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Wangda Tan
>Assignee: Zian Chen
> Attachments: YARN-6124.wip.1.patch, YARN-6124.wip.2.patch
>
>
> Now enabled / disable / update SchedulingEditPolicy config requires restart 
> RM. This is inconvenient when admin wants to make changes to 
> SchedulingEditPolicies.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7218) ApiServer REST API naming convention /ws/v1 is already used in Hadoop v2

2017-11-14 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16252723#comment-16252723
 ] 

Eric Yang commented on YARN-7218:
-

[~billie.rinaldi] This is good practice to add root level wrapper to help 
developer to understand the context of the returning json.  For large scale 
project which involves a lot of entities, it would be easier to manage with 
high level wrapper to indicate the data payload.  For YARN, at this time, there 
are custom resolvers for each of the web application that is attached to 
resource manager, node manager, and application history server.  The change 
does not affect timeline service, or web proxy by unit test and manual 
inspection.  Hence, it seems like a good time to add this to avoid having to 
add custom resolver per web application.  I agree that 
YarnJacksonJaxbJsonProvider change is optional, and can be removed on request.

> ApiServer REST API naming convention /ws/v1 is already used in Hadoop v2
> 
>
> Key: YARN-7218
> URL: https://issues.apache.org/jira/browse/YARN-7218
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications
>Reporter: Eric Yang
>Assignee: Eric Yang
> Attachments: YARN-7218.001.patch, YARN-7218.002.patch, 
> YARN-7218.003.patch
>
>
> In YARN-6626, there is a desire to have ability to run ApiServer REST API in 
> Resource Manager, this can eliminate the requirement to deploy another daemon 
> service for submitting docker applications.  In YARN-5698, a new UI has been 
> implemented as a separate web application.  There are some problems in the 
> arrangement that can cause conflicts of how Java session are being managed.  
> The root context of Resource Manager web application is /ws.  This is hard 
> coded in startWebapp method in ResourceManager.java.  This means all the 
> session management is applied to Web URL of /ws prefix.  /ui2 is independent 
> of /ws context, therefore session management code doesn't apply to /ui2.  
> This could be a session management problem, if servlet based code is going to 
> be introduced into /ui2 web application.
> ApiServer code base is designed as a separate web application.  There is no 
> easy way to inject a separate web application into the same /ws context 
> because ResourceManager is already setup to bind to RMWebServices.  Unless 
> ApiServer code is moved into RMWebServices, otherwise, they will not share 
> the same session management.
> The alternate solution is to keep ApiServer prefix URL independent of /ws 
> context.  However, this will be a departure from YARN web services naming 
> convention.  This can be loaded as a separate web application in Resource 
> Manager jetty server.  One possible proposal is /app/v1/services.  This can 
> keep ApiServer code modular and independent from Resource Manager.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7495) Improve robustness of the AggregatedLogDeletionService

2017-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16252679#comment-16252679
 ] 

Hadoop QA commented on YARN-7495:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
33s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
 2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 28s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 17s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common: The patch generated 1 new + 
7 unchanged - 2 fixed = 8 total (was 9) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch 1 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 44s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
40s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 44m 46s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7495 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12897638/YARN-7495.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 5ab69f2de0c0 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 18621af |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/18492/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/18492/artifact/out/whitespace-tabs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/18492/testReport/ |
| Max. process+thread c

[jira] [Commented] (YARN-7496) CS Intra-queue preemption user-limit calculations are not in line with LeafQueue user-limit calculations

2017-11-14 Thread Eric Payne (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16252643#comment-16252643
 ] 

Eric Payne commented on YARN-7496:
--

Cluster Configuration:
- Cluster Memory: 20GB
- Queue1 capacity and max capacity: 50% : 100%
- Queue2 capacity and max capacity: 50% : 100%
- Queue1: Intra-queue preemption: enabled
- Default container size: 0.5GB

Use Case:
- User1 submits App1 in Queue1 and consumes 12.5GB
- User2 submits App2 in Queue1 and consumes 7.5GB
- User3 submits App3 in Queue1
- Preemption monitor calculates user limit to be {{((total used resources in 
Queue1) / (number of all users)) + (1 container) = normalizeup((20GB/3),0.5GB) 
+ 0.5GB = 7GB + 0.5GB = 7.5GB}}
- Preemption monitor sees that App1 is the only one that has resources, so it 
tries to preempts containers down to 7.5GB from {{App1}}.
- The problem comes here: Capacity Scheduler calculates user limit to be 
{{((total used resources in Queue1) / (number of active users)) + (1 container) 
= normalizeup((20GB/2),0.5GB) + 0.5GB = 10GB + 0.5GB = 10.5GB}}
- Therefore, once {{App1}} gets to 10.5GB, the preemption monitor will try to 
preempt 2.5GB more resources from {{App1}}, but the Capacity Scheduler gives 
them back. This creates oscillation.

> CS Intra-queue preemption user-limit calculations are not in line with 
> LeafQueue user-limit calculations
> 
>
> Key: YARN-7496
> URL: https://issues.apache.org/jira/browse/YARN-7496
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.2
>Reporter: Eric Payne
>Assignee: Eric Payne
>
> Only a problem in 2.8.
> Preemption could oscillate due to the difference in how user limit is 
> calculated between 2.8 and later releases.
> Basically (ignoring ULF, MULP, and maybe others), the calculation for user 
> limit on the Capacity Scheduler side in 2.8 is {{total used resources / 
> number of active users}} while the calculation in later releases is {{total 
> active resources / number of active users}}. When intra-queue preemption was 
> backported to 2.8, it's calculations for user limit were more aligned with 
> the latter algorithm, which is in 2.9 and later releases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7486) Race condition in service AM that can cause NPE

2017-11-14 Thread Billie Rinaldi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16252638#comment-16252638
 ] 

Billie Rinaldi commented on YARN-7486:
--

+1 for patch 01, this solves the race condition. I also verified that the new 
test fails without the main changes.

> Race condition in service AM that can cause NPE
> ---
>
> Key: YARN-7486
> URL: https://issues.apache.org/jira/browse/YARN-7486
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-7486.01.patch
>
>
> 1. container1 completed for instance1
> 2. instance1 is added to pending list, and send an event asynchronously to 
> instance1 to run ContainerStoppedTransition
> 3. container2 allocated, and assigned to instance1, it records the container2 
> inside instance1
> 4. in the meantime, instance1 ContainerStoppedTransition is called and that 
> set the container back to null. 
> This cause the recorded container lost.
> {code}
>   java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.service.provider.ProviderUtils.initCompTokensForSubstitute(ProviderUtils.java:402)
>   at 
> org.apache.hadoop.yarn.service.provider.AbstractProviderService.buildContainerLaunchContext(AbstractProviderService.java:70)
>   at 
> org.apache.hadoop.yarn.service.containerlaunch.ContainerLaunchService$ContainerLauncher.run(ContainerLaunchService.java:89)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-7496) CS Intra-queue preemption user-limit calculations are not in line with LeafQueue user-limit calculations

2017-11-14 Thread Eric Payne (JIRA)

Eric Payne created YARN-7496:


 Summary: CS Intra-queue preemption user-limit calculations are not 
in line with LeafQueue user-limit calculations
 Key: YARN-7496
 URL: https://issues.apache.org/jira/browse/YARN-7496
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.8.2
Reporter: Eric Payne
Assignee: Eric Payne


Only a problem in 2.8.

Preemption could oscillate due to the difference in how user limit is 
calculated between 2.8 and later releases.

Basically (ignoring ULF, MULP, and maybe others), the calculation for user 
limit on the Capacity Scheduler side in 2.8 is {{total used resources / number 
of active users}} while the calculation in later releases is {{total active 
resources / number of active users}}. When intra-queue preemption was 
backported to 2.8, it's calculations for user limit were more aligned with the 
latter algorithm, which is in 2.9 and later releases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7495) Improve robustness of the AggregatedLogDeletionService

2017-11-14 Thread Jonathan Eagles (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated YARN-7495:
--
Attachment: YARN-7495.001.patch

> Improve robustness of the AggregatedLogDeletionService
> --
>
> Key: YARN-7495
> URL: https://issues.apache.org/jira/browse/YARN-7495
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: log-aggregation
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: YARN-7495.001.patch
>
>
> The deletion tasks are scheduled with a TimerTask whose scheduler is a Timer 
> scheduleAtFixedRate. If an exception occurs in the log deletion task, the 
> Timer scheduler interprets this as a task cancelation and stops scheduling 
> future deletion tasks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7218) ApiServer REST API naming convention /ws/v1 is already used in Hadoop v2

2017-11-14 Thread Billie Rinaldi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16252211#comment-16252211
 ] 

Billie Rinaldi commented on YARN-7218:
--

I am trying out patch 03. Is the change to YarnJacksonJaxbJsonProvider 
necessary? It seems like this is not being used and that it could change 
behavior for other webapps using the provider.

> ApiServer REST API naming convention /ws/v1 is already used in Hadoop v2
> 
>
> Key: YARN-7218
> URL: https://issues.apache.org/jira/browse/YARN-7218
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications
>Reporter: Eric Yang
>Assignee: Eric Yang
> Attachments: YARN-7218.001.patch, YARN-7218.002.patch, 
> YARN-7218.003.patch
>
>
> In YARN-6626, there is a desire to have ability to run ApiServer REST API in 
> Resource Manager, this can eliminate the requirement to deploy another daemon 
> service for submitting docker applications.  In YARN-5698, a new UI has been 
> implemented as a separate web application.  There are some problems in the 
> arrangement that can cause conflicts of how Java session are being managed.  
> The root context of Resource Manager web application is /ws.  This is hard 
> coded in startWebapp method in ResourceManager.java.  This means all the 
> session management is applied to Web URL of /ws prefix.  /ui2 is independent 
> of /ws context, therefore session management code doesn't apply to /ui2.  
> This could be a session management problem, if servlet based code is going to 
> be introduced into /ui2 web application.
> ApiServer code base is designed as a separate web application.  There is no 
> easy way to inject a separate web application into the same /ws context 
> because ResourceManager is already setup to bind to RMWebServices.  Unless 
> ApiServer code is moved into RMWebServices, otherwise, they will not share 
> the same session management.
> The alternate solution is to keep ApiServer prefix URL independent of /ws 
> context.  However, this will be a departure from YARN web services naming 
> convention.  This can be loaded as a separate web application in Resource 
> Manager jetty server.  One possible proposal is /app/v1/services.  This can 
> keep ApiServer code modular and independent from Resource Manager.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-1015) FS should watch node resource utilization and allocate opportunistic containers if appropriate

2017-11-14 Thread Haibo Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16252212#comment-16252212
 ] 

Haibo Chen commented on YARN-1015:
--

[~asuresh] Have you got a change to look at the latest patch yet?

> FS should watch node resource utilization and allocate opportunistic 
> containers if appropriate
> --
>
> Key: YARN-1015
> URL: https://issues.apache.org/jira/browse/YARN-1015
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun C Murthy
>Assignee: Haibo Chen
> Attachments: YARN-1015-YARN-1011.00.patch, 
> YARN-1015-YARN-1011.01.patch, YARN-1015-YARN-1011.02.patch, 
> YARN-1015-YARN-1011.prelim.patch
>
>
> FS should looks at resource utilization of nodes (provided by NM in 
> heartbeat) and allocate opportunistic containers if the resource utilization 
> of the node is below its allocation threshold.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7491) Make sure AM is not scheduled on an opportunistic container

2017-11-14 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-7491:
-
Issue Type: Sub-task  (was: Bug)
Parent: YARN-1011

> Make sure AM is not scheduled on an opportunistic container
> ---
>
> Key: YARN-7491
> URL: https://issues.apache.org/jira/browse/YARN-7491
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-7495) Improve robustness of the AggregatedLogDeletionService

2017-11-14 Thread Jonathan Eagles (JIRA)

Jonathan Eagles created YARN-7495:
-

 Summary: Improve robustness of the AggregatedLogDeletionService
 Key: YARN-7495
 URL: https://issues.apache.org/jira/browse/YARN-7495
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: log-aggregation
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles


The deletion tasks are scheduled with a TimerTask whose scheduler is a Timer 
scheduleAtFixedRate. If an exception occurs in the log deletion task, the Timer 
scheduler interprets this as a task cancelation and stops scheduling future 
deletion tasks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7361) Improve the docker container runtime documentation

2017-11-14 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16252168#comment-16252168
 ] 

Jason Lowe commented on YARN-7361:
--

Sorry for the late reply.  The capabilities property example should be updated 
to include the property description changes from YARN-7286, otherwise the user 
may think that setting this to an empty value (or not setting it at all) does 
not apply any capabilities when it actually does.


> Improve the docker container runtime documentation
> --
>
> Key: YARN-7361
> URL: https://issues.apache.org/jira/browse/YARN-7361
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Shane Kumpf
>Assignee: Shane Kumpf
> Attachments: YARN-7361.001.patch
>
>
> During review of YARN-7230, it was found that 
> yarn.nodemanager.runtime.linux.docker.capabilities is missing from the docker 
> containers documentation in most of the active branches. We can also improve 
> the warning that was introduced in YARN-6622.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6078) Containers stuck in Localizing state

2017-11-14 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-6078:
-
Fix Version/s: 2.10.0

> Containers stuck in Localizing state
> 
>
> Key: YARN-6078
> URL: https://issues.apache.org/jira/browse/YARN-6078
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jagadish
>Assignee: Billie Rinaldi
> Fix For: 3.0.0, 3.1.0, 2.10.0, 2.9.1
>
> Attachments: YARN-6078-branch-2.001.patch, YARN-6078.001.patch, 
> YARN-6078.002.patch, YARN-6078.003.patch
>
>
> I encountered an interesting issue in one of our Yarn clusters (where the 
> containers are stuck in localizing phase).
> Our AM requests a container, and starts a process using the NMClient.
> According to the NM the container is in LOCALIZING state:
> {code}
> 1. 2017-01-09 22:06:18,362 [INFO] [AsyncDispatcher event handler] 
> container.ContainerImpl.handle(ContainerImpl.java:1135) - Container 
> container_e03_1481261762048_0541_02_60 transitioned from NEW to LOCALIZING
> 2017-01-09 22:06:18,363 [INFO] [AsyncDispatcher event handler] 
> localizer.ResourceLocalizationService$LocalizerTracker.handle(ResourceLocalizationService.java:711)
>  - Created localizer for container_e03_1481261762048_0541_02_60
> 2017-01-09 22:06:18,364 [INFO] [LocalizerRunner for 
> container_e03_1481261762048_0541_02_60] 
> localizer.ResourceLocalizationService$LocalizerRunner.writeCredentials(ResourceLocalizationService.java:1191)
>  - Writing credentials to the nmPrivate file 
> /../..//.nmPrivate/container_e03_1481261762048_0541_02_60.tokens. 
> Credentials list:
> {code}
> According to the RM the container is in RUNNING state:
> {code}
> 2017-01-09 22:06:17,110 [INFO] [IPC Server handler 19 on 8030] 
> rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:410) - 
> container_e03_1481261762048_0541_02_60 Container Transitioned from 
> ALLOCATED to ACQUIRED
> 2017-01-09 22:06:19,084 [INFO] [ResourceManager Event Processor] 
> rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:410) - 
> container_e03_1481261762048_0541_02_60 Container Transitioned from 
> ACQUIRED to RUNNING
> {code}
> When I click the Yarn RM UI to view the logs for the container,  I get an 
> error
> that
> {code}
> No logs were found. state is LOCALIZING
> {code}
> The Node manager 's stack trace seems to indicate that the NM's 
> LocalizerRunner is stuck waiting to read from the sub-process's outputstream.
> {code}
> "LocalizerRunner for container_e03_1481261762048_0541_02_60" #27007081 
> prio=5 os_prio=0 tid=0x7fa518849800 nid=0x15f7 runnable 
> [0x7fa5076c3000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.FileInputStream.readBytes(Native Method)
>   at java.io.FileInputStream.read(FileInputStream.java:255)
>   at java.io.BufferedInputStream.read1(BufferedInputStream.java:284)
>   at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
>   - locked <0xc6dc9c50> (a 
> java.lang.UNIXProcess$ProcessPipeInputStream)
>   at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
>   at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
>   at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
>   - locked <0xc6dc9c78> (a java.io.InputStreamReader)
>   at java.io.InputStreamReader.read(InputStreamReader.java:184)
>   at java.io.BufferedReader.fill(BufferedReader.java:161)
>   at java.io.BufferedReader.read1(BufferedReader.java:212)
>   at java.io.BufferedReader.read(BufferedReader.java:286)
>   - locked <0xc6dc9c78> (a java.io.InputStreamReader)
>   at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.parseExecResult(Shell.java:786)
>   at org.apache.hadoop.util.Shell.runCommand(Shell.java:568)
>   at org.apache.hadoop.util.Shell.run(Shell.java:479)
>   at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:773)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.startLocalizer(LinuxContainerExecutor.java:237)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1113)
> {code}
> I did a {code}ps aux{code} and confirmed that there was no container-executor 
> process running with INITIALIZE_CONTAINER that the localizer starts. It seems 
> that the output stream pipe of the process is still not closed (even though 
> the localizer process is no longer present).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6078) Containers stuck in Localizing state

2017-11-14 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16252163#comment-16252163
 ] 

Junping Du commented on YARN-6078:
--

+1 on branch-2 patch. I have commit the patch to trunk, branch-3.0, branch-2 
and branch-2.9. Thanks [~billie.rinaldi] for the patch and [~bibinchundatt] for 
review!

> Containers stuck in Localizing state
> 
>
> Key: YARN-6078
> URL: https://issues.apache.org/jira/browse/YARN-6078
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jagadish
>Assignee: Billie Rinaldi
> Fix For: 3.0.0, 3.1.0, 2.10.0, 2.9.1
>
> Attachments: YARN-6078-branch-2.001.patch, YARN-6078.001.patch, 
> YARN-6078.002.patch, YARN-6078.003.patch
>
>
> I encountered an interesting issue in one of our Yarn clusters (where the 
> containers are stuck in localizing phase).
> Our AM requests a container, and starts a process using the NMClient.
> According to the NM the container is in LOCALIZING state:
> {code}
> 1. 2017-01-09 22:06:18,362 [INFO] [AsyncDispatcher event handler] 
> container.ContainerImpl.handle(ContainerImpl.java:1135) - Container 
> container_e03_1481261762048_0541_02_60 transitioned from NEW to LOCALIZING
> 2017-01-09 22:06:18,363 [INFO] [AsyncDispatcher event handler] 
> localizer.ResourceLocalizationService$LocalizerTracker.handle(ResourceLocalizationService.java:711)
>  - Created localizer for container_e03_1481261762048_0541_02_60
> 2017-01-09 22:06:18,364 [INFO] [LocalizerRunner for 
> container_e03_1481261762048_0541_02_60] 
> localizer.ResourceLocalizationService$LocalizerRunner.writeCredentials(ResourceLocalizationService.java:1191)
>  - Writing credentials to the nmPrivate file 
> /../..//.nmPrivate/container_e03_1481261762048_0541_02_60.tokens. 
> Credentials list:
> {code}
> According to the RM the container is in RUNNING state:
> {code}
> 2017-01-09 22:06:17,110 [INFO] [IPC Server handler 19 on 8030] 
> rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:410) - 
> container_e03_1481261762048_0541_02_60 Container Transitioned from 
> ALLOCATED to ACQUIRED
> 2017-01-09 22:06:19,084 [INFO] [ResourceManager Event Processor] 
> rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:410) - 
> container_e03_1481261762048_0541_02_60 Container Transitioned from 
> ACQUIRED to RUNNING
> {code}
> When I click the Yarn RM UI to view the logs for the container,  I get an 
> error
> that
> {code}
> No logs were found. state is LOCALIZING
> {code}
> The Node manager 's stack trace seems to indicate that the NM's 
> LocalizerRunner is stuck waiting to read from the sub-process's outputstream.
> {code}
> "LocalizerRunner for container_e03_1481261762048_0541_02_60" #27007081 
> prio=5 os_prio=0 tid=0x7fa518849800 nid=0x15f7 runnable 
> [0x7fa5076c3000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.FileInputStream.readBytes(Native Method)
>   at java.io.FileInputStream.read(FileInputStream.java:255)
>   at java.io.BufferedInputStream.read1(BufferedInputStream.java:284)
>   at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
>   - locked <0xc6dc9c50> (a 
> java.lang.UNIXProcess$ProcessPipeInputStream)
>   at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
>   at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
>   at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
>   - locked <0xc6dc9c78> (a java.io.InputStreamReader)
>   at java.io.InputStreamReader.read(InputStreamReader.java:184)
>   at java.io.BufferedReader.fill(BufferedReader.java:161)
>   at java.io.BufferedReader.read1(BufferedReader.java:212)
>   at java.io.BufferedReader.read(BufferedReader.java:286)
>   - locked <0xc6dc9c78> (a java.io.InputStreamReader)
>   at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.parseExecResult(Shell.java:786)
>   at org.apache.hadoop.util.Shell.runCommand(Shell.java:568)
>   at org.apache.hadoop.util.Shell.run(Shell.java:479)
>   at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:773)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.startLocalizer(LinuxContainerExecutor.java:237)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1113)
> {code}
> I did a {code}ps aux{code} and confirmed that there was no container-executor 
> process running with INITIALIZE_CONTAINER that the localizer starts. It seems 
> that the output stream pipe of the process is still not closed (even though 
> the localizer process is no longer present).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (YARN-6078) Containers stuck in Localizing state

2017-11-14 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-6078:
-
Fix Version/s: 2.9.1

> Containers stuck in Localizing state
> 
>
> Key: YARN-6078
> URL: https://issues.apache.org/jira/browse/YARN-6078
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jagadish
>Assignee: Billie Rinaldi
> Fix For: 3.0.0, 3.1.0, 2.9.1
>
> Attachments: YARN-6078-branch-2.001.patch, YARN-6078.001.patch, 
> YARN-6078.002.patch, YARN-6078.003.patch
>
>
> I encountered an interesting issue in one of our Yarn clusters (where the 
> containers are stuck in localizing phase).
> Our AM requests a container, and starts a process using the NMClient.
> According to the NM the container is in LOCALIZING state:
> {code}
> 1. 2017-01-09 22:06:18,362 [INFO] [AsyncDispatcher event handler] 
> container.ContainerImpl.handle(ContainerImpl.java:1135) - Container 
> container_e03_1481261762048_0541_02_60 transitioned from NEW to LOCALIZING
> 2017-01-09 22:06:18,363 [INFO] [AsyncDispatcher event handler] 
> localizer.ResourceLocalizationService$LocalizerTracker.handle(ResourceLocalizationService.java:711)
>  - Created localizer for container_e03_1481261762048_0541_02_60
> 2017-01-09 22:06:18,364 [INFO] [LocalizerRunner for 
> container_e03_1481261762048_0541_02_60] 
> localizer.ResourceLocalizationService$LocalizerRunner.writeCredentials(ResourceLocalizationService.java:1191)
>  - Writing credentials to the nmPrivate file 
> /../..//.nmPrivate/container_e03_1481261762048_0541_02_60.tokens. 
> Credentials list:
> {code}
> According to the RM the container is in RUNNING state:
> {code}
> 2017-01-09 22:06:17,110 [INFO] [IPC Server handler 19 on 8030] 
> rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:410) - 
> container_e03_1481261762048_0541_02_60 Container Transitioned from 
> ALLOCATED to ACQUIRED
> 2017-01-09 22:06:19,084 [INFO] [ResourceManager Event Processor] 
> rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:410) - 
> container_e03_1481261762048_0541_02_60 Container Transitioned from 
> ACQUIRED to RUNNING
> {code}
> When I click the Yarn RM UI to view the logs for the container,  I get an 
> error
> that
> {code}
> No logs were found. state is LOCALIZING
> {code}
> The Node manager 's stack trace seems to indicate that the NM's 
> LocalizerRunner is stuck waiting to read from the sub-process's outputstream.
> {code}
> "LocalizerRunner for container_e03_1481261762048_0541_02_60" #27007081 
> prio=5 os_prio=0 tid=0x7fa518849800 nid=0x15f7 runnable 
> [0x7fa5076c3000]
>java.lang.Thread.State: RUNNABLE
>   at java.io.FileInputStream.readBytes(Native Method)
>   at java.io.FileInputStream.read(FileInputStream.java:255)
>   at java.io.BufferedInputStream.read1(BufferedInputStream.java:284)
>   at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
>   - locked <0xc6dc9c50> (a 
> java.lang.UNIXProcess$ProcessPipeInputStream)
>   at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
>   at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
>   at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
>   - locked <0xc6dc9c78> (a java.io.InputStreamReader)
>   at java.io.InputStreamReader.read(InputStreamReader.java:184)
>   at java.io.BufferedReader.fill(BufferedReader.java:161)
>   at java.io.BufferedReader.read1(BufferedReader.java:212)
>   at java.io.BufferedReader.read(BufferedReader.java:286)
>   - locked <0xc6dc9c78> (a java.io.InputStreamReader)
>   at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.parseExecResult(Shell.java:786)
>   at org.apache.hadoop.util.Shell.runCommand(Shell.java:568)
>   at org.apache.hadoop.util.Shell.run(Shell.java:479)
>   at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:773)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.startLocalizer(LinuxContainerExecutor.java:237)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:1113)
> {code}
> I did a {code}ps aux{code} and confirmed that there was no container-executor 
> process running with INITIALIZE_CONTAINER that the localizer starts. It seems 
> that the output stream pipe of the process is still not closed (even though 
> the localizer process is no longer present).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6384) Add configuration property to set max CPU usage when strict-resource-usage is false with cgroups

2017-11-14 Thread Daniel Templeton (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton updated YARN-6384:
---
Summary: Add configuration property to set max CPU usage when 
strict-resource-usage is false with cgroups  (was: Add configuratin to set max 
cpu usage when strict-resource-usage is false with cgroups)

> Add configuration property to set max CPU usage when strict-resource-usage is 
> false with cgroups
> 
>
> Key: YARN-6384
> URL: https://issues.apache.org/jira/browse/YARN-6384
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: dengkai
> Attachments: YARN-6384-0.patch, YARN-6384-1.patch, YARN-6384-2.patch, 
> YARN-6384-3.patch
>
>
> When using cgroups on yarn, if 
> yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage is 
> false, user may get very more cpu time than expected based on the vcores. 
> There should be a upper limit even resource-usage is not strict, just like a 
> percentage which user can get more than promised by vcores. I think it's 
> important in a shared cluster.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6426) Compress ZK YARN keys to scale up (especially AppStateData

2017-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16252060#comment-16252060
 ] 

Hadoop QA commented on YARN-6426:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  5s{color} 
| {color:red} YARN-6426 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-6426 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12861550/zkcompression.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/18491/console |
| Powered by | Apache Yetus 0.7.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Compress ZK YARN keys to scale up (especially AppStateData
> --
>
> Key: YARN-6426
> URL: https://issues.apache.org/jira/browse/YARN-6426
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.0.0-alpha2
>Reporter: Roni Burd
>Assignee: Roni Burd
>  Labels: patch
> Attachments: zkcompression.patch
>
>
> ZK today stores the protobuf files uncompressed. This is not an issue except 
> that if a customer job has thousands of files, AppStateData will store the 
> user context as a string with multiple URLs and it is easy to get to 1MB or 
> more. 
> This can put unnecessary strain on ZK and make the process slow. 
> The proposal is to simply compress protobufs before sending them to ZK



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7414) FairScheduler#getAppWeight() should be moved into FSAppAttempt#getWeight()

2017-11-14 Thread Daniel Templeton (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16252050#comment-16252050
 ] 

Daniel Templeton commented on YARN-7414:


I don't see any difference between patch 1 and patch 2.

> FairScheduler#getAppWeight() should be moved into FSAppAttempt#getWeight()
> --
>
> Key: YARN-7414
> URL: https://issues.apache.org/jira/browse/YARN-7414
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 3.0.0-beta1
>Reporter: Daniel Templeton
>Assignee: Soumabrata Chakraborty
>Priority: Minor
>  Labels: newbie
> Attachments: YARN-7414.001.patch, YARN-7414.002.patch
>
>
> It's illogical that {{FSAppAttempt}} defers to {{FairScheduler}} for its own 
> weight, especially when {{FairScheduler}} has to call back to 
> {{FSAppAttempt}} to get the details to return a value. Instead, 
> {{FSAppAttempt}} should do the work and call out to {{FairScheduler}} to get 
> the details it needs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7088) Fix application start time and add submit time to UIs

2017-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16252049#comment-16252049
 ] 

Hadoop QA commented on YARN-7088:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  5s{color} 
| {color:red} YARN-7088 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-7088 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12885042/YARN-7088.008.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/18489/console |
| Powered by | Apache Yetus 0.7.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Fix application start time and add submit time to UIs
> -
>
> Key: YARN-7088
> URL: https://issues.apache.org/jira/browse/YARN-7088
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha4
>Reporter: Abdullah Yousufi
>Assignee: Abdullah Yousufi
> Attachments: YARN-7088.001.patch, YARN-7088.002.patch, 
> YARN-7088.003.patch, YARN-7088.004.patch, YARN-7088.005.patch, 
> YARN-7088.006.patch, YARN-7088.007.patch, YARN-7088.008.patch
>
>
> Currently, the start time in the old and new UI actually shows the app 
> submission time. There should actually be two different fields; one for the 
> app's submission and one for its start, as well as the elapsed pending time 
> between the two.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6668) Use cgroup to get container resource utilization

2017-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16252047#comment-16252047
 ] 

Hadoop QA commented on YARN-6668:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  7s{color} 
| {color:red} YARN-6668 does not apply to YARN-1011. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-6668 |
| GITHUB PR | https://github.com/apache/hadoop/pull/241 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/18488/console |
| Powered by | Apache Yetus 0.7.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Use cgroup to get container resource utilization
> 
>
> Key: YARN-6668
> URL: https://issues.apache.org/jira/browse/YARN-6668
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha3
>Reporter: Haibo Chen
>Assignee: Miklos Szegedi
> Attachments: YARN-6668.000.patch, YARN-6668.001.patch, 
> YARN-6668.002.patch, YARN-6668.003.patch, YARN-6668.004.patch, 
> YARN-6668.005.patch, YARN-6668.006.patch, YARN-6668.007.patch, 
> YARN-6668.008.patch, YARN-6668.009.patch
>
>
> Container Monitor relies on proc file system to get container resource 
> utilization, which is not as efficient as reading cgroup accounting. We 
> should in NM, when cgroup is enabled, read cgroup stats instead. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7190) Ensure only NM classpath in 2.x gets TSv2 related hbase jars, not the user classpath

2017-11-14 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated YARN-7190:
--
Target Version/s: 3.1.0  (was: 3.0.0, 3.1.0)

> Ensure only NM classpath in 2.x gets TSv2 related hbase jars, not the user 
> classpath
> 
>
> Key: YARN-7190
> URL: https://issues.apache.org/jira/browse/YARN-7190
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineclient, timelinereader, timelineserver
>Reporter: Vrushali C
>Assignee: Varun Saxena
> Fix For: 2.9.0, YARN-5355_branch2
>
> Attachments: YARN-7190-YARN-5355_branch2.01.patch, 
> YARN-7190-YARN-5355_branch2.02.patch, YARN-7190-YARN-5355_branch2.03.patch, 
> YARN-7190.01.patch
>
>
> [~jlowe] had a good observation about the user classpath getting extra jars 
> in hadoop 2.x brought in with TSv2.  If users start picking up Hadoop 2,x's 
> version of HBase jars instead of the ones they shipped with their job, it 
> could be a problem.
> So when TSv2 is to be used in 2,x, the hbase related jars should come into 
> only the NM classpath not the user classpath.
> Here is a list of some jars
> {code}
> commons-csv-1.0.jar
> commons-el-1.0.jar
> commons-httpclient-3.1.jar
> disruptor-3.3.0.jar
> findbugs-annotations-1.3.9-1.jar
> hbase-annotations-1.2.6.jar
> hbase-client-1.2.6.jar
> hbase-common-1.2.6.jar
> hbase-hadoop2-compat-1.2.6.jar
> hbase-hadoop-compat-1.2.6.jar
> hbase-prefix-tree-1.2.6.jar
> hbase-procedure-1.2.6.jar
> hbase-protocol-1.2.6.jar
> hbase-server-1.2.6.jar
> htrace-core-3.1.0-incubating.jar
> jamon-runtime-2.4.1.jar
> jasper-compiler-5.5.23.jar
> jasper-runtime-5.5.23.jar
> jcodings-1.0.8.jar
> joni-2.1.2.jar
> jsp-2.1-6.1.14.jar
> jsp-api-2.1-6.1.14.jar
> jsr311-api-1.1.1.jar
> metrics-core-2.2.0.jar
> servlet-api-2.5-6.1.14.jar
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6907) Node information page in the old web UI should report resource types

2017-11-14 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated YARN-6907:
--
Target Version/s: 3.1.0  (was: 3.0.0, 3.1.0)

> Node information page in the old web UI should report resource types
> 
>
> Key: YARN-6907
> URL: https://issues.apache.org/jira/browse/YARN-6907
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Daniel Templeton
>Assignee: Gergely Novák
>  Labels: newbie
> Attachments: YARN-6907.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7342) Application page doesn't show correct metrics for reservation runs

2017-11-14 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated YARN-7342:
--
Target Version/s: 3.1.0  (was: 3.0.0, 3.1.0)

> Application page doesn't show correct metrics for reservation runs 
> ---
>
> Key: YARN-7342
> URL: https://issues.apache.org/jira/browse/YARN-7342
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, reservation system
>Affects Versions: 3.1.0
>Reporter: Yufei Gu
> Attachments: Screen Shot 2017-10-16 at 17.27.48.png
>
>
> As the screen shot shows, there are some bugs on the webUI while running job 
> with reservation. For examples, queue name should just be "root.queueA" 
> instead of internal queue name. All metrics(Allocated CPU, % of queue, etc) 
> are missing for reservation runs. These should be a blocker though. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6426) Compress ZK YARN keys to scale up (especially AppStateData

2017-11-14 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated YARN-6426:
--
Target Version/s: 3.1.0  (was: 3.0.0, 3.1.0)

> Compress ZK YARN keys to scale up (especially AppStateData
> --
>
> Key: YARN-6426
> URL: https://issues.apache.org/jira/browse/YARN-6426
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.0.0-alpha2
>Reporter: Roni Burd
>Assignee: Roni Burd
>  Labels: patch
> Attachments: zkcompression.patch
>
>
> ZK today stores the protobuf files uncompressed. This is not an issue except 
> that if a customer job has thousands of files, AppStateData will store the 
> user context as a string with multiple URLs and it is easy to get to 1MB or 
> more. 
> This can put unnecessary strain on ZK and make the process slow. 
> The proposal is to simply compress protobufs before sending them to ZK



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7430) User and Group mapping are incorrect in docker container

2017-11-14 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16252023#comment-16252023
 ] 

Eric Yang commented on YARN-7430:
-

[~ebadger]
{quote}
I agree, but I was under the assumption that this was acceptable behavior for 
hadoop. I would also like to get rid of this so that we can ensure that hadoop 
is secured, but this removes the ability to run containers based on arbitrary 
docker images. Basically, this would modify the longterm plan for docker in 
hadoop so we need to make sure that we understand what the longterm plan is.
{quote}

Hadoop provide flexibility to run in both secure and insecure environments.  
HDFS is not governed by uid:gid because the security design is to secure the 
perimeter of cluster nodes to guarantee consistency in ACL.  Username and group 
name are unique identifier without the need of uid:gid.  For YARN, Linux 
container-executor is already enforcing uid:gid to ensure data written locally 
can be read back by ACL enforced by Linux.  Hadoop implicitly followed Linux 
model security without been full compliance.  There are pro and cons in the 
extra flexibility.  This enable the system to run in secure model (Kerberos 
enabled) to behave like Linux.  This also enables cloud system to simulate 
simple security where the container, can run with default container executor or 
other implementation of container executor to keep system secure.  

This JIRA is focusing on a default setting that prevents unintended user to 
gain extra root privileges even in a system that is configured for "simple" or 
"Kerberos" security mode with Linux container executor.  Linux container 
executor by it's own name representing the security model is honoring what 
Linux system expects.  It would be good for Docker container to play by the 
same rule.

It is entirely possible to implement other security mechanism to validate 
processes rights vs file system rights with Docker for some cloud use case 
where not all users are translated to a unix user.  However, the new branch of 
code would not reside in Linux container executor.  Does this address the 
concern of long term plan?

> User and Group mapping are incorrect in docker container
> 
>
> Key: YARN-7430
> URL: https://issues.apache.org/jira/browse/YARN-7430
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: security, yarn
>Affects Versions: 2.9.0, 3.0.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Blocker
> Attachments: YARN-7430.001.patch, YARN-7430.png
>
>
> In YARN-4266, the recommendation was to use -u [uid]:[gid] numeric values to 
> enforce user and group for the running user.  In YARN-6623, this translated 
> to --user=test --group-add=group1.  The code no longer enforce group 
> correctly for launched process.  
> In addition, the implementation in YARN-6623 requires the user and group 
> information to exist in container to translate username and group to uid/gid. 
>  For users on LDAP, there is no good way to populate container with user and 
> group information. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7486) Race condition in service AM that can cause NPE

2017-11-14 Thread Jian He (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-7486:
--
Description: 
1. container1 completed for instance1
2. instance1 is added to pending list, and send an event asynchronously to 
instance1 to run ContainerStoppedTransition
3. container2 allocated, and assigned to instance1, it records the container2 
inside instance1
4. in the meantime, instance1 ContainerStoppedTransition is called and that set 
the container back to null. 
This cause the recorded container lost.

{code}
java.lang.NullPointerException
at 
org.apache.hadoop.yarn.service.provider.ProviderUtils.initCompTokensForSubstitute(ProviderUtils.java:402)
at 
org.apache.hadoop.yarn.service.provider.AbstractProviderService.buildContainerLaunchContext(AbstractProviderService.java:70)
at 
org.apache.hadoop.yarn.service.containerlaunch.ContainerLaunchService$ContainerLauncher.run(ContainerLaunchService.java:89)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{code}

  was:
1. container1 completed for instance1
2. instance1 is added to pending list
3. container2 allocated, and assigned to instance1, it records the container2 
inside instance1
4. in the meantime, instance1 ContainerStoppedTransition is called and that set 
the container back to null. 
This cause the recorded container lost.

{code}
java.lang.NullPointerException
at 
org.apache.hadoop.yarn.service.provider.ProviderUtils.initCompTokensForSubstitute(ProviderUtils.java:402)
at 
org.apache.hadoop.yarn.service.provider.AbstractProviderService.buildContainerLaunchContext(AbstractProviderService.java:70)
at 
org.apache.hadoop.yarn.service.containerlaunch.ContainerLaunchService$ContainerLauncher.run(ContainerLaunchService.java:89)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{code}


> Race condition in service AM that can cause NPE
> ---
>
> Key: YARN-7486
> URL: https://issues.apache.org/jira/browse/YARN-7486
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-7486.01.patch
>
>
> 1. container1 completed for instance1
> 2. instance1 is added to pending list, and send an event asynchronously to 
> instance1 to run ContainerStoppedTransition
> 3. container2 allocated, and assigned to instance1, it records the container2 
> inside instance1
> 4. in the meantime, instance1 ContainerStoppedTransition is called and that 
> set the container back to null. 
> This cause the recorded container lost.
> {code}
>   java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.service.provider.ProviderUtils.initCompTokensForSubstitute(ProviderUtils.java:402)
>   at 
> org.apache.hadoop.yarn.service.provider.AbstractProviderService.buildContainerLaunchContext(AbstractProviderService.java:70)
>   at 
> org.apache.hadoop.yarn.service.containerlaunch.ContainerLaunchService$ContainerLauncher.run(ContainerLaunchService.java:89)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7430) User and Group mapping are incorrect in docker container

2017-11-14 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251940#comment-16251940
 ] 

Eric Badger commented on YARN-7430:
---

bq. If someone is allowing jobs in the mix uid:gid environment without taking 
the effort to manage user uid/gid, they are inherently running insecured 
environment.
I agree, but I was under the assumption that this was acceptable behavior for 
hadoop. I would also like to get rid of this so that we can ensure that hadoop 
is secured, but this removes the ability to run containers based on arbitrary 
docker images. Basically, this would modify the longterm plan for docker in 
hadoop so we need to make sure that we understand what the longterm plan is. 

> User and Group mapping are incorrect in docker container
> 
>
> Key: YARN-7430
> URL: https://issues.apache.org/jira/browse/YARN-7430
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: security, yarn
>Affects Versions: 2.9.0, 3.0.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Blocker
> Attachments: YARN-7430.001.patch, YARN-7430.png
>
>
> In YARN-4266, the recommendation was to use -u [uid]:[gid] numeric values to 
> enforce user and group for the running user.  In YARN-6623, this translated 
> to --user=test --group-add=group1.  The code no longer enforce group 
> correctly for launched process.  
> In addition, the implementation in YARN-6623 requires the user and group 
> information to exist in container to translate username and group to uid/gid. 
>  For users on LDAP, there is no good way to populate container with user and 
> group information. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6789) Add Client API to get all supported resource types from RM

2017-11-14 Thread Daniel Templeton (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton updated YARN-6789:
---
Fix Version/s: 3.1.0

> Add Client API to get all supported resource types from RM
> --
>
> Key: YARN-6789
> URL: https://issues.apache.org/jira/browse/YARN-6789
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Sunil G
>Assignee: Sunil G
> Fix For: 3.0.0, 3.1.0
>
> Attachments: YARN-6789-YARN-3926.001.patch, 
> YARN-6789-YARN-3926.002_incomplete_.patch, YARN-6789-YARN-3926.003.patch, 
> YARN-6789-YARN-3926.004.patch, YARN-6789-YARN-3926.005.patch, 
> YARN-6789-YARN-3926.006.patch, YARN-6789.branch-3.0.001.patch
>
>
> It will be better to provide an api to get all supported resource types from 
> RM.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6789) Add Client API to get all supported resource types from RM

2017-11-14 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated YARN-6789:
--
Fix Version/s: (was: 3.1.0)
   3.0.0

> Add Client API to get all supported resource types from RM
> --
>
> Key: YARN-6789
> URL: https://issues.apache.org/jira/browse/YARN-6789
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Sunil G
>Assignee: Sunil G
> Fix For: 3.0.0
>
> Attachments: YARN-6789-YARN-3926.001.patch, 
> YARN-6789-YARN-3926.002_incomplete_.patch, YARN-6789-YARN-3926.003.patch, 
> YARN-6789-YARN-3926.004.patch, YARN-6789-YARN-3926.005.patch, 
> YARN-6789-YARN-3926.006.patch, YARN-6789.branch-3.0.001.patch
>
>
> It will be better to provide an api to get all supported resource types from 
> RM.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7412) test_docker_util.test_check_mount_permitted() is failing

2017-11-14 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated YARN-7412:
--
Fix Version/s: 3.0.0

> test_docker_util.test_check_mount_permitted() is failing
> 
>
> Key: YARN-7412
> URL: https://issues.apache.org/jira/browse/YARN-7412
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha4
>Reporter: Haibo Chen
>Assignee: Eric Badger
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: YARN-7412.001.patch
>
>
> Test output
>  classname="TestDockerUtil">
>message="/home/haibochen/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/test/utils/test_docker_util.cc:444

>   Expected: itr->second
  Which is: 1
To be equal to: 
> ret
  Which is: 0
for inp
> ut /usr/bin/touch" 
> type="">
> 
>  classname="TestDockerUtil">
>message="/home/haibochen/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/test/utils/test_docker_util.cc:462

>   Expected: expected[i]
  Which is: 
> "/usr/bin/touch"
To be equal to: ptr[i]

>  Which is: "/bin/touch"" 
> type="">
> 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7330) Add support to show GPU on UI/metrics

2017-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251875#comment-16251875
 ] 

Hadoop QA commented on YARN-7330:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m  
9s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
 0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 59s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
33s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api in 
trunk has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
22s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
54s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m  8s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 19 new + 39 unchanged - 2 fixed = 58 total (was 41) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 24s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
21s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
50s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 17m 
15s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
13s{color} | {color:green} hadoop-yarn-ui in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}106m 33s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Find

[jira] [Commented] (YARN-7330) Add support to show GPU on UI/metrics

2017-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251848#comment-16251848
 ] 

Hadoop QA commented on YARN-7330:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 16m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
57s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 27s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m  
5s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api in 
trunk has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m  
8s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 56s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 19 new + 40 unchanged - 2 fixed = 59 total (was 42) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 21s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
55s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
31s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m  
1s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
28s{color} | {color:green} hadoop-yarn-ui in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}102m 47s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Find

[jira] [Commented] (YARN-7159) Normalize unit of resource objects in RM and avoid to do unit conversion in critical path

2017-11-14 Thread Manikandan R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251833#comment-16251833
 ] 

Manikandan R commented on YARN-7159:


Junits failures are not related, runs successfully in my setup.

> Normalize unit of resource objects in RM and avoid to do unit conversion in 
> critical path
> -
>
> Key: YARN-7159
> URL: https://issues.apache.org/jira/browse/YARN-7159
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Wangda Tan
>Assignee: Manikandan R
>Priority: Critical
> Attachments: YARN-7159.001.patch, YARN-7159.002.patch, 
> YARN-7159.003.patch, YARN-7159.004.patch, YARN-7159.005.patch, 
> YARN-7159.006.patch, YARN-7159.007.patch, YARN-7159.008.patch, 
> YARN-7159.009.patch, YARN-7159.010.patch, YARN-7159.011.patch, 
> YARN-7159.012.patch, YARN-7159.013.patch, YARN-7159.015.patch, 
> YARN-7159.016.patch, YARN-7159.017.patch, YARN-7159.018.patch
>
>
> Currently resource conversion could happen in critical code path when 
> different unit is specified by client. This could impact performance and 
> throughput of RM a lot. We should do unit normalization when resource passed 
> to RM and avoid expensive unit conversion every time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7346) Fix compilation errors against hbase2 alpha release

2017-11-14 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251819#comment-16251819
 ] 

Ted Yu commented on YARN-7346:
--

[~ram_krish]:
You can find branch used by Haibo from above.

> Fix compilation errors against hbase2 alpha release
> ---
>
> Key: YARN-7346
> URL: https://issues.apache.org/jira/browse/YARN-7346
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Vrushali C
>
> When compiling hadoop-yarn-server-timelineservice-hbase against 2.0.0-alpha3, 
> I got the following errors:
> https://pastebin.com/Ms4jYEVB
> This issue is to fix the compilation errors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7430) User and Group mapping are incorrect in docker container

2017-11-14 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251805#comment-16251805
 ] 

Eric Yang commented on YARN-7430:
-

[~ebadger] IT infrastructure keeps the entire company's user credential 
consistent through the usage of LDAP/AD servers with uniformed uid:gid to 
ensure there is no security inconsistency.  From that point of view, it is 
important to keep uid:gid uniformed.  In the docker world, if someone runs a 
Ubuntu image on a RedHat Host OS cluster, all system default accounts inside 
Ubuntu image have different uid:gid numbers.  Typical reaction from system 
admin is to secure the container by configuring LDAP/sssd in the container, or 
ban docker container to mount external file system all together.

If someone is allowing jobs in the mix uid:gid environment without taking the 
effort to manage user uid/gid, they are inherently running on a insecured 
environment.  I don't think anyone would support their system being insecure 
for the sake of backward compatibility.

> User and Group mapping are incorrect in docker container
> 
>
> Key: YARN-7430
> URL: https://issues.apache.org/jira/browse/YARN-7430
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: security, yarn
>Affects Versions: 2.9.0, 3.0.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Blocker
> Attachments: YARN-7430.001.patch, YARN-7430.png
>
>
> In YARN-4266, the recommendation was to use -u [uid]:[gid] numeric values to 
> enforce user and group for the running user.  In YARN-6623, this translated 
> to --user=test --group-add=group1.  The code no longer enforce group 
> correctly for launched process.  
> In addition, the implementation in YARN-6623 requires the user and group 
> information to exist in container to translate username and group to uid/gid. 
>  For users on LDAP, there is no good way to populate container with user and 
> group information. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-7430) User and Group mapping are incorrect in docker container

2017-11-14 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251805#comment-16251805
 ] 

Eric Yang edited comment on YARN-7430 at 11/14/17 5:40 PM:
---

[~ebadger] IT infrastructure keeps the entire company's user credential 
consistent through the usage of LDAP/AD servers with uniformed uid:gid to 
ensure there is no security inconsistency.  From that point of view, it is 
important to keep uid:gid uniformed.  In the docker world, if someone runs a 
Ubuntu image on a RedHat Host OS cluster, all system default accounts inside 
Ubuntu image have different uid:gid numbers.  Typical reaction from system 
admin is to secure the container by configuring LDAP/sssd in the container, or 
ban docker container to mount external file system all together.

If someone is allowing jobs in the mix uid:gid environment without taking the 
effort to manage user uid/gid, they are inherently running insecured 
environment.  I don't think anyone would support their system being insecure 
for the sake of backward compatibility.


was (Author: eyang):
[~ebadger] IT infrastructure keeps the entire company's user credential 
consistent through the usage of LDAP/AD servers with uniformed uid:gid to 
ensure there is no security inconsistency.  From that point of view, it is 
important to keep uid:gid uniformed.  In the docker world, if someone runs a 
Ubuntu image on a RedHat Host OS cluster, all system default accounts inside 
Ubuntu image have different uid:gid numbers.  Typical reaction from system 
admin is to secure the container by configuring LDAP/sssd in the container, or 
ban docker container to mount external file system all together.

If someone is allowing jobs in the mix uid:gid environment without taking the 
effort to manage user uid/gid, they are inherently running on a insecured 
environment.  I don't think anyone would support their system being insecure 
for the sake of backward compatibility.

> User and Group mapping are incorrect in docker container
> 
>
> Key: YARN-7430
> URL: https://issues.apache.org/jira/browse/YARN-7430
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: security, yarn
>Affects Versions: 2.9.0, 3.0.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Blocker
> Attachments: YARN-7430.001.patch, YARN-7430.png
>
>
> In YARN-4266, the recommendation was to use -u [uid]:[gid] numeric values to 
> enforce user and group for the running user.  In YARN-6623, this translated 
> to --user=test --group-add=group1.  The code no longer enforce group 
> correctly for launched process.  
> In addition, the implementation in YARN-6623 requires the user and group 
> information to exist in container to translate username and group to uid/gid. 
>  For users on LDAP, there is no good way to populate container with user and 
> group information. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7330) Add support to show GPU on UI/metrics

2017-11-14 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251770#comment-16251770
 ] 

Sunil G commented on YARN-7330:
---

[~skmvasu] ,could you please take a look

> Add support to show GPU on UI/metrics
> -
>
> Key: YARN-7330
> URL: https://issues.apache.org/jira/browse/YARN-7330
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-7330.0-wip.patch, YARN-7330.003.patch, 
> YARN-7330.004.patch, YARN-7330.006.patch, YARN-7330.007.patch, 
> YARN-7330.008.patch, YARN-7330.1-wip.patch, YARN-7330.2-wip.patch, 
> screencapture-0-wip.png
>
>
> We should be able to view GPU metrics from UI/REST API.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7430) User and Group mapping are incorrect in docker container

2017-11-14 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251740#comment-16251740
 ] 

Eric Badger commented on YARN-7430:
---

bq. I'm fine with it. Eric Badger - does it sounds ok to you?

I'm less familiar with use cases outside of using uid:gid to enter the 
container. However, I'm wondering if this would cause some other use cases to 
fail. If an image has a user {{foo}} with different uid:gid pairs inside and 
outside of the container or if the user doesn't exist outside of the container, 
then the process may fail due to permissions issues or due to user lookup 
failures. I imagine this might be the case for standing up simple long running 
services, like a web server or something like that. Basically, enabling uid:gid 
remapping by default will require the docker image and the host to be in sync 
with their users. This isn't currently a requirement and could possibly break 
jobs. Hopefully someone else is more familiar with these cases and can shed 
some more light on whether or not this would break jobs.

> User and Group mapping are incorrect in docker container
> 
>
> Key: YARN-7430
> URL: https://issues.apache.org/jira/browse/YARN-7430
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: security, yarn
>Affects Versions: 2.9.0, 3.0.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Blocker
> Attachments: YARN-7430.001.patch, YARN-7430.png
>
>
> In YARN-4266, the recommendation was to use -u [uid]:[gid] numeric values to 
> enforce user and group for the running user.  In YARN-6623, this translated 
> to --user=test --group-add=group1.  The code no longer enforce group 
> correctly for launched process.  
> In addition, the implementation in YARN-6623 requires the user and group 
> information to exist in container to translate username and group to uid/gid. 
>  For users on LDAP, there is no good way to populate container with user and 
> group information. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7159) Normalize unit of resource objects in RM and avoid to do unit conversion in critical path

2017-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251732#comment-16251732
 ] 

Hadoop QA commented on YARN-7159:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 14m 
38s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 7 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
 5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 10s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
11s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api in 
trunk has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
41s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 47s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
38s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
36s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
40s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 60m 47s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}150m 47s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Unreaped Processes | hadoop-yarn-server-resourcemanager:1 |
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer
 |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing |
|   | hadoop.yarn.server.resourcemanager.TestRMAdminService |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation
 |
| Timed out junit tests | 
org.apache.hadoop.yarn.server.re

[jira] [Commented] (YARN-7486) Race condition in service AM that can cause NPE

2017-11-14 Thread Billie Rinaldi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251722#comment-16251722
 ] 

Billie Rinaldi commented on YARN-7486:
--

Thanks [~jianhe], this patch is looking good. I will do some testing and let 
you know how it goes.

> Race condition in service AM that can cause NPE
> ---
>
> Key: YARN-7486
> URL: https://issues.apache.org/jira/browse/YARN-7486
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-7486.01.patch
>
>
> 1. container1 completed for instance1
> 2. instance1 is added to pending list
> 3. container2 allocated, and assigned to instance1, it records the container2 
> inside instance1
> 4. in the meantime, instance1 ContainerStoppedTransition is called and that 
> set the container back to null. 
> This cause the recorded container lost.
> {code}
>   java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.service.provider.ProviderUtils.initCompTokensForSubstitute(ProviderUtils.java:402)
>   at 
> org.apache.hadoop.yarn.service.provider.AbstractProviderService.buildContainerLaunchContext(AbstractProviderService.java:70)
>   at 
> org.apache.hadoop.yarn.service.containerlaunch.ContainerLaunchService$ContainerLauncher.run(ContainerLaunchService.java:89)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7330) Add support to show GPU on UI/metrics

2017-11-14 Thread Sunil G (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-7330:
--
Attachment: YARN-7330.008.patch

Fixing compilation issue.

> Add support to show GPU on UI/metrics
> -
>
> Key: YARN-7330
> URL: https://issues.apache.org/jira/browse/YARN-7330
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-7330.0-wip.patch, YARN-7330.003.patch, 
> YARN-7330.004.patch, YARN-7330.006.patch, YARN-7330.007.patch, 
> YARN-7330.008.patch, YARN-7330.1-wip.patch, YARN-7330.2-wip.patch, 
> screencapture-0-wip.png
>
>
> We should be able to view GPU metrics from UI/REST API.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7330) Add support to show GPU on UI/metrics

2017-11-14 Thread Sunil G (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-7330:
--
Attachment: YARN-7330.007.patch

Attaching patch addressing comment from Vasudevan.

> Add support to show GPU on UI/metrics
> -
>
> Key: YARN-7330
> URL: https://issues.apache.org/jira/browse/YARN-7330
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-7330.0-wip.patch, YARN-7330.003.patch, 
> YARN-7330.004.patch, YARN-7330.006.patch, YARN-7330.007.patch, 
> YARN-7330.1-wip.patch, YARN-7330.2-wip.patch, screencapture-0-wip.png
>
>
> We should be able to view GPU metrics from UI/REST API.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7462) Render outstanding resource requests on application details page

2017-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251537#comment-16251537
 ] 

Hadoop QA commented on YARN-7462:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  5m 
45s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
30m 13s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 44s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 52m 15s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7462 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12897521/YARN-7462.004.patch |
| Optional Tests |  asflicense  shadedclient  |
| uname | Linux 57f44563d06f 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 18621af |
| maven | version: Apache Maven 3.3.9 |
| Max. process+thread count | 442 (vs. ulimit of 5000) |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui . U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/18485/console |
| Powered by | Apache Yetus 0.7.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Render outstanding resource requests on application details page
> 
>
> Key: YARN-7462
> URL: https://issues.apache.org/jira/browse/YARN-7462
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Vasudevan Skm
>Assignee: Vasudevan Skm
> Attachments: Screen Shot 2017-11-08 at 3.24.30 PM.png, Screen Shot 
> 2017-11-08 at 3.38.48 PM.png, YARN-7462.001.patch, YARN-7462.002.patch, 
> YARN-7462.003.patch, YARN-7462.004.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7480) Render tooltips on columns where text is clipped

2017-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251529#comment-16251529
 ] 

Hadoop QA commented on YARN-7480:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 
33s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
27m  0s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  8s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
20s{color} | {color:red} The patch generated 3 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 49m 22s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7480 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12897518/YARN-7480.002.patch |
| Optional Tests |  asflicense  shadedclient  |
| uname | Linux 81b5e445d8de 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 
12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 18621af |
| maven | version: Apache Maven 3.3.9 |
| asflicense | 
https://builds.apache.org/job/PreCommit-YARN-Build/18484/artifact/out/patch-asflicense-problems.txt
 |
| Max. process+thread count | 353 (vs. ulimit of 5000) |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/18484/console |
| Powered by | Apache Yetus 0.7.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Render tooltips on columns where text is clipped
> 
>
> Key: YARN-7480
> URL: https://issues.apache.org/jira/browse/YARN-7480
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Vasudevan Skm
>Assignee: Vasudevan Skm
> Attachments: YARN-7480.001.patch, YARN-7480.002.patch
>
>
> In em-table, when text gets clipped the information is lost. Need to render a 
> tooltip to show the full text in these cases



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7464) Allow fiters on Nodes page

2017-11-14 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251528#comment-16251528
 ] 

Sunil G commented on YARN-7464:
---

[~skmvasu], seems like the patch is not getting applied to trunk. I think 
yarn-node-managers was renamed to yarn-node-status and causing this. Could you 
please help to rebase this..

> Allow fiters on Nodes page
> --
>
> Key: YARN-7464
> URL: https://issues.apache.org/jira/browse/YARN-7464
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Vasudevan Skm
>Assignee: Vasudevan Skm
> Attachments: Screen Shot 2017-11-08 at 4.56.04 PM.png, Screen Shot 
> 2017-11-08 at 4.56.12 PM.png, YARN-7464.001.patch, YARN-7464.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7492) Set up SASS for UI styling

2017-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251511#comment-16251511
 ] 

Hadoop QA commented on YARN-7492:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
 3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
24m 38s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 24s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
20s{color} | {color:red} The patch generated 4 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 36m 28s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7492 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12897525/YARN-7492.002.patch |
| Optional Tests |  asflicense  shadedclient  |
| uname | Linux 4dc8062ffb7d 4.4.0-89-generic #112-Ubuntu SMP Mon Jul 31 
19:38:41 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 18621af |
| maven | version: Apache Maven 3.3.9 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/18483/artifact/out/whitespace-eol.txt
 |
| asflicense | 
https://builds.apache.org/job/PreCommit-YARN-Build/18483/artifact/out/patch-asflicense-problems.txt
 |
| Max. process+thread count | 403 (vs. ulimit of 5000) |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui . U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/18483/console |
| Powered by | Apache Yetus 0.7.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Set up SASS for UI styling
> --
>
> Key: YARN-7492
> URL: https://issues.apache.org/jira/browse/YARN-7492
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Vasudevan Skm
>Assignee: Vasudevan Skm
> Attachments: YARN-7492.001.patch, YARN-7492.002.patch
>
>
> SASS will help in improving the quality and maintainablity of our styles. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7464) Allow fiters on Nodes page

2017-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251461#comment-16251461
 ] 

Hadoop QA commented on YARN-7464:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  4s{color} 
| {color:red} YARN-7464 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-7464 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12897506/YARN-7464.002.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/18482/console |
| Powered by | Apache Yetus 0.7.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Allow fiters on Nodes page
> --
>
> Key: YARN-7464
> URL: https://issues.apache.org/jira/browse/YARN-7464
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Vasudevan Skm
>Assignee: Vasudevan Skm
> Attachments: Screen Shot 2017-11-08 at 4.56.04 PM.png, Screen Shot 
> 2017-11-08 at 4.56.12 PM.png, YARN-7464.001.patch, YARN-7464.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Resolved] (YARN-7474) Yarn resourcemanager stop allocating container when cluster resource is sufficient

2017-11-14 Thread wuchang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wuchang resolved YARN-7474.
---
Resolution: Fixed

A submitted application's container has make reservations on all NodeManagers , 
which make all NodeManagers become unavailable.

I think I have configured the *yarn.scheduler.maximum-allocation-mb* far too 
big (about half of *yarn.nodemanager.resource.memory-mb*) so that it is 
possible that a bad-configured application's containers will make reservation 
on all nodes and can never switched to allocated ,namely result in a deadlock.

> Yarn resourcemanager stop allocating container when cluster resource is 
> sufficient 
> ---
>
> Key: YARN-7474
> URL: https://issues.apache.org/jira/browse/YARN-7474
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.2
>Reporter: wuchang
>Priority: Critical
> Attachments: rm.log
>
>
> Hadoop Version: *2.7.2*
> My Yarn cluster have *(1100TB,368vCores)*  totallly with 15 nodemangers . 
> My cluster use fair-scheduler and I have 4 queues for different kinds of jobs:
>  
> {quote}
> 
> 
>10 mb, 30 vcores
>422280 mb, 132 vcores
>0.5f
>90
>90
>50
> 
> 
>25000 mb, 20 vcores
>600280 mb, 150 vcores
>0.6f
>90
>90
>50
> 
> 
>10 mb, 30 vcores
>647280 mb, 132 vcores
>0.8f
>90
>90
>50
> 
>   
> 
>8 mb, 20 vcores
>12 mb, 30 vcores
>0.5f
>90
>90
>50
>  
> 
>  {quote}
> from about 9:00 am, all new-coming applications get stuck for nearly 5 hours, 
> but the cluster resource usage is about *(600GB,120vCores)*，it means，the 
> cluster resource is still *sufficient*.
> *The resource usage of the whole yarn cluster AND of each single queue stay 
> unchanged for 5 hours*, really strange. Obviously , if it a resource 
> insufficiency problem , it's impossible that used resource of all queues 
> didn't have any change for 5 hours. So , I is the problem of ResourceManager.
> Since my cluster scale is not large, only 15 nodes with 1100G memory ,I 
> exclude the possibility showed in [YARN-4618].
>  
> besides that , all the running applications seems never finished, the Yarn RM 
> seems static ,the RM log  have no more state change logs about running 
> applications，except for the log about more and more application is submitted 
> and become ACCEPTED, but never from ACCEPTED to RUNNING.
> *The resource usage of the whole yarn cluster AND of each single queue stay 
> unchanged for 5 hours*, really strange.
> The cluster seems like a zombie.
>  
> I haved checked the ApplicationMaster log of some running but stucked 
> application , 
>  
>  {quote}
> 2017-11-11 09:04:55,896 INFO [IPC Server handler 0 on 42899] 
> org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Getting task 
> report for MAP job_1507795051888_183385. Report-size will be 4
> 2017-11-11 09:04:55,957 INFO [IPC Server handler 0 on 42899] 
> org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Getting task 
> report for REDUCE job_1507795051888_183385. Report-size will be 0
> 2017-11-11 09:04:56,037 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before 
> Scheduling: PendingReds:0 ScheduledMaps:4 ScheduledReds:0 AssignedMaps:0 
> AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:0 ContRel:0 
> HostLocal:0 RackLocal:0
> 2017-11-11 09:04:56,061 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() 
> for application_1507795051888_183385: ask=6 release= 0 newContainers=0 
> finishedContainers=0 resourcelimit= knownNMs=15
> 2017-11-11 13:58:56,736 INFO [IPC Server handler 0 on 42899] 
> org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Kill job 
> job_1507795051888_183385 received from appuser (auth:SIMPLE) at 10.120.207.11
>  {quote}
>  
> You can see that at  *2017-11-11 09:04:56,061* It send resource request to 
> ResourceManager but RM allocate zero containers. Then ,no more logs  for 5 
> hours. At  13:58， I have to kill it manually.
>  
> After 5 hours , I kill some pending applications and then everything 
> recovered，remaining cluster resources can be allocated again, ResourceManager 
> seems  to be alive again.
>  
> I have exclude the possibility of  the restriction of maxRunningApps and 
> maxAMShare config because they will just affect a single queue, but my 
> problem is that whole yarn cluster application get stuck.
>  
>  
>  
> Also , I exclude the possibility of a  resourceman

[jira] [Comment Edited] (YARN-7474) Yarn resourcemanager stop allocating container when cluster resource is sufficient

2017-11-14 Thread wuchang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251347#comment-16251347
 ] 

wuchang edited comment on YARN-7474 at 11/14/17 1:05 PM:
-

Finally I get the reason.
A submitted application's container has make reservations on all NodeManagers , 
which make all NodeManagers become unavailable.

I think I have configured the *yarn.scheduler.maximum-allocation-mb* far too 
big (about half of *yarn.nodemanager.resource.memory-mb*) so that it is 
possible that a bad-configured application's containers will make reservation 
on all nodes and can never switched to allocated ,namely result in a deadlock.


was (Author: wuchang1989):
A submitted application's container has make reservations on all NodeManagers , 
which make all NodeManagers become unavailable.

I think I have configured the *yarn.scheduler.maximum-allocation-mb* far too 
big (about half of *yarn.nodemanager.resource.memory-mb*) so that it is 
possible that a bad-configured application's containers will make reservation 
on all nodes and can never switched to allocated ,namely result in a deadlock.

> Yarn resourcemanager stop allocating container when cluster resource is 
> sufficient 
> ---
>
> Key: YARN-7474
> URL: https://issues.apache.org/jira/browse/YARN-7474
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.2
>Reporter: wuchang
>Priority: Critical
> Attachments: rm.log
>
>
> Hadoop Version: *2.7.2*
> My Yarn cluster have *(1100TB,368vCores)*  totallly with 15 nodemangers . 
> My cluster use fair-scheduler and I have 4 queues for different kinds of jobs:
>  
> {quote}
> 
> 
>10 mb, 30 vcores
>422280 mb, 132 vcores
>0.5f
>90
>90
>50
> 
> 
>25000 mb, 20 vcores
>600280 mb, 150 vcores
>0.6f
>90
>90
>50
> 
> 
>10 mb, 30 vcores
>647280 mb, 132 vcores
>0.8f
>90
>90
>50
> 
>   
> 
>8 mb, 20 vcores
>12 mb, 30 vcores
>0.5f
>90
>90
>50
>  
> 
>  {quote}
> from about 9:00 am, all new-coming applications get stuck for nearly 5 hours, 
> but the cluster resource usage is about *(600GB,120vCores)*，it means，the 
> cluster resource is still *sufficient*.
> *The resource usage of the whole yarn cluster AND of each single queue stay 
> unchanged for 5 hours*, really strange. Obviously , if it a resource 
> insufficiency problem , it's impossible that used resource of all queues 
> didn't have any change for 5 hours. So , I is the problem of ResourceManager.
> Since my cluster scale is not large, only 15 nodes with 1100G memory ,I 
> exclude the possibility showed in [YARN-4618].
>  
> besides that , all the running applications seems never finished, the Yarn RM 
> seems static ,the RM log  have no more state change logs about running 
> applications，except for the log about more and more application is submitted 
> and become ACCEPTED, but never from ACCEPTED to RUNNING.
> *The resource usage of the whole yarn cluster AND of each single queue stay 
> unchanged for 5 hours*, really strange.
> The cluster seems like a zombie.
>  
> I haved checked the ApplicationMaster log of some running but stucked 
> application , 
>  
>  {quote}
> 2017-11-11 09:04:55,896 INFO [IPC Server handler 0 on 42899] 
> org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Getting task 
> report for MAP job_1507795051888_183385. Report-size will be 4
> 2017-11-11 09:04:55,957 INFO [IPC Server handler 0 on 42899] 
> org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Getting task 
> report for REDUCE job_1507795051888_183385. Report-size will be 0
> 2017-11-11 09:04:56,037 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before 
> Scheduling: PendingReds:0 ScheduledMaps:4 ScheduledReds:0 AssignedMaps:0 
> AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:0 ContRel:0 
> HostLocal:0 RackLocal:0
> 2017-11-11 09:04:56,061 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() 
> for application_1507795051888_183385: ask=6 release= 0 newContainers=0 
> finishedContainers=0 resourcelimit= knownNMs=15
> 2017-11-11 13:58:56,736 INFO [IPC Server handler 0 on 42899] 
> org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Kill job 
> job_1507795051888_183385 received from appuser (auth:SIMPLE) at 10.120.207.11
>  {quote}
>  
> You can see that at  *2017-11-11 09:04:56,061* It send resource request to 
> ResourceM

[jira] [Created] (YARN-7494) Add muti node lookup support for better placement

2017-11-14 Thread Sunil G (JIRA)

Sunil G created YARN-7494:
-

 Summary: Add muti node lookup support for better placement
 Key: YARN-7494
 URL: https://issues.apache.org/jira/browse/YARN-7494
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacity scheduler
Reporter: Sunil G
Assignee: Sunil G


Instead of single node, for effectiveness we can consider a multi node lookup 
based on partition to start with.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7493) Yarn UI does not display applications page when application is submitted to zero capacity queue

2017-11-14 Thread Rohith Sharma K S (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-7493:

Attachment: capacity-scheduler.xml
Rendering error.png

> Yarn UI does not display applications page when application is submitted to 
> zero capacity queue
> ---
>
> Key: YARN-7493
> URL: https://issues.apache.org/jira/browse/YARN-7493
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Rohith Sharma K S
>Priority: Critical
> Attachments: Rendering error.png, capacity-scheduler.xml
>
>
> It is observed that if child queue has zero capacity and application is 
> submitted to that queue, then whole yarn-ui goes into toss. Application page 
> and cluster overview page does not display at all. 
> Later even if try to kill the app, UI does not display. This needs to be 
> investigated.
> cc :/ [~sunilg]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-7493) Yarn UI does not display applications page when application is submitted to zero capacity queue

2017-11-14 Thread Rohith Sharma K S (JIRA)

Rohith Sharma K S created YARN-7493:
---

 Summary: Yarn UI does not display applications page when 
application is submitted to zero capacity queue
 Key: YARN-7493
 URL: https://issues.apache.org/jira/browse/YARN-7493
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn-ui-v2
Reporter: Rohith Sharma K S
Priority: Critical


It is observed that if child queue has zero capacity and application is 
submitted to that queue, then whole yarn-ui goes into toss. Application page 
and cluster overview page does not display at all. 
Later even if try to kill the app, UI does not display. This needs to be 
investigated.
cc :/ [~sunilg]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7462) Render outstanding resource requests on application details page

2017-11-14 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251228#comment-16251228
 ] 

Sunil G commented on YARN-7462:
---

pending jenkins

> Render outstanding resource requests on application details page
> 
>
> Key: YARN-7462
> URL: https://issues.apache.org/jira/browse/YARN-7462
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Vasudevan Skm
>Assignee: Vasudevan Skm
> Attachments: Screen Shot 2017-11-08 at 3.24.30 PM.png, Screen Shot 
> 2017-11-08 at 3.38.48 PM.png, YARN-7462.001.patch, YARN-7462.002.patch, 
> YARN-7462.003.patch, YARN-7462.004.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7464) Allow fiters on Nodes page

2017-11-14 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251230#comment-16251230
 ] 

Sunil G commented on YARN-7464:
---

+1 pending jenkins for commit. 

> Allow fiters on Nodes page
> --
>
> Key: YARN-7464
> URL: https://issues.apache.org/jira/browse/YARN-7464
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Vasudevan Skm
>Assignee: Vasudevan Skm
> Attachments: Screen Shot 2017-11-08 at 4.56.04 PM.png, Screen Shot 
> 2017-11-08 at 4.56.12 PM.png, YARN-7464.001.patch, YARN-7464.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7474) Yarn resourcemanager stop allocating container when cluster resource is sufficient

2017-11-14 Thread wuchang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251226#comment-16251226
 ] 

wuchang commented on YARN-7474:
---

[~yufeigu] [~templedf]

>From the ResourceManager log，I see:
At 09:04 when the problem start to occur , all NodeManagers in my yarn cluster 
has just been reserved ,below is the result of grepping the *Making 
reservation* from the log:

{code:java}
2017-11-11 09:00:30,343 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: 
Making reservation: node=10.120.117.106 app_id=application_1507795051888_183354
2017-11-11 09:00:30,346 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: 
Making reservation: node=10.120.117.105 app_id=application_1507795051888_183354
2017-11-11 09:00:30,401 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: 
Making reservation: node=10.120.117.84 app_id=application_1507795051888_183354
2017-11-11 09:00:30,412 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: 
Making reservation: node=10.120.117.85 app_id=application_1507795051888_183354
2017-11-11 09:00:30,535 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: 
Making reservation: node=10.120.117.102 app_id=application_1507795051888_183354
2017-11-11 09:00:30,687 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: 
Making reservation: node=10.120.117.86 app_id=application_1507795051888_183354
2017-11-11 09:00:30,824 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: 
Making reservation: node=10.120.117.108 app_id=application_1507795051888_183354
2017-11-11 09:00:30,865 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: 
Making reservation: node=10.120.117.104 app_id=application_1507795051888_183354
2017-11-11 09:00:30,991 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: 
Making reservation: node=10.120.117.103 app_id=application_1507795051888_183354
2017-11-11 09:00:31,232 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: 
Making reservation: node=10.120.117.107 app_id=application_1507795051888_183354
2017-11-11 09:00:31,249 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: 
Making reservation: node=10.120.117.101 app_id=application_1507795051888_183354
2017-11-11 09:00:34,547 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: 
Making reservation: node=10.120.117.100 app_id=application_1507795051888_183358
2017-11-11 09:01:06,277 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: 
Making reservation: node=10.120.117.100 app_id=application_1507795051888_183342
2017-11-11 09:01:16,525 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: 
Making reservation: node=10.120.117.100 app_id=application_1507795051888_183342
2017-11-11 09:01:25,348 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: 
Making reservation: node=10.120.117.100 app_id=application_1507795051888_183342
2017-11-11 09:01:28,351 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: 
Making reservation: node=10.120.117.100 app_id=application_1507795051888_183342
2017-11-11 09:02:29,658 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: 
Making reservation: node=10.120.117.100 app_id=application_1507795051888_183342
2017-11-11 09:04:14,788 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: 
Making reservation: node=10.120.117.100 app_id=application_1507795051888_183376
2017-11-11 09:04:26,307 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: 
Making reservation: node=10.120.117.100 app_id=application_1507795051888_183380
2017-11-11 09:04:51,200 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: 
Making reservation: node=10.120.117.100 app_id=application_1507795051888_183383
{code}

So, I guess if it is caused by a reservation deadlock , which means ,  all 
nodes has been reserved , these reserved containers cannot be turned to 
allocated , and new-coming application cannot make reservation anymore so they 
are all pending, thus , my yarn cluster become dead.


> Yarn resourcemanager stop allocating container when cluster resource is 
> sufficient 
> ---
>
> Key: YARN-7474
> URL: https://issues.apache.org/jira/browse/YARN-7474
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.2
>Reporter: wuchang
>Priority: Critical
> Attachments: rm.log

[jira] [Updated] (YARN-7159) Normalize unit of resource objects in RM and avoid to do unit conversion in critical path

2017-11-14 Thread Manikandan R (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikandan R updated YARN-7159:
---
Attachment: YARN-7159.018.patch

> Normalize unit of resource objects in RM and avoid to do unit conversion in 
> critical path
> -
>
> Key: YARN-7159
> URL: https://issues.apache.org/jira/browse/YARN-7159
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Wangda Tan
>Assignee: Manikandan R
>Priority: Critical
> Attachments: YARN-7159.001.patch, YARN-7159.002.patch, 
> YARN-7159.003.patch, YARN-7159.004.patch, YARN-7159.005.patch, 
> YARN-7159.006.patch, YARN-7159.007.patch, YARN-7159.008.patch, 
> YARN-7159.009.patch, YARN-7159.010.patch, YARN-7159.011.patch, 
> YARN-7159.012.patch, YARN-7159.013.patch, YARN-7159.015.patch, 
> YARN-7159.016.patch, YARN-7159.017.patch, YARN-7159.018.patch
>
>
> Currently resource conversion could happen in critical code path when 
> different unit is specified by client. This could impact performance and 
> throughput of RM a lot. We should do unit normalization when resource passed 
> to RM and avoid expensive unit conversion every time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7159) Normalize unit of resource objects in RM and avoid to do unit conversion in critical path

2017-11-14 Thread Manikandan R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251158#comment-16251158
 ] 

Manikandan R commented on YARN-7159:


Fixed checkstyle, whitespace issues. Junit failures are not related to this 
patch.

> Normalize unit of resource objects in RM and avoid to do unit conversion in 
> critical path
> -
>
> Key: YARN-7159
> URL: https://issues.apache.org/jira/browse/YARN-7159
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Wangda Tan
>Assignee: Manikandan R
>Priority: Critical
> Attachments: YARN-7159.001.patch, YARN-7159.002.patch, 
> YARN-7159.003.patch, YARN-7159.004.patch, YARN-7159.005.patch, 
> YARN-7159.006.patch, YARN-7159.007.patch, YARN-7159.008.patch, 
> YARN-7159.009.patch, YARN-7159.010.patch, YARN-7159.011.patch, 
> YARN-7159.012.patch, YARN-7159.013.patch, YARN-7159.015.patch, 
> YARN-7159.016.patch, YARN-7159.017.patch
>
>
> Currently resource conversion could happen in critical code path when 
> different unit is specified by client. This could impact performance and 
> throughput of RM a lot. We should do unit normalization when resource passed 
> to RM and avoid expensive unit conversion every time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7492) Set up SASS for UI styling

2017-11-14 Thread Vasudevan Skm (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vasudevan Skm updated YARN-7492:

Attachment: YARN-7492.002.patch

> Set up SASS for UI styling
> --
>
> Key: YARN-7492
> URL: https://issues.apache.org/jira/browse/YARN-7492
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Vasudevan Skm
>Assignee: Vasudevan Skm
> Attachments: YARN-7492.001.patch, YARN-7492.002.patch
>
>
> SASS will help in improving the quality and maintainablity of our styles. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7492) Set up SASS for UI styling

2017-11-14 Thread Vasudevan Skm (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vasudevan Skm updated YARN-7492:

Attachment: YARN-7492.001.patch

[~sunil.gov...@gmail.com] [~wangda]

> Set up SASS for UI styling
> --
>
> Key: YARN-7492
> URL: https://issues.apache.org/jira/browse/YARN-7492
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Vasudevan Skm
>Assignee: Vasudevan Skm
> Attachments: YARN-7492.001.patch
>
>
> SASS will help in improving the quality and maintainablity of our styles. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-7492) Set up SASS for UI styling

2017-11-14 Thread Vasudevan Skm (JIRA)

Vasudevan Skm created YARN-7492:
---

 Summary: Set up SASS for UI styling
 Key: YARN-7492
 URL: https://issues.apache.org/jira/browse/YARN-7492
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn-ui-v2
Reporter: Vasudevan Skm
Assignee: Vasudevan Skm


SASS will help in improving the quality and maintainablity of our styles. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7462) Render outstanding resource requests on application details page

2017-11-14 Thread Vasudevan Skm (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251113#comment-16251113
 ] 

Vasudevan Skm commented on YARN-7462:
-

LOoks like the api fields have changes in v3. Updating the patch to work with 
v3 API. 

> Render outstanding resource requests on application details page
> 
>
> Key: YARN-7462
> URL: https://issues.apache.org/jira/browse/YARN-7462
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Vasudevan Skm
>Assignee: Vasudevan Skm
> Attachments: Screen Shot 2017-11-08 at 3.24.30 PM.png, Screen Shot 
> 2017-11-08 at 3.38.48 PM.png, YARN-7462.001.patch, YARN-7462.002.patch, 
> YARN-7462.003.patch, YARN-7462.004.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7462) Render outstanding resource requests on application details page

2017-11-14 Thread Vasudevan Skm (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vasudevan Skm updated YARN-7462:

Attachment: YARN-7462.004.patch

> Render outstanding resource requests on application details page
> 
>
> Key: YARN-7462
> URL: https://issues.apache.org/jira/browse/YARN-7462
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Vasudevan Skm
>Assignee: Vasudevan Skm
> Attachments: Screen Shot 2017-11-08 at 3.24.30 PM.png, Screen Shot 
> 2017-11-08 at 3.38.48 PM.png, YARN-7462.001.patch, YARN-7462.002.patch, 
> YARN-7462.003.patch, YARN-7462.004.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6507) Add support in NodeManager to isolate FPGA devices with CGroups

2017-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251107#comment-16251107
 ] 

Hadoop QA commented on YARN-6507:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  9m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
5s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 30s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
58s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api in 
trunk has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
20s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  9m 
39s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 25s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 185 new + 215 unchanged - 0 fixed = 400 total (was 215) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 1s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 17s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
41s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m  
9s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
51s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 
46s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
40s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}132m 56s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-6507 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12897457/YARN-6507-trunk.007.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux 3ef487f434

[jira] [Updated] (YARN-7480) Render tooltips on columns where text is clipped

2017-11-14 Thread Vasudevan Skm (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vasudevan Skm updated YARN-7480:

Attachment: YARN-7480.002.patch

> Render tooltips on columns where text is clipped
> 
>
> Key: YARN-7480
> URL: https://issues.apache.org/jira/browse/YARN-7480
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Vasudevan Skm
>Assignee: Vasudevan Skm
> Attachments: YARN-7480.001.patch, YARN-7480.002.patch
>
>
> In em-table, when text gets clipped the information is lost. Need to render a 
> tooltip to show the full text in these cases



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7474) Yarn resourcemanager stop allocating container when cluster resource is sufficient

2017-11-14 Thread wuchang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251093#comment-16251093
 ] 

wuchang commented on YARN-7474:
---

Below is some explanation  about the log:

Before 2017-11-11 09:04, everything seems OK , application submitted and become 
runnings state.

At about 2017-11-11 09:04:47, no more applications become RUNNING , all the 
new-comming applications keep in pending state, and the already-running 
applications seem never finished.

At about 2017-11-11 13:58, namely about 5 hours after when problem occurs , I 
killed some applications，and yarn seems alive again. You can see that many 
pending applications' state become running , everything seems OK.

During the problem , Yarn's cluster resource usage is about half of the total 
YARN cluster resources and keeped unchanged , abosolutely unchanged, it seems 
static and dead.

> Yarn resourcemanager stop allocating container when cluster resource is 
> sufficient 
> ---
>
> Key: YARN-7474
> URL: https://issues.apache.org/jira/browse/YARN-7474
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.2
>Reporter: wuchang
>Priority: Critical
> Attachments: rm.log
>
>
> Hadoop Version: *2.7.2*
> My Yarn cluster have *(1100TB,368vCores)*  totallly with 15 nodemangers . 
> My cluster use fair-scheduler and I have 4 queues for different kinds of jobs:
>  
> {quote}
> 
> 
>10 mb, 30 vcores
>422280 mb, 132 vcores
>0.5f
>90
>90
>50
> 
> 
>25000 mb, 20 vcores
>600280 mb, 150 vcores
>0.6f
>90
>90
>50
> 
> 
>10 mb, 30 vcores
>647280 mb, 132 vcores
>0.8f
>90
>90
>50
> 
>   
> 
>8 mb, 20 vcores
>12 mb, 30 vcores
>0.5f
>90
>90
>50
>  
> 
>  {quote}
> from about 9:00 am, all new-coming applications get stuck for nearly 5 hours, 
> but the cluster resource usage is about *(600GB,120vCores)*，it means，the 
> cluster resource is still *sufficient*.
> *The resource usage of the whole yarn cluster AND of each single queue stay 
> unchanged for 5 hours*, really strange. Obviously , if it a resource 
> insufficiency problem , it's impossible that used resource of all queues 
> didn't have any change for 5 hours. So , I is the problem of ResourceManager.
> Since my cluster scale is not large, only 15 nodes with 1100G memory ,I 
> exclude the possibility showed in [YARN-4618].
>  
> besides that , all the running applications seems never finished, the Yarn RM 
> seems static ,the RM log  have no more state change logs about running 
> applications，except for the log about more and more application is submitted 
> and become ACCEPTED, but never from ACCEPTED to RUNNING.
> *The resource usage of the whole yarn cluster AND of each single queue stay 
> unchanged for 5 hours*, really strange.
> The cluster seems like a zombie.
>  
> I haved checked the ApplicationMaster log of some running but stucked 
> application , 
>  
>  {quote}
> 2017-11-11 09:04:55,896 INFO [IPC Server handler 0 on 42899] 
> org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Getting task 
> report for MAP job_1507795051888_183385. Report-size will be 4
> 2017-11-11 09:04:55,957 INFO [IPC Server handler 0 on 42899] 
> org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Getting task 
> report for REDUCE job_1507795051888_183385. Report-size will be 0
> 2017-11-11 09:04:56,037 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before 
> Scheduling: PendingReds:0 ScheduledMaps:4 ScheduledReds:0 AssignedMaps:0 
> AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:0 ContRel:0 
> HostLocal:0 RackLocal:0
> 2017-11-11 09:04:56,061 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() 
> for application_1507795051888_183385: ask=6 release= 0 newContainers=0 
> finishedContainers=0 resourcelimit= knownNMs=15
> 2017-11-11 13:58:56,736 INFO [IPC Server handler 0 on 42899] 
> org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Kill job 
> job_1507795051888_183385 received from appuser (auth:SIMPLE) at 10.120.207.11
>  {quote}
>  
> You can see that at  *2017-11-11 09:04:56,061* It send resource request to 
> ResourceManager but RM allocate zero containers. Then ,no more logs  for 5 
> hours. At  13:58， I have to kill it manually.
>  
> After 5 hours , I kill some pending applications and then everything 
> recovered，remaining cluster resources can be allocated again, Resour

[jira] [Commented] (YARN-7474) Yarn resourcemanager stop allocating container when cluster resource is sufficient

2017-11-14 Thread wuchang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251092#comment-16251092
 ] 

wuchang commented on YARN-7474:
---

[~yufeigu] [~templedf] I have attatched my ResourceManager log during the time 
problem occurs.

> Yarn resourcemanager stop allocating container when cluster resource is 
> sufficient 
> ---
>
> Key: YARN-7474
> URL: https://issues.apache.org/jira/browse/YARN-7474
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.2
>Reporter: wuchang
>Priority: Critical
> Attachments: rm.log
>
>
> Hadoop Version: *2.7.2*
> My Yarn cluster have *(1100TB,368vCores)*  totallly with 15 nodemangers . 
> My cluster use fair-scheduler and I have 4 queues for different kinds of jobs:
>  
> {quote}
> 
> 
>10 mb, 30 vcores
>422280 mb, 132 vcores
>0.5f
>90
>90
>50
> 
> 
>25000 mb, 20 vcores
>600280 mb, 150 vcores
>0.6f
>90
>90
>50
> 
> 
>10 mb, 30 vcores
>647280 mb, 132 vcores
>0.8f
>90
>90
>50
> 
>   
> 
>8 mb, 20 vcores
>12 mb, 30 vcores
>0.5f
>90
>90
>50
>  
> 
>  {quote}
> from about 9:00 am, all new-coming applications get stuck for nearly 5 hours, 
> but the cluster resource usage is about *(600GB,120vCores)*，it means，the 
> cluster resource is still *sufficient*.
> *The resource usage of the whole yarn cluster AND of each single queue stay 
> unchanged for 5 hours*, really strange. Obviously , if it a resource 
> insufficiency problem , it's impossible that used resource of all queues 
> didn't have any change for 5 hours. So , I is the problem of ResourceManager.
> Since my cluster scale is not large, only 15 nodes with 1100G memory ,I 
> exclude the possibility showed in [YARN-4618].
>  
> besides that , all the running applications seems never finished, the Yarn RM 
> seems static ,the RM log  have no more state change logs about running 
> applications，except for the log about more and more application is submitted 
> and become ACCEPTED, but never from ACCEPTED to RUNNING.
> *The resource usage of the whole yarn cluster AND of each single queue stay 
> unchanged for 5 hours*, really strange.
> The cluster seems like a zombie.
>  
> I haved checked the ApplicationMaster log of some running but stucked 
> application , 
>  
>  {quote}
> 2017-11-11 09:04:55,896 INFO [IPC Server handler 0 on 42899] 
> org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Getting task 
> report for MAP job_1507795051888_183385. Report-size will be 4
> 2017-11-11 09:04:55,957 INFO [IPC Server handler 0 on 42899] 
> org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Getting task 
> report for REDUCE job_1507795051888_183385. Report-size will be 0
> 2017-11-11 09:04:56,037 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before 
> Scheduling: PendingReds:0 ScheduledMaps:4 ScheduledReds:0 AssignedMaps:0 
> AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:0 ContRel:0 
> HostLocal:0 RackLocal:0
> 2017-11-11 09:04:56,061 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() 
> for application_1507795051888_183385: ask=6 release= 0 newContainers=0 
> finishedContainers=0 resourcelimit= knownNMs=15
> 2017-11-11 13:58:56,736 INFO [IPC Server handler 0 on 42899] 
> org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Kill job 
> job_1507795051888_183385 received from appuser (auth:SIMPLE) at 10.120.207.11
>  {quote}
>  
> You can see that at  *2017-11-11 09:04:56,061* It send resource request to 
> ResourceManager but RM allocate zero containers. Then ,no more logs  for 5 
> hours. At  13:58， I have to kill it manually.
>  
> After 5 hours , I kill some pending applications and then everything 
> recovered，remaining cluster resources can be allocated again, ResourceManager 
> seems  to be alive again.
>  
> I have exclude the possibility of  the restriction of maxRunningApps and 
> maxAMShare config because they will just affect a single queue, but my 
> problem is that whole yarn cluster application get stuck.
>  
>  
>  
> Also , I exclude the possibility of a  resourcemanger  full gc problem 
> because I check that with gcutil，no full gc happened , resource manager 
> memory is OK.
>  
> So , anyone could give me some suggestions?
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubsc

[jira] [Commented] (YARN-7474) Yarn resourcemanager stop allocating container when cluster resource is sufficient

2017-11-14 Thread wuchang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251082#comment-16251082
 ] 

wuchang commented on YARN-7474:
---

[~yufeigu] [~templedf] Big thanks for you reply.
I noticed that the  the bug mentioned at 
[YARN-4477|https://issues.apache.org/jira/browse/YARN-4477]  is just for hadoop 
2.8.0 or higher, but my hadoop version is 2.7.2. I have already checked my 
2.7.2 source code , there didn't exists the method 
*reservationExceedsThreshold()* metioned there.
Would you please give me more suggestions?

> Yarn resourcemanager stop allocating container when cluster resource is 
> sufficient 
> ---
>
> Key: YARN-7474
> URL: https://issues.apache.org/jira/browse/YARN-7474
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.2
>Reporter: wuchang
>Priority: Critical
>
> Hadoop Version: *2.7.2*
> My Yarn cluster have *(1100TB,368vCores)*  totallly with 15 nodemangers . 
> My cluster use fair-scheduler and I have 4 queues for different kinds of jobs:
>  
> {quote}
> 
> 
>10 mb, 30 vcores
>422280 mb, 132 vcores
>0.5f
>90
>90
>50
> 
> 
>25000 mb, 20 vcores
>600280 mb, 150 vcores
>0.6f
>90
>90
>50
> 
> 
>10 mb, 30 vcores
>647280 mb, 132 vcores
>0.8f
>90
>90
>50
> 
>   
> 
>8 mb, 20 vcores
>12 mb, 30 vcores
>0.5f
>90
>90
>50
>  
> 
>  {quote}
> from about 9:00 am, all new-coming applications get stuck for nearly 5 hours, 
> but the cluster resource usage is about *(600GB,120vCores)*，it means，the 
> cluster resource is still *sufficient*.
> *The resource usage of the whole yarn cluster AND of each single queue stay 
> unchanged for 5 hours*, really strange. Obviously , if it a resource 
> insufficiency problem , it's impossible that used resource of all queues 
> didn't have any change for 5 hours. So , I is the problem of ResourceManager.
> Since my cluster scale is not large, only 15 nodes with 1100G memory ,I 
> exclude the possibility showed in [YARN-4618].
>  
> besides that , all the running applications seems never finished, the Yarn RM 
> seems static ,the RM log  have no more state change logs about running 
> applications，except for the log about more and more application is submitted 
> and become ACCEPTED, but never from ACCEPTED to RUNNING.
> *The resource usage of the whole yarn cluster AND of each single queue stay 
> unchanged for 5 hours*, really strange.
> The cluster seems like a zombie.
>  
> I haved checked the ApplicationMaster log of some running but stucked 
> application , 
>  
>  {quote}
> 2017-11-11 09:04:55,896 INFO [IPC Server handler 0 on 42899] 
> org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Getting task 
> report for MAP job_1507795051888_183385. Report-size will be 4
> 2017-11-11 09:04:55,957 INFO [IPC Server handler 0 on 42899] 
> org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Getting task 
> report for REDUCE job_1507795051888_183385. Report-size will be 0
> 2017-11-11 09:04:56,037 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before 
> Scheduling: PendingReds:0 ScheduledMaps:4 ScheduledReds:0 AssignedMaps:0 
> AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:0 ContRel:0 
> HostLocal:0 RackLocal:0
> 2017-11-11 09:04:56,061 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() 
> for application_1507795051888_183385: ask=6 release= 0 newContainers=0 
> finishedContainers=0 resourcelimit= knownNMs=15
> 2017-11-11 13:58:56,736 INFO [IPC Server handler 0 on 42899] 
> org.apache.hadoop.mapreduce.v2.app.client.MRClientService: Kill job 
> job_1507795051888_183385 received from appuser (auth:SIMPLE) at 10.120.207.11
>  {quote}
>  
> You can see that at  *2017-11-11 09:04:56,061* It send resource request to 
> ResourceManager but RM allocate zero containers. Then ,no more logs  for 5 
> hours. At  13:58， I have to kill it manually.
>  
> After 5 hours , I kill some pending applications and then everything 
> recovered，remaining cluster resources can be allocated again, ResourceManager 
> seems  to be alive again.
>  
> I have exclude the possibility of  the restriction of maxRunningApps and 
> maxAMShare config because they will just affect a single queue, but my 
> problem is that whole yarn cluster application get stuck.
>  
>  
>  
> Also , I exclude the possibility of a  resourcemanger  full gc problem 
> because I check that wi

[jira] [Commented] (YARN-7159) Normalize unit of resource objects in RM and avoid to do unit conversion in critical path

2017-11-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251062#comment-16251062
 ] 

Hadoop QA commented on YARN-7159:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 7 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 21s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
21s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api in 
trunk has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
40s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
39s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 56s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 6 new + 112 unchanged - 0 fixed = 118 total (was 112) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
58s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch 7 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 55s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
38s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
36s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
40s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 58m  0s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}135m  4s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer
 |
|   | hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy 
|
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation
 |
| Timed out junit tests | 
org.apache.hadoop.yarn.server.resourcemanager.TestRMStoreCommands |
|   | 
o

[jira] [Commented] (YARN-7462) Render outstanding resource requests on application details page

2017-11-14 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251058#comment-16251058
 ] 

Sunil G commented on YARN-7462:
---

v3 patch looks good to me. Committing later today if there are no objections.

> Render outstanding resource requests on application details page
> 
>
> Key: YARN-7462
> URL: https://issues.apache.org/jira/browse/YARN-7462
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Vasudevan Skm
>Assignee: Vasudevan Skm
> Attachments: Screen Shot 2017-11-08 at 3.24.30 PM.png, Screen Shot 
> 2017-11-08 at 3.38.48 PM.png, YARN-7462.001.patch, YARN-7462.002.patch, 
> YARN-7462.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-7330) Add support to show GPU on UI/metrics

2017-11-14 Thread Vasudevan Skm (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251049#comment-16251049
 ] 

Vasudevan Skm edited comment on YARN-7330 at 11/14/17 8:03 AM:
---

[~wangda]

1) There are a lot of console.logs in the code. Ideally any prod code should 
not have debugger/ console statements. 
2) /converter.js has a lot of constants. 

{code:javascript}
if (unit === "Ki") {
normalizedValue = normalizedValue * 1024;
} else if (unit === "Mi") {
normalizedValue = normalizedValue * 1024 * 1024;
} else if (unit === "Gi") {
normalizedValue = normalizedValue * 1024 * 1024 * 1024;
} else if (unit === "Ti") {
normalizedValue = normalizedValue * 1024 * 1024 * 1024 * 1024;
} else if (unit === "Pi") {
normalizedValue = normalizedValue * 1024 * 1024 * 1024 * 1024 * 1024;
}

{code}


can be refactored to 


{code:javascript}
const exponents = {

ki:  1024;
 Mi: 1024 ^ 2,
 Gi: 1024 ^ 3
}

normalizedValue = normalizedValue * exponents[]

{code}


Also, all the if blocks here have the same condition
```
 
{code:javascript}
 var finalUnit = "";
  if (normalizedValue / 1024 >= 0.9) {
normalizedValue = normalizedValue / 1024;
finalUnit = "Ki";
  }
  if (normalizedValue / 1024 >= 0.9) {
normalizedValue = normalizedValue / 1024;
finalUnit = "Mi";
  }
  if (normalizedValue / 1024 >= 0.9) {
normalizedValue = normalizedValue / 1024;
finalUnit = "Gi";
  }
  if (normalizedValue / 1024 >= 0.9) {
normalizedValue = normalizedValue / 1024;
finalUnit = "Ti";
 
  if (normalizedValue / 1024 >= 0.9) {
normalizedValue = normalizedValue / 1024;
finalUnit = "Pi";
  }
{code}

```

Am I missing something here?

3. In donut-chart.js can the strings like "resource","memory" be added to a 
constant called ResourceType and used? 





was (Author: skmvasu):
[~wangda]

1) There are a lot of console.logs in the code. Ideally any prod code should 
not have debugger/ console statements. 
2) /converter.js has a lot of constants. 

```

if (unit === "Ki") {
normalizedValue = normalizedValue * 1024;
} else if (unit === "Mi") {
normalizedValue = normalizedValue * 1024 * 1024;
} else if (unit === "Gi") {
normalizedValue = normalizedValue * 1024 * 1024 * 1024;
} else if (unit === "Ti") {
normalizedValue = normalizedValue * 1024 * 1024 * 1024 * 1024;
} else if (unit === "Pi") {
normalizedValue = normalizedValue * 1024 * 1024 * 1024 * 1024 * 1024;
}

```

can be refactored to 

```
const exponents = {

ki:  1024;
 Mi: 1024 ^ 2,
 Gi: 1024 ^ 3
}

normalizedValue = normalizedValue * exponents[]

``

Also, all the if blocks here have the same condition
```
  var finalUnit = "";
  if (normalizedValue / 1024 >= 0.9) {
normalizedValue = normalizedValue / 1024;
finalUnit = "Ki";
  }
  if (normalizedValue / 1024 >= 0.9) {
normalizedValue = normalizedValue / 1024;
finalUnit = "Mi";
  }
  if (normalizedValue / 1024 >= 0.9) {
normalizedValue = normalizedValue / 1024;
finalUnit = "Gi";
  }
  if (normalizedValue / 1024 >= 0.9) {
normalizedValue = normalizedValue / 1024;
finalUnit = "Ti";
 
  if (normalizedValue / 1024 >= 0.9) {
normalizedValue = normalizedValue / 1024;
finalUnit = "Pi";
  }
```

Am I missing something here?

3. In donut-chart.js can the strings like "resource","memory" be added to a 
constant called ResourceType and used? 




> Add support to show GPU on UI/metrics
> -
>
> Key: YARN-7330
> URL: https://issues.apache.org/jira/browse/YARN-7330
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-7330.0-wip.patch, YARN-7330.003.patch, 
> YARN-7330.004.patch, YARN-7330.006.patch, YARN-7330.1-wip.patch, 
> YARN-7330.2-wip.patch, screencapture-0-wip.png
>
>
> We should be able to view GPU metrics from UI/REST API.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7330) Add support to show GPU on UI/metrics

2017-11-14 Thread Vasudevan Skm (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251049#comment-16251049
 ] 

Vasudevan Skm commented on YARN-7330:
-

[~wangda]

1) There are a lot of console.logs in the code. Ideally any prod code should 
not have debugger/ console statements. 
2) /converter.js has a lot of constants. 

```

if (unit === "Ki") {
normalizedValue = normalizedValue * 1024;
} else if (unit === "Mi") {
normalizedValue = normalizedValue * 1024 * 1024;
} else if (unit === "Gi") {
normalizedValue = normalizedValue * 1024 * 1024 * 1024;
} else if (unit === "Ti") {
normalizedValue = normalizedValue * 1024 * 1024 * 1024 * 1024;
} else if (unit === "Pi") {
normalizedValue = normalizedValue * 1024 * 1024 * 1024 * 1024 * 1024;
}

```

can be refactored to 

```
const exponents = {

ki:  1024;
 Mi: 1024 ^ 2,
 Gi: 1024 ^ 3
}

normalizedValue = normalizedValue * exponents[]

``

Also, all the if blocks here have the same condition
```
  var finalUnit = "";
  if (normalizedValue / 1024 >= 0.9) {
normalizedValue = normalizedValue / 1024;
finalUnit = "Ki";
  }
  if (normalizedValue / 1024 >= 0.9) {
normalizedValue = normalizedValue / 1024;
finalUnit = "Mi";
  }
  if (normalizedValue / 1024 >= 0.9) {
normalizedValue = normalizedValue / 1024;
finalUnit = "Gi";
  }
  if (normalizedValue / 1024 >= 0.9) {
normalizedValue = normalizedValue / 1024;
finalUnit = "Ti";
 
  if (normalizedValue / 1024 >= 0.9) {
normalizedValue = normalizedValue / 1024;
finalUnit = "Pi";
  }
```

Am I missing something here?

3. In donut-chart.js can the strings like "resource","memory" be added to a 
constant called ResourceType and used? 




> Add support to show GPU on UI/metrics
> -
>
> Key: YARN-7330
> URL: https://issues.apache.org/jira/browse/YARN-7330
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-7330.0-wip.patch, YARN-7330.003.patch, 
> YARN-7330.004.patch, YARN-7330.006.patch, YARN-7330.1-wip.patch, 
> YARN-7330.2-wip.patch, screencapture-0-wip.png
>
>
> We should be able to view GPU metrics from UI/REST API.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

96 matches

Mail list logo