[jira] [Commented] (YARN-4405) Support node label store in non-appendable file system

2015-12-03 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15039670#comment-15039670
 ] 

Wangda Tan commented on YARN-4405:
--

Thanks for reporting, [~drankye], [~sunilg].

Fixing this issue now.

> Support node label store in non-appendable file system
> --
>
> Key: YARN-4405
> URL: https://issues.apache.org/jira/browse/YARN-4405
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, client, resourcemanager
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-4405.1.patch, YARN-4405.2.patch, YARN-4405.3.patch, 
> YARN-4405.4.patch
>
>
> Existing node label file system store implementation uses append to write 
> edit logs. However, some file system doesn't support append, we need add an 
> implementation to support such non-appendable file systems as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4419) Trunk building failed

2015-12-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15039708#comment-15039708
 ] 

Hudson commented on YARN-4419:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8918 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8918/])
Add missing file for YARN-4419 (jianhe: rev 
e84d6ca2df775bb4c93f6c08b345ac30b3a4525b)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/NonAppendableFSNodeLabelStore.java


> Trunk building failed
> -
>
> Key: YARN-4419
> URL: https://issues.apache.org/jira/browse/YARN-4419
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Kai Zheng
>Assignee: Jian He
>
> Checking out the latest codes, mvn clean package -DskipTests failed as below.
> {noformat}
> [INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ 
> hadoop-yarn-common ---
> [INFO] Changes detected - recompiling the module!
> [INFO] Compiling 72 source files to 
> /home/workspace/hadoop3/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/target/test-classes
> [INFO] -
> [ERROR] COMPILATION ERROR : 
> [INFO] -
> [ERROR] 
> /home/workspace/hadoop3/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/nodelabels/TestFileSystemNodeLabelsStore.java:[75,15]
>  cannot find symbol
>   symbol:   class NonAppendableFSNodeLabelStore
>   location: class 
> org.apache.hadoop.yarn.nodelabels.TestFileSystemNodeLabelsStore
> [INFO] 1 error
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2885) Create AMRMProxy request interceptor for distributed scheduling decisions for queueable containers

2015-12-03 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15041176#comment-15041176
 ] 

Konstantinos Karanasos commented on YARN-2885:
--

[~leftnoteasy], I think the Distributed Scheduler AM Service makes sense.

Given that we will already add the Distributed Scheduling Coordinator in the RM 
(which will be used for the top-k technique, and later for the corrective 
mechanisms in YARN-2888), what about using the same service for delegating the 
AMProtocol wrapper (rather than creating an additional one)?

> Create AMRMProxy request interceptor for distributed scheduling decisions for 
> queueable containers
> --
>
> Key: YARN-2885
> URL: https://issues.apache.org/jira/browse/YARN-2885
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Konstantinos Karanasos
>Assignee: Arun Suresh
> Attachments: YARN-2885-yarn-2877.001.patch
>
>
> We propose to add a Local ResourceManager (LocalRM) to the NM in order to 
> support distributed scheduling decisions. 
> Architecturally we leverage the RMProxy, introduced in YARN-2884. 
> The LocalRM makes distributed decisions for queuable containers requests. 
> Guaranteed-start requests are still handled by the central RM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4405) Support node label store in non-appendable file system

2015-12-03 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15039671#comment-15039671
 ] 

Jian He commented on YARN-4405:
---

sorry, missed a file while committing, fixed now.

> Support node label store in non-appendable file system
> --
>
> Key: YARN-4405
> URL: https://issues.apache.org/jira/browse/YARN-4405
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, client, resourcemanager
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-4405.1.patch, YARN-4405.2.patch, YARN-4405.3.patch, 
> YARN-4405.4.patch
>
>
> Existing node label file system store implementation uses append to write 
> edit logs. However, some file system doesn't support append, we need add an 
> implementation to support such non-appendable file systems as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-4419) Trunk building failed

2015-12-03 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He resolved YARN-4419.
---
Resolution: Fixed

> Trunk building failed
> -
>
> Key: YARN-4419
> URL: https://issues.apache.org/jira/browse/YARN-4419
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Kai Zheng
>Assignee: Jian He
>
> Checking out the latest codes, mvn clean package -DskipTests failed as below.
> {noformat}
> [INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ 
> hadoop-yarn-common ---
> [INFO] Changes detected - recompiling the module!
> [INFO] Compiling 72 source files to 
> /home/workspace/hadoop3/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/target/test-classes
> [INFO] -
> [ERROR] COMPILATION ERROR : 
> [INFO] -
> [ERROR] 
> /home/workspace/hadoop3/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/nodelabels/TestFileSystemNodeLabelsStore.java:[75,15]
>  cannot find symbol
>   symbol:   class NonAppendableFSNodeLabelStore
>   location: class 
> org.apache.hadoop.yarn.nodelabels.TestFileSystemNodeLabelsStore
> [INFO] 1 error
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4419) Trunk building failed

2015-12-03 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15039673#comment-15039673
 ] 

Jian He commented on YARN-4419:
---

sorry, missed a file while committing, fixed now.

> Trunk building failed
> -
>
> Key: YARN-4419
> URL: https://issues.apache.org/jira/browse/YARN-4419
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Kai Zheng
>Assignee: Jian He
>
> Checking out the latest codes, mvn clean package -DskipTests failed as below.
> {noformat}
> [INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ 
> hadoop-yarn-common ---
> [INFO] Changes detected - recompiling the module!
> [INFO] Compiling 72 source files to 
> /home/workspace/hadoop3/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/target/test-classes
> [INFO] -
> [ERROR] COMPILATION ERROR : 
> [INFO] -
> [ERROR] 
> /home/workspace/hadoop3/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/nodelabels/TestFileSystemNodeLabelsStore.java:[75,15]
>  cannot find symbol
>   symbol:   class NonAppendableFSNodeLabelStore
>   location: class 
> org.apache.hadoop.yarn.nodelabels.TestFileSystemNodeLabelsStore
> [INFO] 1 error
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4417) Make RM and Timeline-server REST APIs more consistent

2015-12-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15039714#comment-15039714
 ] 

Hadoop QA commented on YARN-4417:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 
2s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 44s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
18s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
25s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
41s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 15s 
{color} | {color:red} Patch generated 4 new checkstyle issues in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 (total was 48, now 51). {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 43s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
18s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
35s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m 46s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 62m 53s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
26s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 148m 7s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesApps |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
| JDK v1.7.0_91 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | 

[jira] [Commented] (YARN-4340) Add "list" API to reservation system

2015-12-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15041238#comment-15041238
 ] 

Hadoop QA commented on YARN-4340:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 10 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
21s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 57s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 29s 
{color} | {color:green} trunk passed with JDK v1.7.0_85 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
5s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 32s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 
23s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 
12s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 20s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 53s 
{color} | {color:green} trunk passed with JDK v1.7.0_85 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
50s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 56s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 8m 56s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 19m 31s 
{color} | {color:red} root-jdk1.8.0_66 with JDK v1.8.0_66 generated 1 new 
issues (was 751, now 751). {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 8m 56s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 25s 
{color} | {color:green} the patch passed with JDK v1.7.0_85 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 9m 25s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 28m 56s 
{color} | {color:red} root-jdk1.7.0_85 with JDK v1.7.0_85 generated 1 new 
issues (was 745, now 745). {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 9m 25s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 15s 
{color} | {color:red} Patch generated 32 new checkstyle issues in root (total 
was 352, now 382). {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 10s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 
24s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 37s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
introduced 4 new FindBugs issues. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 3m 20s 
{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-api-jdk1.8.0_66 with JDK v1.8.0_66 
generated 9 new issues (was 100, now 100). {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 23s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 8m 49s 
{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-api-jdk1.7.0_85 with JDK v1.7.0_85 
generated 5 new issues (was 0, now 5). {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 5m 8s 
{color} | {color:green} the patch passed with JDK v1.7.0_85 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 25s 
{color} | 

[jira] [Updated] (YARN-4340) Add "list" API to reservation system

2015-12-03 Thread Sean Po (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Po updated YARN-4340:
--
Attachment: (was: YARN-4340.v5.patch)

> Add "list" API to reservation system
> 
>
> Key: YARN-4340
> URL: https://issues.apache.org/jira/browse/YARN-4340
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler, fairscheduler, resourcemanager
>Reporter: Carlo Curino
>Assignee: Sean Po
> Attachments: YARN-4340.v1.patch, YARN-4340.v2.patch, 
> YARN-4340.v3.patch, YARN-4340.v4.patch
>
>
> This JIRA tracks changes to the APIs of the reservation system, and enables 
> querying the reservation system on which reservation exists by "time-range, 
> reservation-id".
> YARN-4420 has a dependency on this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4411) ResourceManager IllegalArgumentException error

2015-12-03 Thread yarntime (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yarntime updated YARN-4411:
---
Attachment: YARN-4411.001.patch

A simple patch which replace 
  YarnApplicationAttemptState.valueOf(this.getState().toString())
with
  this.createApplicationAttemptState()

no t ests added.

> ResourceManager IllegalArgumentException error
> --
>
> Key: YARN-4411
> URL: https://issues.apache.org/jira/browse/YARN-4411
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.1
>Reporter: yarntime
>Assignee: yarntime
> Attachments: YARN-4411.001.patch
>
>
> in version 2.7.1, line 1914  may cause IllegalArgumentException in 
> RMAppAttemptImpl:
>   YarnApplicationAttemptState.valueOf(this.getState().toString())
> cause by this.getState() returns type RMAppAttemptState which may not be 
> converted to YarnApplicationAttemptState.
> {noformat}
> java.lang.IllegalArgumentException: No enum constant 
> org.apache.hadoop.yarn.api.records.YarnApplicationAttemptState.LAUNCHED_UNMANAGED_SAVING
> at java.lang.Enum.valueOf(Enum.java:236)
> at 
> org.apache.hadoop.yarn.api.records.YarnApplicationAttemptState.valueOf(YarnApplicationAttemptState.java:27)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.createApplicationAttemptReport(RMAppAttemptImpl.java:1870)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationAttemptReport(ClientRMService.java:355)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationAttemptReport(ApplicationClientProtocolPBServiceImpl.java:355)
> at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:425)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4405) Support node label store in non-appendable file system

2015-12-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15041158#comment-15041158
 ] 

Hudson commented on YARN-4405:
--

ABORTED: Integrated in Hadoop-Hdfs-trunk-Java8 #663 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/663/])
YARN-4405. Support node label store in non-appendable file system. (jianhe: rev 
755dda8dd8bb23864abc752bad506f223fcac010)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/NodeLabelsStore.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/nodelabels/TestFileSystemNodeLabelsStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/test/java/org/apache/hadoop/yarn/conf/TestYarnConfigurationFields.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/NullRMNodeLabelsManager.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfigurationFieldsBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/nodelabels/DummyCommonNodeLabelsManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/FileSystemNodeLabelsStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java


> Support node label store in non-appendable file system
> --
>
> Key: YARN-4405
> URL: https://issues.apache.org/jira/browse/YARN-4405
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, client, resourcemanager
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-4405.1.patch, YARN-4405.2.patch, YARN-4405.3.patch, 
> YARN-4405.4.patch
>
>
> Existing node label file system store implementation uses append to write 
> edit logs. However, some file system doesn't support append, we need add an 
> implementation to support such non-appendable file systems as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4292) ResourceUtilization should be a part of NodeInfo REST API

2015-12-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15041156#comment-15041156
 ] 

Hudson commented on YARN-4292:
--

ABORTED: Integrated in Hadoop-Hdfs-trunk-Java8 #663 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/663/])
YARN-4292. ResourceUtilization should be a part of NodeInfo REST API. (wangda: 
rev a2c3bfc8c1349102a7f2bc4ea96b80b429ac227b)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/ResourceUtilizationInfo.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesNodes.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/NodeInfo.java


> ResourceUtilization should be a part of NodeInfo REST API
> -
>
> Key: YARN-4292
> URL: https://issues.apache.org/jira/browse/YARN-4292
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Sunil G
> Fix For: 2.8.0
>
> Attachments: 0001-YARN-4292.patch, 0002-YARN-4292.patch, 
> 0003-YARN-4292.patch, 0004-YARN-4292.patch, 0005-YARN-4292.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2877) Extend YARN to support distributed scheduling

2015-12-03 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15041169#comment-15041169
 ] 

Konstantinos Karanasos commented on YARN-2877:
--

Hi [~wangda],

Thanks for pointing out HADOOP-11552. It seems it can also be used for the same 
purpose.
I would suggest to follow the technique of frequent AM-LocalRM heartbeats and 
less frequent LocalRM-RM heartbeats to start with. Once HADOOP-11552 gets 
resolved, we can consider using it.

bq. I think top-k node list technique cannot completely solve the over 
subscribe issue, in a production cluster, application comes in waves, it is 
possible that few large applications can exhaust all resources in a cluster 
within few seconds. Maybe another possible approach to mitigate the issue is: 
propagating queue-able containers from NM to RM periodically, so NM can still 
make decision but RM can also be aware of these queue-able containers.
As long as k is sufficiently big, the phenomenon you describe should not be 
very pronounced. 
Moreover, corrective mechanisms (YARN-2888) will lead to moving tasks from 
highly-loaded nodes to less busy ones.
Going further, what you are suggesting would also make sense.

> Extend YARN to support distributed scheduling
> -
>
> Key: YARN-2877
> URL: https://issues.apache.org/jira/browse/YARN-2877
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager, resourcemanager
>Reporter: Sriram Rao
>Assignee: Konstantinos Karanasos
> Attachments: distributed-scheduling-design-doc_v1.pdf
>
>
> This is an umbrella JIRA that proposes to extend YARN to support distributed 
> scheduling.  Briefly, some of the motivations for distributed scheduling are 
> the following:
> 1. Improve cluster utilization by opportunistically executing tasks otherwise 
> idle resources on individual machines.
> 2. Reduce allocation latency.  Tasks where the scheduling time dominates 
> (i.e., task execution time is much less compared to the time required for 
> obtaining a container from the RM).
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4411) ResourceManager IllegalArgumentException error

2015-12-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15041235#comment-15041235
 ] 

Hadoop QA commented on YARN-4411:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s {color} 
| {color:red} YARN-4411 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12775739/YARN-4411.001.patch |
| JIRA Issue | YARN-4411 |
| Powered by | Apache Yetus   http://yetus.apache.org |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/9858/console |


This message was automatically generated.



> ResourceManager IllegalArgumentException error
> --
>
> Key: YARN-4411
> URL: https://issues.apache.org/jira/browse/YARN-4411
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.1
>Reporter: yarntime
>Assignee: yarntime
> Attachments: YARN-4411.001.patch
>
>
> in version 2.7.1, line 1914  may cause IllegalArgumentException in 
> RMAppAttemptImpl:
>   YarnApplicationAttemptState.valueOf(this.getState().toString())
> cause by this.getState() returns type RMAppAttemptState which may not be 
> converted to YarnApplicationAttemptState.
> {noformat}
> java.lang.IllegalArgumentException: No enum constant 
> org.apache.hadoop.yarn.api.records.YarnApplicationAttemptState.LAUNCHED_UNMANAGED_SAVING
> at java.lang.Enum.valueOf(Enum.java:236)
> at 
> org.apache.hadoop.yarn.api.records.YarnApplicationAttemptState.valueOf(YarnApplicationAttemptState.java:27)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.createApplicationAttemptReport(RMAppAttemptImpl.java:1870)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationAttemptReport(ClientRMService.java:355)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationAttemptReport(ApplicationClientProtocolPBServiceImpl.java:355)
> at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:425)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-4419) Trunk building failed

2015-12-03 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He reassigned YARN-4419:
-

Assignee: Jian He

> Trunk building failed
> -
>
> Key: YARN-4419
> URL: https://issues.apache.org/jira/browse/YARN-4419
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Kai Zheng
>Assignee: Jian He
>
> Checking out the latest codes, mvn clean package -DskipTests failed as below.
> {noformat}
> [INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ 
> hadoop-yarn-common ---
> [INFO] Changes detected - recompiling the module!
> [INFO] Compiling 72 source files to 
> /home/workspace/hadoop3/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/target/test-classes
> [INFO] -
> [ERROR] COMPILATION ERROR : 
> [INFO] -
> [ERROR] 
> /home/workspace/hadoop3/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/nodelabels/TestFileSystemNodeLabelsStore.java:[75,15]
>  cannot find symbol
>   symbol:   class NonAppendableFSNodeLabelStore
>   location: class 
> org.apache.hadoop.yarn.nodelabels.TestFileSystemNodeLabelsStore
> [INFO] 1 error
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4340) Add "list" API to reservation system

2015-12-03 Thread Sean Po (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Po updated YARN-4340:
--
Attachment: YARN-4340.v5.patch

> Add "list" API to reservation system
> 
>
> Key: YARN-4340
> URL: https://issues.apache.org/jira/browse/YARN-4340
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler, fairscheduler, resourcemanager
>Reporter: Carlo Curino
>Assignee: Sean Po
> Attachments: YARN-4340.v1.patch, YARN-4340.v2.patch, 
> YARN-4340.v3.patch, YARN-4340.v4.patch, YARN-4340.v5.patch
>
>
> This JIRA tracks changes to the APIs of the reservation system, and enables 
> querying the reservation system on which reservation exists by "time-range, 
> reservation-id".
> YARN-4420 has a dependency on this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4356) ensure the timeline service v.2 is disabled cleanly and has no impact when it's turned off

2015-12-03 Thread Sangjin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated YARN-4356:
--
Attachment: YARN-4356-feature-YARN-2928.poc.001.patch

I'm posting a POC patch to get early feedback.

I haven't added a config that checks the version of the timeline service yet, 
and I need to sort out various configuration parameters a little more.

But assuming those things will be in place later on, please take a look at 
whether the timeline service v.2 code/memory/behavior is cleanly turned off if 
the timeline service v.2 is disabled. I would greatly appreciate your feedback. 
Thanks!

> ensure the timeline service v.2 is disabled cleanly and has no impact when 
> it's turned off
> --
>
> Key: YARN-4356
> URL: https://issues.apache.org/jira/browse/YARN-4356
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4356-feature-YARN-2928.poc.001.patch
>
>
> For us to be able to merge the first milestone drop to trunk, we want to 
> ensure that once disabled the timeline service v.2 has no impact from the 
> server side to the client side. If the timeline service is not enabled, no 
> action should be done. If v.1 is enabled but not v.2, v.1 should behave the 
> same as it does before the merge.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4002) make ResourceTrackerService.nodeHeartbeat more concurrent

2015-12-03 Thread Hong Zhiguo (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Zhiguo updated YARN-4002:
--
Attachment: YARN-4002-rwlock.patch
YARN-4002-lockless-read.patch

2 patch for the 2 proposed solutions submitted.

> make ResourceTrackerService.nodeHeartbeat more concurrent
> -
>
> Key: YARN-4002
> URL: https://issues.apache.org/jira/browse/YARN-4002
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Hong Zhiguo
>Assignee: Hong Zhiguo
>Priority: Critical
> Attachments: YARN-4002-lockless-read.patch, YARN-4002-rwlock.patch, 
> YARN-4002-v0.patch
>
>
> We have multiple RPC threads to handle NodeHeartbeatRequest from NMs. By 
> design the method ResourceTrackerService.nodeHeartbeat should be concurrent 
> enough to scale for large clusters.
> But we have a "BIG" lock in NodesListManager.isValidNode which I think it's 
> unnecessary.
> First, the fields "includes" and "excludes" of HostsFileReader are only 
> updated on "refresh nodes".  All RPC threads handling node heartbeats are 
> only readers.  So RWLock could be used to  alow concurrent access by RPC 
> threads.
> Second, since he fields "includes" and "excludes" of HostsFileReader are 
> always updated by "reference assignment", which is atomic in Java, the reader 
> side lock could just be skipped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4405) Support node label store in non-appendable file system

2015-12-03 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15039679#comment-15039679
 ] 

Sunil G commented on YARN-4405:
---

Thanks [~jianhe] and [~leftnoteasy]. Its fine now.

> Support node label store in non-appendable file system
> --
>
> Key: YARN-4405
> URL: https://issues.apache.org/jira/browse/YARN-4405
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, client, resourcemanager
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-4405.1.patch, YARN-4405.2.patch, YARN-4405.3.patch, 
> YARN-4405.4.patch
>
>
> Existing node label file system store implementation uses append to write 
> edit logs. However, some file system doesn't support append, we need add an 
> implementation to support such non-appendable file systems as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4405) Support node label store in non-appendable file system

2015-12-03 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15039677#comment-15039677
 ] 

Wangda Tan commented on YARN-4405:
--

Thanks [~jianhe], I was committing just now and found it is already fixed.

> Support node label store in non-appendable file system
> --
>
> Key: YARN-4405
> URL: https://issues.apache.org/jira/browse/YARN-4405
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, client, resourcemanager
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-4405.1.patch, YARN-4405.2.patch, YARN-4405.3.patch, 
> YARN-4405.4.patch
>
>
> Existing node label file system store implementation uses append to write 
> edit logs. However, some file system doesn't support append, we need add an 
> implementation to support such non-appendable file systems as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3840) Resource Manager web ui issue when sorting application by id (with application having id > 9999)

2015-12-03 Thread Mohammad Shahid Khan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15037736#comment-15037736
 ] 

Mohammad Shahid Khan commented on YARN-3840:


+1 
Latest path looks good to me, thanks Varun Saxena, for handling the performance 
issue

> Resource Manager web ui issue when sorting application by id (with 
> application having id > )
> 
>
> Key: YARN-3840
> URL: https://issues.apache.org/jira/browse/YARN-3840
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
>Reporter: LINTE
>Assignee: Varun Saxena
> Fix For: 2.8.0, 2.7.3
>
> Attachments: RMApps.png, RMApps_Sorted.png, YARN-3840-1.patch, 
> YARN-3840-2.patch, YARN-3840-3.patch, YARN-3840-4.patch, YARN-3840-5.patch, 
> YARN-3840-6.patch, YARN-3840.reopened.001.patch, yarn-3840-7.patch
>
>
> On the WEBUI, the global main view page : 
> http://resourcemanager:8088/cluster/apps doesn't display applications over 
> .
> With command line it works (# yarn application -list).
> Regards,
> Alexandre



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4403) (AM/NM/Container)LivelinessMonitor should use monotonic time when calculating period

2015-12-03 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15037612#comment-15037612
 ] 

Junping Du commented on YARN-4403:
--

Thanks [~xinxianyin] to link YARN-4177. I missed that discussion before. Make 
YARN-4177 as a general ticket for YARN sounds good to me.

> (AM/NM/Container)LivelinessMonitor should use monotonic time when calculating 
> period
> 
>
> Key: YARN-4403
> URL: https://issues.apache.org/jira/browse/YARN-4403
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: YARN-4403.patch
>
>
> Currently, (AM/NM/Container)LivelinessMonitor use current system time to 
> calculate a duration of expire which could be broken by settimeofday. We 
> should use Time.monotonicNow() instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4408) NodeManager still reports negative running containers

2015-12-03 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15037603#comment-15037603
 ] 

Junping Du commented on YARN-4408:
--

+1. The test failure is not related and already tracked in YARN-4393. 
Committing this in.

> NodeManager still reports negative running containers
> -
>
> Key: YARN-4408
> URL: https://issues.apache.org/jira/browse/YARN-4408
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.4.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: YARN-4408.001.patch, YARN-4408.002.patch, 
> YARN-4408.003.patch
>
>
> YARN-1697 fixed a problem where the NodeManager metrics could report a 
> negative number of running containers.  However, it missed a rare case where 
> this can still happen.
> YARN-1697 added a flag to indicate if the container was actually launched 
> ({{LOCALIZED}} to {{RUNNING}}) or not ({{LOCALIZED}} to {{KILLING}}), which 
> is then checked when transitioning from {{CONTAINER_CLEANEDUP_AFTER_KILL}} to 
> {{DONE}} and {{EXITED_WITH_FAILURE}} to {{DONE}} to only decrement the gauge 
> if we actually ran the container and incremented the gauge .  However, this 
> flag is not checked while transitioning from {{EXITED_WITH_SUCCESS}} to 
> {{DONE}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4008) HTTP ERROR 404 Problem accessing /track. Reason: NOT_FOUND Powered by Jetty://

2015-12-03 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15037849#comment-15037849
 ] 

Junping Du commented on YARN-4008:
--

Hi [~targio], do we report some issues here? If not, I will close this JIRA as 
invalid.

> HTTP ERROR 404  Problem accessing /track. Reason:  NOT_FOUND  Powered by 
> Jetty://
> -
>
> Key: YARN-4008
> URL: https://issues.apache.org/jira/browse/YARN-4008
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Ibrahim Hashem
>
> HTTP ERROR 404
> Problem accessing /track. Reason:
> NOT_FOUND
> Powered by Jetty://



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4408) NodeManager still reports negative running containers

2015-12-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15037871#comment-15037871
 ] 

Hudson commented on YARN-4408:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8913 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8913/])
YARN-4408. Fix issue that NodeManager still reports negative running 
(junping_du: rev 62e9348bc10bb97a5fcb4281f7996a09d8e69c60)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/TestContainer.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java


> NodeManager still reports negative running containers
> -
>
> Key: YARN-4408
> URL: https://issues.apache.org/jira/browse/YARN-4408
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.4.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Fix For: 2.8.0
>
> Attachments: YARN-4408.001.patch, YARN-4408.002.patch, 
> YARN-4408.003.patch
>
>
> YARN-1697 fixed a problem where the NodeManager metrics could report a 
> negative number of running containers.  However, it missed a rare case where 
> this can still happen.
> YARN-1697 added a flag to indicate if the container was actually launched 
> ({{LOCALIZED}} to {{RUNNING}}) or not ({{LOCALIZED}} to {{KILLING}}), which 
> is then checked when transitioning from {{CONTAINER_CLEANEDUP_AFTER_KILL}} to 
> {{DONE}} and {{EXITED_WITH_FAILURE}} to {{DONE}} to only decrement the gauge 
> if we actually ran the container and incremented the gauge .  However, this 
> flag is not checked while transitioning from {{EXITED_WITH_SUCCESS}} to 
> {{DONE}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4409) Fix javadoc and checkstyle issues in timelineservice code

2015-12-03 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-4409:
---
Affects Version/s: YARN-2928
 Target Version/s: YARN-2928

> Fix javadoc and checkstyle issues in timelineservice code
> -
>
> Key: YARN-4409
> URL: https://issues.apache.org/jira/browse/YARN-4409
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-2928-1st-milestone
>
> There are a large number of javadoc and checkstyle issues currently open in 
> timelineservice code. We need to fix them before we merge it into trunk.
> Refer to 
> https://issues.apache.org/jira/browse/YARN-3862?focusedCommentId=15035267=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15035267
> We still have 94 open checkstyle issues and javadocs failing for Java 8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4409) Fix javadoc and checkstyle issues in timelineservice code

2015-12-03 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-4409:
---
Labels: yarn-2928-1st-milestone  (was: )

> Fix javadoc and checkstyle issues in timelineservice code
> -
>
> Key: YARN-4409
> URL: https://issues.apache.org/jira/browse/YARN-4409
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-2928-1st-milestone
>
> There are a large number of javadoc and checkstyle issues currently open in 
> timelineservice code. We need to fix them before we merge it into trunk.
> Refer to 
> https://issues.apache.org/jira/browse/YARN-3862?focusedCommentId=15035267=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15035267
> We still have 94 open checkstyle issues and javadocs failing for Java 8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3603) Application Attempts page confusing

2015-12-03 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15038101#comment-15038101
 ] 

Sunil G commented on YARN-3603:
---

I think this patch  has gone stale and I feel we can move this ticket forward.
[~rohithsharma] / [~xgong] / [~tgraves] Could you please help to check the 
screen shots attached and see if its a good addition.

Also would like to propose one point which got discussed with Jeff about killed 
container. Is it good to have that information in RM UI. Any known reasons why 
we do not have killed/preempted container list in UI. 
Thank You.

> Application Attempts page confusing
> ---
>
> Key: YARN-3603
> URL: https://issues.apache.org/jira/browse/YARN-3603
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 2.8.0
>Reporter: Thomas Graves
>Assignee: Sunil G
> Attachments: 0001-YARN-3603.patch, 0002-YARN-3603.patch, 
> 0003-YARN-3603.patch, ahs1.png
>
>
> The application attempts page 
> (http://RM:8088/cluster/appattempt/appattempt_1431101480046_0003_01)
> is a bit confusing on what is going on.  I think the table of containers 
> there is for only Running containers and when the app is completed or killed 
> its empty.  The table should have a label on it stating so.  
> Also the "AM Container" field is a link when running but not when its killed. 
>  That might be confusing.
> There is no link to the logs in this page but there is in the app attempt 
> table when looking at http://
> rm:8088/cluster/app/application_1431101480046_0003



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4414) Nodemanager connection errors are retried at multiple levels

2015-12-03 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-4414:


 Summary: Nodemanager connection errors are retried at multiple 
levels
 Key: YARN-4414
 URL: https://issues.apache.org/jira/browse/YARN-4414
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.2, 2.7.1
Reporter: Jason Lowe


This is related to YARN-3238.  Ran into more scenarios where connection errors 
are being retried at multiple levels, like NoRouteToHostException.  The fix for 
YARN-3238 was too specific, and I think we need a more general solution to 
catch a wider array of connection errors that can occur to avoid retrying them 
both at the RPC layer and at the NM proxy layer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4413) Nodes in the includes list should not be listed as decommissioned in the UI

2015-12-03 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15038135#comment-15038135
 ] 

Daniel Templeton commented on YARN-4413:


bq. But a restart will help here to clear the metrics.

True, but it will also cause an outage, which comes with its own potential 
impact.

bq. So I feel we could look both lists upon refresh and remove/add nodes based 
on the entries in both files and from memory.

Agreed.  I'll past a patch with my general approach shortly.

> Nodes in the includes list should not be listed as decommissioned in the UI
> ---
>
> Key: YARN-4413
> URL: https://issues.apache.org/jira/browse/YARN-4413
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.7.1
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>
> If I decommission a node and then move it from the excludes list back to the 
> includes list, but I don't restart the node, the node will still be listed by 
> the web UI as decomissioned until either the NM or RM is restarted.  Ideally, 
> removing the node from the excludes list and putting it back into the 
> includes list should cause the node to be reported as shutdown instead.
> CC [~kshukla]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4293) ResourceUtilization should be a part of yarn node CLI

2015-12-03 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15038085#comment-15038085
 ] 

Sunil G commented on YARN-4293:
---

Couple of test cases are related along with few warnings.  I will handle those 
once [~ka...@cloudera.com] [~leftnoteasy] confirms the approach. Thank You!

> ResourceUtilization should be a part of yarn node CLI
> -
>
> Key: YARN-4293
> URL: https://issues.apache.org/jira/browse/YARN-4293
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Sunil G
> Attachments: 0001-YARN-4293.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3816) [Aggregation] App-level aggregation and accumulation for YARN system metrics

2015-12-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15038169#comment-15038169
 ] 

Hadoop QA commented on YARN-3816:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 11m 
24s {color} | {color:green} feature-YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 12s 
{color} | {color:green} feature-YARN-2928 passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 27s 
{color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_85 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
31s {color} | {color:green} feature-YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 4s 
{color} | {color:green} feature-YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
53s {color} | {color:green} feature-YARN-2928 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 21s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common in 
feature-YARN-2928 has 3 extant Findbugs warnings. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 31s 
{color} | {color:red} hadoop-yarn-common in feature-YARN-2928 failed with JDK 
v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 10s 
{color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_85 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
54s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 10s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 10s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 28s 
{color} | {color:green} the patch passed with JDK v1.7.0_85 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 28s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 30s 
{color} | {color:red} Patch generated 18 new checkstyle issues in 
hadoop-yarn-project/hadoop-yarn (total was 362, now 367). {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 4s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
52s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 
15s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 3m 34s 
{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-api-jdk1.8.0_66 with JDK v1.8.0_66 
generated 2 new issues (was 100, now 100). {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 33s 
{color} | {color:red} hadoop-yarn-common in the patch failed with JDK 
v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 14s 
{color} | {color:green} the patch passed with JDK v1.7.0_85 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 35s 
{color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_66. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 5s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 20s 
{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with 
JDK v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 51s 
{color} | {color:green} 

[jira] [Commented] (YARN-4414) Nodemanager connection errors are retried at multiple levels

2015-12-03 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15038107#comment-15038107
 ] 

Jason Lowe commented on YARN-4414:
--

I noticed that HA proxies for the namenode and resourcemanager explicitly 
disable the connection retries in the RPC layer by default since it knows the 
HA proxy will do the retries.  I think the same should apply for nodemanager 
proxies, since we're seeing even connection timeouts retried too often in the 
RPC layer given a container allocation is worthless after 10 minutes by 
default.  By disabling retries in the RPC layer, we can add 
ConnectTimeoutException back to the list of exceptions retried at the NM proxy 
layer and simply retry all appropriate exceptions at the NM proxy layer.

> Nodemanager connection errors are retried at multiple levels
> 
>
> Key: YARN-4414
> URL: https://issues.apache.org/jira/browse/YARN-4414
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Jason Lowe
>
> This is related to YARN-3238.  Ran into more scenarios where connection 
> errors are being retried at multiple levels, like NoRouteToHostException.  
> The fix for YARN-3238 was too specific, and I think we need a more general 
> solution to catch a wider array of connection errors that can occur to avoid 
> retrying them both at the RPC layer and at the NM proxy layer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-3196) [Compatibility] Make TS next gen be compatible with the current TS

2015-12-03 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du reassigned YARN-3196:


Assignee: Junping Du  (was: Zhijie Shen)

> [Compatibility] Make TS next gen be compatible with the current TS
> --
>
> Key: YARN-3196
> URL: https://issues.apache.org/jira/browse/YARN-3196
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Zhijie Shen
>Assignee: Junping Du
>  Labels: yarn-2928-1st-milestone
>
> File a jira to make sure that we don't forget to be compatible with the 
> current TS, such that we can smoothly move users to new TS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3196) [Compatibility] Make TS next gen be compatible with the current TS

2015-12-03 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15037879#comment-15037879
 ] 

Junping Du commented on YARN-3196:
--

We should make sure ATS v2 compatible with ATS v1/v1.5 in some ways. [~zjshen], 
I will take it over if you are not actively work on this.

> [Compatibility] Make TS next gen be compatible with the current TS
> --
>
> Key: YARN-3196
> URL: https://issues.apache.org/jira/browse/YARN-3196
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
>  Labels: yarn-2928-1st-milestone
>
> File a jira to make sure that we don't forget to be compatible with the 
> current TS, such that we can smoothly move users to new TS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3196) [Compatibility] Make TS next gen be compatible with the current TS

2015-12-03 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3196:
-
Labels: yarn-2928-1st-milestone  (was: )

> [Compatibility] Make TS next gen be compatible with the current TS
> --
>
> Key: YARN-3196
> URL: https://issues.apache.org/jira/browse/YARN-3196
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
>  Labels: yarn-2928-1st-milestone
>
> File a jira to make sure that we don't forget to be compatible with the 
> current TS, such that we can smoothly move users to new TS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3816) [Aggregation] App-level aggregation and accumulation for YARN system metrics

2015-12-03 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3816:
-
Attachment: YARN-3816-feature-YARN-2928.v4.1.patch

Rename the patch name.

> [Aggregation] App-level aggregation and accumulation for YARN system metrics
> 
>
> Key: YARN-3816
> URL: https://issues.apache.org/jira/browse/YARN-3816
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
>  Labels: yarn-2928-1st-milestone
> Attachments: Application Level Aggregation of Timeline Data.pdf, 
> YARN-3816-YARN-2928-v1.patch, YARN-3816-YARN-2928-v2.1.patch, 
> YARN-3816-YARN-2928-v2.2.patch, YARN-3816-YARN-2928-v2.3.patch, 
> YARN-3816-YARN-2928-v2.patch, YARN-3816-YARN-2928-v3.1.patch, 
> YARN-3816-YARN-2928-v3.patch, YARN-3816-YARN-2928-v4.patch, 
> YARN-3816-feature-YARN-2928-v4.1.patch, 
> YARN-3816-feature-YARN-2928.v4.1.patch, YARN-3816-poc-v1.patch, 
> YARN-3816-poc-v2.patch
>
>
> We need application level aggregation of Timeline data:
> - To present end user aggregated states for each application, include: 
> resource (CPU, Memory) consumption across all containers, number of 
> containers launched/completed/failed, etc. We need this for apps while they 
> are running as well as when they are done.
> - Also, framework specific metrics, e.g. HDFS_BYTES_READ, should be 
> aggregated to show details of states in framework level.
> - Other level (Flow/User/Queue) aggregation can be more efficient to be based 
> on Application-level aggregations rather than raw entity-level data as much 
> less raws need to scan (with filter out non-aggregated entities, like: 
> events, configurations, etc.).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3053) [Security] Review and implement for property security in ATS v.2

2015-12-03 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3053:
-
Assignee: Junping Du  (was: Zhijie Shen)

> [Security] Review and implement for property security in ATS v.2
> 
>
> Key: YARN-3053
> URL: https://issues.apache.org/jira/browse/YARN-3053
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Junping Du
>
> Per design in YARN-2928, we want to evaluate and review the system for 
> security, and ensure proper security in the system.
> This includes proper authentication, token management, access control, and 
> any other relevant security aspects.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3816) [Aggregation] App-level aggregation and accumulation for YARN system metrics

2015-12-03 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3816:
-
Attachment: YARN-3816-feature-YARN-2928-v4.1.patch

Rebase v4 patch to the new YARN-2928 branch. Haven't addressed above comments, 
will address them in next patch.

> [Aggregation] App-level aggregation and accumulation for YARN system metrics
> 
>
> Key: YARN-3816
> URL: https://issues.apache.org/jira/browse/YARN-3816
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
>  Labels: yarn-2928-1st-milestone
> Attachments: Application Level Aggregation of Timeline Data.pdf, 
> YARN-3816-YARN-2928-v1.patch, YARN-3816-YARN-2928-v2.1.patch, 
> YARN-3816-YARN-2928-v2.2.patch, YARN-3816-YARN-2928-v2.3.patch, 
> YARN-3816-YARN-2928-v2.patch, YARN-3816-YARN-2928-v3.1.patch, 
> YARN-3816-YARN-2928-v3.patch, YARN-3816-YARN-2928-v4.patch, 
> YARN-3816-feature-YARN-2928-v4.1.patch, YARN-3816-poc-v1.patch, 
> YARN-3816-poc-v2.patch
>
>
> We need application level aggregation of Timeline data:
> - To present end user aggregated states for each application, include: 
> resource (CPU, Memory) consumption across all containers, number of 
> containers launched/completed/failed, etc. We need this for apps while they 
> are running as well as when they are done.
> - Also, framework specific metrics, e.g. HDFS_BYTES_READ, should be 
> aggregated to show details of states in framework level.
> - Other level (Flow/User/Queue) aggregation can be more efficient to be based 
> on Application-level aggregations rather than raw entity-level data as much 
> less raws need to scan (with filter out non-aggregated entities, like: 
> events, configurations, etc.).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4408) NodeManager still reports negative running containers

2015-12-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15037987#comment-15037987
 ] 

Hudson commented on YARN-4408:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #661 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/661/])
YARN-4408. Fix issue that NodeManager still reports negative running 
(junping_du: rev 62e9348bc10bb97a5fcb4281f7996a09d8e69c60)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/TestContainer.java
* hadoop-yarn-project/CHANGES.txt


> NodeManager still reports negative running containers
> -
>
> Key: YARN-4408
> URL: https://issues.apache.org/jira/browse/YARN-4408
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.4.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Fix For: 2.8.0
>
> Attachments: YARN-4408.001.patch, YARN-4408.002.patch, 
> YARN-4408.003.patch
>
>
> YARN-1697 fixed a problem where the NodeManager metrics could report a 
> negative number of running containers.  However, it missed a rare case where 
> this can still happen.
> YARN-1697 added a flag to indicate if the container was actually launched 
> ({{LOCALIZED}} to {{RUNNING}}) or not ({{LOCALIZED}} to {{KILLING}}), which 
> is then checked when transitioning from {{CONTAINER_CLEANEDUP_AFTER_KILL}} to 
> {{DONE}} and {{EXITED_WITH_FAILURE}} to {{DONE}} to only decrement the gauge 
> if we actually ran the container and incremented the gauge .  However, this 
> flag is not checked while transitioning from {{EXITED_WITH_SUCCESS}} to 
> {{DONE}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4405) Support node label store in non-appendable file system

2015-12-03 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15037993#comment-15037993
 ] 

Sunil G commented on YARN-4405:
---

Hi [~leftnoteasy]
Thank you for starting this ticket.

Couple of minor points:
1. Can mirror.new.tmp also pre-exists is the system (may be RM crashed earlier 
just after creating this file)
2. I feel may be {{close()}} need to be overridden. Otherwise for 
NonAppendableFSNodeLabelStore, we will still have base implementation to close 
{{editlogOs}} which is not needed.
{code}
IOUtils.cleanup(LOG, fs, editlogOs);
{code}
3. May be some logs at the end of {{NonAppendableFSNodeLabelStore#recover}} can 
help to provide some info that file creation/deletion/modification is done 
successfully.


> Support node label store in non-appendable file system
> --
>
> Key: YARN-4405
> URL: https://issues.apache.org/jira/browse/YARN-4405
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, client, resourcemanager
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-4405.1.patch, YARN-4405.2.patch, YARN-4405.3.patch
>
>
> Existing node label file system store implementation uses append to write 
> edit logs. However, some file system doesn't support append, we need add an 
> implementation to support such non-appendable file systems as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3816) [Aggregation] App-level aggregation and accumulation for YARN system metrics

2015-12-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15037947#comment-15037947
 ] 

Hadoop QA commented on YARN-3816:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s {color} 
| {color:red} YARN-3816 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12775575/YARN-3816-feature-YARN-2928-v4.1.patch
 |
| JIRA Issue | YARN-3816 |
| Powered by | Apache Yetus   http://yetus.apache.org |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/9847/console |


This message was automatically generated.



> [Aggregation] App-level aggregation and accumulation for YARN system metrics
> 
>
> Key: YARN-3816
> URL: https://issues.apache.org/jira/browse/YARN-3816
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
>  Labels: yarn-2928-1st-milestone
> Attachments: Application Level Aggregation of Timeline Data.pdf, 
> YARN-3816-YARN-2928-v1.patch, YARN-3816-YARN-2928-v2.1.patch, 
> YARN-3816-YARN-2928-v2.2.patch, YARN-3816-YARN-2928-v2.3.patch, 
> YARN-3816-YARN-2928-v2.patch, YARN-3816-YARN-2928-v3.1.patch, 
> YARN-3816-YARN-2928-v3.patch, YARN-3816-YARN-2928-v4.patch, 
> YARN-3816-feature-YARN-2928-v4.1.patch, YARN-3816-poc-v1.patch, 
> YARN-3816-poc-v2.patch
>
>
> We need application level aggregation of Timeline data:
> - To present end user aggregated states for each application, include: 
> resource (CPU, Memory) consumption across all containers, number of 
> containers launched/completed/failed, etc. We need this for apps while they 
> are running as well as when they are done.
> - Also, framework specific metrics, e.g. HDFS_BYTES_READ, should be 
> aggregated to show details of states in framework level.
> - Other level (Flow/User/Queue) aggregation can be more efficient to be based 
> on Application-level aggregations rather than raw entity-level data as much 
> less raws need to scan (with filter out non-aggregated entities, like: 
> events, configurations, etc.).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4416) Deadlock due to synchronised get Methods in AbstractCSQueue

2015-12-03 Thread Naganarasimha G R (JIRA)
Naganarasimha G R created YARN-4416:
---

 Summary: Deadlock due to synchronised get Methods in 
AbstractCSQueue
 Key: YARN-4416
 URL: https://issues.apache.org/jira/browse/YARN-4416
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacity scheduler, resourcemanager
Affects Versions: 2.7.1
Reporter: Naganarasimha G R
Assignee: Naganarasimha G R
Priority: Minor


While debugging in eclipse came across a scenario where in i had to get to know 
the name of the queue but every time i tried to see the queue it was getting 
hung. On seeing the stack realized there was a deadlock but on analysis found 
out that it was only due to *queue.toString()* during debugging as 
{{AbstractCSQueue.getAbsoluteUsedCapacity}} was synchronized.
Still i feel {{AbstractCSQueue}}'s getter methods need not be synchronized and 
better be handled through read and write locks.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4416) Deadlock due to synchronised get Methods in AbstractCSQueue

2015-12-03 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-4416:

Attachment: deadlock.log

attaching the stack trace for deadlock

> Deadlock due to synchronised get Methods in AbstractCSQueue
> ---
>
> Key: YARN-4416
> URL: https://issues.apache.org/jira/browse/YARN-4416
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, resourcemanager
>Affects Versions: 2.7.1
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>Priority: Minor
> Attachments: deadlock.log
>
>
> While debugging in eclipse came across a scenario where in i had to get to 
> know the name of the queue but every time i tried to see the queue it was 
> getting hung. On seeing the stack realized there was a deadlock but on 
> analysis found out that it was only due to *queue.toString()* during 
> debugging as {{AbstractCSQueue.getAbsoluteUsedCapacity}} was synchronized.
> Still i feel {{AbstractCSQueue}}'s getter methods need not be synchronized 
> and better be handled through read and write locks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4002) make ResourceTrackerService.nodeHeartbeat more concurrent

2015-12-03 Thread Brook Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15038392#comment-15038392
 ] 

Brook Zhou commented on YARN-4002:
--

If this is currently not being worked on, I will assign it to me.

> make ResourceTrackerService.nodeHeartbeat more concurrent
> -
>
> Key: YARN-4002
> URL: https://issues.apache.org/jira/browse/YARN-4002
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Hong Zhiguo
>Assignee: Hong Zhiguo
>Priority: Critical
> Attachments: YARN-4002-v0.patch
>
>
> We have multiple RPC threads to handle NodeHeartbeatRequest from NMs. By 
> design the method ResourceTrackerService.nodeHeartbeat should be concurrent 
> enough to scale for large clusters.
> But we have a "BIG" lock in NodesListManager.isValidNode which I think it's 
> unnecessary.
> First, the fields "includes" and "excludes" of HostsFileReader are only 
> updated on "refresh nodes".  All RPC threads handling node heartbeats are 
> only readers.  So RWLock could be used to  alow concurrent access by RPC 
> threads.
> Second, since he fields "includes" and "excludes" of HostsFileReader are 
> always updated by "reference assignment", which is atomic in Java, the reader 
> side lock could just be skipped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4415) Scheduler Web Ui shows max capacity for the queue is 100% but when we submit application doesnt get assigned

2015-12-03 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-4415:

Attachment: screenshot-1.png

> Scheduler Web Ui shows max capacity for the queue is 100% but when we submit 
> application doesnt get assigned
> 
>
> Key: YARN-4415
> URL: https://issues.apache.org/jira/browse/YARN-4415
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler, resourcemanager
>Affects Versions: 2.7.2
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
> Attachments: App info with diagnostics info.png, screenshot-1.png
>
>
> Steps to reproduce the issue :
> Scenario 1:
> # Configure a queue(default) with accessible node labels as *
> # create a exclusive partition *xxx* and map a NM to it
> # ensure no capacities are configured for default for label xxx
> # start an RM app with queue as default and label as xxx
> # application is stuck but scheduler ui shows 100% as max capacity for that 
> queue
> Scenario 2:
> # create a nonexclusive partition *sharedPartition* and map a NM to it
> # ensure no capacities are configured for default queue
> # start an RM app with queue as *default* and label as *sharedPartition*
> # application is stuck but scheduler ui shows 100% as max capacity for that 
> queue for *sharedPartition*
> For both issues cause is the same default max capacity and abs max capacity 
> is set to Zero %



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4415) Scheduler Web Ui shows max capacity for the queue is 100% but when we submit application doesnt get assigned

2015-12-03 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15038332#comment-15038332
 ] 

Naganarasimha G R commented on YARN-4415:
-

As per offline discussion with [~wangda] he had mentioned that it was done with 
intent that the default max capacity of a  partition is set to zero to avoid 
configuring the queue.
IMHO i feel its much easier if we assume max capacity is 100% and calculate abs 
max based on its parent queue's max cap for following reasons
# It will have the same behavior as that of default partition hence less 
confusion
# May be my understanding is wrong but i feel its easier to add new partitions 
without touching the CS.xml as we can set the accessible nodelabels to * and 
assume 100% as the max capacity and 0% as guranteed capacity.

And also we need to update the documentation with the default values


> Scheduler Web Ui shows max capacity for the queue is 100% but when we submit 
> application doesnt get assigned
> 
>
> Key: YARN-4415
> URL: https://issues.apache.org/jira/browse/YARN-4415
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler, resourcemanager
>Affects Versions: 2.7.2
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
> Attachments: App info with diagnostics info.png, screenshot-1.png
>
>
> Steps to reproduce the issue :
> Scenario 1:
> # Configure a queue(default) with accessible node labels as *
> # create a exclusive partition *xxx* and map a NM to it
> # ensure no capacities are configured for default for label xxx
> # start an RM app with queue as default and label as xxx
> # application is stuck but scheduler ui shows 100% as max capacity for that 
> queue
> Scenario 2:
> # create a nonexclusive partition *sharedPartition* and map a NM to it
> # ensure no capacities are configured for default queue
> # start an RM app with queue as *default* and label as *sharedPartition*
> # application is stuck but scheduler ui shows 100% as max capacity for that 
> queue for *sharedPartition*
> For both issues cause is the same default max capacity and abs max capacity 
> is set to Zero %



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3946) Allow fetching exact reason as to why a submitted app is in ACCEPTED state in CS

2015-12-03 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15038381#comment-15038381
 ] 

Naganarasimha G R commented on YARN-3946:
-

Hi [~wangda],
Some of the test failures seems to be related to the patch also would merge 
{{checkAndUpdateAMContainerDiagnostics}} and {{updateAMContainerDiagnostics}} 
with additional parameter rather than another method. Will upload a new patch 
at the earliest. 

> Allow fetching exact reason as to why a submitted app is in ACCEPTED state in 
> CS
> 
>
> Key: YARN-3946
> URL: https://issues.apache.org/jira/browse/YARN-3946
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler, resourcemanager
>Affects Versions: 2.6.0
>Reporter: Sumit Nigam
>Assignee: Naganarasimha G R
> Attachments: 3946WebImages.zip, YARN-3946.v1.001.patch, 
> YARN-3946.v1.002.patch, YARN-3946.v1.003.Images.zip, YARN-3946.v1.003.patch, 
> YARN-3946.v1.004.patch, YARN-3946.v1.005.patch
>
>
> Currently there is no direct way to get the exact reason as to why a 
> submitted app is still in ACCEPTED state. It should be possible to know 
> through RM REST API as to what aspect is not being met - say, queue limits 
> being reached, or core/ memory requirement not being met, or AM limit being 
> reached, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4238) createdTime and modifiedTime is not reported while publishing entities to ATSv2

2015-12-03 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15038566#comment-15038566
 ] 

Varun Saxena commented on YARN-4238:


[~sjlee0], sorry had missed your comment. Will fix checkstyle issues and update 
a patch shortly .

bq. I think it is reasonable to say that the clients are required to set 
creation time and modification time, or they will not be present in the data 
and things like sort will not work correctly on those records. What do you 
think?
I agree. Client sending created time is a reasonable assumption to make. We can 
mention this explicitly as well when we document ATSv2.


> createdTime and modifiedTime is not reported while publishing entities to 
> ATSv2
> ---
>
> Key: YARN-4238
> URL: https://issues.apache.org/jira/browse/YARN-4238
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4238-YARN-2928.01.patch, 
> YARN-4238-feature-YARN-2928.02.patch
>
>
> While publishing entities from RM and elsewhere we are not sending created 
> time. For instance, created time in TimelineServiceV2Publisher class and for 
> other entities in other such similar classes is not updated. We can easily 
> update created time when sending application created event. Likewise for 
> modification time on every write.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4408) NodeManager still reports negative running containers

2015-12-03 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15038649#comment-15038649
 ] 

Robert Kanter commented on YARN-4408:
-

Thanks [~djp]!

> NodeManager still reports negative running containers
> -
>
> Key: YARN-4408
> URL: https://issues.apache.org/jira/browse/YARN-4408
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.4.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Fix For: 2.8.0
>
> Attachments: YARN-4408.001.patch, YARN-4408.002.patch, 
> YARN-4408.003.patch
>
>
> YARN-1697 fixed a problem where the NodeManager metrics could report a 
> negative number of running containers.  However, it missed a rare case where 
> this can still happen.
> YARN-1697 added a flag to indicate if the container was actually launched 
> ({{LOCALIZED}} to {{RUNNING}}) or not ({{LOCALIZED}} to {{KILLING}}), which 
> is then checked when transitioning from {{CONTAINER_CLEANEDUP_AFTER_KILL}} to 
> {{DONE}} and {{EXITED_WITH_FAILURE}} to {{DONE}} to only decrement the gauge 
> if we actually ran the container and incremented the gauge .  However, this 
> flag is not checked while transitioning from {{EXITED_WITH_SUCCESS}} to 
> {{DONE}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1974) add args for DistributedShell to specify a set of nodes on which the tasks run

2015-12-03 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1974:

Attachment: YARN-1974.1.patch

> add args for DistributedShell to specify a set of nodes on which the tasks run
> --
>
> Key: YARN-1974
> URL: https://issues.apache.org/jira/browse/YARN-1974
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications/distributed-shell
>Affects Versions: 2.7.0
>Reporter: Hong Zhiguo
>Assignee: Hong Zhiguo
>Priority: Minor
> Attachments: YARN-1974.1.patch, YARN-1974.patch
>
>
> It's very useful to execute a script on a specific set of machines for both 
> testing and maintenance purpose.
> The args "--nodes" and "--relax_locality" are added to DistributedShell. 
> Together with an unit test using miniCluster.
> It's also tested on our real cluster with Fair scheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-1974) add args for DistributedShell to specify a set of nodes on which the tasks run

2015-12-03 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong reassigned YARN-1974:
---

Assignee: Xuan Gong  (was: Hong Zhiguo)

> add args for DistributedShell to specify a set of nodes on which the tasks run
> --
>
> Key: YARN-1974
> URL: https://issues.apache.org/jira/browse/YARN-1974
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications/distributed-shell
>Affects Versions: 2.7.0
>Reporter: Hong Zhiguo
>Assignee: Xuan Gong
>Priority: Minor
> Attachments: YARN-1974.1.patch, YARN-1974.patch
>
>
> It's very useful to execute a script on a specific set of machines for both 
> testing and maintenance purpose.
> The args "--nodes" and "--relax_locality" are added to DistributedShell. 
> Together with an unit test using miniCluster.
> It's also tested on our real cluster with Fair scheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4292) ResourceUtilization should be a part of NodeInfo REST API

2015-12-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15038743#comment-15038743
 ] 

Hudson commented on YARN-4292:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8915 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8915/])
YARN-4292. ResourceUtilization should be a part of NodeInfo REST API. (wangda: 
rev a2c3bfc8c1349102a7f2bc4ea96b80b429ac227b)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/ResourceUtilizationInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesNodes.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/NodeInfo.java
* hadoop-yarn-project/CHANGES.txt


> ResourceUtilization should be a part of NodeInfo REST API
> -
>
> Key: YARN-4292
> URL: https://issues.apache.org/jira/browse/YARN-4292
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Sunil G
> Fix For: 2.8.0
>
> Attachments: 0001-YARN-4292.patch, 0002-YARN-4292.patch, 
> 0003-YARN-4292.patch, 0004-YARN-4292.patch, 0005-YARN-4292.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2885) Create AMRMProxy request interceptor for distributed scheduling decisions for queueable containers

2015-12-03 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15038769#comment-15038769
 ] 

Wangda Tan commented on YARN-2885:
--

[~kkaranasos], 
bq. ... That said, I am not sure if it is required to create a wrapper at this 
point for the AM protocol.
As suggested by [~asuresh], 
bq. Have an Distributed Scheduler AM Service running on the RM if DS is 
enabled. This will implement the new protocol (it will delegate all the 
AMProtocol stuff to the AMService and will handle DistScheduler specific stuff)
Do you think if it's a good idea?

> Create AMRMProxy request interceptor for distributed scheduling decisions for 
> queueable containers
> --
>
> Key: YARN-2885
> URL: https://issues.apache.org/jira/browse/YARN-2885
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Konstantinos Karanasos
>Assignee: Arun Suresh
> Attachments: YARN-2885-yarn-2877.001.patch
>
>
> We propose to add a Local ResourceManager (LocalRM) to the NM in order to 
> support distributed scheduling decisions. 
> Architecturally we leverage the RMProxy, introduced in YARN-2884. 
> The LocalRM makes distributed decisions for queuable containers requests. 
> Guaranteed-start requests are still handled by the central RM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4225) Add preemption status to yarn queue -status for capacity scheduler

2015-12-03 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15038723#comment-15038723
 ] 

Wangda Tan commented on YARN-4225:
--

Thanks [~jlowe], I can understand the issue now.

I'm OK with both approach - existing one in latest patch or simply return false 
if there's no such field in proto.

> Add preemption status to yarn queue -status for capacity scheduler
> --
>
> Key: YARN-4225
> URL: https://issues.apache.org/jira/browse/YARN-4225
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, yarn
>Affects Versions: 2.7.1
>Reporter: Eric Payne
>Assignee: Eric Payne
>Priority: Minor
> Attachments: YARN-4225.001.patch, YARN-4225.002.patch, 
> YARN-4225.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list

2015-12-03 Thread Kuhu Shukla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kuhu Shukla updated YARN-4311:
--
Attachment: YARN-4311-v3.patch

Fixed one test failure in TestYarnConfigurationFields by adding the new configs 
to yarn-default. TestAMAuthorization and TestClientRMTokens are unrelated and 
fail as per YARN-4318 and YARN-4306.

Corrected check-style issues.

> Removing nodes from include and exclude lists will not remove them from 
> decommissioned nodes list
> -
>
> Key: YARN-4311
> URL: https://issues.apache.org/jira/browse/YARN-4311
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.1
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Attachments: YARN-4311-v1.patch, YARN-4311-v2.patch, 
> YARN-4311-v3.patch
>
>
> In order to fully forget about a node, removing the node from include and 
> exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The 
> tricky part that [~jlowe] pointed out was the case when include lists are not 
> used, in that case we don't want the nodes to fall off if they are not active.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2877) Extend YARN to support distributed scheduling

2015-12-03 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15038760#comment-15038760
 ] 

Wangda Tan commented on YARN-2877:
--

Hi [~kkaranasos],
Thanks for reply:
bq. We are planning to address this by having smaller heartbeat intervals in 
the AM-LocalRM communication when compared to the LocalRM-RM. For instance, the 
AM-LocalRM heartbeat interval can be set to 50ms, while the LocalRM-RM interval 
to 200ms (in other words, we will only propagate to the RM only one in every 
four heartbeats).
Maybe you could also take a look at HADOOP-11552, which could possibly achieve 
better latency and reduce heartbeat frequency.

bq. This is a valid concern. The best way to minimize preemption is through the 
"top-k node list" technique described above. As the LocalRM will be placing the 
QUEUEABLE containers to the least loaded nodes, preemption will be minimized.
I think top-k node list technique cannot completely solve the over subscribe 
issue, in a production cluster, application comes in waves, it is possible that 
few large applications can exhaust all resources in a cluster within few 
seconds. Maybe another possible approach to mitigate the issue is: propagating 
queue-able containers from NM to RM periodically, so NM can still make decision 
but RM can also be aware of these queue-able containers.

bq. That said, as you also mention, QUEUEABLE containers are more suitable for 
short-running tasks, where the probability of a container being preempted is 
smaller.
Ideally it's better to support all non-long-running-service tasks. LocalRM 
could allocate short-running queue-able tasks and RM an allocate other 
queue-able tasks.

> Extend YARN to support distributed scheduling
> -
>
> Key: YARN-2877
> URL: https://issues.apache.org/jira/browse/YARN-2877
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager, resourcemanager
>Reporter: Sriram Rao
>Assignee: Konstantinos Karanasos
> Attachments: distributed-scheduling-design-doc_v1.pdf
>
>
> This is an umbrella JIRA that proposes to extend YARN to support distributed 
> scheduling.  Briefly, some of the motivations for distributed scheduling are 
> the following:
> 1. Improve cluster utilization by opportunistically executing tasks otherwise 
> idle resources on individual machines.
> 2. Reduce allocation latency.  Tasks where the scheduling time dominates 
> (i.e., task execution time is much less compared to the time required for 
> obtaining a container from the RM).
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2885) Create AMRMProxy request interceptor for distributed scheduling decisions for queueable containers

2015-12-03 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15038768#comment-15038768
 ] 

Wangda Tan commented on YARN-2885:
--

Hi [~asuresh],
Thanks for reply,

bq. What we were aiming for is to not send any Queueable resource reqs to the 
RM...
After thinking, RM could directly support queue-able container allocation. 
Since queue-able/guaranteed executionType is part of user-facing API, so 
scheduler can consider to allocate queue-able container or not. LocalRM is a 
way to allocate queue-able containers. But  please make sure that there's no 
assumption (hardcoded logic) that queue-able container can be only allocated by 
LocalRM?

bq. I totally agree that the AM should not be bothered with this.. But if you 
notice, It is actually not set by the AM, it set by the 
DistSchedulerReqeustInterceptor when it proxies the AM calls...
Since you planned to have a LocalRM coordinator, I would prefer to add a 
separated Distributed Scheduler Coordinator service and protocols.

Other comments are make sense to me.


> Create AMRMProxy request interceptor for distributed scheduling decisions for 
> queueable containers
> --
>
> Key: YARN-2885
> URL: https://issues.apache.org/jira/browse/YARN-2885
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Konstantinos Karanasos
>Assignee: Arun Suresh
> Attachments: YARN-2885-yarn-2877.001.patch
>
>
> We propose to add a Local ResourceManager (LocalRM) to the NM in order to 
> support distributed scheduling decisions. 
> Architecturally we leverage the RMProxy, introduced in YARN-2884. 
> The LocalRM makes distributed decisions for queuable containers requests. 
> Guaranteed-start requests are still handled by the central RM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4405) Support node label store in non-appendable file system

2015-12-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15038763#comment-15038763
 ] 

Hadoop QA commented on YARN-4405:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 
1s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
24s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 22s 
{color} | {color:green} trunk passed with JDK v1.7.0_85 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
4s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 48s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
57s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 1s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 32s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 5m 12s 
{color} | {color:green} trunk passed with JDK v1.7.0_85 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 26s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 9m 26s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 26s 
{color} | {color:green} the patch passed with JDK v1.7.0_85 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 9m 26s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 3s 
{color} | {color:red} Patch generated 7 new checkstyle issues in root (total 
was 264, now 267). {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 47s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
57s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 13 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 0s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 7m 6s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 49s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 5m 22s 
{color} | {color:green} the patch passed with JDK v1.7.0_85 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 45s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 26s 
{color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_66. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 2s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 61m 1s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 49s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.7.0_85. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 27s 
{color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_85. 
{color} |
| {color:green}+1{color} | {color:green} unit 

[jira] [Commented] (YARN-4304) AM max resource configuration per partition to be displayed/updated correctly in UI and in various partition related metrics

2015-12-03 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15038703#comment-15038703
 ] 

Wangda Tan commented on YARN-4304:
--

Hi [~sunilg],
Took a look at REST API implementation of latest patch, some comments:

By design, PartitionResourcesInfo/ResourcesInfo should be used by 
user/queue/app, so we need to make fields and usage generic to these components.
- amResourceLimit is meaningful to all components. App doesn't use that field 
for now, but we can keep it and set it to infinite.
- userAMResourceLimit is not meaningful to queue/app, and it's overlap to 
user.resourcesInfo.amResourceLimit, I suggest to remove it and we can use 
amResourceLimit of first user of queues to show on UI. Another reason is in the 
future we could have different users has different amResourceLimits.

And also ResourcesInfo is RESTful mapping to ResourceUsage, so necessary 
changes need to be added to ResourceUsage (Maybe rename ResourceUsage to 
ResourcesInformation?) as well. Renaming could be done in a separated JIRA, but 
I suggest to change ResourceUsage implementation at the same JIRA.

If you agree with above, ResourcesInfo's constructor shouldn't relate to 
LeafQueue and considerAMUsage, it should simply copy fields from ResourceUsage.

> AM max resource configuration per partition to be displayed/updated correctly 
> in UI and in various partition related metrics
> 
>
> Key: YARN-4304
> URL: https://issues.apache.org/jira/browse/YARN-4304
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: webapp
>Affects Versions: 2.7.1
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: 0001-YARN-4304.patch, 0002-YARN-4304.patch, 
> 0003-YARN-4304.patch, 0004-YARN-4304.patch, REST_and_UI.zip
>
>
> As we are supporting per-partition level max AM resource percentage 
> configuration, UI and various metrics also need to display correct 
> configurations related to same. 
> For eg: Current UI still shows am-resource percentage per queue level. This 
> is to be updated correctly when label config is used.
> - Display max-am-percentage per-partition in Scheduler UI (label also) and in 
> ClusterMetrics page
> - Update queue/partition related metrics w.r.t per-partition 
> am-resource-percentage



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1974) add args for DistributedShell to specify a set of nodes on which the tasks run

2015-12-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15038803#comment-15038803
 ] 

Hadoop QA commented on YARN-1974:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
36s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
38s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
22s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 18s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 11s 
{color} | {color:red} Patch generated 2 new checkstyle issues in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell
 (total was 148, now 148). {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
46s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 21s {color} 
| {color:red} hadoop-yarn-applications-distributedshell in the patch failed 
with JDK v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 8s 
{color} | {color:green} hadoop-yarn-applications-distributedshell in the patch 
passed with JDK v1.7.0_91. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 24s 
{color} | {color:red} Patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 34m 54s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 
hadoop.yarn.applications.distributedshell.TestDistributedShell |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12775669/YARN-1974.1.patch |
| JIRA Issue | YARN-1974 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 613c1f2f9c94 3.13.0-36-lowlatency #63-Ubuntu SMP 

[jira] [Commented] (YARN-4416) Deadlock due to synchronised get Methods in AbstractCSQueue

2015-12-03 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15038560#comment-15038560
 ] 

Wangda Tan commented on YARN-4416:
--

Thanks for reporting this issue, [~Naganarasimha].

Looked at the code, all methods used by 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue#toString
 don't need to be synchronized:
- queueCapacity, resource-usage has their own read/write lock.
- numContainers is volatile.
- read/write lock could be added to OrderingPolicy. Read operations don't need 
synchronized. So getNumApplications doesn't need synchronized.

> Deadlock due to synchronised get Methods in AbstractCSQueue
> ---
>
> Key: YARN-4416
> URL: https://issues.apache.org/jira/browse/YARN-4416
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, resourcemanager
>Affects Versions: 2.7.1
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>Priority: Minor
> Attachments: deadlock.log
>
>
> While debugging in eclipse came across a scenario where in i had to get to 
> know the name of the queue but every time i tried to see the queue it was 
> getting hung. On seeing the stack realized there was a deadlock but on 
> analysis found out that it was only due to *queue.toString()* during 
> debugging as {{AbstractCSQueue.getAbsoluteUsedCapacity}} was synchronized.
> Still i feel {{AbstractCSQueue}}'s getter methods need not be synchronized 
> and better be handled through read and write locks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4238) createdTime and modifiedTime is not reported while publishing entities to ATSv2

2015-12-03 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-4238:
---
Attachment: YARN-4238-feature-YARN-2928.03.patch

> createdTime and modifiedTime is not reported while publishing entities to 
> ATSv2
> ---
>
> Key: YARN-4238
> URL: https://issues.apache.org/jira/browse/YARN-4238
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4238-YARN-2928.01.patch, 
> YARN-4238-feature-YARN-2928.02.patch, YARN-4238-feature-YARN-2928.03.patch
>
>
> While publishing entities from RM and elsewhere we are not sending created 
> time. For instance, created time in TimelineServiceV2Publisher class and for 
> other entities in other such similar classes is not updated. We can easily 
> update created time when sending application created event. Likewise for 
> modification time on every write.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4340) Add "list" API to reservation system

2015-12-03 Thread Sean Po (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Po updated YARN-4340:
--
Attachment: YARN-4340.v5.patch

> Add "list" API to reservation system
> 
>
> Key: YARN-4340
> URL: https://issues.apache.org/jira/browse/YARN-4340
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler, fairscheduler, resourcemanager
>Reporter: Carlo Curino
>Assignee: Sean Po
> Attachments: YARN-4340.v1.patch, YARN-4340.v2.patch, 
> YARN-4340.v3.patch, YARN-4340.v4.patch, YARN-4340.v5.patch
>
>
> This JIRA tracks changes to the APIs of the reservation system, and enables 
> querying the reservation system on which reservation exists by "time-range, 
> reservation-id, username".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4340) Add "list" API to reservation system

2015-12-03 Thread Sean Po (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Po updated YARN-4340:
--
Attachment: (was: YARN-4340.v8.patch)

> Add "list" API to reservation system
> 
>
> Key: YARN-4340
> URL: https://issues.apache.org/jira/browse/YARN-4340
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler, fairscheduler, resourcemanager
>Reporter: Carlo Curino
>Assignee: Sean Po
> Attachments: YARN-4340.v1.patch, YARN-4340.v2.patch, 
> YARN-4340.v3.patch, YARN-4340.v4.patch
>
>
> This JIRA tracks changes to the APIs of the reservation system, and enables 
> querying the reservation system on which reservation exists by "time-range, 
> reservation-id, username".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4340) Add "list" API to reservation system

2015-12-03 Thread Sean Po (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Po updated YARN-4340:
--
Attachment: YARN-4340.v8.patch

> Add "list" API to reservation system
> 
>
> Key: YARN-4340
> URL: https://issues.apache.org/jira/browse/YARN-4340
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler, fairscheduler, resourcemanager
>Reporter: Carlo Curino
>Assignee: Sean Po
> Attachments: YARN-4340.v1.patch, YARN-4340.v2.patch, 
> YARN-4340.v3.patch, YARN-4340.v4.patch
>
>
> This JIRA tracks changes to the APIs of the reservation system, and enables 
> querying the reservation system on which reservation exists by "time-range, 
> reservation-id, username".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4417) Make RM and Timeline-server REST APIs more consistent

2015-12-03 Thread Wangda Tan (JIRA)
Wangda Tan created YARN-4417:


 Summary: Make RM and Timeline-server REST APIs more consistent
 Key: YARN-4417
 URL: https://issues.apache.org/jira/browse/YARN-4417
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Wangda Tan
Assignee: Wangda Tan


There're some differences between RM and timeline-server's REST APIs, for 
example, RM REST API doesn't support get application attempt info by app-id and 
attempt-id but timeline server supports. We could make them more consistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4403) (AM/NM/Container)LivelinessMonitor should use monotonic time when calculating period

2015-12-03 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15039502#comment-15039502
 ] 

Jian He commented on YARN-4403:
---

patch looks good to me. 
one minor suggestion, do you think we can change the base 
AbstractLivelinessMonitor to have a default constructor with   MonotonicClock 
and callers can use this constructor instead ?

> (AM/NM/Container)LivelinessMonitor should use monotonic time when calculating 
> period
> 
>
> Key: YARN-4403
> URL: https://issues.apache.org/jira/browse/YARN-4403
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: YARN-4403.patch
>
>
> Currently, (AM/NM/Container)LivelinessMonitor use current system time to 
> calculate a duration of expire which could be broken by settimeofday. We 
> should use Time.monotonicNow() instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3542) Re-factor support for CPU as a resource using the new ResourceHandler mechanism

2015-12-03 Thread Sidharta Seethana (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15039519#comment-15039519
 ] 

Sidharta Seethana commented on YARN-3542:
-

hi [~vvasudev],

Thank you for the patch. I took a look at the patch and it is a bit unclear how 
the new configs/resource handler are meant to interact with the existing 
{{CgroupsLCEResourcesHandler}} . IMO, one of the goals here is to deprecate 
{{CgroupsLCEResourcesHandler}} and use the new resource handler mechanism so 
that all the resource handling/isolation is handled in a consistent manner. 

Could you please provide a description of the changes introduced in this patch 
and what the interaction would be with the existing CPU cgroups implementation 
(especially from a configuration perspective) ?

thanks,
-Sidharta

> Re-factor support for CPU as a resource using the new ResourceHandler 
> mechanism
> ---
>
> Key: YARN-3542
> URL: https://issues.apache.org/jira/browse/YARN-3542
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Sidharta Seethana
>Assignee: Sidharta Seethana
>Priority: Critical
> Attachments: YARN-3542.001.patch, YARN-3542.002.patch
>
>
> In YARN-3443 , a new ResourceHandler mechanism was added which enabled easier 
> addition of new resource types in the nodemanager (this was used for network 
> as a resource - See YARN-2140 ). We should refactor the existing CPU 
> implementation ( LinuxContainerExecutor/CgroupsLCEResourcesHandler ) using 
> the new ResourceHandler mechanism. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4417) Make RM and Timeline-server REST APIs more consistent

2015-12-03 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-4417:
-
Attachment: YARN-4417.1.patch

Attached initial patch for review.

> Make RM and Timeline-server REST APIs more consistent
> -
>
> Key: YARN-4417
> URL: https://issues.apache.org/jira/browse/YARN-4417
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-4417.1.patch
>
>
> There're some differences between RM and timeline-server's REST APIs, for 
> example, RM REST API doesn't support get application attempt info by app-id 
> and attempt-id but timeline server supports. We could make them more 
> consistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3669) Attempt-failures validatiy interval should have a global admin configurable lower limit

2015-12-03 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-3669:

Attachment: YARN-3669.1.patch

> Attempt-failures validatiy interval should have a global admin configurable 
> lower limit
> ---
>
> Key: YARN-3669
> URL: https://issues.apache.org/jira/browse/YARN-3669
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>  Labels: newbie
> Attachments: YARN-3669.1.patch
>
>
> Found this while reviewing YARN-3480.
> bq. When 'attemptFailuresValidityInterval'(introduced in YARN-611) is set to 
> a small value, retried attempts might be very large. So we need to delete 
> some attempts stored in RMStateStore and RMStateStore.
> I think we need to have a lower limit on the failure-validaty interval to 
> avoid situations like this.
> Having this will avoid pardoning too-many failures in too-short a duration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-3542) Re-factor support for CPU as a resource using the new ResourceHandler mechanism

2015-12-03 Thread Sidharta Seethana (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sidharta Seethana reassigned YARN-3542:
---

Assignee: Sidharta Seethana  (was: Varun Vasudev)

> Re-factor support for CPU as a resource using the new ResourceHandler 
> mechanism
> ---
>
> Key: YARN-3542
> URL: https://issues.apache.org/jira/browse/YARN-3542
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Sidharta Seethana
>Assignee: Sidharta Seethana
>Priority: Critical
> Attachments: YARN-3542.001.patch, YARN-3542.002.patch
>
>
> In YARN-3443 , a new ResourceHandler mechanism was added which enabled easier 
> addition of new resource types in the nodemanager (this was used for network 
> as a resource - See YARN-2140 ). We should refactor the existing CPU 
> implementation ( LinuxContainerExecutor/CgroupsLCEResourcesHandler ) using 
> the new ResourceHandler mechanism. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3669) Attempt-failures validatiy interval should have a global admin configurable lower limit

2015-12-03 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-3669:
--
Assignee: Xuan Gong  (was: Vinod Kumar Vavilapalli)

> Attempt-failures validatiy interval should have a global admin configurable 
> lower limit
> ---
>
> Key: YARN-3669
> URL: https://issues.apache.org/jira/browse/YARN-3669
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Xuan Gong
>  Labels: newbie
> Attachments: YARN-3669.1.patch
>
>
> Found this while reviewing YARN-3480.
> bq. When 'attemptFailuresValidityInterval'(introduced in YARN-611) is set to 
> a small value, retried attempts might be very large. So we need to delete 
> some attempts stored in RMStateStore and RMStateStore.
> I think we need to have a lower limit on the failure-validaty interval to 
> avoid situations like this.
> Having this will avoid pardoning too-many failures in too-short a duration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4309) Add debug information to application logs when a container fails

2015-12-03 Thread Sidharta Seethana (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15038231#comment-15038231
 ] 

Sidharta Seethana commented on YARN-4309:
-

Could you please edit the comment mentioned above to make it a bit clearer? 
Maybe include a note, both in yarn-default.xml 's description of the config 
flag as well as code that symlinks could be followed outside the current 
directory?

Also, there seem to be a few spurious empty lines introduced in 
DockerContainerExecutor. Apart from this, the latest patch seems good to me.



> Add debug information to application logs when a container fails
> 
>
> Key: YARN-4309
> URL: https://issues.apache.org/jira/browse/YARN-4309
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Attachments: YARN-4309.001.patch, YARN-4309.002.patch, 
> YARN-4309.003.patch, YARN-4309.004.patch, YARN-4309.005.patch
>
>
> Sometimes when a container fails, it can be pretty hard to figure out why it 
> failed.
> My proposal is that if a container fails, we collect information about the 
> container local dir and dump it into the container log dir. Ideally, I'd like 
> to tar up the directory entirely, but I'm not sure of the security and space 
> implications of such a approach. At the very least, we can list all the files 
> in the container local dir, and dump the contents of launch_container.sh(into 
> the container log dir).
> When log aggregation occurs, all this information will automatically get 
> collected and make debugging such failures much easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4413) Nodes in the includes list should not be listed as decommissioned in the UI

2015-12-03 Thread Daniel Templeton (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton updated YARN-4413:
---
Attachment: YARN-4413.001.patch

Here's the approach I propose.

I should note that I noticed that the graceful decommission code allows an 
illegal state transition:

{code}
if (entry.getValue().getState() == NodeState.DECOMMISSIONING
|| entry.getValue().getState() == NodeState.DECOMMISSIONED) {
  this.rmContext.getDispatcher().getEventHandler()
  .handle(new RMNodeEvent(nodeId, RMNodeEventType.RECOMMISSION));
}
{code}

DECOMMISSIONED -> RECOMMISSION is not allowed.  This issue is coincidentally 
fixed by this patch.

> Nodes in the includes list should not be listed as decommissioned in the UI
> ---
>
> Key: YARN-4413
> URL: https://issues.apache.org/jira/browse/YARN-4413
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.7.1
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
> Attachments: YARN-4413.001.patch
>
>
> If I decommission a node and then move it from the excludes list back to the 
> includes list, but I don't restart the node, the node will still be listed by 
> the web UI as decomissioned until either the NM or RM is restarted.  Ideally, 
> removing the node from the excludes list and putting it back into the 
> includes list should cause the node to be reported as shutdown instead.
> CC [~kshukla]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4415) Scheduler Web Ui shows max capacity for the queue is 100% but when we submit application doesnt get assigned

2015-12-03 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-4415:

Attachment: App info with diagnostics info.png

> Scheduler Web Ui shows max capacity for the queue is 100% but when we submit 
> application doesnt get assigned
> 
>
> Key: YARN-4415
> URL: https://issues.apache.org/jira/browse/YARN-4415
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler, resourcemanager
>Affects Versions: 2.7.2
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
> Attachments: App info with diagnostics info.png
>
>
> Steps to reproduce the issue :
> Scenario 1:
> # Configure a queue(default) with accessible node labels as *
> # create a exclusive partition *xxx* and map a NM to it
> # ensure no capacities are configured for default for label xxx
> # start an RM app with queue as default and label as xxx
> # application is stuck but scheduler ui shows 100% as max capacity for that 
> queue
> Scenario 2:
> # create a nonexclusive partition *sharedPartition* and map a NM to it
> # ensure no capacities are configured for default queue
> # start an RM app with queue as *default* and label as *sharedPartition*
> # application is stuck but scheduler ui shows 100% as max capacity for that 
> queue for *sharedPartition*
> For both issues cause is the same default max capacity and abs max capacity 
> is set to Zero %



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4413) Nodes in the includes list should not be listed as decommissioned in the UI

2015-12-03 Thread Kuhu Shukla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15038265#comment-15038265
 ] 

Kuhu Shukla commented on YARN-4413:
---

YARN-4386 tracks the RECOMMISSION check. The current patch does not have a test 
since its an invalid check.

> Nodes in the includes list should not be listed as decommissioned in the UI
> ---
>
> Key: YARN-4413
> URL: https://issues.apache.org/jira/browse/YARN-4413
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.7.1
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
> Attachments: YARN-4413.001.patch
>
>
> If I decommission a node and then move it from the excludes list back to the 
> includes list, but I don't restart the node, the node will still be listed by 
> the web UI as decomissioned until either the NM or RM is restarted.  Ideally, 
> removing the node from the excludes list and putting it back into the 
> includes list should cause the node to be reported as shutdown instead.
> CC [~kshukla]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4413) Nodes in the includes list should not be listed as decommissioned in the UI

2015-12-03 Thread Kuhu Shukla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15038269#comment-15038269
 ] 

Kuhu Shukla commented on YARN-4413:
---

The current patch for YARN-4386*

> Nodes in the includes list should not be listed as decommissioned in the UI
> ---
>
> Key: YARN-4413
> URL: https://issues.apache.org/jira/browse/YARN-4413
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.7.1
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
> Attachments: YARN-4413.001.patch
>
>
> If I decommission a node and then move it from the excludes list back to the 
> includes list, but I don't restart the node, the node will still be listed by 
> the web UI as decomissioned until either the NM or RM is restarted.  Ideally, 
> removing the node from the excludes list and putting it back into the 
> includes list should cause the node to be reported as shutdown instead.
> CC [~kshukla]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4292) ResourceUtilization should be a part of NodeInfo REST API

2015-12-03 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15039527#comment-15039527
 ] 

Sunil G commented on YARN-4292:
---

Thank you very much [~leftnoteasy]  for the review and commit. 

> ResourceUtilization should be a part of NodeInfo REST API
> -
>
> Key: YARN-4292
> URL: https://issues.apache.org/jira/browse/YARN-4292
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Sunil G
> Fix For: 2.8.0
>
> Attachments: 0001-YARN-4292.patch, 0002-YARN-4292.patch, 
> 0003-YARN-4292.patch, 0004-YARN-4292.patch, 0005-YARN-4292.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list

2015-12-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15039529#comment-15039529
 ] 

Hadoop QA commented on YARN-4311:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
8s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 58s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 29s 
{color} | {color:green} trunk passed with JDK v1.7.0_85 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
4s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 5s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
55s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 
38s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 50s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 16s 
{color} | {color:green} trunk passed with JDK v1.7.0_85 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
56s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 50s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 8m 50s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 26s 
{color} | {color:green} the patch passed with JDK v1.7.0_85 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 9m 26s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 2s 
{color} | {color:red} Patch generated 1 new checkstyle issues in root (total 
was 396, now 396). {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 3s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
55s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 49s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 19s 
{color} | {color:green} the patch passed with JDK v1.7.0_85 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 24s 
{color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_66. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 59s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 41s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 37s 
{color} | {color:green} hadoop-sls in the patch passed with JDK v1.8.0_66. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 26s 
{color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_85. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 14s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.7.0_85. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 33s 

[jira] [Updated] (YARN-3368) Improve YARN web UI

2015-12-03 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-3368:
-
Attachment: (POC, Aug-2015)) yarn-ui-screenshots.zip

Archived and reattached old screenshots

> Improve YARN web UI
> ---
>
> Key: YARN-3368
> URL: https://issues.apache.org/jira/browse/YARN-3368
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jian He
> Attachments: (Dec 3 2015) yarn-ui-screenshots.zip, (POC, Aug-2015)) 
> yarn-ui-screenshots.zip
>
>
> The goal is to improve YARN UI for better usability.
> We may take advantage of some existing front-end frameworks to build a 
> fancier, easier-to-use UI. 
> The old UI continue to exist until  we feel it's ready to flip to the new UI.
> This serves as an umbrella jira to track the tasks. we can do this in a 
> branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3368) Improve YARN web UI

2015-12-03 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-3368:
-
Attachment: (was: Applications-table-Screenshot.png)

> Improve YARN web UI
> ---
>
> Key: YARN-3368
> URL: https://issues.apache.org/jira/browse/YARN-3368
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jian He
> Attachments: (Dec 3 2015) yarn-ui-screenshots.zip
>
>
> The goal is to improve YARN UI for better usability.
> We may take advantage of some existing front-end frameworks to build a 
> fancier, easier-to-use UI. 
> The old UI continue to exist until  we feel it's ready to flip to the new UI.
> This serves as an umbrella jira to track the tasks. we can do this in a 
> branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3368) Improve YARN web UI

2015-12-03 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-3368:
-
Attachment: (Dec 3 2015) yarn-ui-screenshots.zip

> Improve YARN web UI
> ---
>
> Key: YARN-3368
> URL: https://issues.apache.org/jira/browse/YARN-3368
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jian He
> Attachments: (Dec 3 2015) yarn-ui-screenshots.zip
>
>
> The goal is to improve YARN UI for better usability.
> We may take advantage of some existing front-end frameworks to build a 
> fancier, easier-to-use UI. 
> The old UI continue to exist until  we feel it's ready to flip to the new UI.
> This serves as an umbrella jira to track the tasks. we can do this in a 
> branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3368) Improve YARN web UI

2015-12-03 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-3368:
-
Attachment: (was: Queue-Hierarchy-Screenshot.png)

> Improve YARN web UI
> ---
>
> Key: YARN-3368
> URL: https://issues.apache.org/jira/browse/YARN-3368
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jian He
> Attachments: (Dec 3 2015) yarn-ui-screenshots.zip
>
>
> The goal is to improve YARN UI for better usability.
> We may take advantage of some existing front-end frameworks to build a 
> fancier, easier-to-use UI. 
> The old UI continue to exist until  we feel it's ready to flip to the new UI.
> This serves as an umbrella jira to track the tasks. we can do this in a 
> branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4304) AM max resource configuration per partition to be displayed/updated correctly in UI and in various partition related metrics

2015-12-03 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15039547#comment-15039547
 ] 

Sunil G commented on YARN-4304:
---

Thanks [~leftnoteasy] for the comments. 
Yes. It's fine and we can change those variables as per the comments. 
For {{ResourceUsage}},  since it's an existing code,  we need to add these new 
items and make the renaming as suggested. Both these can be tracked in another 
ticket together. And later can be made used here. I ll create another ticket 
for that if you feel it's fine. Thank you. 

> AM max resource configuration per partition to be displayed/updated correctly 
> in UI and in various partition related metrics
> 
>
> Key: YARN-4304
> URL: https://issues.apache.org/jira/browse/YARN-4304
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: webapp
>Affects Versions: 2.7.1
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: 0001-YARN-4304.patch, 0002-YARN-4304.patch, 
> 0003-YARN-4304.patch, 0004-YARN-4304.patch, REST_and_UI.zip
>
>
> As we are supporting per-partition level max AM resource percentage 
> configuration, UI and various metrics also need to display correct 
> configurations related to same. 
> For eg: Current UI still shows am-resource percentage per queue level. This 
> is to be updated correctly when label config is used.
> - Display max-am-percentage per-partition in Scheduler UI (label also) and in 
> ClusterMetrics page
> - Update queue/partition related metrics w.r.t per-partition 
> am-resource-percentage



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3480) Recovery may get very slow with lots of services with lots of app-attempts

2015-12-03 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15039560#comment-15039560
 ] 

Jian He commented on YARN-3480:
---

[~hex108], i know it's been a long time, would you still like to work on this  ?
IMO, as a first step, we can do as in previous 
[comment|https://issues.apache.org/jira/browse/YARN-3480?focusedCommentId=14533731=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14533731]
 to remove the apps beyond the validity interval as mostly those apps user care 
least. cc [~xgong]. 

> Recovery may get very slow with lots of services with lots of app-attempts
> --
>
> Key: YARN-3480
> URL: https://issues.apache.org/jira/browse/YARN-3480
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Jun Gong
>Assignee: Jun Gong
> Attachments: YARN-3480.01.patch, YARN-3480.02.patch, 
> YARN-3480.03.patch, YARN-3480.04.patch
>
>
> When RM HA is enabled and running containers are kept across attempts, apps 
> are more likely to finish successfully with more retries(attempts), so it 
> will be better to set 'yarn.resourcemanager.am.max-attempts' larger. However 
> it will make RMStateStore(FileSystem/HDFS/ZK) store more attempts, and make 
> RM recover process much slower. It might be better to set max attempts to be 
> stored in RMStateStore.
> BTW: When 'attemptFailuresValidityInterval'(introduced in YARN-611) is set to 
> a small value, retried attempts might be very large. So we need to delete 
> some attempts stored in RMStateStore and RMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4309) Add debug information to application logs when a container fails

2015-12-03 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15039563#comment-15039563
 ] 

Wangda Tan commented on YARN-4309:
--

Hi [~vvasudev],

Thanks for working on this task, it's really useful to identify container 
launch issues, some questions/comments:
- Since debug information fetch script (like copy script and list files) is at 
the end of launch_container.sh, is it possible that a container is killed so 
such script cannot be executed?
- Do you think is it better to generate a separated script file to fetch debug 
information before launch user code? Which we can 
1. Guarantee it will be executed
2. It won't add debug information to normal launch_container.sh.
3. Return code of script won't affected by debug script.
- Is it possible to enable/disable this function while NM is running? 

+[~sidharta-s].

> Add debug information to application logs when a container fails
> 
>
> Key: YARN-4309
> URL: https://issues.apache.org/jira/browse/YARN-4309
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Attachments: YARN-4309.001.patch, YARN-4309.002.patch, 
> YARN-4309.003.patch, YARN-4309.004.patch, YARN-4309.005.patch
>
>
> Sometimes when a container fails, it can be pretty hard to figure out why it 
> failed.
> My proposal is that if a container fails, we collect information about the 
> container local dir and dump it into the container log dir. Ideally, I'd like 
> to tar up the directory entirely, but I'm not sure of the security and space 
> implications of such a approach. At the very least, we can list all the files 
> in the container local dir, and dump the contents of launch_container.sh(into 
> the container log dir).
> When log aggregation occurs, all this information will automatically get 
> collected and make debugging such failures much easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4238) createdTime and modifiedTime is not reported while publishing entities to ATSv2

2015-12-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15039522#comment-15039522
 ] 

Hadoop QA commented on YARN-4238:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
12s {color} | {color:green} feature-YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 54s 
{color} | {color:green} feature-YARN-2928 passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 31s 
{color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_85 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
4s {color} | {color:green} feature-YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 44s 
{color} | {color:green} feature-YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
48s {color} | {color:green} feature-YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 2s 
{color} | {color:green} feature-YARN-2928 passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 26s 
{color} | {color:red} hadoop-yarn-server-resourcemanager in feature-YARN-2928 
failed with JDK v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 16s 
{color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_85 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
44s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 37s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 14m 28s 
{color} | {color:red} root-jdk1.8.0_66 with JDK v1.8.0_66 generated 2 new 
issues (was 779, now 779). {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 9m 37s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 54s 
{color} | {color:green} the patch passed with JDK v1.7.0_85 {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 24m 22s 
{color} | {color:red} root-jdk1.7.0_85 with JDK v1.7.0_85 generated 2 new 
issues (was 772, now 772). {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 9m 54s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 6s 
{color} | {color:red} Patch generated 2 new checkstyle issues in root (total 
was 401, now 339). {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 46s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
47s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
31s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 25s 
{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed 
with JDK v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 13s 
{color} | {color:green} the patch passed with JDK v1.7.0_85 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 1s 
{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with 
JDK v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m 49s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 10m 50s {color} 
| {color:red} hadoop-mapreduce-client-app in the patch failed with JDK 
v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 0s 
{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with 
JDK 

[jira] [Commented] (YARN-4238) createdTime and modifiedTime is not reported while publishing entities to ATSv2

2015-12-03 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15039534#comment-15039534
 ] 

Sangjin Lee commented on YARN-4238:
---

[~varun_saxena], it seems like it should be relatively easy to correct the 2 
checkstyle violations. Could you also take a look at the unit test failures, 
especially TestSystemMetricsPublisher to see if it is related? Thanks!

> createdTime and modifiedTime is not reported while publishing entities to 
> ATSv2
> ---
>
> Key: YARN-4238
> URL: https://issues.apache.org/jira/browse/YARN-4238
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4238-YARN-2928.01.patch, 
> YARN-4238-feature-YARN-2928.02.patch, YARN-4238-feature-YARN-2928.03.patch
>
>
> While publishing entities from RM and elsewhere we are not sending created 
> time. For instance, created time in TimelineServiceV2Publisher class and for 
> other entities in other such similar classes is not updated. We can easily 
> update created time when sending application created event. Likewise for 
> modification time on every write.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3368) Improve YARN web UI

2015-12-03 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15039532#comment-15039532
 ] 

Wangda Tan commented on YARN-3368:
--

Hi guys,

Some updates of the new YARN web UI works:
- Rewrote the UI framework (Still using Ember.JS) so new pages/charts can be 
added easier.
- Finished scheduler page, which contains a select-able queue hierarchy view 
and also related scheduler information.
- Finished apps table, app page, app-attempt page and also timeline view of 
app-attempts/containers.
- Finished cluster overview page, which contains several charts to show 
overview of the cluster (such as total memory, total apps, etc.)

Attached screenshots to demostrate latest changes.

There're still many pending tasks
- Finalize design.
- Page of node managers' view.
- Bugs / hardcoded configurations.

In order to advance the work faster, I propose to create a new branch 
(YARN-3368), so more people can participate development/discussion. All 
UI-related code changes will be in a separated folder 
"hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui".

Thanks for [~Sreenath], [~vinodkv], [~jianhe], [~gtCarrera] for supports and 
suggestions.

Thoughts?

> Improve YARN web UI
> ---
>
> Key: YARN-3368
> URL: https://issues.apache.org/jira/browse/YARN-3368
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jian He
> Attachments: (Dec 3 2015) yarn-ui-screenshots.zip
>
>
> The goal is to improve YARN UI for better usability.
> We may take advantage of some existing front-end frameworks to build a 
> fancier, easier-to-use UI. 
> The old UI continue to exist until  we feel it's ready to flip to the new UI.
> This serves as an umbrella jira to track the tasks. we can do this in a 
> branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3368) Improve YARN web UI

2015-12-03 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-3368:
-
Attachment: (was: YARN-3368.poc.1.patch)

> Improve YARN web UI
> ---
>
> Key: YARN-3368
> URL: https://issues.apache.org/jira/browse/YARN-3368
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jian He
> Attachments: (Dec 3 2015) yarn-ui-screenshots.zip
>
>
> The goal is to improve YARN UI for better usability.
> We may take advantage of some existing front-end frameworks to build a 
> fancier, easier-to-use UI. 
> The old UI continue to exist until  we feel it's ready to flip to the new UI.
> This serves as an umbrella jira to track the tasks. we can do this in a 
> branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4405) Support node label store in non-appendable file system

2015-12-03 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15039539#comment-15039539
 ] 

Wangda Tan commented on YARN-4405:
--

[~sunilg], please let me know if you have any other comments on latest patch.

Thanks,

> Support node label store in non-appendable file system
> --
>
> Key: YARN-4405
> URL: https://issues.apache.org/jira/browse/YARN-4405
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, client, resourcemanager
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-4405.1.patch, YARN-4405.2.patch, YARN-4405.3.patch, 
> YARN-4405.4.patch
>
>
> Existing node label file system store implementation uses append to write 
> edit logs. However, some file system doesn't support append, we need add an 
> implementation to support such non-appendable file systems as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3368) Improve YARN web UI

2015-12-03 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15039537#comment-15039537
 ] 

Wangda Tan commented on YARN-3368:
--

This task depends on some fixes of REST APIs, which is tracked by YARN-4417.

> Improve YARN web UI
> ---
>
> Key: YARN-3368
> URL: https://issues.apache.org/jira/browse/YARN-3368
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jian He
> Attachments: (Dec 3 2015) yarn-ui-screenshots.zip, (POC, Aug-2015)) 
> yarn-ui-screenshots.zip
>
>
> The goal is to improve YARN UI for better usability.
> We may take advantage of some existing front-end frameworks to build a 
> fancier, easier-to-use UI. 
> The old UI continue to exist until  we feel it's ready to flip to the new UI.
> This serves as an umbrella jira to track the tasks. we can do this in a 
> branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4405) Support node label store in non-appendable file system

2015-12-03 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15039550#comment-15039550
 ] 

Sunil G commented on YARN-4405:
---

Thanks [~leftnoteasy]. Latest patch looks good.
 +1.

> Support node label store in non-appendable file system
> --
>
> Key: YARN-4405
> URL: https://issues.apache.org/jira/browse/YARN-4405
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, client, resourcemanager
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-4405.1.patch, YARN-4405.2.patch, YARN-4405.3.patch, 
> YARN-4405.4.patch
>
>
> Existing node label file system store implementation uses append to write 
> edit logs. However, some file system doesn't support append, we need add an 
> implementation to support such non-appendable file systems as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4304) AM max resource configuration per partition to be displayed/updated correctly in UI and in various partition related metrics

2015-12-03 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15039551#comment-15039551
 ] 

Wangda Tan commented on YARN-4304:
--

[~sunilg], 
I'm fine with changing ResourceUsage in a separated JIRA (logic change only, 
not renaming), but I think it's better to finish ResourceUsage change before 
this patch. Which we can have a more clear view of this patch once 
ResourceUsage changes are completed.

> AM max resource configuration per partition to be displayed/updated correctly 
> in UI and in various partition related metrics
> 
>
> Key: YARN-4304
> URL: https://issues.apache.org/jira/browse/YARN-4304
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: webapp
>Affects Versions: 2.7.1
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: 0001-YARN-4304.patch, 0002-YARN-4304.patch, 
> 0003-YARN-4304.patch, 0004-YARN-4304.patch, REST_and_UI.zip
>
>
> As we are supporting per-partition level max AM resource percentage 
> configuration, UI and various metrics also need to display correct 
> configurations related to same. 
> For eg: Current UI still shows am-resource percentage per queue level. This 
> is to be updated correctly when label config is used.
> - Display max-am-percentage per-partition in Scheduler UI (label also) and in 
> ClusterMetrics page
> - Update queue/partition related metrics w.r.t per-partition 
> am-resource-percentage



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4418) AM Resource Limit per partition can be updated to ResourceUsage as well

2015-12-03 Thread Sunil G (JIRA)
Sunil G created YARN-4418:
-

 Summary: AM Resource Limit per partition can be updated to 
ResourceUsage as well
 Key: YARN-4418
 URL: https://issues.apache.org/jira/browse/YARN-4418
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.1
Reporter: Sunil G
Assignee: Sunil G


AMResourceLimit is now extended to all partitions after YARN-3216. Its also 
better to track this ResourceLimit in existing {{ResourceUsage}} so that REST 
framework can be benefited to avail this information easily. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4405) Support node label store in non-appendable file system

2015-12-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15039594#comment-15039594
 ] 

Hudson commented on YARN-4405:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8917 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8917/])
YARN-4405. Support node label store in non-appendable file system. (jianhe: rev 
755dda8dd8bb23864abc752bad506f223fcac010)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/nodelabels/TestFileSystemNodeLabelsStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/nodelabels/DummyCommonNodeLabelsManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/test/java/org/apache/hadoop/yarn/conf/TestYarnConfigurationFields.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/NullRMNodeLabelsManager.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfigurationFieldsBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/FileSystemNodeLabelsStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/NodeLabelsStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml


> Support node label store in non-appendable file system
> --
>
> Key: YARN-4405
> URL: https://issues.apache.org/jira/browse/YARN-4405
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, client, resourcemanager
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-4405.1.patch, YARN-4405.2.patch, YARN-4405.3.patch, 
> YARN-4405.4.patch
>
>
> Existing node label file system store implementation uses append to write 
> edit logs. However, some file system doesn't support append, we need add an 
> implementation to support such non-appendable file systems as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4002) make ResourceTrackerService.nodeHeartbeat more concurrent

2015-12-03 Thread Hong Zhiguo (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15039597#comment-15039597
 ] 

Hong Zhiguo commented on YARN-4002:
---

I'm working on it. I've proposed 2 different solutions and waiting for specific 
comments.

> make ResourceTrackerService.nodeHeartbeat more concurrent
> -
>
> Key: YARN-4002
> URL: https://issues.apache.org/jira/browse/YARN-4002
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Hong Zhiguo
>Assignee: Hong Zhiguo
>Priority: Critical
> Attachments: YARN-4002-v0.patch
>
>
> We have multiple RPC threads to handle NodeHeartbeatRequest from NMs. By 
> design the method ResourceTrackerService.nodeHeartbeat should be concurrent 
> enough to scale for large clusters.
> But we have a "BIG" lock in NodesListManager.isValidNode which I think it's 
> unnecessary.
> First, the fields "includes" and "excludes" of HostsFileReader are only 
> updated on "refresh nodes".  All RPC threads handling node heartbeats are 
> only readers.  So RWLock could be used to  alow concurrent access by RPC 
> threads.
> Second, since he fields "includes" and "excludes" of HostsFileReader are 
> always updated by "reference assignment", which is atomic in Java, the reader 
> side lock could just be skipped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3669) Attempt-failures validatiy interval should have a global admin configurable lower limit

2015-12-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15039609#comment-15039609
 ] 

Hadoop QA commented on YARN-3669:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
38s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 51s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 5s 
{color} | {color:green} trunk passed with JDK v1.7.0_85 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
27s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 36s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
39s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
45s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 29s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 43s 
{color} | {color:green} trunk passed with JDK v1.7.0_85 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
29s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 49s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 49s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 5s 
{color} | {color:green} the patch passed with JDK v1.7.0_85 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 5s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 41s 
{color} | {color:red} Patch generated 2 new checkstyle issues in 
hadoop-yarn-project/hadoop-yarn (total was 326, now 327). {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 45s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
39s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 0s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 26s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 38s 
{color} | {color:green} the patch passed with JDK v1.7.0_85 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 22s 
{color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_66. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 51s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 59m 36s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 24s 
{color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_85. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 7s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.7.0_85. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 60m 40s {color} 

[jira] [Commented] (YARN-3840) Resource Manager web ui issue when sorting application by id (with application having id > 9999)

2015-12-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15039614#comment-15039614
 ] 

Hudson commented on YARN-3840:
--

ABORTED: Integrated in Hadoop-Hdfs-trunk-Java8 #662 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/662/])
YARN-3840. Resource Manager web ui issue when sorting application by id 
(jianhe: rev 9f77ccad735f4843ce2c38355de9f434838d4507)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-sorting/natural.js
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/view/JQueryUI.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/webapp/TaskPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/WebPageUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AppPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/AllApplicationsPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/TestAHSWebApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AppAttemptPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/AllContainersPage.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/webapp/TasksPage.java


> Resource Manager web ui issue when sorting application by id (with 
> application having id > )
> 
>
> Key: YARN-3840
> URL: https://issues.apache.org/jira/browse/YARN-3840
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.0
>Reporter: LINTE
>Assignee: Varun Saxena
> Fix For: 2.8.0, 2.7.3
>
> Attachments: RMApps.png, RMApps_Sorted.png, YARN-3840-1.patch, 
> YARN-3840-2.patch, YARN-3840-3.patch, YARN-3840-4.patch, YARN-3840-5.patch, 
> YARN-3840-6.patch, YARN-3840.reopened.001.patch, yarn-3840-7.patch
>
>
> On the WEBUI, the global main view page : 
> http://resourcemanager:8088/cluster/apps doesn't display applications over 
> .
> With command line it works (# yarn application -list).
> Regards,
> Alexandre



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >