[jira] [Commented] (YARN-3197) Confusing log generated by CapacityScheduler

2015-03-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365286#comment-14365286
 ] 

Hudson commented on YARN-3197:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #135 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/135/])
YARN-3197. Confusing log generated by CapacityScheduler. Contributed by 
(devaraj: rev 7179f94f9d000fc52bd9ce5aa9741aba97ec3ee8)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* hadoop-yarn-project/CHANGES.txt


 Confusing log generated by CapacityScheduler
 

 Key: YARN-3197
 URL: https://issues.apache.org/jira/browse/YARN-3197
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.6.0
Reporter: Hitesh Shah
Assignee: Varun Saxena
Priority: Minor
 Fix For: 2.8.0

 Attachments: YARN-3197.001.patch, YARN-3197.002.patch, 
 YARN-3197.003.patch, YARN-3197.004.patch


 2015-02-12 20:35:39,968 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...
 2015-02-12 20:35:39,968 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...
 2015-02-12 20:35:39,968 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...
 2015-02-12 20:35:40,960 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...
 2015-02-12 20:35:40,960 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2854) The document about timeline service and generic service needs to be updated

2015-03-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365291#comment-14365291
 ] 

Hudson commented on YARN-2854:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #135 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/135/])
YARN-2854. Addendum patch to fix the minor issue in the timeline service 
documentation. (zjshen: rev ed4e72a20b75ffbd22deb0607dd8b94f6e437a84)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/TimelineServer.md


 The document about timeline service and generic service needs to be updated
 ---

 Key: YARN-2854
 URL: https://issues.apache.org/jira/browse/YARN-2854
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Naganarasimha G R
Priority: Critical
 Fix For: 2.7.0

 Attachments: TimelineServer.html, YARN-2854.20141120-1.patch, 
 YARN-2854.20150128.1.patch, YARN-2854.20150304.1.patch, 
 YARN-2854.20150311-1.patch, YARN-2854.20150313-1.patch, 
 YARN-2854.20150314-1.patch, YARN-2854.20150314-1_branch2.patch, 
 YARN-2854.20150315-1_trunk_addendum.patch, timeline_structure.jpg






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3349) Treat all exceptions as failure in TestFSRMStateStore#testFSRMStateStoreClientRetry

2015-03-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365290#comment-14365290
 ] 

Hudson commented on YARN-3349:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #135 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/135/])
YARN-3349. Treat all exceptions as failure in 
TestFSRMStateStore#testFSRMStateStoreClientRetry. Contributed by Zhihai Xu. 
(ozawa: rev 7522a643faeea2d8a8e2c7409ae60e0973e7cf38)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestFSRMStateStore.java
* hadoop-yarn-project/CHANGES.txt


 Treat all exceptions as failure in 
 TestFSRMStateStore#testFSRMStateStoreClientRetry
 ---

 Key: YARN-3349
 URL: https://issues.apache.org/jira/browse/YARN-3349
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: test
Affects Versions: 2.6.0
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Minor
 Fix For: 2.7.0

 Attachments: YARN-3349.000.patch


 treat all exceptions as failure in testFSRMStateStoreClientRetry.
 Currently the exception could only be replicated to 0 nodes instead of 
 minReplication (=1) is not treated as failure in 
 testFSRMStateStoreClientRetry.
 {code}
 // TODO 0 datanode exception will not be retried by dfs client, 
 fix
 // that separately.
 if (!e.getMessage().contains(could only be replicated +
  to 0 nodes instead of minReplication (=1))) {
 assertionFailedInThread.set(true);
  }
 {code}
 With YARN-2820(Retry in FileSystemRMStateStore), we needn't treat this  
 exception specially. We can remove the check and treat all exceptions as 
 failure in testFSRMStateStoreClientRetry.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3339) TestDockerContainerExecutor should pull a single image and not the entire centos repository

2015-03-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365295#comment-14365295
 ] 

Hudson commented on YARN-3339:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #135 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/135/])
YARN-3339. TestDockerContainerExecutor should pull a single image and not the 
entire centos repository. (Ravindra Kumar Naik via raviprak) (raviprak: rev 
56085203c43b8f2561bf3745910e03f8ac176a67)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestDockerContainerExecutor.java


 TestDockerContainerExecutor should pull a single image and not the entire 
 centos repository
 ---

 Key: YARN-3339
 URL: https://issues.apache.org/jira/browse/YARN-3339
 Project: Hadoop YARN
  Issue Type: Test
  Components: test
Affects Versions: 2.6.0
 Environment: Linux
Reporter: Ravindra Kumar Naik
Priority: Minor
 Fix For: 2.8.0

 Attachments: YARN-3339-branch-2.6.0.001.patch, 
 YARN-3339-trunk.001.patch


 TestDockerContainerExecutor test pulls the entire centos repository which is 
 time consuming.
 Pulling a specific image (e.g. centos7) will be sufficient to run the test 
 successfully and will save time



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3339) TestDockerContainerExecutor should pull a single image and not the entire centos repository

2015-03-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365347#comment-14365347
 ] 

Hudson commented on YARN-3339:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2085 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2085/])
YARN-3339. TestDockerContainerExecutor should pull a single image and not the 
entire centos repository. (Ravindra Kumar Naik via raviprak) (raviprak: rev 
56085203c43b8f2561bf3745910e03f8ac176a67)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestDockerContainerExecutor.java
* hadoop-yarn-project/CHANGES.txt


 TestDockerContainerExecutor should pull a single image and not the entire 
 centos repository
 ---

 Key: YARN-3339
 URL: https://issues.apache.org/jira/browse/YARN-3339
 Project: Hadoop YARN
  Issue Type: Test
  Components: test
Affects Versions: 2.6.0
 Environment: Linux
Reporter: Ravindra Kumar Naik
Priority: Minor
 Fix For: 2.8.0

 Attachments: YARN-3339-branch-2.6.0.001.patch, 
 YARN-3339-trunk.001.patch


 TestDockerContainerExecutor test pulls the entire centos repository which is 
 time consuming.
 Pulling a specific image (e.g. centos7) will be sufficient to run the test 
 successfully and will save time



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2854) The document about timeline service and generic service needs to be updated

2015-03-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365343#comment-14365343
 ] 

Hudson commented on YARN-2854:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2085 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2085/])
YARN-2854. Addendum patch to fix the minor issue in the timeline service 
documentation. (zjshen: rev ed4e72a20b75ffbd22deb0607dd8b94f6e437a84)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/TimelineServer.md


 The document about timeline service and generic service needs to be updated
 ---

 Key: YARN-2854
 URL: https://issues.apache.org/jira/browse/YARN-2854
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Naganarasimha G R
Priority: Critical
 Fix For: 2.7.0

 Attachments: TimelineServer.html, YARN-2854.20141120-1.patch, 
 YARN-2854.20150128.1.patch, YARN-2854.20150304.1.patch, 
 YARN-2854.20150311-1.patch, YARN-2854.20150313-1.patch, 
 YARN-2854.20150314-1.patch, YARN-2854.20150314-1_branch2.patch, 
 YARN-2854.20150315-1_trunk_addendum.patch, timeline_structure.jpg






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3197) Confusing log generated by CapacityScheduler

2015-03-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365338#comment-14365338
 ] 

Hudson commented on YARN-3197:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2085 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2085/])
YARN-3197. Confusing log generated by CapacityScheduler. Contributed by 
(devaraj: rev 7179f94f9d000fc52bd9ce5aa9741aba97ec3ee8)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* hadoop-yarn-project/CHANGES.txt


 Confusing log generated by CapacityScheduler
 

 Key: YARN-3197
 URL: https://issues.apache.org/jira/browse/YARN-3197
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.6.0
Reporter: Hitesh Shah
Assignee: Varun Saxena
Priority: Minor
 Fix For: 2.8.0

 Attachments: YARN-3197.001.patch, YARN-3197.002.patch, 
 YARN-3197.003.patch, YARN-3197.004.patch


 2015-02-12 20:35:39,968 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...
 2015-02-12 20:35:39,968 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...
 2015-02-12 20:35:39,968 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...
 2015-02-12 20:35:40,960 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...
 2015-02-12 20:35:40,960 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3362) Add node label usage in RM CapacityScheduler web UI

2015-03-17 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366608#comment-14366608
 ] 

Naganarasimha G R commented on YARN-3362:
-

Hi Wangda,
Would like to work on this issue, hence have assigned to my name, if you have 
already started working on it please feel free to reassign.

 Add node label usage in RM CapacityScheduler web UI
 ---

 Key: YARN-3362
 URL: https://issues.apache.org/jira/browse/YARN-3362
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler, resourcemanager, webapp
Reporter: Wangda Tan
Assignee: Naganarasimha G R

 We don't have node label usage in RM CapacityScheduler web UI now, without 
 this, user will be hard to understand what happened to nodes have labels 
 assign to it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-3362) Add node label usage in RM CapacityScheduler web UI

2015-03-17 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R reassigned YARN-3362:
---

Assignee: Naganarasimha G R

 Add node label usage in RM CapacityScheduler web UI
 ---

 Key: YARN-3362
 URL: https://issues.apache.org/jira/browse/YARN-3362
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler, resourcemanager, webapp
Reporter: Wangda Tan
Assignee: Naganarasimha G R

 We don't have node label usage in RM CapacityScheduler web UI now, without 
 this, user will be hard to understand what happened to nodes have labels 
 assign to it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3040) [Data Model] Implement client-side API for handling flows

2015-03-17 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366614#comment-14366614
 ] 

Li Lu commented on YARN-3040:
-

Quick comment: In my understanding the flow based API is used in multiple 
components, including but not limited to event producers (like distributed 
shell, rm and nms), collectors (a.k.a aggregators), and storage 
implementations. It's not specially attached to the rm. 

 [Data Model] Implement client-side API for handling flows
 -

 Key: YARN-3040
 URL: https://issues.apache.org/jira/browse/YARN-3040
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Robert Kanter

 Per design in YARN-2928, implement client-side API for handling *flows*. 
 Frameworks should be able to define and pass in all attributes of flows and 
 flow runs to YARN, and they should be passed into ATS writers.
 YARN tags were discussed as a way to handle this piece of information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3111) Fix ratio problem on FairScheduler page

2015-03-17 Thread Peng Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366621#comment-14366621
 ] 

Peng Zhang commented on YARN-3111:
--

Thanks for your advices

For 4 proposal listed above:
1  2 are already done in the patch
3 is good, but one question is that parent queue has no tooltip now, but it has 
its own bar.
And think over 3  4, what about listing all resources's usage percent on the 
text on the right of each bar? Maybe color red for dominant resource? or just 
judge it by comparing percent number?

And also what do you think of the issue I mentioned above? I think it still can 
happen after 1  2, cause for one queue: steady, fair, max, usage resource may 
have different dominant resource type. If I make a mistake here, please let me 
know.
bq. queue's bar width is decided by (queue steady resource / cluster resource), 
and queue's usage width is decided by (queue's usage resource / cluster 
resource). For above two percent computation, dominant resource may be 
different, so two percent value is still in different dimension, and it causes 
confusion.

 Fix ratio problem on FairScheduler page
 ---

 Key: YARN-3111
 URL: https://issues.apache.org/jira/browse/YARN-3111
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.6.0
Reporter: Peng Zhang
Assignee: Peng Zhang
Priority: Minor
 Attachments: YARN-3111.1.patch, YARN-3111.png


 Found 3 problems on FairScheduler page:
 1. Only compute memory for ratio even when queue schedulingPolicy is DRF.
 2. When min resources is configured larger than real resources, the steady 
 fair share ratio is so long that it is out the page.
 3. When cluster resources is 0(no nodemanager start), ratio is displayed as 
 NaN% used
 Attached image shows the snapshot of above problems. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3040) [Data Model] Implement client-side API for handling flows

2015-03-17 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366654#comment-14366654
 ] 

Zhijie Shen commented on YARN-3040:
---

[~Naganarasimha], thanks for being interested in this issue. I've already had a 
WIP patch. If you don't mind, may I continue the work, and would you please 
help to review it?

 [Data Model] Implement client-side API for handling flows
 -

 Key: YARN-3040
 URL: https://issues.apache.org/jira/browse/YARN-3040
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Robert Kanter

 Per design in YARN-2928, implement client-side API for handling *flows*. 
 Frameworks should be able to define and pass in all attributes of flows and 
 flow runs to YARN, and they should be passed into ATS writers.
 YARN tags were discussed as a way to handle this piece of information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery

2015-03-17 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen resolved YARN-3039.
---
   Resolution: Fixed
Fix Version/s: YARN-2928
 Hadoop Flags: Reviewed

Committed the patch to branch YARN-2928. Thanks for the patch, Junping! Thanks 
for review, Sangjin!

 [Aggregator wireup] Implement ATS app-appgregator service discovery
 ---

 Key: YARN-3039
 URL: https://issues.apache.org/jira/browse/YARN-3039
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Junping Du
 Fix For: YARN-2928

 Attachments: Service Binding for applicationaggregator of ATS 
 (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, 
 YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, 
 YARN-3039-v3-core-changes-only.patch, YARN-3039-v4.patch, YARN-3039-v5.patch, 
 YARN-3039-v6.patch, YARN-3039-v7.patch, YARN-3039-v8.patch, YARN-3039.9.patch


 Per design in YARN-2928, implement ATS writer service discovery. This is 
 essential for off-node clients to send writes to the right ATS writer. This 
 should also handle the case of AM failures.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3273) Improve web UI to facilitate scheduling analysis and debugging

2015-03-17 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366593#comment-14366593
 ] 

Rohith commented on YARN-3273:
--

I am pretty confused with jenkins report , report says 
hadoop-yarn-server-common  failed tests but at console log for this project 
says no failure!! {{Tests run: 19, Failures: 0, Errors: 0, Skipped: 0}}

 Improve web UI to facilitate scheduling analysis and debugging
 --

 Key: YARN-3273
 URL: https://issues.apache.org/jira/browse/YARN-3273
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jian He
Assignee: Rohith
 Attachments: 0001-YARN-3273-v1.patch, 0001-YARN-3273-v2.patch, 
 0002-YARN-3273.patch, 0003-YARN-3273.patch, 0003-YARN-3273.patch, 
 0004-YARN-3273.patch, YARN-3273-am-resource-used-AND-User-limit-v2.PNG, 
 YARN-3273-am-resource-used-AND-User-limit.PNG, 
 YARN-3273-application-headroom-v2.PNG, YARN-3273-application-headroom.PNG


 Job may be stuck for reasons such as:
 - hitting queue capacity 
 - hitting user-limit, 
 - hitting AM-resource-percentage 
 The  first queueCapacity is already shown on the UI.
 We may surface things like:
 - what is user's current usage and user-limit; 
 - what is the AM resource usage and limit;
 - what is the application's current HeadRoom;
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3273) Improve web UI to facilitate scheduling analysis and debugging

2015-03-17 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366595#comment-14366595
 ] 

Rohith commented on YARN-3273:
--

TestAMRestart failure is unrelated to this patch

 Improve web UI to facilitate scheduling analysis and debugging
 --

 Key: YARN-3273
 URL: https://issues.apache.org/jira/browse/YARN-3273
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jian He
Assignee: Rohith
 Attachments: 0001-YARN-3273-v1.patch, 0001-YARN-3273-v2.patch, 
 0002-YARN-3273.patch, 0003-YARN-3273.patch, 0003-YARN-3273.patch, 
 0004-YARN-3273.patch, YARN-3273-am-resource-used-AND-User-limit-v2.PNG, 
 YARN-3273-am-resource-used-AND-User-limit.PNG, 
 YARN-3273-application-headroom-v2.PNG, YARN-3273-application-headroom.PNG


 Job may be stuck for reasons such as:
 - hitting queue capacity 
 - hitting user-limit, 
 - hitting AM-resource-percentage 
 The  first queueCapacity is already shown on the UI.
 We may surface things like:
 - what is user's current usage and user-limit; 
 - what is the AM resource usage and limit;
 - what is the application's current HeadRoom;
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3362) Add node label usage in RM CapacityScheduler web UI

2015-03-17 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366616#comment-14366616
 ] 

Wangda Tan commented on YARN-3362:
--

It's yours :). Looking forward your patch. 

Thanks,

 Add node label usage in RM CapacityScheduler web UI
 ---

 Key: YARN-3362
 URL: https://issues.apache.org/jira/browse/YARN-3362
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler, resourcemanager, webapp
Reporter: Wangda Tan
Assignee: Naganarasimha G R

 We don't have node label usage in RM CapacityScheduler web UI now, without 
 this, user will be hard to understand what happened to nodes have labels 
 assign to it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3040) [Data Model] Implement client-side API for handling flows

2015-03-17 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366587#comment-14366587
 ] 

Naganarasimha G R commented on YARN-3040:
-

Hi [~rkanter] and [~zjshen], 
Seems like scope of this jira is small and i need to make use of this in 
YARN-3044, so if both of you are ok would like to take this jira up.

 [Data Model] Implement client-side API for handling flows
 -

 Key: YARN-3040
 URL: https://issues.apache.org/jira/browse/YARN-3040
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Robert Kanter

 Per design in YARN-2928, implement client-side API for handling *flows*. 
 Frameworks should be able to define and pass in all attributes of flows and 
 flow runs to YARN, and they should be passed into ATS writers.
 YARN tags were discussed as a way to handle this piece of information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3273) Improve web UI to facilitate scheduling analysis and debugging

2015-03-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366636#comment-14366636
 ] 

Hudson commented on YARN-3273:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #7355 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7355/])
YARN-3273. Improve scheduler UI to facilitate scheduling analysis and 
debugging. Contributed Rohith Sharmaks (jianhe: rev 
658097d6da1b1aac8e01db459f0c3b456e99652f)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestContinuousScheduling.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/AppAttemptBlock.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerLeafQueueInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesCapacitySched.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestNodesPage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestFifoScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/SchedulerInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptMetrics.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/MetricsOverviewTable.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/UserInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/CapacitySchedulerPage.java


 Improve web UI to facilitate scheduling analysis and debugging
 --

 Key: YARN-3273
 URL: https://issues.apache.org/jira/browse/YARN-3273
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jian He
Assignee: Rohith
 Fix For: 2.8.0

 Attachments: 0001-YARN-3273-v1.patch, 0001-YARN-3273-v2.patch, 
 0002-YARN-3273.patch, 0003-YARN-3273.patch, 0003-YARN-3273.patch, 
 0004-YARN-3273.patch, YARN-3273-am-resource-used-AND-User-limit-v2.PNG, 
 YARN-3273-am-resource-used-AND-User-limit.PNG, 
 YARN-3273-application-headroom-v2.PNG, YARN-3273-application-headroom.PNG


 Job may be stuck for reasons 

[jira] [Updated] (YARN-3273) Improve web UI to facilitate scheduling analysis and debugging

2015-03-17 Thread Rohith (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-3273:
-
Attachment: 0003-YARN-3273.patch

 Improve web UI to facilitate scheduling analysis and debugging
 --

 Key: YARN-3273
 URL: https://issues.apache.org/jira/browse/YARN-3273
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jian He
Assignee: Rohith
 Attachments: 0001-YARN-3273-v1.patch, 0001-YARN-3273-v2.patch, 
 0002-YARN-3273.patch, 0003-YARN-3273.patch, 0003-YARN-3273.patch, 
 YARN-3273-am-resource-used-AND-User-limit-v2.PNG, 
 YARN-3273-am-resource-used-AND-User-limit.PNG, 
 YARN-3273-application-headroom-v2.PNG, YARN-3273-application-headroom.PNG


 Job may be stuck for reasons such as:
 - hitting queue capacity 
 - hitting user-limit, 
 - hitting AM-resource-percentage 
 The  first queueCapacity is already shown on the UI.
 We may surface things like:
 - what is user's current usage and user-limit; 
 - what is the AM resource usage and limit;
 - what is the application's current HeadRoom;
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3341) Fix findbugs warning:BC_UNCONFIRMED_CAST at FSSchedulerNode.reserveResource

2015-03-17 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364786#comment-14364786
 ] 

Brahma Reddy Battula commented on YARN-3341:


It is dupe of YARN-3204,, can you please have look into YARN-3204..?

 Fix findbugs warning:BC_UNCONFIRMED_CAST at FSSchedulerNode.reserveResource
 ---

 Key: YARN-3341
 URL: https://issues.apache.org/jira/browse/YARN-3341
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Minor
  Labels: findbugs
 Attachments: YARN-3341.000.patch, YARN-3341.001.patch


 Fix findbugs warning:BC_UNCONFIRMED_CAST at FSSchedulerNode.reserveResource
 The warning message is
 {code}
 Unchecked/unconfirmed cast from 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt
  to org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt 
 in 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerNode.reserveResource(SchedulerApplicationAttempt,
  Priority, RMContainer)
 {code}
 The code which cause the warning is
 {code}
 this.reservedAppSchedulable = (FSAppAttempt) application;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3358) Audit log not present while refreshing Service ACLs'

2015-03-17 Thread Varun Saxena (JIRA)
Varun Saxena created YARN-3358:
--

 Summary: Audit log not present while refreshing Service ACLs'
 Key: YARN-3358
 URL: https://issues.apache.org/jira/browse/YARN-3358
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: Varun Saxena
Assignee: Varun Saxena
Priority: Minor


There should be a success audit log in AdminService#refreshServiceAcls



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3273) Improve web UI to facilitate scheduling analysis and debugging

2015-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364762#comment-14364762
 ] 

Hadoop QA commented on YARN-3273:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12705019/0003-YARN-3273.patch
  against trunk revision ef9946c.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 5 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesFairScheduler
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesApps
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes
  
org.apache.hadoop.yarn.server.resourcemanager.security.TestAMRMTokens
  
org.apache.hadoop.yarn.server.resourcemanager.TestMoveApplication
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerQueueACLs
  
org.apache.hadoop.yarn.server.resourcemanager.security.TestRMDelegationTokens

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6995//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6995//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6995//console

This message is automatically generated.

 Improve web UI to facilitate scheduling analysis and debugging
 --

 Key: YARN-3273
 URL: https://issues.apache.org/jira/browse/YARN-3273
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jian He
Assignee: Rohith
 Attachments: 0001-YARN-3273-v1.patch, 0001-YARN-3273-v2.patch, 
 0002-YARN-3273.patch, 0003-YARN-3273.patch, 0003-YARN-3273.patch, 
 YARN-3273-am-resource-used-AND-User-limit-v2.PNG, 
 YARN-3273-am-resource-used-AND-User-limit.PNG, 
 YARN-3273-application-headroom-v2.PNG, YARN-3273-application-headroom.PNG


 Job may be stuck for reasons such as:
 - hitting queue capacity 
 - hitting user-limit, 
 - hitting AM-resource-percentage 
 The  first queueCapacity is already shown on the UI.
 We may surface things like:
 - what is user's current usage and user-limit; 
 - what is the AM resource usage and limit;
 - what is the application's current HeadRoom;
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3241) Leading space, trailing space and empty sub queue name may cause MetricsException for fair scheduler

2015-03-17 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-3241:

Attachment: (was: YARN-3241.000.patch)

 Leading space, trailing space and empty sub queue name may cause 
 MetricsException for fair scheduler
 

 Key: YARN-3241
 URL: https://issues.apache.org/jira/browse/YARN-3241
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-3241.000.patch


 Leading space, trailing space and empty sub queue name may cause 
 MetricsException(Metrics source XXX already exists! ) when add application to 
 FairScheduler.
 The reason is because QueueMetrics parse the queue name different from the 
 QueueManager.
 QueueMetrics use Q_SPLITTER to parse queue name, it will remove Leading space 
 and trailing space in the sub queue name, It will also remove empty sub queue 
 name.
 {code}
   static final Splitter Q_SPLITTER =
   Splitter.on('.').omitEmptyStrings().trimResults(); 
 {code}
 But QueueManager won't remove Leading space, trailing space and empty sub 
 queue name.
 This will cause out of sync between FSQueue and FSQueueMetrics.
 QueueManager will think two queue names are different so it will try to 
 create a new queue.
 But FSQueueMetrics will treat these two queue names as same queue which will 
 create Metrics source XXX already exists! MetricsException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3273) Improve web UI to facilitate scheduling analysis and debugging

2015-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364660#comment-14364660
 ] 

Hadoop QA commented on YARN-3273:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12705000/0003-YARN-3273.patch
  against trunk revision 046521c.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 7 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 5 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestNodesPage

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6994//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6994//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6994//console

This message is automatically generated.

 Improve web UI to facilitate scheduling analysis and debugging
 --

 Key: YARN-3273
 URL: https://issues.apache.org/jira/browse/YARN-3273
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jian He
Assignee: Rohith
 Attachments: 0001-YARN-3273-v1.patch, 0001-YARN-3273-v2.patch, 
 0002-YARN-3273.patch, 0003-YARN-3273.patch, 
 YARN-3273-am-resource-used-AND-User-limit-v2.PNG, 
 YARN-3273-am-resource-used-AND-User-limit.PNG, 
 YARN-3273-application-headroom-v2.PNG, YARN-3273-application-headroom.PNG


 Job may be stuck for reasons such as:
 - hitting queue capacity 
 - hitting user-limit, 
 - hitting AM-resource-percentage 
 The  first queueCapacity is already shown on the UI.
 We may surface things like:
 - what is user's current usage and user-limit; 
 - what is the AM resource usage and limit;
 - what is the application's current HeadRoom;
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3241) Leading space, trailing space and empty sub queue name may cause MetricsException for fair scheduler

2015-03-17 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-3241:

Attachment: YARN-3241.000.patch

 Leading space, trailing space and empty sub queue name may cause 
 MetricsException for fair scheduler
 

 Key: YARN-3241
 URL: https://issues.apache.org/jira/browse/YARN-3241
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-3241.000.patch


 Leading space, trailing space and empty sub queue name may cause 
 MetricsException(Metrics source XXX already exists! ) when add application to 
 FairScheduler.
 The reason is because QueueMetrics parse the queue name different from the 
 QueueManager.
 QueueMetrics use Q_SPLITTER to parse queue name, it will remove Leading space 
 and trailing space in the sub queue name, It will also remove empty sub queue 
 name.
 {code}
   static final Splitter Q_SPLITTER =
   Splitter.on('.').omitEmptyStrings().trimResults(); 
 {code}
 But QueueManager won't remove Leading space, trailing space and empty sub 
 queue name.
 This will cause out of sync between FSQueue and FSQueueMetrics.
 QueueManager will think two queue names are different so it will try to 
 create a new queue.
 But FSQueueMetrics will treat these two queue names as same queue which will 
 create Metrics source XXX already exists! MetricsException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3241) Leading space, trailing space and empty sub queue name may cause MetricsException for fair scheduler

2015-03-17 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-3241:

Attachment: (was: YARN-3241.000.patch)

 Leading space, trailing space and empty sub queue name may cause 
 MetricsException for fair scheduler
 

 Key: YARN-3241
 URL: https://issues.apache.org/jira/browse/YARN-3241
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: zhihai xu
Assignee: zhihai xu

 Leading space, trailing space and empty sub queue name may cause 
 MetricsException(Metrics source XXX already exists! ) when add application to 
 FairScheduler.
 The reason is because QueueMetrics parse the queue name different from the 
 QueueManager.
 QueueMetrics use Q_SPLITTER to parse queue name, it will remove Leading space 
 and trailing space in the sub queue name, It will also remove empty sub queue 
 name.
 {code}
   static final Splitter Q_SPLITTER =
   Splitter.on('.').omitEmptyStrings().trimResults(); 
 {code}
 But QueueManager won't remove Leading space, trailing space and empty sub 
 queue name.
 This will cause out of sync between FSQueue and FSQueueMetrics.
 QueueManager will think two queue names are different so it will try to 
 create a new queue.
 But FSQueueMetrics will treat these two queue names as same queue which will 
 create Metrics source XXX already exists! MetricsException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3339) TestDockerContainerExecutor should pull a single image and not the entire centos repository

2015-03-17 Thread Ravindra Kumar Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364662#comment-14364662
 ] 

Ravindra Kumar Naik commented on YARN-3339:
---

Thanks for the information.

 TestDockerContainerExecutor should pull a single image and not the entire 
 centos repository
 ---

 Key: YARN-3339
 URL: https://issues.apache.org/jira/browse/YARN-3339
 Project: Hadoop YARN
  Issue Type: Test
  Components: test
Affects Versions: 2.6.0
 Environment: Linux
Reporter: Ravindra Kumar Naik
Priority: Minor
 Fix For: 2.8.0

 Attachments: YARN-3339-branch-2.6.0.001.patch, 
 YARN-3339-trunk.001.patch


 TestDockerContainerExecutor test pulls the entire centos repository which is 
 time consuming.
 Pulling a specific image (e.g. centos7) will be sufficient to run the test 
 successfully and will save time



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3241) Leading space, trailing space and empty sub queue name may cause MetricsException for fair scheduler

2015-03-17 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-3241:

Attachment: YARN-3241.000.patch

 Leading space, trailing space and empty sub queue name may cause 
 MetricsException for fair scheduler
 

 Key: YARN-3241
 URL: https://issues.apache.org/jira/browse/YARN-3241
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-3241.000.patch


 Leading space, trailing space and empty sub queue name may cause 
 MetricsException(Metrics source XXX already exists! ) when add application to 
 FairScheduler.
 The reason is because QueueMetrics parse the queue name different from the 
 QueueManager.
 QueueMetrics use Q_SPLITTER to parse queue name, it will remove Leading space 
 and trailing space in the sub queue name, It will also remove empty sub queue 
 name.
 {code}
   static final Splitter Q_SPLITTER =
   Splitter.on('.').omitEmptyStrings().trimResults(); 
 {code}
 But QueueManager won't remove Leading space, trailing space and empty sub 
 queue name.
 This will cause out of sync between FSQueue and FSQueueMetrics.
 QueueManager will think two queue names are different so it will try to 
 create a new queue.
 But FSQueueMetrics will treat these two queue names as same queue which will 
 create Metrics source XXX already exists! MetricsException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3181) FairScheduler: Fix up outdated findbugs issues

2015-03-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365972#comment-14365972
 ] 

Hudson commented on YARN-3181:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7351 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7351/])
Revert YARN-3181. FairScheduler: Fix up outdated findbugs issues. (kasha) 
(kasha: rev 32b43304563c2430c00bc3e142a962d2bc5f4d58)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSOpDurations.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
* hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml
* hadoop-yarn-project/CHANGES.txt


 FairScheduler: Fix up outdated findbugs issues
 --

 Key: YARN-3181
 URL: https://issues.apache.org/jira/browse/YARN-3181
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: yarn-3181-1.patch


 In FairScheduler, we have excluded some findbugs-reported errors. Some of 
 them aren't applicable anymore, and there are a few that can be easily fixed 
 without needing an exclusion. It would be nice to fix them. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3111) Fix ratio problem on FairScheduler page

2015-03-17 Thread Ashwin Shankar (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366002#comment-14366002
 ] 

Ashwin Shankar commented on YARN-3111:
--

Sounds good to me.

 Fix ratio problem on FairScheduler page
 ---

 Key: YARN-3111
 URL: https://issues.apache.org/jira/browse/YARN-3111
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.6.0
Reporter: Peng Zhang
Assignee: Peng Zhang
Priority: Minor
 Attachments: YARN-3111.1.patch, YARN-3111.png


 Found 3 problems on FairScheduler page:
 1. Only compute memory for ratio even when queue schedulingPolicy is DRF.
 2. When min resources is configured larger than real resources, the steady 
 fair share ratio is so long that it is out the page.
 3. When cluster resources is 0(no nodemanager start), ratio is displayed as 
 NaN% used
 Attached image shows the snapshot of above problems. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3273) Improve web UI to facilitate scheduling analysis and debugging

2015-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366106#comment-14366106
 ] 

Hadoop QA commented on YARN-3273:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12705019/0003-YARN-3273.patch
  against trunk revision 968425e.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7004//console

This message is automatically generated.

 Improve web UI to facilitate scheduling analysis and debugging
 --

 Key: YARN-3273
 URL: https://issues.apache.org/jira/browse/YARN-3273
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jian He
Assignee: Rohith
 Attachments: 0001-YARN-3273-v1.patch, 0001-YARN-3273-v2.patch, 
 0002-YARN-3273.patch, 0003-YARN-3273.patch, 0003-YARN-3273.patch, 
 YARN-3273-am-resource-used-AND-User-limit-v2.PNG, 
 YARN-3273-am-resource-used-AND-User-limit.PNG, 
 YARN-3273-application-headroom-v2.PNG, YARN-3273-application-headroom.PNG


 Job may be stuck for reasons such as:
 - hitting queue capacity 
 - hitting user-limit, 
 - hitting AM-resource-percentage 
 The  first queueCapacity is already shown on the UI.
 We may surface things like:
 - what is user's current usage and user-limit; 
 - what is the AM resource usage and limit;
 - what is the application's current HeadRoom;
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3361) CapacityScheduler side changes to support non-exclusive node labels

2015-03-17 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-3361:
-
Component/s: (was: api)
 (was: client)
 (was: resourcemanager)
 capacityscheduler

 CapacityScheduler side changes to support non-exclusive node labels
 ---

 Key: YARN-3361
 URL: https://issues.apache.org/jira/browse/YARN-3361
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Wangda Tan
Assignee: Wangda Tan

 According to design doc attached in YARN-3214, we need implement following 
 logic in CapacityScheduler:
 1) When allocate a resource request with no node-label specified, it should 
 get preferentially allocated to node without labels.
 2) When there're some available resource in a node with label, they can be 
 used by applications with following order:
 - Applications under queues which can access the label and ask for same 
 labeled resource. 
 - Applications under queues which can access the label and ask for 
 non-labeled resource.
 - Applications under queues cannot access the label and ask for non-labeled 
 resource.
 3) Expose necessary information that can be used by preemption policy to make 
 preemption decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3361) CapacityScheduler side changes to support non-exclusive node labels

2015-03-17 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-3361:
-
Description: 
According to design doc attached in YARN-3214, we need implement following 
logic in CapacityScheduler:
1) When allocate a resource request with no node-label specified, it should get 
preferentially allocated to node without labels.

2) When there're some available resource in a node with label, they can be used 
by applications with following order:
- Applications under queues which can access the label and ask for same labeled 
resource. 
- Applications under queues which can access the label and ask for non-labeled 
resource.
- Applications under queues cannot access the label and ask for non-labeled 
resource.

3) Expose necessary information that can be used by preemption policy to make 
preemption decisions.

  was:Reference to design doc attached in YARN-3214, this is CapacityScheduler 
side changes to support non-exclusive node labels.


 CapacityScheduler side changes to support non-exclusive node labels
 ---

 Key: YARN-3361
 URL: https://issues.apache.org/jira/browse/YARN-3361
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan

 According to design doc attached in YARN-3214, we need implement following 
 logic in CapacityScheduler:
 1) When allocate a resource request with no node-label specified, it should 
 get preferentially allocated to node without labels.
 2) When there're some available resource in a node with label, they can be 
 used by applications with following order:
 - Applications under queues which can access the label and ask for same 
 labeled resource. 
 - Applications under queues which can access the label and ask for 
 non-labeled resource.
 - Applications under queues cannot access the label and ask for non-labeled 
 resource.
 3) Expose necessary information that can be used by preemption policy to make 
 preemption decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3356) Capacity Scheduler FiCaSchedulerApp should use ResourceUsage to track used-resources-by-label.

2015-03-17 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-3356:
-
Description: 
According to design doc attached in YARN-3214, we need implement following 
logic in CapacityScheduler:
1) When allocate a resource request with no node-label specified, it should get 
preferentially allocated to node without labels.

2) When there're some available resource in a node with label, they can be used 
by applications with following order:
- Applications under queues which can access the label and ask for same labeled 
resource. 
- Applications under queues which can access the label and ask for non-labeled 
resource.
- Applications under queues cannot access the label and ask for non-labeled 
resource.

3) Expose necessary information that can be used by preemption policy to make 
preemption decisions.

  was:
Simliar to YARN-3099, Capacity Scheduler's LeafQueue.User/FiCaSchedulerApp 
should use ResourceRequest to track resource-usage/pending by label for better 
resource tracking and preemption.

And also, when application's pending resource changed (container allocated, app 
completed, moved, etc.), we need update ResourceUsage of queue hierarchies.


 Capacity Scheduler FiCaSchedulerApp should use ResourceUsage to track 
 used-resources-by-label.
 --

 Key: YARN-3356
 URL: https://issues.apache.org/jira/browse/YARN-3356
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-3356.1.patch


 According to design doc attached in YARN-3214, we need implement following 
 logic in CapacityScheduler:
 1) When allocate a resource request with no node-label specified, it should 
 get preferentially allocated to node without labels.
 2) When there're some available resource in a node with label, they can be 
 used by applications with following order:
 - Applications under queues which can access the label and ask for same 
 labeled resource. 
 - Applications under queues which can access the label and ask for 
 non-labeled resource.
 - Applications under queues cannot access the label and ask for non-labeled 
 resource.
 3) Expose necessary information that can be used by preemption policy to make 
 preemption decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3356) Capacity Scheduler FiCaSchedulerApp should use ResourceUsage to track used-resources-by-label.

2015-03-17 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-3356:
-
Description: 
Simliar to YARN-3099, Capacity Scheduler's LeafQueue.User/FiCaSchedulerApp 
should use ResourceRequest to track resource-usage/pending by label for better 
resource tracking and preemption. 

And also, when application's pending resource changed (container allocated, app 
completed, moved, etc.), we need update ResourceUsage of queue hierarchies.

  was:
According to design doc attached in YARN-3214, we need implement following 
logic in CapacityScheduler:
1) When allocate a resource request with no node-label specified, it should get 
preferentially allocated to node without labels.

2) When there're some available resource in a node with label, they can be used 
by applications with following order:
- Applications under queues which can access the label and ask for same labeled 
resource. 
- Applications under queues which can access the label and ask for non-labeled 
resource.
- Applications under queues cannot access the label and ask for non-labeled 
resource.

3) Expose necessary information that can be used by preemption policy to make 
preemption decisions.


 Capacity Scheduler FiCaSchedulerApp should use ResourceUsage to track 
 used-resources-by-label.
 --

 Key: YARN-3356
 URL: https://issues.apache.org/jira/browse/YARN-3356
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-3356.1.patch


 Simliar to YARN-3099, Capacity Scheduler's LeafQueue.User/FiCaSchedulerApp 
 should use ResourceRequest to track resource-usage/pending by label for 
 better resource tracking and preemption. 
 And also, when application's pending resource changed (container allocated, 
 app completed, moved, etc.), we need update ResourceUsage of queue 
 hierarchies.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2556) Tool to measure the performance of the timeline server

2015-03-17 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated YARN-2556:
--
Attachment: YARN-2556.2.patch

 Tool to measure the performance of the timeline server
 --

 Key: YARN-2556
 URL: https://issues.apache.org/jira/browse/YARN-2556
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Chang Li
 Attachments: YARN-2556-WIP.patch, YARN-2556-WIP.patch, 
 YARN-2556.1.patch, YARN-2556.2.patch, YARN-2556.patch, yarn2556.patch, 
 yarn2556.patch, yarn2556_wip.patch


 We need to be able to understand the capacity model for the timeline server 
 to give users the tools they need to deploy a timeline server with the 
 correct capacity.
 I propose we create a mapreduce job that can measure timeline server write 
 and read performance. Transactions per second, I/O for both read and write 
 would be a good start.
 This could be done as an example or test job that could be tied into gridmix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3326) ReST support for getLabelsToNodes

2015-03-17 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366120#comment-14366120
 ] 

Naganarasimha G R commented on YARN-3326:
-

Hi [~vvasudev], [~wangda], I was not able to come up with anything better than  
/label-mappings?label=label1,label2,..., please inform if this is ok will 
modify the patch else please provide more options...   
P.S. /nodes is already used for getNodes 


 ReST support for getLabelsToNodes 
 --

 Key: YARN-3326
 URL: https://issues.apache.org/jira/browse/YARN-3326
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Naganarasimha G R
Assignee: Naganarasimha G R
Priority: Minor
 Attachments: YARN-3326.20150310-1.patch


 REST to support to retrieve LabelsToNodes Mapping



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3273) Improve web UI to facilitate scheduling analysis and debugging

2015-03-17 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366121#comment-14366121
 ] 

Jian He commented on YARN-3273:
---

patch actually applies, not sure why jenkins complains, re-submitting the same 
patch.

 Improve web UI to facilitate scheduling analysis and debugging
 --

 Key: YARN-3273
 URL: https://issues.apache.org/jira/browse/YARN-3273
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jian He
Assignee: Rohith
 Attachments: 0001-YARN-3273-v1.patch, 0001-YARN-3273-v2.patch, 
 0002-YARN-3273.patch, 0003-YARN-3273.patch, 0003-YARN-3273.patch, 
 0004-YARN-3273.patch, YARN-3273-am-resource-used-AND-User-limit-v2.PNG, 
 YARN-3273-am-resource-used-AND-User-limit.PNG, 
 YARN-3273-application-headroom-v2.PNG, YARN-3273-application-headroom.PNG


 Job may be stuck for reasons such as:
 - hitting queue capacity 
 - hitting user-limit, 
 - hitting AM-resource-percentage 
 The  first queueCapacity is already shown on the UI.
 We may surface things like:
 - what is user's current usage and user-limit; 
 - what is the AM resource usage and limit;
 - what is the application's current HeadRoom;
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-3360) Add JMX metrics to TimelineDataManager

2015-03-17 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe reassigned YARN-3360:


Assignee: Jason Lowe

 Add JMX metrics to TimelineDataManager
 --

 Key: YARN-3360
 URL: https://issues.apache.org/jira/browse/YARN-3360
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: timelineserver
Affects Versions: 2.6.0
Reporter: Jason Lowe
Assignee: Jason Lowe

 The TimelineDataManager currently has no metrics, outside of the standard JVM 
 metrics.  It would be very useful to at least log basic counts of method 
 calls, time spent in those calls, and number of entities/events involved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3189) Yarn application usage command should not give -appstate and -apptype

2015-03-17 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated YARN-3189:
---
Fix Version/s: (was: 3.0.0)

 Yarn application usage command should not give -appstate and -apptype
 -

 Key: YARN-3189
 URL: https://issues.apache.org/jira/browse/YARN-3189
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Anushri
Assignee: Anushri
Priority: Minor
 Attachments: YARN-3189.patch, YARN-3189.patch


 Yarn application usage command should not give -appstate and -apptype since 
 these two are applicable to --list command..
  *Can somebody please assign this issue to me* 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3356) Capacity Scheduler FiCaSchedulerApp should use ResourceUsage to track used-resources-by-label.

2015-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366060#comment-14366060
 ] 

Hadoop QA commented on YARN-3356:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12705148/YARN-3356.1.patch
  against trunk revision d884670.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 5 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7001//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/7001//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7001//console

This message is automatically generated.

 Capacity Scheduler FiCaSchedulerApp should use ResourceUsage to track 
 used-resources-by-label.
 --

 Key: YARN-3356
 URL: https://issues.apache.org/jira/browse/YARN-3356
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-3356.1.patch


 Simliar to YARN-3099, Capacity Scheduler's LeafQueue.User/FiCaSchedulerApp 
 should use ResourceRequest to track resource-usage/pending by label for 
 better resource tracking and preemption.
 And also, when application's pending resource changed (container allocated, 
 app completed, moved, etc.), we need update ResourceUsage of queue 
 hierarchies.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3034) [Aggregator wireup] Implement RM starting its ATS writer

2015-03-17 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-3034:

Attachment: YARN-3034.20150318-1.patch

Hi [~djp], I have updated the patch with the yarn-default.xml updates, please 
review.

 [Aggregator wireup] Implement RM starting its ATS writer
 

 Key: YARN-3034
 URL: https://issues.apache.org/jira/browse/YARN-3034
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Naganarasimha G R
 Attachments: YARN-3034-20150312-1.patch, YARN-3034.20150205-1.patch, 
 YARN-3034.20150316-1.patch, YARN-3034.20150318-1.patch


 Per design in YARN-2928, implement resource managers starting their own ATS 
 writers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3305) AM-Used Resource for leafqueue is wrongly populated if AM ResourceRequest is less than minimumAllocation

2015-03-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366093#comment-14366093
 ] 

Hudson commented on YARN-3305:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7352 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7352/])
YARN-3305. Normalize AM resource request on app submission. Contributed by 
Rohith Sharmaks (jianhe: rev 968425e9f7b850ff9c2ab8ca37a64c3fdbe77dbf)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestAppManager.java


 AM-Used Resource for leafqueue is wrongly populated if AM ResourceRequest is 
 less than minimumAllocation
 

 Key: YARN-3305
 URL: https://issues.apache.org/jira/browse/YARN-3305
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.6.0
Reporter: Rohith
Assignee: Rohith
 Fix For: 2.8.0

 Attachments: 0001-YARN-3305.patch, 0001-YARN-3305.patch, 
 0002-YARN-3305.patch, 0003-YARN-3305.patch


 For given any ResourceRequest, {{CS#allocate}} normalizes request to 
 minimumAllocation if requested memory is less than minimumAllocation.
 But AM-used resource is updated with actual ResourceRequest made by user. 
 This results in AM container allocation more than Max ApplicationMaster 
 Resource.
 This is because AM-Used is updated with actual ResourceRequest made by user 
 while activating the applications. But during allocation of container, 
 ResourceRequest is normalized to minimumAllocation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3360) Add JMX metrics to TimelineDataManager

2015-03-17 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-3360:
-
Attachment: YARN-3360.001.patch

This adds basic, store-independent, metrics to the TimelineDataManager to 
provide call counts, entity/event counts, and time-per-call averages.  This 
also fixes a number of unit tests that weren't initializing the data manager 
properly.

 Add JMX metrics to TimelineDataManager
 --

 Key: YARN-3360
 URL: https://issues.apache.org/jira/browse/YARN-3360
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: timelineserver
Affects Versions: 2.6.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-3360.001.patch


 The TimelineDataManager currently has no metrics, outside of the standard JVM 
 metrics.  It would be very useful to at least log basic counts of method 
 calls, time spent in those calls, and number of entities/events involved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3361) CapacityScheduler side changes to support non-exclusive node labels

2015-03-17 Thread Wangda Tan (JIRA)
Wangda Tan created YARN-3361:


 Summary: CapacityScheduler side changes to support non-exclusive 
node labels
 Key: YARN-3361
 URL: https://issues.apache.org/jira/browse/YARN-3361
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Wangda Tan
Assignee: Wangda Tan


Reference to design doc attached in YARN-3214, this is CapacityScheduler side 
changes to support non-exclusive node labels.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3345) Add non-exclusive node label RMAdmin CLI/API

2015-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366119#comment-14366119
 ] 

Hadoop QA commented on YARN-3345:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12705175/YARN-3345.4.patch
  against trunk revision 968425e.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:red}-1 eclipse:eclipse{color}.  The patch failed to build with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7002//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7002//console

This message is automatically generated.

 Add non-exclusive node label RMAdmin CLI/API
 

 Key: YARN-3345
 URL: https://issues.apache.org/jira/browse/YARN-3345
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-3345.1.patch, YARN-3345.2.patch, YARN-3345.3.patch, 
 YARN-3345.4.patch


 As described in YARN-3214 (see design doc attached to that JIRA), we need add 
 non-exclusive node label RMAdmin API and CLI implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3273) Improve web UI to facilitate scheduling analysis and debugging

2015-03-17 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-3273:
--
Attachment: 0004-YARN-3273.patch

 Improve web UI to facilitate scheduling analysis and debugging
 --

 Key: YARN-3273
 URL: https://issues.apache.org/jira/browse/YARN-3273
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jian He
Assignee: Rohith
 Attachments: 0001-YARN-3273-v1.patch, 0001-YARN-3273-v2.patch, 
 0002-YARN-3273.patch, 0003-YARN-3273.patch, 0003-YARN-3273.patch, 
 0004-YARN-3273.patch, YARN-3273-am-resource-used-AND-User-limit-v2.PNG, 
 YARN-3273-am-resource-used-AND-User-limit.PNG, 
 YARN-3273-application-headroom-v2.PNG, YARN-3273-application-headroom.PNG


 Job may be stuck for reasons such as:
 - hitting queue capacity 
 - hitting user-limit, 
 - hitting AM-resource-percentage 
 The  first queueCapacity is already shown on the UI.
 We may surface things like:
 - what is user's current usage and user-limit; 
 - what is the AM resource usage and limit;
 - what is the application's current HeadRoom;
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3362) Add node label usage in RM CapacityScheduler web UI

2015-03-17 Thread Wangda Tan (JIRA)
Wangda Tan created YARN-3362:


 Summary: Add node label usage in RM CapacityScheduler web UI
 Key: YARN-3362
 URL: https://issues.apache.org/jira/browse/YARN-3362
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler, resourcemanager, webapp
Reporter: Wangda Tan


We don't have node label usage in RM CapacityScheduler web UI now, without 
this, user will be hard to understand what happened to nodes have labels assign 
to it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3362) Add node label usage in RM CapacityScheduler web UI

2015-03-17 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366191#comment-14366191
 ] 

Wangda Tan commented on YARN-3362:
--

My proposal is:

For now, RM CapacityScheduler UI looks like
{code}
+ root [==] 50% used
  + a  [===] 75% used
- a1 [=]  30% used
  -
  |  Queue Metrics Table   |
  ||
  |   metrics1   |value1   |
  |   metrics2   |value2   |
  |   metrics3   |value3   |
  |   metrics4   |value4   |
  --
  + b [...]
  + c [...]
{code}

We can add one more hierarchy above queue's hierarchy, which are labels can be 
accessed and/or labels are being used by the queue, which can looks like

{code}
+ label_x  [=] 30% used
+ root [=] 30% used
  + a  [===] 75% used
+ a1 [=]  30% used
  -
  |  Queue Metrics Table (For label_x) |
  ||
  |   metrics1   |value1   |
  |   metrics2   |value2   |
  |   metrics3   |value3   |
  |   metrics4   |value4   |
  --
+ label_y
+ root [...]
+ ...
+ label_z
+ root [...]
+ ...
+ no_label
+ root [...]
+ ...
{code}

To make it backward compatible, when there's no label in the system, it will 
not show label-bar, and root is still root-queue.

Please feel free to share your ideas on this!

 Add node label usage in RM CapacityScheduler web UI
 ---

 Key: YARN-3362
 URL: https://issues.apache.org/jira/browse/YARN-3362
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler, resourcemanager, webapp
Reporter: Wangda Tan

 We don't have node label usage in RM CapacityScheduler web UI now, without 
 this, user will be hard to understand what happened to nodes have labels 
 assign to it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3241) Leading space, trailing space and empty sub queue name may cause MetricsException for fair scheduler

2015-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364856#comment-14364856
 ] 

Hadoop QA commented on YARN-3241:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12705023/YARN-3241.000.patch
  against trunk revision 48c2db3.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 5 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6997//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6997//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6997//console

This message is automatically generated.

 Leading space, trailing space and empty sub queue name may cause 
 MetricsException for fair scheduler
 

 Key: YARN-3241
 URL: https://issues.apache.org/jira/browse/YARN-3241
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-3241.000.patch


 Leading space, trailing space and empty sub queue name may cause 
 MetricsException(Metrics source XXX already exists! ) when add application to 
 FairScheduler.
 The reason is because QueueMetrics parse the queue name different from the 
 QueueManager.
 QueueMetrics use Q_SPLITTER to parse queue name, it will remove Leading space 
 and trailing space in the sub queue name, It will also remove empty sub queue 
 name.
 {code}
   static final Splitter Q_SPLITTER =
   Splitter.on('.').omitEmptyStrings().trimResults(); 
 {code}
 But QueueManager won't remove Leading space, trailing space and empty sub 
 queue name.
 This will cause out of sync between FSQueue and FSQueueMetrics.
 QueueManager will think two queue names are different so it will try to 
 create a new queue.
 But FSQueueMetrics will treat these two queue names as same queue which will 
 create Metrics source XXX already exists! MetricsException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3197) Confusing log generated by CapacityScheduler

2015-03-17 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated YARN-3197:

Hadoop Flags: Reviewed

 Confusing log generated by CapacityScheduler
 

 Key: YARN-3197
 URL: https://issues.apache.org/jira/browse/YARN-3197
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.6.0
Reporter: Hitesh Shah
Assignee: Varun Saxena
Priority: Minor
 Attachments: YARN-3197.001.patch, YARN-3197.002.patch, 
 YARN-3197.003.patch, YARN-3197.004.patch


 2015-02-12 20:35:39,968 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...
 2015-02-12 20:35:39,968 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...
 2015-02-12 20:35:39,968 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...
 2015-02-12 20:35:40,960 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...
 2015-02-12 20:35:40,960 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3197) Confusing log generated by CapacityScheduler

2015-03-17 Thread Devaraj K (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated YARN-3197:

Target Version/s: 2.8.0  (was: 2.7.0)

+1, latest patch looks good to me, will commit it shortly.

 Confusing log generated by CapacityScheduler
 

 Key: YARN-3197
 URL: https://issues.apache.org/jira/browse/YARN-3197
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.6.0
Reporter: Hitesh Shah
Assignee: Varun Saxena
Priority: Minor
 Attachments: YARN-3197.001.patch, YARN-3197.002.patch, 
 YARN-3197.003.patch, YARN-3197.004.patch


 2015-02-12 20:35:39,968 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...
 2015-02-12 20:35:39,968 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...
 2015-02-12 20:35:39,968 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...
 2015-02-12 20:35:40,960 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...
 2015-02-12 20:35:40,960 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1453) [JDK8] Fix Javadoc errors caused by incorrect or illegal tags in doc comments

2015-03-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364917#comment-14364917
 ] 

Hudson commented on YARN-1453:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #869 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/869/])
YARN-1453. [JDK8] Fix Javadoc errors caused by incorrect or illegal tags in doc 
comments. Contributed by Akira AJISAKA, Andrew Purtell, and Allen Wittenauer. 
(ozawa: rev 3da9a97cfbcc3a1c50aaf85b1a129d4d269cd5fd)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerNode.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/RegisterNodeManagerRequest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ApplicationClientProtocol.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ReservationRequest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetContainerStatusesResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry/src/main/java/org/apache/hadoop/registry/client/binding/RegistryUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/records/NodeHealthStatus.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/src/main/java/org/apache/hadoop/yarn/server/webproxy/ProxyUriUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ContainerStatus.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/WebApps.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/FinishApplicationMasterResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetClusterMetricsResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/StartContainerRequest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ApplicationSubmissionContext.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ContainerReport.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry/src/main/java/org/apache/hadoop/registry/client/impl/RegistryOperationsClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetQueueInfoResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/RegisterApplicationMasterResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/PreemptionMessage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/StringHelper.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ApplicationBaseProtocol.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/security/ApplicationACLsManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/NMTokenCache.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetQueueInfoRequest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/QueueInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttempt.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/AHSClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetApplicationsRequest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ReservationRequestInterpreter.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/util/NodeManagerHardwareUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/AllocateRequest.java
* 

[jira] [Commented] (YARN-2777) Mark the end of individual log in aggregated log

2015-03-17 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364860#comment-14364860
 ] 

Varun Saxena commented on YARN-2777:


[~tedyu], previous test failures were unrelated.
Kindly review.

 Mark the end of individual log in aggregated log
 

 Key: YARN-2777
 URL: https://issues.apache.org/jira/browse/YARN-2777
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Ted Yu
Assignee: Varun Saxena
  Labels: log-aggregation
 Attachments: YARN-2777.001.patch, YARN-2777.02.patch


 Below is snippet of aggregated log showing hbase master log:
 {code}
 LogType: hbase-hbase-master-ip-172-31-34-167.log
 LogUploadTime: 29-Oct-2014 22:31:55
 LogLength: 24103045
 Log Contents:
 Wed Oct 29 15:43:57 UTC 2014 Starting master on ip-172-31-34-167
 ...
   at 
 org.apache.hadoop.hbase.master.cleaner.CleanerChore.chore(CleanerChore.java:124)
   at org.apache.hadoop.hbase.Chore.run(Chore.java:80)
   at java.lang.Thread.run(Thread.java:745)
 LogType: hbase-hbase-master-ip-172-31-34-167.out
 {code}
 Since logs from various daemons are aggregated in one log file, it would be 
 desirable to mark the end of one log before starting with the next.
 e.g. with such a line:
 {code}
 End of LogType: hbase-hbase-master-ip-172-31-34-167.log
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3111) Fix ratio problem on FairScheduler page

2015-03-17 Thread Peng Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364815#comment-14364815
 ] 

Peng Zhang commented on YARN-3111:
--

I think overlay is not a good choice. 
Currently scheduler bar is already overlay of steady share, instantaneous share 
and max resources. 
Then overlaying  two dimension of resources may generate 2 * 3 elements? If so, 
it should be too cluttered without new resources added.

When test this patch in our cluster, I found a new issue with some abnormal 
configuration: 
queue's bar width is decided by (queue steady resource / cluster resource), and 
queue's usage width is decided by (queue's usage resource / cluster resource). 
For above two percent computation, dominant resource may be different, so two 
percent value is still in different dimension, and it causes confusion.

To figure out above problem, we practice making queue's steady share 
proportional to root queue share in different resource dimension, so first 
percent value(queue steady resource / cluster resource) has the same percent 
value in different resources, and it will not cause confusion. 

I think deeper problem is that FS can configure cpu and memory seperately(eg: 
min resource, max resource ), and it makes resource not proportional between 
queues, but need a view of percentage.

 Fix ratio problem on FairScheduler page
 ---

 Key: YARN-3111
 URL: https://issues.apache.org/jira/browse/YARN-3111
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.6.0
Reporter: Peng Zhang
Assignee: Peng Zhang
Priority: Minor
 Attachments: YARN-3111.1.patch, YARN-3111.png


 Found 3 problems on FairScheduler page:
 1. Only compute memory for ratio even when queue schedulingPolicy is DRF.
 2. When min resources is configured larger than real resources, the steady 
 fair share ratio is so long that it is out the page.
 3. When cluster resources is 0(no nodemanager start), ratio is displayed as 
 NaN% used
 Attached image shows the snapshot of above problems. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3197) Confusing log generated by CapacityScheduler

2015-03-17 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364898#comment-14364898
 ] 

Varun Saxena commented on YARN-3197:


Thanks [~devaraj.k] for the commit and review. [~leftnoteasy] and [~vinodkv], 
thanks for your comments.


 Confusing log generated by CapacityScheduler
 

 Key: YARN-3197
 URL: https://issues.apache.org/jira/browse/YARN-3197
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.6.0
Reporter: Hitesh Shah
Assignee: Varun Saxena
Priority: Minor
 Fix For: 2.8.0

 Attachments: YARN-3197.001.patch, YARN-3197.002.patch, 
 YARN-3197.003.patch, YARN-3197.004.patch


 2015-02-12 20:35:39,968 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...
 2015-02-12 20:35:39,968 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...
 2015-02-12 20:35:39,968 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...
 2015-02-12 20:35:40,960 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...
 2015-02-12 20:35:40,960 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3197) Confusing log generated by CapacityScheduler

2015-03-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364906#comment-14364906
 ] 

Hudson commented on YARN-3197:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7347 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7347/])
YARN-3197. Confusing log generated by CapacityScheduler. Contributed by 
(devaraj: rev 7179f94f9d000fc52bd9ce5aa9741aba97ec3ee8)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java


 Confusing log generated by CapacityScheduler
 

 Key: YARN-3197
 URL: https://issues.apache.org/jira/browse/YARN-3197
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.6.0
Reporter: Hitesh Shah
Assignee: Varun Saxena
Priority: Minor
 Fix For: 2.8.0

 Attachments: YARN-3197.001.patch, YARN-3197.002.patch, 
 YARN-3197.003.patch, YARN-3197.004.patch


 2015-02-12 20:35:39,968 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...
 2015-02-12 20:35:39,968 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...
 2015-02-12 20:35:39,968 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...
 2015-02-12 20:35:40,960 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...
 2015-02-12 20:35:40,960 INFO  capacity.CapacityScheduler 
 (CapacityScheduler.java:completedContainer(1190)) - Null container 
 completed...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3344) procfs stat file is not in the expected format warning

2015-03-17 Thread Ravindra Kumar Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Kumar Naik updated YARN-3344:
--
Attachment: (was: YARN-3344-branch-2.6.0.001.patch)

 procfs stat file is not in the expected format warning
 --

 Key: YARN-3344
 URL: https://issues.apache.org/jira/browse/YARN-3344
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Jon Bringhurst
 Attachments: YARN-3344-branch-trunk.001.patch, 
 YARN-3344-branch-trunk.002.patch, YARN-3344-branch-trunk.003.patch


 Although this doesn't appear to be causing any functional issues, it is 
 spamming our log files quite a bit. :)
 It appears that the regex in ProcfsBasedProcessTree doesn't work for all 
 /proc/pid/stat files.
 Here's the error I'm seeing:
 {noformat}
 source_host: asdf,
 method: constructProcessInfo,
 level: WARN,
 message: Unexpected: procfs stat file is not in the expected format 
 for process with pid 6953
 file: ProcfsBasedProcessTree.java,
 line_number: 514,
 class: org.apache.hadoop.yarn.util.ProcfsBasedProcessTree,
 {noformat}
 And here's the basic info on process with pid 6953:
 {noformat}
 [asdf ~]$ cat /proc/6953/stat
 6953 (python2.6 /expo) S 1871 1871 1871 0 -1 4202496 9364 1080 0 0 25 3 0 0 
 20 0 1 0 144918696 205295616 5856 18446744073709551615 1 1 0 0 0 0 0 16781312 
 2 18446744073709551615 0 0 17 13 0 0 0 0 0
 [asdf ~]$ ps aux|grep 6953
 root  6953  0.0  0.0 200484 23424 ?S21:44   0:00 python2.6 
 /export/apps/salt/minion-scripts/module-sync.py
 jbringhu 13481  0.0  0.0 105312   872 pts/0S+   22:13   0:00 grep -i 6953
 [asdf ~]$ 
 {noformat}
 This is using 2.6.32-431.11.2.el6.x86_64 in RHEL 6.5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3344) procfs stat file is not in the expected format warning

2015-03-17 Thread Ravindra Kumar Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Kumar Naik updated YARN-3344:
--
Attachment: (was: YARN-3344-branch-2.6.0.003.patch)

 procfs stat file is not in the expected format warning
 --

 Key: YARN-3344
 URL: https://issues.apache.org/jira/browse/YARN-3344
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Jon Bringhurst
 Attachments: YARN-3344-branch-trunk.001.patch, 
 YARN-3344-branch-trunk.002.patch, YARN-3344-branch-trunk.003.patch


 Although this doesn't appear to be causing any functional issues, it is 
 spamming our log files quite a bit. :)
 It appears that the regex in ProcfsBasedProcessTree doesn't work for all 
 /proc/pid/stat files.
 Here's the error I'm seeing:
 {noformat}
 source_host: asdf,
 method: constructProcessInfo,
 level: WARN,
 message: Unexpected: procfs stat file is not in the expected format 
 for process with pid 6953
 file: ProcfsBasedProcessTree.java,
 line_number: 514,
 class: org.apache.hadoop.yarn.util.ProcfsBasedProcessTree,
 {noformat}
 And here's the basic info on process with pid 6953:
 {noformat}
 [asdf ~]$ cat /proc/6953/stat
 6953 (python2.6 /expo) S 1871 1871 1871 0 -1 4202496 9364 1080 0 0 25 3 0 0 
 20 0 1 0 144918696 205295616 5856 18446744073709551615 1 1 0 0 0 0 0 16781312 
 2 18446744073709551615 0 0 17 13 0 0 0 0 0
 [asdf ~]$ ps aux|grep 6953
 root  6953  0.0  0.0 200484 23424 ?S21:44   0:00 python2.6 
 /export/apps/salt/minion-scripts/module-sync.py
 jbringhu 13481  0.0  0.0 105312   872 pts/0S+   22:13   0:00 grep -i 6953
 [asdf ~]$ 
 {noformat}
 This is using 2.6.32-431.11.2.el6.x86_64 in RHEL 6.5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3344) procfs stat file is not in the expected format warning

2015-03-17 Thread Ravindra Kumar Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Kumar Naik updated YARN-3344:
--
Attachment: (was: YARN-3344-branch-2.6.0.002.patch)

 procfs stat file is not in the expected format warning
 --

 Key: YARN-3344
 URL: https://issues.apache.org/jira/browse/YARN-3344
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Jon Bringhurst
 Attachments: YARN-3344-branch-trunk.001.patch, 
 YARN-3344-branch-trunk.002.patch, YARN-3344-branch-trunk.003.patch


 Although this doesn't appear to be causing any functional issues, it is 
 spamming our log files quite a bit. :)
 It appears that the regex in ProcfsBasedProcessTree doesn't work for all 
 /proc/pid/stat files.
 Here's the error I'm seeing:
 {noformat}
 source_host: asdf,
 method: constructProcessInfo,
 level: WARN,
 message: Unexpected: procfs stat file is not in the expected format 
 for process with pid 6953
 file: ProcfsBasedProcessTree.java,
 line_number: 514,
 class: org.apache.hadoop.yarn.util.ProcfsBasedProcessTree,
 {noformat}
 And here's the basic info on process with pid 6953:
 {noformat}
 [asdf ~]$ cat /proc/6953/stat
 6953 (python2.6 /expo) S 1871 1871 1871 0 -1 4202496 9364 1080 0 0 25 3 0 0 
 20 0 1 0 144918696 205295616 5856 18446744073709551615 1 1 0 0 0 0 0 16781312 
 2 18446744073709551615 0 0 17 13 0 0 0 0 0
 [asdf ~]$ ps aux|grep 6953
 root  6953  0.0  0.0 200484 23424 ?S21:44   0:00 python2.6 
 /export/apps/salt/minion-scripts/module-sync.py
 jbringhu 13481  0.0  0.0 105312   872 pts/0S+   22:13   0:00 grep -i 6953
 [asdf ~]$ 
 {noformat}
 This is using 2.6.32-431.11.2.el6.x86_64 in RHEL 6.5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3273) Improve web UI to facilitate scheduling analysis and debugging

2015-03-17 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364865#comment-14364865
 ] 

Rohith commented on YARN-3273:
--

All test failures are because of BindException which is unrelated to this 
patch. May need to re kick off jenkin

 Improve web UI to facilitate scheduling analysis and debugging
 --

 Key: YARN-3273
 URL: https://issues.apache.org/jira/browse/YARN-3273
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jian He
Assignee: Rohith
 Attachments: 0001-YARN-3273-v1.patch, 0001-YARN-3273-v2.patch, 
 0002-YARN-3273.patch, 0003-YARN-3273.patch, 0003-YARN-3273.patch, 
 YARN-3273-am-resource-used-AND-User-limit-v2.PNG, 
 YARN-3273-am-resource-used-AND-User-limit.PNG, 
 YARN-3273-application-headroom-v2.PNG, YARN-3273-application-headroom.PNG


 Job may be stuck for reasons such as:
 - hitting queue capacity 
 - hitting user-limit, 
 - hitting AM-resource-percentage 
 The  first queueCapacity is already shown on the UI.
 We may surface things like:
 - what is user's current usage and user-limit; 
 - what is the AM resource usage and limit;
 - what is the application's current HeadRoom;
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3136) getTransferredContainers can be a bottleneck during AM registration

2015-03-17 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365604#comment-14365604
 ] 

Sunil G commented on YARN-3136:
---

Thank you [~jlowe] for pointing out.
I will fix and upload a new patch.


 getTransferredContainers can be a bottleneck during AM registration
 ---

 Key: YARN-3136
 URL: https://issues.apache.org/jira/browse/YARN-3136
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Affects Versions: 2.6.0
Reporter: Jason Lowe
Assignee: Sunil G
 Attachments: 0001-YARN-3136.patch, 0002-YARN-3136.patch, 
 0003-YARN-3136.patch, 0004-YARN-3136.patch, 0005-YARN-3136.patch, 
 0006-YARN-3136.patch


 While examining RM stack traces on a busy cluster I noticed a pattern of AMs 
 stuck waiting for the scheduler lock trying to call getTransferredContainers. 
  The scheduler lock is highly contended, especially on a large cluster with 
 many nodes heartbeating, and it would be nice if we could find a way to 
 eliminate the need to grab this lock during this call.  We've already done 
 similar work during AM allocate calls to make sure they don't needlessly grab 
 the scheduler lock, and it would be good to do so here as well, if possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3339) TestDockerContainerExecutor should pull a single image and not the entire centos repository

2015-03-17 Thread Ravindra Kumar Naik (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365626#comment-14365626
 ] 

Ravindra Kumar Naik commented on YARN-3339:
---

Thanks [~raviprakash] for reviewing and comitting.

 TestDockerContainerExecutor should pull a single image and not the entire 
 centos repository
 ---

 Key: YARN-3339
 URL: https://issues.apache.org/jira/browse/YARN-3339
 Project: Hadoop YARN
  Issue Type: Test
  Components: test
Affects Versions: 2.6.0
 Environment: Linux
Reporter: Ravindra Kumar Naik
Priority: Minor
 Fix For: 2.8.0

 Attachments: YARN-3339-branch-2.6.0.001.patch, 
 YARN-3339-trunk.001.patch


 TestDockerContainerExecutor test pulls the entire centos repository which is 
 time consuming.
 Pulling a specific image (e.g. centos7) will be sufficient to run the test 
 successfully and will save time



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3360) Add JMX metrics to TimelineDataManager

2015-03-17 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-3360:


 Summary: Add JMX metrics to TimelineDataManager
 Key: YARN-3360
 URL: https://issues.apache.org/jira/browse/YARN-3360
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: timelineserver
Affects Versions: 2.6.0
Reporter: Jason Lowe


The TimelineDataManager currently has no metrics, outside of the standard JVM 
metrics.  It would be very useful to at least log basic counts of method calls, 
time spent in those calls, and number of entities/events involved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3110) Few issues in ApplicationHistory web ui

2015-03-17 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365736#comment-14365736
 ] 

Naganarasimha G R commented on YARN-3110:
-

Hi [~zjshen]  [~xgong],
Can you one of you please review this jira. If fine will add some test cases 
for 1st and the 2nd issues listed above ...

 Few issues in ApplicationHistory web ui
 ---

 Key: YARN-3110
 URL: https://issues.apache.org/jira/browse/YARN-3110
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: applications, timelineserver
Affects Versions: 2.6.0
Reporter: Bibin A Chundatt
Assignee: Naganarasimha G R
Priority: Minor
 Attachments: YARN-3110.20150209-1.patch, YARN-3110.20150315-1.patch


 Application state and History link wrong when Application is in unassigned 
 state
  
 1.Configure capacity schedular with queue size as 1  also max Absolute Max 
 Capacity:  10.0%
 (Current application state is Accepted and Unassigned from resource manager 
 side)
 2.Submit application to queue and check the state and link in Application 
 history
 State= null and History link shown as N/A in applicationhistory page
 Kill the same application . In timeline server logs the below is show when 
 selecting application link.
 {quote}
 2015-01-29 15:39:50,956 ERROR org.apache.hadoop.yarn.webapp.View: Failed to 
 read the AM container of the application attempt 
 appattempt_1422467063659_0007_01.
 java.lang.NullPointerException
   at 
 org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getContainer(ApplicationHistoryManagerOnTimelineStore.java:162)
   at 
 org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getAMContainer(ApplicationHistoryManagerOnTimelineStore.java:184)
   at 
 org.apache.hadoop.yarn.server.webapp.AppBlock$3.run(AppBlock.java:160)
   at 
 org.apache.hadoop.yarn.server.webapp.AppBlock$3.run(AppBlock.java:157)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
   at 
 org.apache.hadoop.yarn.server.webapp.AppBlock.render(AppBlock.java:156)
   at 
 org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:67)
   at 
 org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:77)
   at org.apache.hadoop.yarn.webapp.View.render(View.java:235)
   at 
 org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49)
   at 
 org.apache.hadoop.yarn.webapp.hamlet.HamletImpl$EImp._v(HamletImpl.java:117)
   at org.apache.hadoop.yarn.webapp.hamlet.Hamlet$TD._(Hamlet.java:845)
   at 
 org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:56)
   at org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82)
   at org.apache.hadoop.yarn.webapp.Controller.render(Controller.java:212)
   at 
 org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.AHSController.app(AHSController.java:38)
   at sun.reflect.GeneratedMethodAccessor63.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:153)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
   at 
 com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263)
   at 
 com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178)
   at 
 com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91)
   at 
 com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62)
   at 
 com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900)
   at 
 com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834)
   at 
 com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795)
   at 
 com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163)
   at 
 com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58)
   at 
 com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118)
   at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113)
   at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
   at 
 

[jira] [Updated] (YARN-3356) Capacity Scheduler FiCaSchedulerApp should use ResourceUsage to track used-resources-by-label.

2015-03-17 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-3356:
-
Attachment: YARN-3356.1.patch

Attached ver.1 patch.

 Capacity Scheduler FiCaSchedulerApp should use ResourceUsage to track 
 used-resources-by-label.
 --

 Key: YARN-3356
 URL: https://issues.apache.org/jira/browse/YARN-3356
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-3356.1.patch


 Simliar to YARN-3099, Capacity Scheduler's LeafQueue.User/FiCaSchedulerApp 
 should use ResourceRequest to track resource-usage/pending by label for 
 better resource tracking and preemption.
 And also, when application's pending resource changed (container allocated, 
 app completed, moved, etc.), we need update ResourceUsage of queue 
 hierarchies.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (YARN-3339) TestDockerContainerExecutor should pull a single image and not the entire centos repository

2015-03-17 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash closed YARN-3339.
--

 TestDockerContainerExecutor should pull a single image and not the entire 
 centos repository
 ---

 Key: YARN-3339
 URL: https://issues.apache.org/jira/browse/YARN-3339
 Project: Hadoop YARN
  Issue Type: Test
  Components: test
Affects Versions: 2.6.0
 Environment: Linux
Reporter: Ravindra Kumar Naik
Priority: Minor
 Fix For: 2.8.0

 Attachments: YARN-3339-branch-2.6.0.001.patch, 
 YARN-3339-trunk.001.patch


 TestDockerContainerExecutor test pulls the entire centos repository which is 
 time consuming.
 Pulling a specific image (e.g. centos7) will be sufficient to run the test 
 successfully and will save time



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3204) Fix new findbug warnings in hadoop-yarn-server-resourcemanager(resourcemanager.scheduler.fair)

2015-03-17 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365821#comment-14365821
 ] 

zhihai xu commented on YARN-3204:
-

Some comments:
1. about the comment for Inconsistent sync warning fsOpDurations
{code}
Inconsistent sync warning - callDurationMetrics is only initialized once and 
never changed
{code}
It looks like not accurate. Each method from fsOpDurations is only called in 
one thread, all these methods access different fields and are independent.

2.Can we define reloadListener as volatile?
since reloadListener is accessed by two threads, it will be safer to use 
volatile.

3.Can we move the check to the beginning of the functions of reserveResource?
It will be better to check error earlier than later to avoid unnecessary 
operations.
{code}
if (!(application instanceof FSAppAttempt)) {
{code}
Can we use YarnRuntimeException instead of IllegalArgumentException?
This looks like an unexpected runtime exception.

4.adding lock for getAllocationConfiguration is dangerous.
A lot of codes(Queue, FairReservationSystem... ) are calling 
getAllocationConfiguration, which can introduce potential deadlock situation 
and performance issue.
For example, QueueManager#getQueue lock queues then call 
QueueManager#createQueue then call scheduler.getAllocationConfiguration.
This will have two layer locks if we add lock in getAllocationConfiguration.
Can we define allocConf as volatile?  since allocConf will only be updated by 
AllocationReloadListener.onReload which is called from 
AllocationFileLoaderService#reloadThread after initialization. 


 Fix new findbug warnings in 
 hadoop-yarn-server-resourcemanager(resourcemanager.scheduler.fair)
 --

 Key: YARN-3204
 URL: https://issues.apache.org/jira/browse/YARN-3204
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
Priority: Blocker
 Attachments: YARN-3204-001.patch, YARN-3204-002.patch, 
 YARN-3204-003.patch


 Please check following findbug report..
 https://builds.apache.org/job/PreCommit-YARN-Build/6644//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3243) CapacityScheduler should pass headroom from parent to children to make sure ParentQueue obey its capacity limits.

2015-03-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365601#comment-14365601
 ] 

Hudson commented on YARN-3243:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7349 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7349/])
YARN-3243. CapacityScheduler should pass headroom from parent to children to 
make sure ParentQueue obey its capacity limits. Contributed by Wangda Tan. 
(jianhe: rev 487374b7fe0c92fc7eb1406c568952722b5d5b15)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/AbstractCSQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestReservations.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestParentQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestChildQueueOrder.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java


 CapacityScheduler should pass headroom from parent to children to make sure 
 ParentQueue obey its capacity limits.
 -

 Key: YARN-3243
 URL: https://issues.apache.org/jira/browse/YARN-3243
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Fix For: 2.8.0

 Attachments: YARN-3243.1.patch, YARN-3243.2.patch, YARN-3243.3.patch, 
 YARN-3243.4.patch, YARN-3243.5.patch


 Now CapacityScheduler has some issues to make sure ParentQueue always obeys 
 its capacity limits, for example:
 1) When allocating container of a parent queue, it will only check 
 parentQueue.usage  parentQueue.max. If leaf queue allocated a container.size 
  (parentQueue.max - parentQueue.usage), parent queue can excess its max 
 resource limit, as following example:
 {code}
 A  (usage=54, max=55)
/ \
   A1 A2 (usage=1, max=55)
 (usage=53, max=53)
 {code}
 Queue-A2 is able to allocate container since its usage  max, but if we do 
 that, A's usage can excess A.max.
 2) When doing continous reservation check, parent queue will only tell 
 children you need unreserve *some* resource, so that I will less than my 
 maximum resource, but it will not tell how many resource need to be 
 unreserved. This may lead to parent queue excesses configured maximum 
 capacity as well.
 With YARN-3099/YARN-3124, now we have {{ResourceUsage}} class in each class, 
 *here is my proposal*:
 - ParentQueue will set its children's ResourceUsage.headroom, which means, 
 *maximum resource its children can allocate*.
 - ParentQueue will set its children's headroom to be (saying parent's name is 
 qA): min(qA.headroom, qA.max - qA.used). This will make sure qA's 
 ancestors' capacity will be enforced as well (qA.headroom is set by qA's 
 parent).
 - {{needToUnReserve}} is not necessary, instead, children can get how much 
 resource need 

[jira] [Commented] (YARN-3341) Fix findbugs warning:BC_UNCONFIRMED_CAST at FSSchedulerNode.reserveResource

2015-03-17 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365698#comment-14365698
 ] 

zhihai xu commented on YARN-3341:
-

Sorry, I missed  YARN-3204, I resolved this issue as duplicate.
I will review the patch at YARN-3204. [~brahmareddy], thanks to point this out.

 Fix findbugs warning:BC_UNCONFIRMED_CAST at FSSchedulerNode.reserveResource
 ---

 Key: YARN-3341
 URL: https://issues.apache.org/jira/browse/YARN-3341
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Minor
  Labels: findbugs
 Attachments: YARN-3341.000.patch, YARN-3341.001.patch


 Fix findbugs warning:BC_UNCONFIRMED_CAST at FSSchedulerNode.reserveResource
 The warning message is
 {code}
 Unchecked/unconfirmed cast from 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt
  to org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt 
 in 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerNode.reserveResource(SchedulerApplicationAttempt,
  Priority, RMContainer)
 {code}
 The code which cause the warning is
 {code}
 this.reservedAppSchedulable = (FSAppAttempt) application;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3356) Capacity Scheduler FiCaSchedulerApp should use ResourceUsage to track used-resources-by-label.

2015-03-17 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-3356:
-
Description: 
Simliar to YARN-3099, Capacity Scheduler's LeafQueue.User/FiCaSchedulerApp 
should use ResourceRequest to track resource-usage/pending by label for better 
resource tracking and preemption.

And also, when application's pending resource changed (container allocated, app 
completed, moved, etc.), we need update ResourceUsage of queue hierarchies.

  was:Simliar to YARN-3099, Capacity Scheduler's 
LeafQueue.User/FiCaSchedulerApp should use ResourceRequest to track 
resource-usage/pending by label for better resource tracking and preemption.


 Capacity Scheduler FiCaSchedulerApp should use ResourceUsage to track 
 used-resources-by-label.
 --

 Key: YARN-3356
 URL: https://issues.apache.org/jira/browse/YARN-3356
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan

 Simliar to YARN-3099, Capacity Scheduler's LeafQueue.User/FiCaSchedulerApp 
 should use ResourceRequest to track resource-usage/pending by label for 
 better resource tracking and preemption.
 And also, when application's pending resource changed (container allocated, 
 app completed, moved, etc.), we need update ResourceUsage of queue 
 hierarchies.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3111) Fix ratio problem on FairScheduler page

2015-03-17 Thread Ashwin Shankar (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365722#comment-14365722
 ] 

Ashwin Shankar commented on YARN-3111:
--

bq. What do you guys think of overlaying CPU and memory usage, the way steady 
and instantaneous fairshares are laid out today?

I agree with Peng, it is going to become pretty cluttered displaying shares/max 
of each of the resources on the same bar.
Also, if we go by this approach, it would be ambiguous as to which resource is 
the usage bar representing, since the usage bar shows usage of dominant 
resource ie max(memoryRatio, vCoresRatio). Usage bar turns orange when its 
above fairshare, if we represent all the resources in one bar, how do we
know if we are above fair share due to memory or cpu or disk ?

Here is my proposal :
For each queue bar :
1. represent steady/instant/max of only the dominant resource in the bar.
2. usage, like in the patch, will be again usage of dominant resource.
3. In the tooltip, we mention what is the dominant resource we are representing 
in that queue([memory,cpu]). Note that dominant resource displayed can be 
memory in one queue and something else in another.
4. We already display steady/instant memory percentage in the tool tip, we 
could add cpu % of steady/instant/max in there as well, so that we know details
of each of the resource.

Thoughts ?

 Fix ratio problem on FairScheduler page
 ---

 Key: YARN-3111
 URL: https://issues.apache.org/jira/browse/YARN-3111
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.6.0
Reporter: Peng Zhang
Assignee: Peng Zhang
Priority: Minor
 Attachments: YARN-3111.1.patch, YARN-3111.png


 Found 3 problems on FairScheduler page:
 1. Only compute memory for ratio even when queue schedulingPolicy is DRF.
 2. When min resources is configured larger than real resources, the steady 
 fair share ratio is so long that it is out the page.
 3. When cluster resources is 0(no nodemanager start), ratio is displayed as 
 NaN% used
 Attached image shows the snapshot of above problems. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3293) Track and display capacity scheduler health metrics in web UI

2015-03-17 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365765#comment-14365765
 ] 

Vinod Kumar Vavilapalli commented on YARN-3293:
---

+1 for doing this.

We do not need to be specific to any scheduler.

Had an offline discussion with [~vvasudev] and [~venkateshrin]. Here are a few 
things worth highlighting
 - Last Scheduling Cycle: Timestamp, Allocated Resource (MB, VCores), Reserved 
Resource, Released Resource
 - Number of Allocations so far
 - Last Allocation: Timestamp, Node, Container, Queue
 - Number of Resource releases so far
 - Last Resource Release: Timestamp, Container, Queue
 - Number of Preemptions so far
 - Last Preemption: Timestamp, Container, Queue
 - Number of fulfilled reservations so far. Fulfilled reservations do not 
include our internal (unreserve + reserve) operations
 - Last Reservation: Timestamp, Node, Container, Queue
 - Others: Maximum current node size (MB, Cores) in the cluster
 - Configuration: Cluster level Minimum Allocation MB, Maximum Allocation MB, 
similar values for cores

 Track and display capacity scheduler health metrics in web UI
 -

 Key: YARN-3293
 URL: https://issues.apache.org/jira/browse/YARN-3293
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Reporter: Varun Vasudev
Assignee: Varun Vasudev

 It would be good to display metrics that let users know about the health of 
 the capacity scheduler in the web UI. Today it is hard to get an idea if the 
 capacity scheduler is functioning correctly. Metrics such as the time for the 
 last allocation, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery

2015-03-17 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-3039:
-
Attachment: YARN-3039-v8.patch

Incorporate [~zjshen]'s comments in v8 patch. For TestRPC, lets keep it there 
given it works fine in yarn-server-common.
Also, verify that end-to-end test for TestDistributedShell get passed.

 [Aggregator wireup] Implement ATS app-appgregator service discovery
 ---

 Key: YARN-3039
 URL: https://issues.apache.org/jira/browse/YARN-3039
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Junping Du
 Attachments: Service Binding for applicationaggregator of ATS 
 (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, 
 YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, 
 YARN-3039-v3-core-changes-only.patch, YARN-3039-v4.patch, YARN-3039-v5.patch, 
 YARN-3039-v6.patch, YARN-3039-v7.patch, YARN-3039-v8.patch


 Per design in YARN-2928, implement ATS writer service discovery. This is 
 essential for off-node clients to send writes to the right ATS writer. This 
 should also handle the case of AM failures.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3205) FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration.

2015-03-17 Thread Tsuyoshi Ozawa (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi Ozawa updated YARN-3205:
-
Affects Version/s: 2.7.0

 FileSystemRMStateStore should disable FileSystem Cache to avoid get a 
 Filesystem with an old configuration.
 ---

 Key: YARN-3205
 URL: https://issues.apache.org/jira/browse/YARN-3205
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: zhihai xu
Assignee: zhihai xu
 Fix For: 2.8.0

 Attachments: YARN-3205.000.patch, YARN-3205.001.patch


 FileSystemRMStateStore should disable FileSystem Cache to avoid get a 
 Filesystem with an old configuration. The old configuration may not have all 
 these customized DFS_CLIENT configurations for FileSystemRMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3205) FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration.

2015-03-17 Thread Tsuyoshi Ozawa (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi Ozawa updated YARN-3205:
-
Fix Version/s: 2.8.0

 FileSystemRMStateStore should disable FileSystem Cache to avoid get a 
 Filesystem with an old configuration.
 ---

 Key: YARN-3205
 URL: https://issues.apache.org/jira/browse/YARN-3205
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: zhihai xu
Assignee: zhihai xu
 Fix For: 2.8.0

 Attachments: YARN-3205.000.patch, YARN-3205.001.patch


 FileSystemRMStateStore should disable FileSystem Cache to avoid get a 
 Filesystem with an old configuration. The old configuration may not have all 
 these customized DFS_CLIENT configurations for FileSystemRMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3356) Capacity Scheduler FiCaSchedulerApp should use ResourceUsage to track used-resources-by-label.

2015-03-17 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-3356:
-
Attachment: YARN-3356.2.patch

 Capacity Scheduler FiCaSchedulerApp should use ResourceUsage to track 
 used-resources-by-label.
 --

 Key: YARN-3356
 URL: https://issues.apache.org/jira/browse/YARN-3356
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-3356.1.patch, YARN-3356.2.patch


 Simliar to YARN-3099, Capacity Scheduler's LeafQueue.User/FiCaSchedulerApp 
 should use ResourceRequest to track resource-usage/pending by label for 
 better resource tracking and preemption. 
 And also, when application's pending resource changed (container allocated, 
 app completed, moved, etc.), we need update ResourceUsage of queue 
 hierarchies.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3205) FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration.

2015-03-17 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366438#comment-14366438
 ] 

zhihai xu commented on YARN-3205:
-

[~ozawa], thanks for the review. That is a very good suggestion.
I uploaded a new patch YARN-3205.001.patch which addressed your comment.
The only difference is I didn't call stop because stop will call closeInternal 
to close the fs, which will make the test pass without the change.

 FileSystemRMStateStore should disable FileSystem Cache to avoid get a 
 Filesystem with an old configuration.
 ---

 Key: YARN-3205
 URL: https://issues.apache.org/jira/browse/YARN-3205
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-3205.000.patch, YARN-3205.001.patch


 FileSystemRMStateStore should disable FileSystem Cache to avoid get a 
 Filesystem with an old configuration. The old configuration may not have all 
 these customized DFS_CLIENT configurations for FileSystemRMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3205) FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration.

2015-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366516#comment-14366516
 ] 

Hadoop QA commented on YARN-3205:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12705236/YARN-3205.001.patch
  against trunk revision 968425e.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7010//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7010//console

This message is automatically generated.

 FileSystemRMStateStore should disable FileSystem Cache to avoid get a 
 Filesystem with an old configuration.
 ---

 Key: YARN-3205
 URL: https://issues.apache.org/jira/browse/YARN-3205
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-3205.000.patch, YARN-3205.001.patch


 FileSystemRMStateStore should disable FileSystem Cache to avoid get a 
 Filesystem with an old configuration. The old configuration may not have all 
 these customized DFS_CLIENT configurations for FileSystemRMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3205) FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration.

2015-03-17 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366542#comment-14366542
 ] 

Tsuyoshi Ozawa commented on YARN-3205:
--

+1, committing this shortly.

 FileSystemRMStateStore should disable FileSystem Cache to avoid get a 
 Filesystem with an old configuration.
 ---

 Key: YARN-3205
 URL: https://issues.apache.org/jira/browse/YARN-3205
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: zhihai xu
Assignee: zhihai xu
 Attachments: YARN-3205.000.patch, YARN-3205.001.patch


 FileSystemRMStateStore should disable FileSystem Cache to avoid get a 
 Filesystem with an old configuration. The old configuration may not have all 
 these customized DFS_CLIENT configurations for FileSystemRMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery

2015-03-17 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-3039:
--
Attachment: YARN-3039.9.patch

Thanks for addressing the comments. I made some minor touch on the patch to fix 
some method signatures, and make old put method still use retryfilter. I'll 
commit the patch a bit later in case [~sjlee0] wants to take a look at the 
patch too.

 [Aggregator wireup] Implement ATS app-appgregator service discovery
 ---

 Key: YARN-3039
 URL: https://issues.apache.org/jira/browse/YARN-3039
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Junping Du
 Attachments: Service Binding for applicationaggregator of ATS 
 (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, 
 YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, 
 YARN-3039-v3-core-changes-only.patch, YARN-3039-v4.patch, YARN-3039-v5.patch, 
 YARN-3039-v6.patch, YARN-3039-v7.patch, YARN-3039-v8.patch, YARN-3039.9.patch


 Per design in YARN-2928, implement ATS writer service discovery. This is 
 essential for off-node clients to send writes to the right ATS writer. This 
 should also handle the case of AM failures.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3205) FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration.

2015-03-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366553#comment-14366553
 ] 

Hudson commented on YARN-3205:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #7354 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7354/])
YARN-3205. FileSystemRMStateStore should disable FileSystem Cache to avoid get 
a Filesystem with an old configuration. Contributed by Zhihai Xu. (ozawa: rev 
3bc72cc16d8c7b8addd8f565523001dfcc32b891)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestFSRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java
* hadoop-yarn-project/CHANGES.txt


 FileSystemRMStateStore should disable FileSystem Cache to avoid get a 
 Filesystem with an old configuration.
 ---

 Key: YARN-3205
 URL: https://issues.apache.org/jira/browse/YARN-3205
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: zhihai xu
Assignee: zhihai xu
 Fix For: 2.8.0

 Attachments: YARN-3205.000.patch, YARN-3205.001.patch


 FileSystemRMStateStore should disable FileSystem Cache to avoid get a 
 Filesystem with an old configuration. The old configuration may not have all 
 these customized DFS_CLIENT configurations for FileSystemRMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3205) FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration.

2015-03-17 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366554#comment-14366554
 ] 

zhihai xu commented on YARN-3205:
-

Thanks [~ozawa] for valuable feedback and committing the patch!

 FileSystemRMStateStore should disable FileSystem Cache to avoid get a 
 Filesystem with an old configuration.
 ---

 Key: YARN-3205
 URL: https://issues.apache.org/jira/browse/YARN-3205
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: zhihai xu
Assignee: zhihai xu
 Fix For: 2.8.0

 Attachments: YARN-3205.000.patch, YARN-3205.001.patch


 FileSystemRMStateStore should disable FileSystem Cache to avoid get a 
 Filesystem with an old configuration. The old configuration may not have all 
 these customized DFS_CLIENT configurations for FileSystemRMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3181) FairScheduler: Fix up outdated findbugs issues

2015-03-17 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366560#comment-14366560
 ] 

Brahma Reddy Battula commented on YARN-3181:


I will take up this issue, since I worked on YARN-3204...

 FairScheduler: Fix up outdated findbugs issues
 --

 Key: YARN-3181
 URL: https://issues.apache.org/jira/browse/YARN-3181
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: yarn-3181-1.patch


 In FairScheduler, we have excluded some findbugs-reported errors. Some of 
 them aren't applicable anymore, and there are a few that can be easily fixed 
 without needing an exclusion. It would be nice to fix them. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-3181) FairScheduler: Fix up outdated findbugs issues

2015-03-17 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula reassigned YARN-3181:
--

Assignee: Brahma Reddy Battula  (was: Karthik Kambatla)

 FairScheduler: Fix up outdated findbugs issues
 --

 Key: YARN-3181
 URL: https://issues.apache.org/jira/browse/YARN-3181
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Karthik Kambatla
Assignee: Brahma Reddy Battula
 Attachments: yarn-3181-1.patch


 In FairScheduler, we have excluded some findbugs-reported errors. Some of 
 them aren't applicable anymore, and there are a few that can be easily fixed 
 without needing an exclusion. It would be nice to fix them. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3356) Capacity Scheduler FiCaSchedulerApp should use ResourceUsage to track used-resources-by-label.

2015-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366507#comment-14366507
 ] 

Hadoop QA commented on YARN-3356:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12705234/YARN-3356.2.patch
  against trunk revision 968425e.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.security.TestRMDelegationTokens

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7009//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/7009//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7009//console

This message is automatically generated.

 Capacity Scheduler FiCaSchedulerApp should use ResourceUsage to track 
 used-resources-by-label.
 --

 Key: YARN-3356
 URL: https://issues.apache.org/jira/browse/YARN-3356
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-3356.1.patch, YARN-3356.2.patch


 Simliar to YARN-3099, Capacity Scheduler's LeafQueue.User/FiCaSchedulerApp 
 should use ResourceRequest to track resource-usage/pending by label for 
 better resource tracking and preemption. 
 And also, when application's pending resource changed (container allocated, 
 app completed, moved, etc.), we need update ResourceUsage of queue 
 hierarchies.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle

2015-03-17 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366514#comment-14366514
 ] 

Zhijie Shen commented on YARN-3047:
---

Some comments about the patch:

1. No need to change {{timeline/TimelineEvents.java}}.

2. In YarnConfiguration, how about we still reusing the existing timeline 
service config. I propose config reuse because there doesn't exist the use case 
that we start old timeline server and the new timeline reader server together. 
And change in WebAppUtils should be not necessary too.

3. NameValuePair is for internal usage only. Let's keep it in the timeline 
service module?

4. Rename TimelineReaderStore to TimelineReader. I think we don't need to have 
NullTimelineReader. Instead, we should have a POC implementation based on local 
FS like FileSystemTimelineWriterImpl. But we can defer this work in a separate 
jira if the implementation is not straightforward.

5. TimelineReaderServer - TimelineWebServer? For startTimelineReaderWebApp, 
can we do something similar to TimelineAggregatorsCollection#startWebApp.

6. Add the command in yarn and yarn.cmd to start the server.

 [Data Serving] Set up ATS reader with basic request serving structure and 
 lifecycle
 ---

 Key: YARN-3047
 URL: https://issues.apache.org/jira/browse/YARN-3047
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Varun Saxena
 Attachments: YARN-3047.001.patch, YARN-3047.02.patch


 Per design in YARN-2938, set up the ATS reader as a service and implement the 
 basic structure as a service. It includes lifecycle management, request 
 serving, and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery

2015-03-17 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366558#comment-14366558
 ] 

Sangjin Lee commented on YARN-3039:
---

I took a look at patch v.8. LGTM. Thanks much for your work [~djp]!

 [Aggregator wireup] Implement ATS app-appgregator service discovery
 ---

 Key: YARN-3039
 URL: https://issues.apache.org/jira/browse/YARN-3039
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Junping Du
 Attachments: Service Binding for applicationaggregator of ATS 
 (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, 
 YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, 
 YARN-3039-v3-core-changes-only.patch, YARN-3039-v4.patch, YARN-3039-v5.patch, 
 YARN-3039-v6.patch, YARN-3039-v7.patch, YARN-3039-v8.patch, YARN-3039.9.patch


 Per design in YARN-2928, implement ATS writer service discovery. This is 
 essential for off-node clients to send writes to the right ATS writer. This 
 should also handle the case of AM failures.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)

2015-03-17 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366562#comment-14366562
 ] 

Naganarasimha G R commented on YARN-2495:
-

1. Well i think i am missing something or you have miss-read my patch.
* in both request Protos i have used NodeIdToLabelsProto  {{optional 
NodeIdToLabelsProto nodeLabels = 8;}} and {{optional NodeIdToLabelsProto 
nodeLabels = 4;}} and not used {{repeat string nodeLabels}} directly
* Test cases TestYarnServerApiClasses.testNodeHeartbeatRequestPBImpl, 
testNodeHeartbeatRequestPBImplWithNullLabels, 
testRegisterNodeManagerRequestWithNullLabels and 
testRegisterNodeManagerRequestWithValidLabels validates that the approach in 
the patch supports null and filled labelsset, manually tested for empty set and 
as expected that too worked. will get this test case added in next patch
* thought of creating a new proto like {{StringSetProto}} but felt why create 
another proto class just for this purpose and you too had mentioned to use 
{{NodeIdToLabelsProto}} hence made use of existing proto class 


2. {{Typo, lable - label,}} : oops because of typo in proto, generated methods 
also had issues hence proto and places accessing these methods(6 instances) 
have this error. Will get it corrected in next patch.

3. ??optional bool areNodeLablesAcceptedByRM = 7 \[default = false\], I think 
default should be true.??
Personally i felt it should not matter as I am explicitly handling in the code. 
But consider the case where NM gets upgraded first then it should not be the 
case NM sends labels and older RM ignores the additional labels but response by 
default sends labels are accepted. And also felt by name/functionality, it 
should be set to true only after RM accepts the labels  

4,5,6
--will get it corrected as part of next patch.

Also one favor, it would be helpful if you review the test-cases and give feed 
back on them too, as it will reduce my effort in creating multiple patches. I 
understand that its huge size patch but anyway i feel major 
aspects/functionality seems to be stable with the last patch. 


 Allow admin specify labels from each NM (Distributed configuration)
 ---

 Key: YARN-2495
 URL: https://issues.apache.org/jira/browse/YARN-2495
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Naganarasimha G R
 Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
 YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
 YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
 YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, 
 YARN-2495.20150305-1.patch, YARN-2495.20150309-1.patch, 
 YARN-2495.20150318-1.patch, YARN-2495_20141022.1.patch


 Target of this JIRA is to allow admin specify labels in each NM, this covers
 - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
 using script suggested by [~aw] (YARN-2729) )
 - NM will send labels to RM via ResourceTracker API
 - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)

2015-03-17 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366312#comment-14366312
 ] 

Wangda Tan commented on YARN-2495:
--

Thanks for updating.

1. I saw you're trying to make {{repeat string nodeLabels}} to be 3 different 
values: null, empty, and non-empty. But I'm not sure if this works in PB, could 
you write a test to verify that? (Not-set/Set-nothing/Set-value to 
RegisterNodeManagerRequestPBImpl.nodeLabels, create a new 
RegisterNodeManagerRequestPBImpl from old.getProto(), and get nodeLabels() to 
see if it works).

If this doesn't work, you can create a PB message like StringSetProto use it in 
messages like RegisterNodeManagerRequest, which can support null/empty/non-empty

2. Typo, lable - label, I found several in your patch.

3. optional bool areNodeLablesAcceptedByRM = 7 \[default = false\], I think 
default should be true.

4. NodeStatusUpdaterImpl: no need to call nodeLabelsProvider.getNodeLabels() 
twice when register/heartbeat

5. HeartBeat - Heartbeat

6. NodeStatusUpdaterImpl: When labels are rejected by RM, you should log it 
with diag message.

Will include test review in next round.


 Allow admin specify labels from each NM (Distributed configuration)
 ---

 Key: YARN-2495
 URL: https://issues.apache.org/jira/browse/YARN-2495
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Naganarasimha G R
 Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
 YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
 YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
 YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, 
 YARN-2495.20150305-1.patch, YARN-2495.20150309-1.patch, 
 YARN-2495.20150318-1.patch, YARN-2495_20141022.1.patch


 Target of this JIRA is to allow admin specify labels in each NM, this covers
 - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
 using script suggested by [~aw] (YARN-2729) )
 - NM will send labels to RM via ResourceTracker API
 - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2556) Tool to measure the performance of the timeline server

2015-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366357#comment-14366357
 ] 

Hadoop QA commented on YARN-2556:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12705194/YARN-2556.2.patch
  against trunk revision 968425e.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7005//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7005//console

This message is automatically generated.

 Tool to measure the performance of the timeline server
 --

 Key: YARN-2556
 URL: https://issues.apache.org/jira/browse/YARN-2556
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Chang Li
 Attachments: YARN-2556-WIP.patch, YARN-2556-WIP.patch, 
 YARN-2556.1.patch, YARN-2556.2.patch, YARN-2556.patch, yarn2556.patch, 
 yarn2556.patch, yarn2556_wip.patch


 We need to be able to understand the capacity model for the timeline server 
 to give users the tools they need to deploy a timeline server with the 
 correct capacity.
 I propose we create a mapreduce job that can measure timeline server write 
 and read performance. Transactions per second, I/O for both read and write 
 would be a good start.
 This could be done as an example or test job that could be tied into gridmix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3360) Add JMX metrics to TimelineDataManager

2015-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366288#comment-14366288
 ] 

Hadoop QA commented on YARN-3360:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12705199/YARN-3360.001.patch
  against trunk revision 968425e.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7008//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7008//console

This message is automatically generated.

 Add JMX metrics to TimelineDataManager
 --

 Key: YARN-3360
 URL: https://issues.apache.org/jira/browse/YARN-3360
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: timelineserver
Affects Versions: 2.6.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-3360.001.patch


 The TimelineDataManager currently has no metrics, outside of the standard JVM 
 metrics.  It would be very useful to at least log basic counts of method 
 calls, time spent in those calls, and number of entities/events involved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1963) Support priorities across applications within the same queue

2015-03-17 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366194#comment-14366194
 ] 

Wangda Tan commented on YARN-1963:
--

bq. I feel we can make such label config in a common place which can be 
accessible for any schedulers.
Agree, this should be a part of YARN configuration. I put it as a part of queue 
config just for readability for the proposal :).

 Support priorities across applications within the same queue 
 -

 Key: YARN-1963
 URL: https://issues.apache.org/jira/browse/YARN-1963
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: api, resourcemanager
Reporter: Arun C Murthy
Assignee: Sunil G
 Attachments: 0001-YARN-1963-prototype.patch, YARN Application 
 Priorities Design.pdf, YARN Application Priorities Design_01.pdf


 It will be very useful to support priorities among applications within the 
 same queue, particularly in production scenarios. It allows for finer-grained 
 controls without having to force admins to create a multitude of queues, plus 
 allows existing applications to continue using existing queues which are 
 usually part of institutional memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3040) [Data Model] Implement client-side API for handling flows

2015-03-17 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366338#comment-14366338
 ] 

Zhijie Shen commented on YARN-3040:
---

[~rkanter], would you mind my taking over this jira to move it forward?

 [Data Model] Implement client-side API for handling flows
 -

 Key: YARN-3040
 URL: https://issues.apache.org/jira/browse/YARN-3040
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Robert Kanter

 Per design in YARN-2928, implement client-side API for handling *flows*. 
 Frameworks should be able to define and pass in all attributes of flows and 
 flow runs to YARN, and they should be passed into ATS writers.
 YARN tags were discussed as a way to handle this piece of information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2003) Support to process Job priority from Submission Context in AppAttemptAddedSchedulerEvent [RM side]

2015-03-17 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366273#comment-14366273
 ] 

Wangda Tan commented on YARN-2003:
--

Hi [~sunilg],
Thanks for working on this. I took a quick look at YARN-2003 / YARN-2004. Some 
overall comments.

First I want to describe my thinkings of how RM side workflow look like:
- When application submit to RMAppManager, it will simply pass priority set by 
user to RMApp (see (1))
- RMApp will finally create APP_ATTEMPT_ADDED, queue itself will normalize 
priority (reject it / convert it from label to number, etc. And set new 
priority to RMApp), set priority to SchedulerApplicationAttempt
- Scheduler uses priority in SchedulerApplicationAttempt and Queue to make 
scheduling decisions
- If user ask for changing application priority, or admin changes priority 
configuration, event may need send to scheduler to update inner 
applications/queues.
- When user requires priority of application via CLI/Web-UI, 
ApplicationPriorityManager will convert number to label (if possible) and show 
to user.

(1). We don't have to add too much logic here, and if we can simply handle it 
inside scheduler, when configuration changed (like label-integer priority 
mapping changed), it can be handled by scheduler itself.

Back to your patches, several major differences are:
- RMAppManager takes responsibility to check ACL for priority, which I think 
not proper (ACL is always managed by scheduler queues) and not correct (when 
queue configuration changed, you cannot do rechecking application priority if 
new app attempt created.).
- As same as above, queues should check ACLs for priority.
- You may not need add priority to SchedulerApplication, it seems not necessary 
to me.

 Support to process Job priority from Submission Context in 
 AppAttemptAddedSchedulerEvent [RM side]
 --

 Key: YARN-2003
 URL: https://issues.apache.org/jira/browse/YARN-2003
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Sunil G
Assignee: Sunil G
 Attachments: 0001-YARN-2003.patch, 0002-YARN-2003.patch, 
 0003-YARN-2003.patch, 0004-YARN-2003.patch


 AppAttemptAddedSchedulerEvent should be able to receive the Job Priority from 
 Submission Context and store.
 Later this can be used by Scheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2003) Support to process Job priority from Submission Context in AppAttemptAddedSchedulerEvent [RM side]

2015-03-17 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366282#comment-14366282
 ] 

Wangda Tan commented on YARN-2003:
--

In beyond, I suggest you can divide YARN-2003/YARN-2004 to be:
YARN-2003:
- Changes of RM side
- Changes of scheduler Queue interface (May need some empty implementations in 
specific schedulers).
- New scheduler event

YARN-2003 should be able to compile/test without YARN-2004

YARN-2004 tracks changes only in CapacityScheduler side and need YARN-2004 to 
compile/test. 

Which this, I can simply apply two patches to do review.

 Support to process Job priority from Submission Context in 
 AppAttemptAddedSchedulerEvent [RM side]
 --

 Key: YARN-2003
 URL: https://issues.apache.org/jira/browse/YARN-2003
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Sunil G
Assignee: Sunil G
 Attachments: 0001-YARN-2003.patch, 0002-YARN-2003.patch, 
 0003-YARN-2003.patch, 0004-YARN-2003.patch


 AppAttemptAddedSchedulerEvent should be able to receive the Job Priority from 
 Submission Context and store.
 Later this can be used by Scheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3034) [Aggregator wireup] Implement RM starting its ATS writer

2015-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366202#comment-14366202
 ] 

Hadoop QA commented on YARN-3034:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12705190/YARN-3034.20150318-1.patch
  against trunk revision 968425e.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7006//console

This message is automatically generated.

 [Aggregator wireup] Implement RM starting its ATS writer
 

 Key: YARN-3034
 URL: https://issues.apache.org/jira/browse/YARN-3034
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Naganarasimha G R
 Attachments: YARN-3034-20150312-1.patch, YARN-3034.20150205-1.patch, 
 YARN-3034.20150316-1.patch, YARN-3034.20150318-1.patch


 Per design in YARN-2928, implement resource managers starting their own ATS 
 writers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2693) Priority Label Manager in RM to manage application priority based on configuration

2015-03-17 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366225#comment-14366225
 ] 

Wangda Tan commented on YARN-2693:
--

Some overall suggestions:
1) Instead of ApplicationPriorityPerQueue, queue's priority related fields 
could store in scheduler.Queue directly (methods of scheduler.Queue, 
implementation for different scheduler could be various. Since we have 
different scheduler configuration). Benefits of doing this:
- All other queue-specifc configurations are in schedulers' own configuration 
files, make application-priority fields for queue storing out-of queue means 
you have to sync it with queue's configuration when you do refreshQueues, etc.
- Put it in scheduler.Queue can make scheduler changes easier (don't have to 
access ApplicationPriorityManager).

2) Methods of ApplicationPriorityManager:
- Since we're discussing how to configure priority, I will review 
ApplicationPriorityManager implementations once we close design.
- ClusterPriorities should be a range (If we start from zero, a maxPriority 
will be enough)
- getApplicationPriorityFromQueue should not exist, all queue related methods 
should be in scheduler.Queue
- isPriorityExistsInCluster maybe not existed, it should be something like 
accepted.
- Can be reinitialize
- Can convert between number/label

 Priority Label Manager in RM to manage application priority based on 
 configuration
 --

 Key: YARN-2693
 URL: https://issues.apache.org/jira/browse/YARN-2693
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Sunil G
Assignee: Sunil G
 Attachments: 0001-YARN-2693.patch, 0002-YARN-2693.patch, 
 0003-YARN-2693.patch, 0004-YARN-2693.patch, 0005-YARN-2693.patch, 
 0006-YARN-2693.patch


 Focus of this JIRA is to have a centralized service to handle priority labels.
 Support operations such as
 * Add/Delete priority label to a specified queue
 * Manage integer mapping associated with each priority label
 * Support managing default priority label of a given queue
 * Expose interface to RM to validate priority label
 TO have simplified interface, Priority Manager will support only 
 configuration file in contrast with admin cli and REST. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)

2015-03-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366236#comment-14366236
 ] 

Hadoop QA commented on YARN-2495:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12705179/YARN-2495.20150318-1.patch
  against trunk revision 968425e.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/7003//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7003//console

This message is automatically generated.

 Allow admin specify labels from each NM (Distributed configuration)
 ---

 Key: YARN-2495
 URL: https://issues.apache.org/jira/browse/YARN-2495
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Naganarasimha G R
 Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
 YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
 YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, 
 YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, 
 YARN-2495.20150305-1.patch, YARN-2495.20150309-1.patch, 
 YARN-2495.20150318-1.patch, YARN-2495_20141022.1.patch


 Target of this JIRA is to allow admin specify labels in each NM, this covers
 - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or 
 using script suggested by [~aw] (YARN-2729) )
 - NM will send labels to RM via ResourceTracker API
 - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3363) add localization and container launch time to ContainerMetrics at NM to show these timing information for each active container.

2015-03-17 Thread zhihai xu (JIRA)
zhihai xu created YARN-3363:
---

 Summary: add localization and container launch time to 
ContainerMetrics at NM to show these timing information for each active 
container.
 Key: YARN-3363
 URL: https://issues.apache.org/jira/browse/YARN-3363
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: zhihai xu
Assignee: zhihai xu


add localization and container launch time to ContainerMetrics at NM to show 
these timing information for each active container.
Currently ContainerMetrics has container's actual memory usage(YARN-2984),  
actual CPU usage(YARN-3122), resource  and pid(YARN-3022). It will be better to 
have localization and container launch time in ContainerMetrics for each active 
container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >