[jira] [Commented] (YARN-1001) YARN should provide per application-type and state statistics

2013-08-08 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13733219#comment-13733219
 ] 

Zhijie Shen commented on YARN-1001:
---

[~srimanth.gunturi], let me reword the requirement. Giving some application 
types and states, Ambari wants to categorize the applications into the buckets 
of all combinations of these application types and states, and count the number 
of the applications in each bucket. For example, users want to know the number 
of the applications of three application types: type1, type2, and type3, and 
two states: state1 and state2. Assume RM has 5 applications: app1(type1, 
state1), app2(type2, state1), app3(type2, state1), app4(type2, state2), 
app5(type3, state1). The users will get the following statistics:

[type1, state1]: 1
[type1, state2]: 0
[type2, state1]: 2
[type2, state2]: 1
[type3, state1]: 1
[type3, state2]: 0

Is this exactly what Ambari wants?


 YARN should provide per application-type and state statistics
 -

 Key: YARN-1001
 URL: https://issues.apache.org/jira/browse/YARN-1001
 Project: Hadoop YARN
  Issue Type: Task
  Components: api
Affects Versions: 2.1.0-beta
Reporter: Srimanth Gunturi

 In Ambari we plan to show for MR2 the number of applications finished, 
 running, waiting, etc. It would be efficient if YARN could provide per 
 application-type and state aggregated counts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests

2013-08-08 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13733249#comment-13733249
 ] 

Junping Du commented on YARN-1042:
--

Yes. It is pretty useful in cases specified in description. With this info of 
affinity and anti-affinity, AM should have knowledge to translate container 
request to resource request and ask for RM.

 add ability to specify affinity/anti-affinity in container requests
 ---

 Key: YARN-1042
 URL: https://issues.apache.org/jira/browse/YARN-1042
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: resourcemanager
Affects Versions: 3.0.0
Reporter: Steve Loughran

 container requests to the AM should be able to request anti-affinity to 
 ensure that things like Region Servers don't come up on the same failure 
 zones. 
 Similarly, you may be able to want to specify affinity to same host or rack 
 without specifying which specific host/rack. Example: bringing up a small 
 giraph cluster in a large YARN cluster would benefit from having the 
 processes in the same rack purely for bandwidth reasons.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-353) Add Zookeeper-based store implementation for RMStateStore

2013-08-08 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13733576#comment-13733576
 ] 

Hitesh Shah commented on YARN-353:
--

bq. For deleteWithRetries, the return code of exists() could be checked if a 
delete is required or not. this depends on whether RM wants to know the delete 
operation succeeds or not.

I am not sure I understand. If the RM is trying to delete something and the 
node does not exist, is there a situation where the RM wants to know that the 
node didn't exist and fail if a non-existent node was tried to be deleted?

 Add Zookeeper-based store implementation for RMStateStore
 -

 Key: YARN-353
 URL: https://issues.apache.org/jira/browse/YARN-353
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Hitesh Shah
Assignee: Bikas Saha
 Attachments: YARN-353.10.patch, YARN-353.11.patch, YARN-353.1.patch, 
 YARN-353.2.patch, YARN-353.3.patch, YARN-353.4.patch, YARN-353.5.patch, 
 YARN-353.6.patch, YARN-353.7.patch, YARN-353.8.patch, YARN-353.9.patch


 Add store that write RM state data to ZK

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-337) RM handles killed application tracking URL poorly

2013-08-08 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-337:


Attachment: YARN-337.patch

Patch that sets the tracking URL to the RM app page when an AM attempt is 
killed.  Also refactored the places where this was done for FAILED attempts to 
better cover all the various ways an AM attempt can fail.

As for the unregister attempt failure, I'm tempted to leave that as-is since 
there will always be races between YARN-level kill/fail and apps unregistering. 
 As long as we point to the RM app page when something goes wrong, at least the 
user has something to start with to diagnose the problem rather than a bad link 
to nowhere.

 RM handles killed application tracking URL poorly
 -

 Key: YARN-337
 URL: https://issues.apache.org/jira/browse/YARN-337
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.0.2-alpha, 0.23.5
Reporter: Jason Lowe
  Labels: usability
 Attachments: YARN-337.patch


 When the ResourceManager kills an application, it leaves the proxy URL 
 redirecting to the original tracking URL for the application even though the 
 ApplicationMaster is no longer there to service it.  It should redirect it 
 somewhere more useful, like the RM's web page for the application, where the 
 user can find that the application was killed and links to the AM logs.
 In addition, sometimes the AM during teardown from the kill can attempt to 
 unregister and provide an updated tracking URL, but unfortunately the RM has 
 forgotten the AM due to the kill and refuses to process the unregistration. 
  Instead it logs:
 {noformat}
 2013-01-09 17:37:49,671 [IPC Server handler 2 on 8030] ERROR
 org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: 
 AppAttemptId doesnt exist in cache appattempt_1357575694478_28614_01
 {noformat}
 It should go ahead and process the unregistration to update the tracking URL 
 since the application offered it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1001) YARN should provide per application-type and state statistics

2013-08-08 Thread Srimanth Gunturi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13733588#comment-13733588
 ] 

Srimanth Gunturi commented on YARN-1001:


[~zjshen], yes.

 YARN should provide per application-type and state statistics
 -

 Key: YARN-1001
 URL: https://issues.apache.org/jira/browse/YARN-1001
 Project: Hadoop YARN
  Issue Type: Task
  Components: api
Affects Versions: 2.1.0-beta
Reporter: Srimanth Gunturi

 In Ambari we plan to show for MR2 the number of applications finished, 
 running, waiting, etc. It would be efficient if YARN could provide per 
 application-type and state aggregated counts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-337) RM handles killed application tracking URL poorly

2013-08-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13733612#comment-13733612
 ] 

Hadoop QA commented on YARN-337:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12596859/YARN-337.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1675//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1675//console

This message is automatically generated.

 RM handles killed application tracking URL poorly
 -

 Key: YARN-337
 URL: https://issues.apache.org/jira/browse/YARN-337
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.0.2-alpha, 0.23.5
Reporter: Jason Lowe
Assignee: Jason Lowe
  Labels: usability
 Attachments: YARN-337.patch


 When the ResourceManager kills an application, it leaves the proxy URL 
 redirecting to the original tracking URL for the application even though the 
 ApplicationMaster is no longer there to service it.  It should redirect it 
 somewhere more useful, like the RM's web page for the application, where the 
 user can find that the application was killed and links to the AM logs.
 In addition, sometimes the AM during teardown from the kill can attempt to 
 unregister and provide an updated tracking URL, but unfortunately the RM has 
 forgotten the AM due to the kill and refuses to process the unregistration. 
  Instead it logs:
 {noformat}
 2013-01-09 17:37:49,671 [IPC Server handler 2 on 8030] ERROR
 org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: 
 AppAttemptId doesnt exist in cache appattempt_1357575694478_28614_01
 {noformat}
 It should go ahead and process the unregistration to update the tracking URL 
 since the application offered it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1036) Distributed Cache gives inconsistent result if cache files get deleted from task tracker

2013-08-08 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13733639#comment-13733639
 ] 

Ravi Prakash commented on YARN-1036:


Hi Omkar!
Thanks a lot for pointing out the problem in the earlier patch. 

Regarding the changes you are proposing, I meant for this JIRA to simply be a 
backport of MAPREDUCE-4342. I wasn't able to re-open that JIRA because it has 
already been closed (hence I had to file this new JIRA).

If you have spotted a problem with the current patch, I would welcome your 
suggested changes. However if you have an issue with the approach, I would 
request you to please pursue them in a separate JIRA as they lie outside the 
scope of simple backporting. Most of this code is already in trunk as is.

Please let me know if this is acceptable to you.

 Distributed Cache gives inconsistent result if cache files get deleted from 
 task tracker 
 -

 Key: YARN-1036
 URL: https://issues.apache.org/jira/browse/YARN-1036
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 0.23.9
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: YARN-1036.branch-0.23.patch, YARN-1036.branch-0.23.patch


 This is a JIRA to backport MAPREDUCE-4342. I had to open a new JIRA because 
 that one had been closed. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1033) Expose RM active/standby state to web UI and metrics

2013-08-08 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13733681#comment-13733681
 ] 

Bikas Saha commented on YARN-1033:
--

This would depend on the implementation choice for YARN-1027. If we dont start 
all/external-facing services in standby mode then this jira will not work.

 Expose RM active/standby state to web UI and metrics
 

 Key: YARN-1033
 URL: https://issues.apache.org/jira/browse/YARN-1033
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.1.0-beta
Reporter: nemon lou
Assignee: nemon lou

 Both active and standby RM shall expose it's web server and show it's current 
 state (active or standby) on web page.
 Cluster metrics also need this state for monitor.
 Standby RM web services shall refuse client request unless querying for RM 
 state.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1001) YARN should provide per application-type and state statistics

2013-08-08 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13733688#comment-13733688
 ] 

Zhijie Shen commented on YARN-1001:
---

Then, the limitation of this requirement is that the dimension of the params to 
categorize the applications is not free to scale. If you want add one more 
para, the number of buckets will be increased exponentially. Therefore, I'm 
afraid this proposed API cannot support as many params as getApps() can do.

@Srimanth Gunturi, do you think application type and state are enough for 
Ambari?

 YARN should provide per application-type and state statistics
 -

 Key: YARN-1001
 URL: https://issues.apache.org/jira/browse/YARN-1001
 Project: Hadoop YARN
  Issue Type: Task
  Components: api
Affects Versions: 2.1.0-beta
Reporter: Srimanth Gunturi

 In Ambari we plan to show for MR2 the number of applications finished, 
 running, waiting, etc. It would be efficient if YARN could provide per 
 application-type and state aggregated counts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-353) Add Zookeeper-based store implementation for RMStateStore

2013-08-08 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13733684#comment-13733684
 ] 

Jian He commented on YARN-353:
--

bq. I am not sure I understand. If the RM is trying to delete something and the 
node does not exist, is there a situation where the RM wants to know that the 
node didn't exist and fail if a non-existent node was tried to be deleted?
Agreed. We should specifically check if the node exists or not. Otherwise the 
ZK delete() API will throw an exception if node doesn't exist which we don't 
want.

 Add Zookeeper-based store implementation for RMStateStore
 -

 Key: YARN-353
 URL: https://issues.apache.org/jira/browse/YARN-353
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Hitesh Shah
Assignee: Bikas Saha
 Attachments: YARN-353.10.patch, YARN-353.11.patch, YARN-353.1.patch, 
 YARN-353.2.patch, YARN-353.3.patch, YARN-353.4.patch, YARN-353.5.patch, 
 YARN-353.6.patch, YARN-353.7.patch, YARN-353.8.patch, YARN-353.9.patch


 Add store that write RM state data to ZK

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-978) [YARN-321] Adding ApplicationAttemptReport and Protobuf implementation

2013-08-08 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13733736#comment-13733736
 ] 

Zhijie Shen commented on YARN-978:
--

bq. any reason why we need final application status and the tracker url in the 
report?

Like why we need this info for application report. It should be important to 
users.

bq. aren't these available in the overall application report?

Yes, the application report contains this info, which is extracted from the 
current attempt. We'd like to keep the info of all the attempts, including the 
failed ones.

bq. what is meant to be retrieved from the attempt report as compared to the 
app report?

We hope the users can get the as complete info of attempts as possible.

 [YARN-321] Adding ApplicationAttemptReport and Protobuf implementation
 --

 Key: YARN-978
 URL: https://issues.apache.org/jira/browse/YARN-978
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Mayank Bansal
Assignee: Xuan Gong
 Fix For: YARN-321

 Attachments: YARN-978-1.patch, YARN-978.2.patch, YARN-978.3.patch


 We dont have ApplicationAttemptReport and Protobuf implementation.
 Adding that.
 Thanks,
 Mayank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-353) Add Zookeeper-based store implementation for RMStateStore

2013-08-08 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13733760#comment-13733760
 ] 

Karthik Kambatla commented on YARN-353:
---

Looking into this now. Will hopefully have an update (patch + replies) sometime 
today.

 Add Zookeeper-based store implementation for RMStateStore
 -

 Key: YARN-353
 URL: https://issues.apache.org/jira/browse/YARN-353
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Hitesh Shah
Assignee: Bikas Saha
 Attachments: YARN-353.10.patch, YARN-353.11.patch, YARN-353.1.patch, 
 YARN-353.2.patch, YARN-353.3.patch, YARN-353.4.patch, YARN-353.5.patch, 
 YARN-353.6.patch, YARN-353.7.patch, YARN-353.8.patch, YARN-353.9.patch


 Add store that write RM state data to ZK

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1046) TestDistributedShell fails intermittently

2013-08-08 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-1046:
---

Attachment: yarn-1046-1.patch

Uploading a patch that might help.

Haven't been able to validate this because I was unable to consistently 
reproduce the issue. Verified the test passes locally with the patch, so at 
least this is not a regression.


 TestDistributedShell fails intermittently
 -

 Key: YARN-1046
 URL: https://issues.apache.org/jira/browse/YARN-1046
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: yarn-1046-1.patch


 Have been running into this frequently inspite of MAPREDUCE-3709 on centos6 
 machines. However, when I try to run it independently on the machines, I have 
 not been able to reproduce it.
 {noformat}
 2013-08-07 19:17:35,048 WARN  [Container Monitor] 
 monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(444)) - 
 Container [pid=16556,containerID=container_1375928243488_0001_01_01] is 
 running beyond virtual memory limits. Current usage: 132.4 MB of 512 MB 
 physical memory used; 1.2 GB of 1.0 GB virtual memory used. Killing container.
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1046) TestDistributedShell fails intermittently

2013-08-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13733805#comment-13733805
 ] 

Hadoop QA commented on YARN-1046:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12596899/yarn-1046-1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1676//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1676//console

This message is automatically generated.

 TestDistributedShell fails intermittently
 -

 Key: YARN-1046
 URL: https://issues.apache.org/jira/browse/YARN-1046
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: yarn-1046-1.patch


 Have been running into this frequently inspite of MAPREDUCE-3709 on centos6 
 machines. However, when I try to run it independently on the machines, I have 
 not been able to reproduce it.
 {noformat}
 2013-08-07 19:17:35,048 WARN  [Container Monitor] 
 monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(444)) - 
 Container [pid=16556,containerID=container_1375928243488_0001_01_01] is 
 running beyond virtual memory limits. Current usage: 132.4 MB of 512 MB 
 physical memory used; 1.2 GB of 1.0 GB virtual memory used. Killing container.
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-901) Active users field in Resourcemanager scheduler UI gives negative values

2013-08-08 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13733831#comment-13733831
 ] 

Jian He commented on YARN-901:
--

Hi [~nishan], are you running 2.0.5-alpha ? I couldn't reproduce on 2.1.0-beta, 
can you give some steps to reproduce and attach the screen shot of the web UI, 
thanks
 

 Active users field in Resourcemanager scheduler UI gives negative values
 --

 Key: YARN-901
 URL: https://issues.apache.org/jira/browse/YARN-901
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Affects Versions: 2.0.5-alpha
Reporter: Nishan Shetty
Priority: Minor

 Active users field in Resourcemanager scheduler UI gives negative values on 
 Resourcemanager restart when job is in progress

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (YARN-1045) Improve toString implementation for PBImpls

2013-08-08 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He reassigned YARN-1045:
-

Assignee: Jian He

 Improve toString implementation for PBImpls
 ---

 Key: YARN-1045
 URL: https://issues.apache.org/jira/browse/YARN-1045
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.4-alpha
Reporter: Siddharth Seth
Assignee: Jian He
 Attachments: YARN-1045.patch


 The generic toString implementation that is used in most of the PBImpls 
 {code}getProto().toString().replaceAll(\\n, , ).replaceAll(\\s+,  
 );{code} is rather inefficient - replacing \n and \s to generate a one 
 line string. Instead, we can use 
 {code}TextFormat.shortDebugString(getProto());{code}.
 If we can get this into 2.1.0 - great, otherwise the next release.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1045) Improve toString implementation for PBImpls

2013-08-08 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-1045:
--

Attachment: YARN-1045.patch

uploaded a trivial patch to replace the toString() of PB as description, no 
test case 

 Improve toString implementation for PBImpls
 ---

 Key: YARN-1045
 URL: https://issues.apache.org/jira/browse/YARN-1045
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.4-alpha
Reporter: Siddharth Seth
 Attachments: YARN-1045.patch


 The generic toString implementation that is used in most of the PBImpls 
 {code}getProto().toString().replaceAll(\\n, , ).replaceAll(\\s+,  
 );{code} is rather inefficient - replacing \n and \s to generate a one 
 line string. Instead, we can use 
 {code}TextFormat.shortDebugString(getProto());{code}.
 If we can get this into 2.1.0 - great, otherwise the next release.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1045) Improve toString implementation for PBImpls

2013-08-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13733882#comment-13733882
 ] 

Hadoop QA commented on YARN-1045:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12596908/YARN-1045.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1677//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1677//console

This message is automatically generated.

 Improve toString implementation for PBImpls
 ---

 Key: YARN-1045
 URL: https://issues.apache.org/jira/browse/YARN-1045
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.4-alpha
Reporter: Siddharth Seth
Assignee: Jian He
 Attachments: YARN-1045.patch


 The generic toString implementation that is used in most of the PBImpls 
 {code}getProto().toString().replaceAll(\\n, , ).replaceAll(\\s+,  
 );{code} is rather inefficient - replacing \n and \s to generate a one 
 line string. Instead, we can use 
 {code}TextFormat.shortDebugString(getProto());{code}.
 If we can get this into 2.1.0 - great, otherwise the next release.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1045) Improve toString implementation for PBImpls

2013-08-08 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13733934#comment-13733934
 ] 

Siddharth Seth commented on YARN-1045:
--

Thanks for taking this up Jian. Did you get a chance to run all MR and YARN 
unit tests locally  - in case we're relying on the toString format anywhere.

 Improve toString implementation for PBImpls
 ---

 Key: YARN-1045
 URL: https://issues.apache.org/jira/browse/YARN-1045
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.4-alpha
Reporter: Siddharth Seth
Assignee: Jian He
 Attachments: YARN-1045.patch


 The generic toString implementation that is used in most of the PBImpls 
 {code}getProto().toString().replaceAll(\\n, , ).replaceAll(\\s+,  
 );{code} is rather inefficient - replacing \n and \s to generate a one 
 line string. Instead, we can use 
 {code}TextFormat.shortDebugString(getProto());{code}.
 If we can get this into 2.1.0 - great, otherwise the next release.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-899) Get queue administration ACLs working

2013-08-08 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13734040#comment-13734040
 ] 

Siddharth Seth commented on YARN-899:
-

bq.  With this in mind, I think who has access should be based on a union of 
ACLs 
Agree. AMs get ACLs from the RM when they register. That could be a combined 
list along with the queue ACLs. It's up to the AMs to enforce these. Maybe the 
RM proxy could do some of this as well. The MR JobHistoryServer gets ACLs from 
the AM - again it's up to this to enforce them. The RM AppHistoryServer will 
need to do the union though.

Don't have experience with JT ACLs, but it does look like that's doing a union 
as well. View vs Modify ACLs for queues makes sense to me.

 Get queue administration ACLs working
 -

 Key: YARN-899
 URL: https://issues.apache.org/jira/browse/YARN-899
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Xuan Gong
 Attachments: YARN-899.1.patch


 The Capacity Scheduler documents the 
 yarn.scheduler.capacity.root.queue-path.acl_administer_queue config option 
 for controlling who can administer a queue, but it is not hooked up to 
 anything.  The Fair Scheduler could make use of a similar option as well.  
 This is a feature-parity regression from MR1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-679) add an entry point that can start any Yarn service

2013-08-08 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13734135#comment-13734135
 ] 

Steve Loughran commented on YARN-679:
-

I've been evolving this driven by the Hoya application; code is up online at 
[https://github.com/hortonworks/hoya/tree/master/src/main/java/org/apache/hadoop/yarn/service/launcher]

Some observations
I added an interface {{GetExceptionExitCode}} to get an exception code off any 
Exception. 

{code}
public interface GetExceptionExitCode {

  int  getExitCode();
}
{code}

It'd be nice to have this interface implemented by {{Shell.ExitCodeException}} 
and {{ExitUtil.ExitException}} so that we have a consistent way to get exit 
codes from any exception willing to provide them.

ps. we could do with a more standardised set of error codes for YARN 
applications -convention rather than mandatory.

To make services executable, rather than just deployable, I added another 
interface.

{code}
public interface RunService {

  /**
   * Propagate the command line arguments
   * 
   * @param args argument list
   * @throws IOException any problem
   */
  void setArgs(String...args) throws Exception;
  
  /**
   * Run a service
   * @return the exit code
   * @throws Throwable any exception to report
   */
  int runService() throws Throwable ;
}
{code}

Here {{setArgs}} passes down all the arguments *before {{Service.init(Config)}} 
is called. This lets me tune the config passed to the superclass based on the 
supplied arguments.

{{runService()}} is called after {{Service.start()}}. 

The model here is that the main() thread goes
# create service class
# {{setArgs(...)}}
# {{init(config}}
# {{start()}}
# {{int exit=runService()}}
# {{stop}}

The service doesn't need to start its own worker thread, and the exit code from 
runService becomes the exit code of the app. Any of the service methods are 
also free to throw an exception; it implements {{getExitCode()}} that becomes 
the exit code of the app.

The code seems a bit over-complex, but it's evolved to also be the entry point 
for tests too


 add an entry point that can start any Yarn service
 --

 Key: YARN-679
 URL: https://issues.apache.org/jira/browse/YARN-679
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api
Reporter: Steve Loughran
Priority: Minor
 Attachments: YARN-679-001.patch


 There's no need to write separate .main classes for every Yarn service, given 
 that the startup mechanism should be identical: create, init, start, wait for 
 stopped -with an interrupt handler to trigger a clean shutdown on a control-c 
 interrrupt.
 Provide one that takes any classname, and a list of config files/options

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1043) YARN Queue metrics are getting pushed to neither file nor Ganglia

2013-08-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13734152#comment-13734152
 ] 

Hudson commented on YARN-1043:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #4230 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4230/])
YARN-1043. Push all metrics consistently. Contributed by Jian He. (acmurthy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1512081)
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSystemImpl.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/QueueMetrics.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSQueueMetrics.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestQueueMetrics.java


 YARN Queue metrics are getting pushed to neither file nor Ganglia
 -

 Key: YARN-1043
 URL: https://issues.apache.org/jira/browse/YARN-1043
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: Yusaku Sako
Assignee: Jian He
 Fix For: 2.1.0-beta

 Attachments: YARN-1043.1.patch, YARN-1043.patch


 YARN Queue metrics are not getting pushed to file or Ganglia via Hadoop 
 Metrics 2. 
 QueueMetrics are still accessible via JMX and RM REST API 
 (hostname:8088/ws/v1/cluster/scheduler).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-899) Get queue administration ACLs working

2013-08-08 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-899:
---

Attachment: YARN-899.2.patch

 Get queue administration ACLs working
 -

 Key: YARN-899
 URL: https://issues.apache.org/jira/browse/YARN-899
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Xuan Gong
 Attachments: YARN-899.1.patch, YARN-899.2.patch


 The Capacity Scheduler documents the 
 yarn.scheduler.capacity.root.queue-path.acl_administer_queue config option 
 for controlling who can administer a queue, but it is not hooked up to 
 anything.  The Fair Scheduler could make use of a similar option as well.  
 This is a feature-parity regression from MR1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-899) Get queue administration ACLs working

2013-08-08 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13734164#comment-13734164
 ] 

Xuan Gong commented on YARN-899:


Here is my propose: 
We can create QueueACLsManager which encapsulate the ResourceScheduler, so 
whenever we need to check user's permission, we can provide the UGI, 
queueName(This can be found from RMApp) to the Scheduler (No matter it is 
CapacityScheduler, fairScheduler or FIFOScheduler), and let scheduler to help 
us make the decision. 

For each scheduler, we may need to add a new interface : 
Scheduler#hasAccess(UGI, queueName). The queueName is used to find the correct 
Queue. The reason why I send information back to Scheduler and let scheduler 
make the decision is because 
a. I think all the QueueACLsInfos are collected by the Scheduler, and save in 
its queues (I can not findother places which save the QueueACLs), 
b. even if the queue is re-initiated in the future, we do not need to worry 
about it.


Attached is the patch implements this propose. Please give me the suggestions. 

 Get queue administration ACLs working
 -

 Key: YARN-899
 URL: https://issues.apache.org/jira/browse/YARN-899
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Xuan Gong
 Attachments: YARN-899.1.patch, YARN-899.2.patch


 The Capacity Scheduler documents the 
 yarn.scheduler.capacity.root.queue-path.acl_administer_queue config option 
 for controlling who can administer a queue, but it is not hooked up to 
 anything.  The Fair Scheduler could make use of a similar option as well.  
 This is a feature-parity regression from MR1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-1047) Expose # of pre-emptions as a queue counter

2013-08-08 Thread Philip Zeyliger (JIRA)
Philip Zeyliger created YARN-1047:
-

 Summary: Expose # of pre-emptions as a queue counter
 Key: YARN-1047
 URL: https://issues.apache.org/jira/browse/YARN-1047
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.2-alpha
Reporter: Philip Zeyliger


Since YARN supports pre-empting containers, a given queue should expose the 
number of containers it has had pre-empted as a metric.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-656) In scheduler UI, including reserved memory in Memory Total can make it exceed cluster capacity.

2013-08-08 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13734186#comment-13734186
 ] 

Alejandro Abdelnur commented on YARN-656:
-

+1

 In scheduler UI, including reserved memory in Memory Total can make it 
 exceed cluster capacity.
 -

 Key: YARN-656
 URL: https://issues.apache.org/jira/browse/YARN-656
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, scheduler
Affects Versions: 2.0.4-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-656-1.patch, YARN-656.patch


 Memory Total is currently a sum of availableMB, allocatedMB, and 
 reservedMB.  Including reservedMB in this sum can make the total exceed the 
 capacity of the cluster. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1046) TestDistributedShell fails intermittently

2013-08-08 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-1046:
---

Attachment: yarn-1046-2.patch

Thanks Sandy. Totally agree with you, forgot about that JIRA.

Here is a patch that is very much along the lines of MAPREDUCE-5094, but for 
MiniYARNCluster.

I wonder if we need MAPREDUCE-5094 anymore?

 TestDistributedShell fails intermittently
 -

 Key: YARN-1046
 URL: https://issues.apache.org/jira/browse/YARN-1046
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: yarn-1046-1.patch, yarn-1046-2.patch


 Have been running into this frequently inspite of MAPREDUCE-3709 on centos6 
 machines. However, when I try to run it independently on the machines, I have 
 not been able to reproduce it.
 {noformat}
 2013-08-07 19:17:35,048 WARN  [Container Monitor] 
 monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(444)) - 
 Container [pid=16556,containerID=container_1375928243488_0001_01_01] is 
 running beyond virtual memory limits. Current usage: 132.4 MB of 512 MB 
 physical memory used; 1.2 GB of 1.0 GB virtual memory used. Killing container.
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1021) Yarn Scheduler Load Simulator

2013-08-08 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13734221#comment-13734221
 ] 

Alejandro Abdelnur commented on YARN-1021:
--

Wei, first of all, nice.

I'm not convinced (even if I suggested that to you offline) on having dirs 
under share/hadoop/tools/sls/conf being 'configurable' and added to the 
classpath.

Instead I would suggest the following:

The stuff under share/hadoop/tools/sls/ should be samples, i.e.: sample-conf/  
sample-data

The runmen2sls.sh  slsrunner.sh scripts should not add the sample-conf dir to 
the classpath, they should just add the JARs.

And the documentation should state that sample-conf/ files should be copied to 
the hadoop conf/ directory to run the simulator.




 Yarn Scheduler Load Simulator
 -

 Key: YARN-1021
 URL: https://issues.apache.org/jira/browse/YARN-1021
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: scheduler
Reporter: Wei Yan
Assignee: Wei Yan
 Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.pdf


 The Yarn Scheduler is a fertile area of interest with different 
 implementations, e.g., Fifo, Capacity and Fair  schedulers. Meanwhile, 
 several optimizations are also made to improve scheduler performance for 
 different scenarios and workload. Each scheduler algorithm has its own set of 
 features, and drives scheduling decisions by many factors, such as fairness, 
 capacity guarantee, resource availability, etc. It is very important to 
 evaluate a scheduler algorithm very well before we deploy it in a production 
 cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling 
 algorithm. Evaluating in a real cluster is always time and cost consuming, 
 and it is also very hard to find a large-enough cluster. Hence, a simulator 
 which can predict how well a scheduler algorithm for some specific workload 
 would be quite useful.
 We want to build a Scheduler Load Simulator to simulate large-scale Yarn 
 clusters and application loads in a single machine. This would be invaluable 
 in furthering Yarn by providing a tool for researchers and developers to 
 prototype new scheduler features and predict their behavior and performance 
 with reasonable amount of confidence, there-by aiding rapid innovation.
 The simulator will exercise the real Yarn ResourceManager removing the 
 network factor by simulating NodeManagers and ApplicationMasters via handling 
 and dispatching NM/AMs heartbeat events from within the same JVM.
 To keep tracking of scheduler behavior and performance, a scheduler wrapper 
 will wrap the real scheduler.
 The simulator will produce real time metrics while executing, including:
 * Resource usages for whole cluster and each queue, which can be utilized to 
 configure cluster and queue's capacity.
 * The detailed application execution trace (recorded in relation to simulated 
 time), which can be analyzed to understand/validate the  scheduler behavior 
 (individual jobs turn around time, throughput, fairness, capacity guarantee, 
 etc).
 * Several key metrics of scheduler algorithm, such as time cost of each 
 scheduler operation (allocate, handle, etc), which can be utilized by Hadoop 
 developers to find the code spots and scalability limits.
 The simulator will provide real time charts showing the behavior of the 
 scheduler and its performance.
 A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing 
 how to use simulator to simulate Fair Scheduler and Capacity Scheduler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-1048) Add new AMRMClientAsync.getMatchingRequests method taking a Container as parameter

2013-08-08 Thread Alejandro Abdelnur (JIRA)
Alejandro Abdelnur created YARN-1048:


 Summary: Add new AMRMClientAsync.getMatchingRequests method taking 
a Container as parameter
 Key: YARN-1048
 URL: https://issues.apache.org/jira/browse/YARN-1048
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: api
Affects Versions: 2.1.0-beta
Reporter: Alejandro Abdelnur


The current method signature {{getMatchingRequests(Priority priority, String 
resourceName, Resource resource)}} for using within 
{{onContainersAllocated(ListContainer containers)}} as we have to deconstruct 
the info from the received containers.

A new signature, {{getMatchingRequests(Container container)}} would simplify 
usage for clients.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1046) TestDistributedShell fails intermittently

2013-08-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13734252#comment-13734252
 ] 

Hadoop QA commented on YARN-1046:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12596983/yarn-1046-2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1678//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1678//console

This message is automatically generated.

 TestDistributedShell fails intermittently
 -

 Key: YARN-1046
 URL: https://issues.apache.org/jira/browse/YARN-1046
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: yarn-1046-1.patch, yarn-1046-2.patch


 Have been running into this frequently inspite of MAPREDUCE-3709 on centos6 
 machines. However, when I try to run it independently on the machines, I have 
 not been able to reproduce it.
 {noformat}
 2013-08-07 19:17:35,048 WARN  [Container Monitor] 
 monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(444)) - 
 Container [pid=16556,containerID=container_1375928243488_0001_01_01] is 
 running beyond virtual memory limits. Current usage: 132.4 MB of 512 MB 
 physical memory used; 1.2 GB of 1.0 GB virtual memory used. Killing container.
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1046) TestDistributedShell fails intermittently

2013-08-08 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13734264#comment-13734264
 ] 

Sandy Ryza commented on YARN-1046:
--

+1

 TestDistributedShell fails intermittently
 -

 Key: YARN-1046
 URL: https://issues.apache.org/jira/browse/YARN-1046
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: yarn-1046-1.patch, yarn-1046-2.patch


 Have been running into this frequently inspite of MAPREDUCE-3709 on centos6 
 machines. However, when I try to run it independently on the machines, I have 
 not been able to reproduce it.
 {noformat}
 2013-08-07 19:17:35,048 WARN  [Container Monitor] 
 monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(444)) - 
 Container [pid=16556,containerID=container_1375928243488_0001_01_01] is 
 running beyond virtual memory limits. Current usage: 132.4 MB of 512 MB 
 physical memory used; 1.2 GB of 1.0 GB virtual memory used. Killing container.
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-589) Expose a REST API for monitoring the fair scheduler

2013-08-08 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13734267#comment-13734267
 ] 

Sandy Ryza commented on YARN-589:
-

Thanks for the review, Alejandro.  Committed to trunk, branch-2, and 
branch-2.1-beta.

 Expose a REST API for monitoring the fair scheduler
 ---

 Key: YARN-589
 URL: https://issues.apache.org/jira/browse/YARN-589
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Affects Versions: 2.0.3-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: fairscheduler.xml, YARN-589-1.patch, YARN-589-2.patch, 
 YARN-589.patch


 The fair scheduler should have an HTTP interface that exposes information 
 such as applications per queue, fair shares, demands, current allocations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1046) TestDistributedShell fails intermittently

2013-08-08 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13734265#comment-13734265
 ] 

Sandy Ryza commented on YARN-1046:
--

And yeah, MAPREDUCE-5094 may not be necessary now, though that's work for 
another JIRA.

 TestDistributedShell fails intermittently
 -

 Key: YARN-1046
 URL: https://issues.apache.org/jira/browse/YARN-1046
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: yarn-1046-1.patch, yarn-1046-2.patch


 Have been running into this frequently inspite of MAPREDUCE-3709 on centos6 
 machines. However, when I try to run it independently on the machines, I have 
 not been able to reproduce it.
 {noformat}
 2013-08-07 19:17:35,048 WARN  [Container Monitor] 
 monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(444)) - 
 Container [pid=16556,containerID=container_1375928243488_0001_01_01] is 
 running beyond virtual memory limits. Current usage: 132.4 MB of 512 MB 
 physical memory used; 1.2 GB of 1.0 GB virtual memory used. Killing container.
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-589) Expose a REST API for monitoring the fair scheduler

2013-08-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13734283#comment-13734283
 ] 

Hudson commented on YARN-589:
-

SUCCESS: Integrated in Hadoop-trunk-Commit #4233 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4233/])
Amending YARN-589.  Adding missing file from patch (sandy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1512112)
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesFairScheduler.java
YARN-589. Expose a REST API for monitoring the fair scheduler (Sandy Ryza). 
(sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1512111)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/FairSchedulerInfo.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/FairSchedulerLeafQueueInfo.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/FairSchedulerQueueInfo.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/ResourceInfo.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/SchedulerInfo.java


 Expose a REST API for monitoring the fair scheduler
 ---

 Key: YARN-589
 URL: https://issues.apache.org/jira/browse/YARN-589
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Affects Versions: 2.0.3-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Fix For: 2.1.1-beta

 Attachments: fairscheduler.xml, YARN-589-1.patch, YARN-589-2.patch, 
 YARN-589.patch


 The fair scheduler should have an HTTP interface that exposes information 
 such as applications per queue, fair shares, demands, current allocations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1045) Improve toString implementation for PBImpls

2013-08-08 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-1045:
--

Attachment: YARN-1045.1.patch

missed 5 pb classes..

 Improve toString implementation for PBImpls
 ---

 Key: YARN-1045
 URL: https://issues.apache.org/jira/browse/YARN-1045
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.4-alpha
Reporter: Siddharth Seth
Assignee: Jian He
 Attachments: YARN-1045.1.patch, YARN-1045.patch


 The generic toString implementation that is used in most of the PBImpls 
 {code}getProto().toString().replaceAll(\\n, , ).replaceAll(\\s+,  
 );{code} is rather inefficient - replacing \n and \s to generate a one 
 line string. Instead, we can use 
 {code}TextFormat.shortDebugString(getProto());{code}.
 If we can get this into 2.1.0 - great, otherwise the next release.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1045) Improve toString implementation for PBImpls

2013-08-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13734325#comment-13734325
 ] 

Hadoop QA commented on YARN-1045:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12597006/YARN-1045.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1679//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1679//console

This message is automatically generated.

 Improve toString implementation for PBImpls
 ---

 Key: YARN-1045
 URL: https://issues.apache.org/jira/browse/YARN-1045
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.4-alpha
Reporter: Siddharth Seth
Assignee: Jian He
 Attachments: YARN-1045.1.patch, YARN-1045.patch


 The generic toString implementation that is used in most of the PBImpls 
 {code}getProto().toString().replaceAll(\\n, , ).replaceAll(\\s+,  
 );{code} is rather inefficient - replacing \n and \s to generate a one 
 line string. Instead, we can use 
 {code}TextFormat.shortDebugString(getProto());{code}.
 If we can get this into 2.1.0 - great, otherwise the next release.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1021) Yarn Scheduler Load Simulator

2013-08-08 Thread Wei Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan updated YARN-1021:
--

Attachment: YARN-1021-demo.tar.gz
YARN-1021.pdf

Update patch and documents according to [~tucu00]'s suggestions.

 Yarn Scheduler Load Simulator
 -

 Key: YARN-1021
 URL: https://issues.apache.org/jira/browse/YARN-1021
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: scheduler
Reporter: Wei Yan
Assignee: Wei Yan
 Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.pdf


 The Yarn Scheduler is a fertile area of interest with different 
 implementations, e.g., Fifo, Capacity and Fair  schedulers. Meanwhile, 
 several optimizations are also made to improve scheduler performance for 
 different scenarios and workload. Each scheduler algorithm has its own set of 
 features, and drives scheduling decisions by many factors, such as fairness, 
 capacity guarantee, resource availability, etc. It is very important to 
 evaluate a scheduler algorithm very well before we deploy it in a production 
 cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling 
 algorithm. Evaluating in a real cluster is always time and cost consuming, 
 and it is also very hard to find a large-enough cluster. Hence, a simulator 
 which can predict how well a scheduler algorithm for some specific workload 
 would be quite useful.
 We want to build a Scheduler Load Simulator to simulate large-scale Yarn 
 clusters and application loads in a single machine. This would be invaluable 
 in furthering Yarn by providing a tool for researchers and developers to 
 prototype new scheduler features and predict their behavior and performance 
 with reasonable amount of confidence, there-by aiding rapid innovation.
 The simulator will exercise the real Yarn ResourceManager removing the 
 network factor by simulating NodeManagers and ApplicationMasters via handling 
 and dispatching NM/AMs heartbeat events from within the same JVM.
 To keep tracking of scheduler behavior and performance, a scheduler wrapper 
 will wrap the real scheduler.
 The simulator will produce real time metrics while executing, including:
 * Resource usages for whole cluster and each queue, which can be utilized to 
 configure cluster and queue's capacity.
 * The detailed application execution trace (recorded in relation to simulated 
 time), which can be analyzed to understand/validate the  scheduler behavior 
 (individual jobs turn around time, throughput, fairness, capacity guarantee, 
 etc).
 * Several key metrics of scheduler algorithm, such as time cost of each 
 scheduler operation (allocate, handle, etc), which can be utilized by Hadoop 
 developers to find the code spots and scalability limits.
 The simulator will provide real time charts showing the behavior of the 
 scheduler and its performance.
 A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing 
 how to use simulator to simulate Fair Scheduler and Capacity Scheduler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1021) Yarn Scheduler Load Simulator

2013-08-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13734347#comment-13734347
 ] 

Hadoop QA commented on YARN-1021:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12597013/YARN-1021.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-assemblies hadoop-tools/hadoop-sls hadoop-tools/hadoop-tools-dist.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1680//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1680//console

This message is automatically generated.

 Yarn Scheduler Load Simulator
 -

 Key: YARN-1021
 URL: https://issues.apache.org/jira/browse/YARN-1021
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: scheduler
Reporter: Wei Yan
Assignee: Wei Yan
 Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz, 
 YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, 
 YARN-1021.patch, YARN-1021.pdf


 The Yarn Scheduler is a fertile area of interest with different 
 implementations, e.g., Fifo, Capacity and Fair  schedulers. Meanwhile, 
 several optimizations are also made to improve scheduler performance for 
 different scenarios and workload. Each scheduler algorithm has its own set of 
 features, and drives scheduling decisions by many factors, such as fairness, 
 capacity guarantee, resource availability, etc. It is very important to 
 evaluate a scheduler algorithm very well before we deploy it in a production 
 cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling 
 algorithm. Evaluating in a real cluster is always time and cost consuming, 
 and it is also very hard to find a large-enough cluster. Hence, a simulator 
 which can predict how well a scheduler algorithm for some specific workload 
 would be quite useful.
 We want to build a Scheduler Load Simulator to simulate large-scale Yarn 
 clusters and application loads in a single machine. This would be invaluable 
 in furthering Yarn by providing a tool for researchers and developers to 
 prototype new scheduler features and predict their behavior and performance 
 with reasonable amount of confidence, there-by aiding rapid innovation.
 The simulator will exercise the real Yarn ResourceManager removing the 
 network factor by simulating NodeManagers and ApplicationMasters via handling 
 and dispatching NM/AMs heartbeat events from within the same JVM.
 To keep tracking of scheduler behavior and performance, a scheduler wrapper 
 will wrap the real scheduler.
 The simulator will produce real time metrics while executing, including:
 * Resource usages for whole cluster and each queue, which can be utilized to 
 configure cluster and queue's capacity.
 * The detailed application execution trace (recorded in relation to simulated 
 time), which can be analyzed to understand/validate the  scheduler behavior 
 (individual jobs turn around time, throughput, fairness, capacity guarantee, 
 etc).
 * Several key metrics of scheduler algorithm, such as time cost of each 
 scheduler operation (allocate, handle, etc), which can be utilized by Hadoop 
 developers to find the code spots and scalability limits.
 The simulator will provide real time charts showing the behavior of the 
 scheduler and its performance.
 A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing 
 how to use simulator to simulate Fair Scheduler and Capacity Scheduler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira