date:20140527


[ 
https://issues.apache.org/jira/browse/YARN-2096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14009807#comment-14009807
 ] 

Tsuyoshi OZAWA commented on YARN-2096:
--

One good news: TestRMRestart with Anubhav's patch works well - after running 
tests hundreds times, no failure. Good job :-)

 Race in TestRMRestart#testQueueMetricsOnRMRestart
 -

 Key: YARN-2096
 URL: https://issues.apache.org/jira/browse/YARN-2096
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Fix For: 2.5.0

 Attachments: YARN-2096.patch


 org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testQueueMetricsOnRMRestart
  fails randomly because of a race condition.
 The test validates that metrics are incremented, but does not wait for all 
 transitions to finish before checking for the values.
 It also resets metrics after kicking off recovery of second RM. The metrics 
 that need to be incremented race with this reset causing test to fail 
 randomly.
 We need to wait for the right transitions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-1680) availableResources sent to applicationMaster in heartbeat should exclude blacklistedNodes free memory.

2014-05-27 Thread Chen He (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated YARN-1680:
--

Attachment: YARN-1680-v2.patch

 availableResources sent to applicationMaster in heartbeat should exclude 
 blacklistedNodes free memory.
 --

 Key: YARN-1680
 URL: https://issues.apache.org/jira/browse/YARN-1680
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.2.0, 2.3.0
 Environment: SuSE 11 SP2 + Hadoop-2.3 
Reporter: Rohith
Assignee: Chen He
 Attachments: YARN-1680-v2.patch, YARN-1680.patch


 There are 4 NodeManagers with 8GB each.Total cluster capacity is 32GB.Cluster 
 slow start is set to 1.
 Job is running reducer task occupied 29GB of cluster.One NodeManager(NM-4) is 
 become unstable(3 Map got killed), MRAppMaster blacklisted unstable 
 NodeManager(NM-4). All reducer task are running in cluster now.
 MRAppMaster does not preempt the reducers because for Reducer preemption 
 calculation, headRoom is considering blacklisted nodes memory. This makes 
 jobs to hang forever(ResourceManager does not assing any new containers on 
 blacklisted nodes but returns availableResouce considers cluster free 
 memory). 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-796) Allow for (admin) labels on nodes and resource-requests

2014-05-27 Thread Jian Fang (JIRA)

[
https://issues.apache.org/jira/browse/YARN-796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14009893#comment-14009893
]

Jian Fang commented on YARN-796:

Hi Bikas, I think it is better to have the node manager to specify its own
labels and then it registers the labels with RM.

Also, it would be great if YARN could provide an API to add/update labels to a
node. This is based on the following scenario.

Usually a hadoop cluster in cloud is elastic, that is to say, the cluster size
can be automatically or manually expended or shrunk based on cluster situation,
for example, idleness. When a node in a cluster is chosen to be shrunk, i.e.,
to be removed, we could call the API to label the node so that no more tasks
would be assigned to this node.

We could use the decommission API to achieve this goal, but I think the label
API may be more elegant.

Allow for (admin) labels on nodes and resource-requests
---

Key: YARN-796
URL: https://issues.apache.org/jira/browse/YARN-796
Project: Hadoop YARN
Issue Type: Sub-task
Reporter: Arun C Murthy
Assignee: Wangda Tan
Attachments: YARN-796.patch

It will be useful for admins to specify labels for nodes. Examples of labels
are OS, processor architecture etc.
We should expose these labels and allow applications to specify labels on
resource-requests.
Obviously we need to support admin operations on adding/removing node labels.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1474) Make schedulers services


[ 
https://issues.apache.org/jira/browse/YARN-1474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14009905#comment-14009905
 ] 

Hadoop QA commented on YARN-1474:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12646928/YARN-1474.17.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-tools/hadoop-sls 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3833//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3833//console

This message is automatically generated.

 Make schedulers services
 

 Key: YARN-1474
 URL: https://issues.apache.org/jira/browse/YARN-1474
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Affects Versions: 2.3.0, 2.4.0
Reporter: Sandy Ryza
Assignee: Tsuyoshi OZAWA
 Attachments: YARN-1474.1.patch, YARN-1474.10.patch, 
 YARN-1474.11.patch, YARN-1474.12.patch, YARN-1474.13.patch, 
 YARN-1474.14.patch, YARN-1474.15.patch, YARN-1474.16.patch, 
 YARN-1474.17.patch, YARN-1474.2.patch, YARN-1474.3.patch, YARN-1474.4.patch, 
 YARN-1474.5.patch, YARN-1474.6.patch, YARN-1474.7.patch, YARN-1474.8.patch, 
 YARN-1474.9.patch


 Schedulers currently have a reinitialize but no start and stop.  Fitting them 
 into the YARN service model would make things more coherent.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1474) Make schedulers services


[ 
https://issues.apache.org/jira/browse/YARN-1474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14009928#comment-14009928
 ] 

Tsuyoshi OZAWA commented on YARN-1474:
--

The three test failures of TestFairScheduler are filed as YARN-2105. 

 Make schedulers services
 

 Key: YARN-1474
 URL: https://issues.apache.org/jira/browse/YARN-1474
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Affects Versions: 2.3.0, 2.4.0
Reporter: Sandy Ryza
Assignee: Tsuyoshi OZAWA
 Attachments: YARN-1474.1.patch, YARN-1474.10.patch, 
 YARN-1474.11.patch, YARN-1474.12.patch, YARN-1474.13.patch, 
 YARN-1474.14.patch, YARN-1474.15.patch, YARN-1474.16.patch, 
 YARN-1474.17.patch, YARN-1474.2.patch, YARN-1474.3.patch, YARN-1474.4.patch, 
 YARN-1474.5.patch, YARN-1474.6.patch, YARN-1474.7.patch, YARN-1474.8.patch, 
 YARN-1474.9.patch


 Schedulers currently have a reinitialize but no start and stop.  Fitting them 
 into the YARN service model would make things more coherent.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1680) availableResources sent to applicationMaster in heartbeat should exclude blacklistedNodes free memory.


[ 
https://issues.apache.org/jira/browse/YARN-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14009931#comment-14009931
 ] 

Hadoop QA commented on YARN-1680:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12646932/YARN-1680-v2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3834//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3834//console

This message is automatically generated.

 availableResources sent to applicationMaster in heartbeat should exclude 
 blacklistedNodes free memory.
 --

 Key: YARN-1680
 URL: https://issues.apache.org/jira/browse/YARN-1680
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.2.0, 2.3.0
 Environment: SuSE 11 SP2 + Hadoop-2.3 
Reporter: Rohith
Assignee: Chen He
 Attachments: YARN-1680-v2.patch, YARN-1680.patch


 There are 4 NodeManagers with 8GB each.Total cluster capacity is 32GB.Cluster 
 slow start is set to 1.
 Job is running reducer task occupied 29GB of cluster.One NodeManager(NM-4) is 
 become unstable(3 Map got killed), MRAppMaster blacklisted unstable 
 NodeManager(NM-4). All reducer task are running in cluster now.
 MRAppMaster does not preempt the reducers because for Reducer preemption 
 calculation, headRoom is considering blacklisted nodes memory. This makes 
 jobs to hang forever(ResourceManager does not assing any new containers on 
 blacklisted nodes but returns availableResouce considers cluster free 
 memory). 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (YARN-2106) TestFairScheduler in trunk is failing

Wei Yan created YARN-2106:
-

 Summary: TestFairScheduler in trunk is failing
 Key: YARN-2106
 URL: https://issues.apache.org/jira/browse/YARN-2106
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Wei Yan
Assignee: Wei Yan


Some issues due to the Queue Placement policy.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (YARN-2106) TestFairScheduler in trunk is failing


 [ 
https://issues.apache.org/jira/browse/YARN-2106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza resolved YARN-2106.
--

Resolution: Duplicate

 TestFairScheduler in trunk is failing
 -

 Key: YARN-2106
 URL: https://issues.apache.org/jira/browse/YARN-2106
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Wei Yan
Assignee: Wei Yan

 Some issues due to the Queue Placement policy.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1680) availableResources sent to applicationMaster in heartbeat should exclude blacklistedNodes free memory.

2014-05-27 Thread Chen He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010005#comment-14010005
 ] 

Chen He commented on YARN-1680:
---

These three errors are reported in 
[YARN-2105|https://issues.apache.org/jira/browse/YARN-2105] and not related to 
this JIRA.

 availableResources sent to applicationMaster in heartbeat should exclude 
 blacklistedNodes free memory.
 --

 Key: YARN-1680
 URL: https://issues.apache.org/jira/browse/YARN-1680
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.2.0, 2.3.0
 Environment: SuSE 11 SP2 + Hadoop-2.3 
Reporter: Rohith
Assignee: Chen He
 Attachments: YARN-1680-v2.patch, YARN-1680.patch


 There are 4 NodeManagers with 8GB each.Total cluster capacity is 32GB.Cluster 
 slow start is set to 1.
 Job is running reducer task occupied 29GB of cluster.One NodeManager(NM-4) is 
 become unstable(3 Map got killed), MRAppMaster blacklisted unstable 
 NodeManager(NM-4). All reducer task are running in cluster now.
 MRAppMaster does not preempt the reducers because for Reducer preemption 
 calculation, headRoom is considering blacklisted nodes memory. This makes 
 jobs to hang forever(ResourceManager does not assing any new containers on 
 blacklisted nodes but returns availableResouce considers cluster free 
 memory). 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2105) Three TestFairScheduler tests fail in trunk

2014-05-27 Thread Karthik Kambatla (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010013#comment-14010013
 ] 

Karthik Kambatla commented on YARN-2105:


Looks good to me. I ll wait for Sandy also to take a look. 

 Three TestFairScheduler tests fail in trunk
 ---

 Key: YARN-2105
 URL: https://issues.apache.org/jira/browse/YARN-2105
 Project: Hadoop YARN
  Issue Type: Test
Reporter: Ted Yu
Assignee: Ashwin Shankar
 Attachments: YARN-2105-v1.txt


 The following tests fail in trunk:
 {code}
 Failed tests:
   TestFairScheduler.testDontAllowUndeclaredPools:2412 expected:1 but was:0
 Tests in error:
   TestFairScheduler.testQueuePlacementWithPolicy:624 NullPointer
   TestFairScheduler.testNotUserAsDefaultQueue:530 » NullPointer
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2107) Refactor timeline classes into server.timeline package


 [ 
https://issues.apache.org/jira/browse/YARN-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-2107:
--

Issue Type: Bug  (was: Sub-task)
Parent: (was: YARN-1530)

 Refactor timeline classes into server.timeline package
 --

 Key: YARN-2107
 URL: https://issues.apache.org/jira/browse/YARN-2107
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli

 Right now, most of timeline-server classes are present in an 
 applicationhistoryserver package instead of a top level timeline package.
 This is one part of YARN-2043, there is more to do..



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (YARN-2107) Refactor timeline classes into server.timeline package

Vinod Kumar Vavilapalli created YARN-2107:
-

 Summary: Refactor timeline classes into server.timeline package
 Key: YARN-2107
 URL: https://issues.apache.org/jira/browse/YARN-2107
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli


Right now, most of timeline-server classes are present in an 
applicationhistoryserver package instead of a top level timeline package.

This is one part of YARN-2043, there is more to do..



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2105) Three TestFairScheduler tests fail in trunk


[ 
https://issues.apache.org/jira/browse/YARN-2105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010164#comment-14010164
 ] 

Tsuyoshi OZAWA commented on YARN-2105:
--

The patch works well on my local.

 Three TestFairScheduler tests fail in trunk
 ---

 Key: YARN-2105
 URL: https://issues.apache.org/jira/browse/YARN-2105
 Project: Hadoop YARN
  Issue Type: Test
Reporter: Ted Yu
Assignee: Ashwin Shankar
 Attachments: YARN-2105-v1.txt


 The following tests fail in trunk:
 {code}
 Failed tests:
   TestFairScheduler.testDontAllowUndeclaredPools:2412 expected:1 but was:0
 Tests in error:
   TestFairScheduler.testQueuePlacementWithPolicy:624 NullPointer
   TestFairScheduler.testNotUserAsDefaultQueue:530 » NullPointer
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2107) Refactor timeline classes into server.timeline package


 [ 
https://issues.apache.org/jira/browse/YARN-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-2107:
--

Issue Type: Sub-task  (was: Bug)
Parent: YARN-1530

 Refactor timeline classes into server.timeline package
 --

 Key: YARN-2107
 URL: https://issues.apache.org/jira/browse/YARN-2107
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli

 Right now, most of timeline-server classes are present in an 
 applicationhistoryserver package instead of a top level timeline package.
 This is one part of YARN-2043, there is more to do..



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2107) Refactor timeline classes into server.timeline package


 [ 
https://issues.apache.org/jira/browse/YARN-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-2107:
--

Attachment: YARN-2107.txt

Here's a simple eclipse-refactor patch attached.

Easiest way to review if on git - apply the patch, git add new files and 
run git diff -M

 Refactor timeline classes into server.timeline package
 --

 Key: YARN-2107
 URL: https://issues.apache.org/jira/browse/YARN-2107
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Attachments: YARN-2107.txt


 Right now, most of timeline-server classes are present in an 
 applicationhistoryserver package instead of a top level timeline package.
 This is one part of YARN-2043, there is more to do..



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2107) Refactor timeline classes into server.timeline package


[ 
https://issues.apache.org/jira/browse/YARN-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010200#comment-14010200
 ] 

Hadoop QA commented on YARN-2107:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12646971/YARN-2107.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 7 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests:

  
org.apache.hadoop.yarn.server.applicationhistoryservice.TestApplicationHistoryServer
  
org.apache.hadoop.yarn.server.applicationhistoryservice.TestApplicationHistoryClientService

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3835//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3835//console

This message is automatically generated.

 Refactor timeline classes into server.timeline package
 --

 Key: YARN-2107
 URL: https://issues.apache.org/jira/browse/YARN-2107
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Attachments: YARN-2107.txt


 Right now, most of timeline-server classes are present in an 
 applicationhistoryserver package instead of a top level timeline package.
 This is one part of YARN-2043, there is more to do..



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2108) Show minShare on RM Scheduler page

2014-05-27 Thread Siqi Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated YARN-2108:
--

Description: 
Today RM Scheduler page shows FairShare, Used, Used (over fair share) and 
MaxCapacity.
It would be better to show MinShare with possibly different color code, so that 
we know queue is running more than its min share. 

 Show minShare on RM Scheduler page
 --

 Key: YARN-2108
 URL: https://issues.apache.org/jira/browse/YARN-2108
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Siqi Li
Assignee: Siqi Li

 Today RM Scheduler page shows FairShare, Used, Used (over fair share) and 
 MaxCapacity.
 It would be better to show MinShare with possibly different color code, so 
 that we know queue is running more than its min share. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1021) Yarn Scheduler Load Simulator

[
https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010234#comment-14010234
]

Wei Yan commented on YARN-1021:
---

[~cristiana.voicu], the SLS directly supports rumen traces. In general, you
need to have some existing workload traces (i.e., from some production
clusters), and then use Rumen to generate workload traces. Then let the SLS
load these traces. Or you can generate some traces randomly (random # of jobs,
requests, lifetime, etc).
Sorry that I don't have the traces used in that page right now.

Yarn Scheduler Load Simulator
-

Key: YARN-1021
URL: https://issues.apache.org/jira/browse/YARN-1021
Project: Hadoop YARN
Issue Type: New Feature
Components: scheduler
Reporter: Wei Yan
Assignee: Wei Yan
Fix For: 2.3.0

Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz,
YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch,
YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch,
YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.patch,
YARN-1021.patch, YARN-1021.patch, YARN-1021.patch, YARN-1021.pdf

The Yarn Scheduler is a fertile area of interest with different
implementations, e.g., Fifo, Capacity and Fair schedulers. Meanwhile,
several optimizations are also made to improve scheduler performance for
different scenarios and workload. Each scheduler algorithm has its own set of
features, and drives scheduling decisions by many factors, such as fairness,
capacity guarantee, resource availability, etc. It is very important to
evaluate a scheduler algorithm very well before we deploy it in a production
cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling
algorithm. Evaluating in a real cluster is always time and cost consuming,
and it is also very hard to find a large-enough cluster. Hence, a simulator
which can predict how well a scheduler algorithm for some specific workload
would be quite useful.
We want to build a Scheduler Load Simulator to simulate large-scale Yarn
clusters and application loads in a single machine. This would be invaluable
in furthering Yarn by providing a tool for researchers and developers to
prototype new scheduler features and predict their behavior and performance
with reasonable amount of confidence, there-by aiding rapid innovation.
The simulator will exercise the real Yarn ResourceManager removing the
network factor by simulating NodeManagers and ApplicationMasters via handling
and dispatching NM/AMs heartbeat events from within the same JVM.
To keep tracking of scheduler behavior and performance, a scheduler wrapper
will wrap the real scheduler.
The simulator will produce real time metrics while executing, including:
* Resource usages for whole cluster and each queue, which can be utilized to
configure cluster and queue's capacity.
* The detailed application execution trace (recorded in relation to simulated
time), which can be analyzed to understand/validate the scheduler behavior
(individual jobs turn around time, throughput, fairness, capacity guarantee,
etc).
* Several key metrics of scheduler algorithm, such as time cost of each
scheduler operation (allocate, handle, etc), which can be utilized by Hadoop
developers to find the code spots and scalability limits.
The simulator will provide real time charts showing the behavior of the
scheduler and its performance.
A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing
how to use simulator to simulate Fair Scheduler and Capacity Scheduler.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2099) Preemption in fair scheduler should consider app priorities

2014-05-27 Thread Ashwin Shankar (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010244#comment-14010244
 ] 

Ashwin Shankar commented on YARN-2099:
--

Ah, I didn't know about YARN-596,this is very nice ! 
I agree with Sandy's comment. Keeping app preemption based on leaf queue's 
scheduling policy and having a separate policy
which is purely based on priority makes sense to me.

 Preemption in fair scheduler should consider app priorities
 ---

 Key: YARN-2099
 URL: https://issues.apache.org/jira/browse/YARN-2099
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, resourcemanager
Affects Versions: 2.5.0
Reporter: Ashwin Shankar

 Fair scheduler should take app priorities into account while
 preempting containers.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-1769) CapacityScheduler: Improve reservations

2014-05-27 Thread Thomas Graves (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated YARN-1769:


Attachment: YARN-1769.patch

upmerged to latest

 CapacityScheduler:  Improve reservations
 

 Key: YARN-1769
 URL: https://issues.apache.org/jira/browse/YARN-1769
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Affects Versions: 2.3.0
Reporter: Thomas Graves
Assignee: Thomas Graves
 Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, 
 YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, 
 YARN-1769.patch, YARN-1769.patch


 Currently the CapacityScheduler uses reservations in order to handle requests 
 for large containers and the fact there might not currently be enough space 
 available on a single host.
 The current algorithm for reservations is to reserve as many containers as 
 currently required and then it will start to reserve more above that after a 
 certain number of re-reservations (currently biased against larger 
 containers).  Anytime it hits the limit of number reserved it stops looking 
 at any other nodes. This results in potentially missing nodes that have 
 enough space to fullfill the request.   
 The other place for improvement is currently reservations count against your 
 queue capacity.  If you have reservations you could hit the various limits 
 which would then stop you from looking further at that node.  
 The above 2 cases can cause an application requesting a larger container to 
 take a long time to gets it resources.  
 We could improve upon both of those by simply continuing to look at incoming 
 nodes to see if we could potentially swap out a reservation for an actual 
 allocation. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2099) Preemption in fair scheduler should consider app priorities


[ 
https://issues.apache.org/jira/browse/YARN-2099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010251#comment-14010251
 ] 

Wei Yan commented on YARN-2099:
---

Hey, [~ashwinshankar77], Are you working on this one? If not, I would like to 
take it.

 Preemption in fair scheduler should consider app priorities
 ---

 Key: YARN-2099
 URL: https://issues.apache.org/jira/browse/YARN-2099
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, resourcemanager
Affects Versions: 2.5.0
Reporter: Ashwin Shankar

 Fair scheduler should take app priorities into account while
 preempting containers.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2099) Preemption in fair scheduler should consider app priorities

2014-05-27 Thread Ashwin Shankar (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010269#comment-14010269
 ] 

Ashwin Shankar commented on YARN-2099:
--

Hey [~ywskycn], please go ahead.

 Preemption in fair scheduler should consider app priorities
 ---

 Key: YARN-2099
 URL: https://issues.apache.org/jira/browse/YARN-2099
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, resourcemanager
Affects Versions: 2.5.0
Reporter: Ashwin Shankar

 Fair scheduler should take app priorities into account while
 preempting containers.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2107) Refactor timeline classes into server.timeline package

2014-05-27 Thread Zhijie Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010277#comment-14010277
 ] 

Zhijie Shen commented on YARN-2107:
---

+1 for the new namespace. The test failure is caused by the defaults:

{code}
  property
descriptionStore class name for timeline store./description
nameyarn.timeline-service.store-class/name

valueorg.apache.hadoop.yarn.server.applicationhistoryservice.timeline.LeveldbTimelineStore/value
  /property
{code}

We need to change yarn-default.xml accordingly.

 Refactor timeline classes into server.timeline package
 --

 Key: YARN-2107
 URL: https://issues.apache.org/jira/browse/YARN-2107
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Attachments: YARN-2107.txt


 Right now, most of timeline-server classes are present in an 
 applicationhistoryserver package instead of a top level timeline package.
 This is one part of YARN-2043, there is more to do..



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Assigned] (YARN-2099) Preemption in fair scheduler should consider app priorities


 [ 
https://issues.apache.org/jira/browse/YARN-2099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan reassigned YARN-2099:
-

Assignee: Wei Yan

 Preemption in fair scheduler should consider app priorities
 ---

 Key: YARN-2099
 URL: https://issues.apache.org/jira/browse/YARN-2099
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, resourcemanager
Affects Versions: 2.5.0
Reporter: Ashwin Shankar
Assignee: Wei Yan

 Fair scheduler should take app priorities into account while
 preempting containers.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations


[ 
https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010319#comment-14010319
 ] 

Hadoop QA commented on YARN-1769:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12646981/YARN-1769.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3836//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3836//console

This message is automatically generated.

 CapacityScheduler:  Improve reservations
 

 Key: YARN-1769
 URL: https://issues.apache.org/jira/browse/YARN-1769
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Affects Versions: 2.3.0
Reporter: Thomas Graves
Assignee: Thomas Graves
 Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, 
 YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, 
 YARN-1769.patch, YARN-1769.patch


 Currently the CapacityScheduler uses reservations in order to handle requests 
 for large containers and the fact there might not currently be enough space 
 available on a single host.
 The current algorithm for reservations is to reserve as many containers as 
 currently required and then it will start to reserve more above that after a 
 certain number of re-reservations (currently biased against larger 
 containers).  Anytime it hits the limit of number reserved it stops looking 
 at any other nodes. This results in potentially missing nodes that have 
 enough space to fullfill the request.   
 The other place for improvement is currently reservations count against your 
 queue capacity.  If you have reservations you could hit the various limits 
 which would then stop you from looking further at that node.  
 The above 2 cases can cause an application requesting a larger container to 
 take a long time to gets it resources.  
 We could improve upon both of those by simply continuing to look at incoming 
 nodes to see if we could potentially swap out a reservation for an actual 
 allocation. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1769) CapacityScheduler: Improve reservations

2014-05-27 Thread Thomas Graves (JIRA)

[
https://issues.apache.org/jira/browse/YARN-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010358#comment-14010358
]

Thomas Graves commented on YARN-1769:
-

TestFairScheduler is failing for other reasons. see
https://issues.apache.org/jira/browse/YARN-2105.

CapacityScheduler: Improve reservations

Key: YARN-1769
URL: https://issues.apache.org/jira/browse/YARN-1769
Project: Hadoop YARN
Issue Type: Improvement
Components: capacityscheduler
Affects Versions: 2.3.0
Reporter: Thomas Graves
Assignee: Thomas Graves
Attachments: YARN-1769.patch, YARN-1769.patch, YARN-1769.patch,
YARN-1769.patch, YARN-1769.patch, YARN-1769.patch, YARN-1769.patch,
YARN-1769.patch, YARN-1769.patch

Currently the CapacityScheduler uses reservations in order to handle requests
for large containers and the fact there might not currently be enough space
available on a single host.
The current algorithm for reservations is to reserve as many containers as
currently required and then it will start to reserve more above that after a
certain number of re-reservations (currently biased against larger
containers). Anytime it hits the limit of number reserved it stops looking
at any other nodes. This results in potentially missing nodes that have
enough space to fullfill the request.
The other place for improvement is currently reservations count against your
queue capacity. If you have reservations you could hit the various limits
which would then stop you from looking further at that node.
The above 2 cases can cause an application requesting a larger container to
take a long time to gets it resources.
We could improve upon both of those by simply continuing to look at incoming
nodes to see if we could potentially swap out a reservation for an actual
allocation.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2107) Refactor timeline classes into server.timeline package


 [ 
https://issues.apache.org/jira/browse/YARN-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-2107:
--

Attachment: YARN-2107.1.txt

Tx for the review and the tip Zhijie. I fixed both yarn-default.xml and the 
documentation.

Technically the rename is an incompatible change the LevelDBStore impl. But 
Timeline service wasn't 'declared' stable, so I am not creating any 
compatibility bridges.

 Refactor timeline classes into server.timeline package
 --

 Key: YARN-2107
 URL: https://issues.apache.org/jira/browse/YARN-2107
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Attachments: YARN-2107.1.txt, YARN-2107.txt


 Right now, most of timeline-server classes are present in an 
 applicationhistoryserver package instead of a top level timeline package.
 This is one part of YARN-2043, there is more to do..



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1474) Make schedulers services


[ 
https://issues.apache.org/jira/browse/YARN-1474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010372#comment-14010372
 ] 

Tsuyoshi OZAWA commented on YARN-1474:
--

[~kkambatl], v17 is ready for review. could you take a look?

 Make schedulers services
 

 Key: YARN-1474
 URL: https://issues.apache.org/jira/browse/YARN-1474
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Affects Versions: 2.3.0, 2.4.0
Reporter: Sandy Ryza
Assignee: Tsuyoshi OZAWA
 Attachments: YARN-1474.1.patch, YARN-1474.10.patch, 
 YARN-1474.11.patch, YARN-1474.12.patch, YARN-1474.13.patch, 
 YARN-1474.14.patch, YARN-1474.15.patch, YARN-1474.16.patch, 
 YARN-1474.17.patch, YARN-1474.2.patch, YARN-1474.3.patch, YARN-1474.4.patch, 
 YARN-1474.5.patch, YARN-1474.6.patch, YARN-1474.7.patch, YARN-1474.8.patch, 
 YARN-1474.9.patch


 Schedulers currently have a reinitialize but no start and stop.  Fitting them 
 into the YARN service model would make things more coherent.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2107) Refactor timeline classes into server.timeline package


[ 
https://issues.apache.org/jira/browse/YARN-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010448#comment-14010448
 ] 

Hadoop QA commented on YARN-2107:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12646997/YARN-2107.1.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 7 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3837//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3837//console

This message is automatically generated.

 Refactor timeline classes into server.timeline package
 --

 Key: YARN-2107
 URL: https://issues.apache.org/jira/browse/YARN-2107
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Attachments: YARN-2107.1.txt, YARN-2107.txt


 Right now, most of timeline-server classes are present in an 
 applicationhistoryserver package instead of a top level timeline package.
 This is one part of YARN-2043, there is more to do..



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Assigned] (YARN-1961) Fair scheduler preemption doesn't work for non-leaf queues

2014-05-27 Thread Ashwin Shankar (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-1961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashwin Shankar reassigned YARN-1961:


Assignee: Ashwin Shankar

 Fair scheduler preemption doesn't work for non-leaf queues
 --

 Key: YARN-1961
 URL: https://issues.apache.org/jira/browse/YARN-1961
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.4.0
Reporter: Ashwin Shankar
Assignee: Ashwin Shankar
  Labels: scheduler

 Setting minResources and minSharePreemptionTimeout to a non-leaf queue 
 doesn't cause preemption to happen when that non-leaf queue is below 
 minResources and there are outstanding demands in that non-leaf queue.
 Here is an example fs allocation config(partial) :
 {code:xml}
 queue name=abc
   minResources3072 mb,0 vcores/minResources
   minSharePreemptionTimeout30/minSharePreemptionTimeout
 queue name=childabc1
 /queue
 queue name=childabc2
 /queue
  /queue
  {code}
 With the above configs,preemption doesn't seem to happen if queue abc is 
 below minShare and it has outstanding unsatisfied demands from apps in its 
 child queues. Ideally in such cases we would like preemption to kick off and 
 reclaim resources from other queues(not under queue abc).
 Looking at the code it seems like preemption checks for starvation only at 
 the leaf queue level and not at the parent level.
 {code:title=FairScheduler.java|borderStyle=solid}
 boolean isStarvedForMinShare(FSLeafQueue sched)
 boolean isStarvedForFairShare(FSLeafQueue sched)
 {code}
 This affects our use case where we have a parent queue with probably a 100 
 unconfigured leaf queues under it.We want to give a minshare to the parent 
 queue to protect all the leaf queues under it,but we cannot do it due to this 
 bug.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2105) Three TestFairScheduler tests fail in trunk


[ 
https://issues.apache.org/jira/browse/YARN-2105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010497#comment-14010497
 ] 

Sandy Ryza commented on YARN-2105:
--

+1.  Thanks for the quick turnaround on this Ashwin.

 Three TestFairScheduler tests fail in trunk
 ---

 Key: YARN-2105
 URL: https://issues.apache.org/jira/browse/YARN-2105
 Project: Hadoop YARN
  Issue Type: Test
Reporter: Ted Yu
Assignee: Ashwin Shankar
 Attachments: YARN-2105-v1.txt


 The following tests fail in trunk:
 {code}
 Failed tests:
   TestFairScheduler.testDontAllowUndeclaredPools:2412 expected:1 but was:0
 Tests in error:
   TestFairScheduler.testQueuePlacementWithPolicy:624 NullPointer
   TestFairScheduler.testNotUserAsDefaultQueue:530 » NullPointer
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2108) Show minShare on RM Fair Scheduler page


 [ 
https://issues.apache.org/jira/browse/YARN-2108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-2108:
-

Summary: Show minShare on RM Fair Scheduler page  (was: Show minShare on RM 
Scheduler page)

 Show minShare on RM Fair Scheduler page
 ---

 Key: YARN-2108
 URL: https://issues.apache.org/jira/browse/YARN-2108
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Siqi Li
Assignee: Siqi Li

 Today RM Scheduler page shows FairShare, Used, Used (over fair share) and 
 MaxCapacity.
 It would be better to show MinShare with possibly different color code, so 
 that we know queue is running more than its min share. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2108) Show minShare on RM Fair Scheduler page

2014-05-27 Thread Siqi Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated YARN-2108:
--

Attachment: YARN-2108.v1.patch

 Show minShare on RM Fair Scheduler page
 ---

 Key: YARN-2108
 URL: https://issues.apache.org/jira/browse/YARN-2108
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Siqi Li
Assignee: Siqi Li
 Attachments: YARN-2108.v1.patch


 Today RM Scheduler page shows FairShare, Used, Used (over fair share) and 
 MaxCapacity.
 It would be better to show MinShare with possibly different color code, so 
 that we know queue is running more than its min share. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1801) NPE in public localizer


[ 
https://issues.apache.org/jira/browse/YARN-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010507#comment-14010507
 ] 

Tsuyoshi OZAWA commented on YARN-1801:
--

Looks good to me(non-binding). [~jlowe], can you take a look please?

 NPE in public localizer
 ---

 Key: YARN-1801
 URL: https://issues.apache.org/jira/browse/YARN-1801
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: Jason Lowe
Assignee: Hong Zhiguo
Priority: Critical
 Attachments: YARN-1801.patch


 While investigating YARN-1800 found this in the NM logs that caused the 
 public localizer to shutdown:
 {noformat}
 2014-01-23 01:26:38,655 INFO  localizer.ResourceLocalizationService 
 (ResourceLocalizationService.java:addResource(651)) - Downloading public 
 rsrc:{ 
 hdfs://colo-2:8020/user/fertrist/oozie-oozi/601-140114233013619-oozie-oozi-W/aggregator--map-reduce/map-reduce-launcher.jar,
  1390440382009, FILE, null }
 2014-01-23 01:26:38,656 FATAL localizer.ResourceLocalizationService 
 (ResourceLocalizationService.java:run(726)) - Error: Shutting down
 java.lang.NullPointerException
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.run(ResourceLocalizationService.java:712)
 2014-01-23 01:26:38,656 INFO  localizer.ResourceLocalizationService 
 (ResourceLocalizationService.java:run(728)) - Public cache exiting
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2105) Fix TestFairScheduler after YARN-2012


 [ 
https://issues.apache.org/jira/browse/YARN-2105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-2105:
-

Summary: Fix TestFairScheduler after YARN-2012  (was: Three 
TestFairScheduler tests fail in trunk)

 Fix TestFairScheduler after YARN-2012
 -

 Key: YARN-2105
 URL: https://issues.apache.org/jira/browse/YARN-2105
 Project: Hadoop YARN
  Issue Type: Test
Reporter: Ted Yu
Assignee: Ashwin Shankar
 Fix For: 2.5.0

 Attachments: YARN-2105-v1.txt


 The following tests fail in trunk:
 {code}
 Failed tests:
   TestFairScheduler.testDontAllowUndeclaredPools:2412 expected:1 but was:0
 Tests in error:
   TestFairScheduler.testQueuePlacementWithPolicy:624 NullPointer
   TestFairScheduler.testNotUserAsDefaultQueue:530 » NullPointer
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2108) Show minShare on RM Fair Scheduler page

2014-05-27 Thread Siqi Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated YARN-2108:
--

Attachment: YARN-2108.v2.patch

 Show minShare on RM Fair Scheduler page
 ---

 Key: YARN-2108
 URL: https://issues.apache.org/jira/browse/YARN-2108
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Siqi Li
Assignee: Siqi Li
 Attachments: YARN-2108.v1.patch, YARN-2108.v2.patch


 Today RM Scheduler page shows FairShare, Used, Used (over fair share) and 
 MaxCapacity.
 It would be better to show MinShare with possibly different color code, so 
 that we know queue is running more than its min share. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1801) NPE in public localizer


[ 
https://issues.apache.org/jira/browse/YARN-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010560#comment-14010560
 ] 

Hadoop QA commented on YARN-1801:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12646195/YARN-1801.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3839//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3839//console

This message is automatically generated.

 NPE in public localizer
 ---

 Key: YARN-1801
 URL: https://issues.apache.org/jira/browse/YARN-1801
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: Jason Lowe
Assignee: Hong Zhiguo
Priority: Critical
 Attachments: YARN-1801.patch


 While investigating YARN-1800 found this in the NM logs that caused the 
 public localizer to shutdown:
 {noformat}
 2014-01-23 01:26:38,655 INFO  localizer.ResourceLocalizationService 
 (ResourceLocalizationService.java:addResource(651)) - Downloading public 
 rsrc:{ 
 hdfs://colo-2:8020/user/fertrist/oozie-oozi/601-140114233013619-oozie-oozi-W/aggregator--map-reduce/map-reduce-launcher.jar,
  1390440382009, FILE, null }
 2014-01-23 01:26:38,656 FATAL localizer.ResourceLocalizationService 
 (ResourceLocalizationService.java:run(726)) - Error: Shutting down
 java.lang.NullPointerException
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.run(ResourceLocalizationService.java:712)
 2014-01-23 01:26:38,656 INFO  localizer.ResourceLocalizationService 
 (ResourceLocalizationService.java:run(728)) - Public cache exiting
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Assigned] (YARN-2091) Add ContainerExitStatus.KILL_EXCEEDED_MEMORY and pass it to app masters


 [ 
https://issues.apache.org/jira/browse/YARN-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA reassigned YARN-2091:


Assignee: Tsuyoshi OZAWA

 Add ContainerExitStatus.KILL_EXCEEDED_MEMORY and pass it to app masters
 ---

 Key: YARN-2091
 URL: https://issues.apache.org/jira/browse/YARN-2091
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Bikas Saha
Assignee: Tsuyoshi OZAWA

 Currently, the AM cannot programmatically determine if the task was killed 
 due to using excessive memory. The NM kills it without passing this 
 information in the container status back to the RM. So the AM cannot take any 
 action here. The jira tracks adding this exit status and passing it from the 
 NM to the RM and then the AM. In general, there may be other such actions 
 taken by YARN that are currently opaque to the AM. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2091) Add ContainerExitStatus.KILL_EXCEEDED_MEMORY and pass it to app masters


[ 
https://issues.apache.org/jira/browse/YARN-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010580#comment-14010580
 ] 

Tsuyoshi OZAWA commented on YARN-2091:
--

ContainerManagerImpl cannot distinguish the exit reason because 
ContainersMonitorImpl dispatches ContainerKillEvent without the exit reason 
currently. I plan to add exit reason to ContainerKillEvent. Please let me know 
if you have better idea.

 Add ContainerExitStatus.KILL_EXCEEDED_MEMORY and pass it to app masters
 ---

 Key: YARN-2091
 URL: https://issues.apache.org/jira/browse/YARN-2091
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Bikas Saha
Assignee: Tsuyoshi OZAWA

 Currently, the AM cannot programmatically determine if the task was killed 
 due to using excessive memory. The NM kills it without passing this 
 information in the container status back to the RM. So the AM cannot take any 
 action here. The jira tracks adding this exit status and passing it from the 
 NM to the RM and then the AM. In general, there may be other such actions 
 taken by YARN that are currently opaque to the AM. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2091) Add ContainerExitStatus.KILL_EXCEEDED_MEMORY and pass it to app masters

2014-05-27 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010585#comment-14010585
 ] 

Bikas Saha commented on YARN-2091:
--

Thats the missing pieces AFAIK. That exit reason needs to be passed along 
internally through the NM and then on to the RM and AM. Maybe simply directly 
use ContainerExitStatus instead of a new reason object inside 
ContainerKillEvent.

 Add ContainerExitStatus.KILL_EXCEEDED_MEMORY and pass it to app masters
 ---

 Key: YARN-2091
 URL: https://issues.apache.org/jira/browse/YARN-2091
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Bikas Saha
Assignee: Tsuyoshi OZAWA

 Currently, the AM cannot programmatically determine if the task was killed 
 due to using excessive memory. The NM kills it without passing this 
 information in the container status back to the RM. So the AM cannot take any 
 action here. The jira tracks adding this exit status and passing it from the 
 NM to the RM and then the AM. In general, there may be other such actions 
 taken by YARN that are currently opaque to the AM. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-596) In fair scheduler, intra-application container priorities affect inter-application preemption decisions


 [ 
https://issues.apache.org/jira/browse/YARN-596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan updated YARN-596:
-

Attachment: YARN-596.patch

Update a new patch after YARN-2105 is in.

 In fair scheduler, intra-application container priorities affect 
 inter-application preemption decisions
 ---

 Key: YARN-596
 URL: https://issues.apache.org/jira/browse/YARN-596
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.0.3-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-596.patch, YARN-596.patch, YARN-596.patch, 
 YARN-596.patch, YARN-596.patch, YARN-596.patch, YARN-596.patch, 
 YARN-596.patch, YARN-596.patch


 In the fair scheduler, containers are chosen for preemption in the following 
 way:
 All containers for all apps that are in queues that are over their fair share 
 are put in a list.
 The list is sorted in order of the priority that the container was requested 
 in.
 This means that an application can shield itself from preemption by 
 requesting it's containers at higher priorities, which doesn't really make 
 sense.
 Also, an application that is not over its fair share, but that is in a queue 
 that is over it's fair share is just as likely to have containers preempted 
 as an application that is over its fair share.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2108) Show minShare on RM Fair Scheduler page


[ 
https://issues.apache.org/jira/browse/YARN-2108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010611#comment-14010611
 ] 

Hadoop QA commented on YARN-2108:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12647016/YARN-2108.v2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3840//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3840//console

This message is automatically generated.

 Show minShare on RM Fair Scheduler page
 ---

 Key: YARN-2108
 URL: https://issues.apache.org/jira/browse/YARN-2108
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Siqi Li
Assignee: Siqi Li
 Attachments: YARN-2108.v1.patch, YARN-2108.v2.patch


 Today RM Scheduler page shows FairShare, Used, Used (over fair share) and 
 MaxCapacity.
 It would be better to show MinShare with possibly different color code, so 
 that we know queue is running more than its min share. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2091) Add ContainerExitStatus.KILL_EXCEEDED_MEMORY and pass it to app masters


[ 
https://issues.apache.org/jira/browse/YARN-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010614#comment-14010614
 ] 

Tsuyoshi OZAWA commented on YARN-2091:
--

Hi Bikas, let me clarify what simply directly use means. I meant to pass exit 
reason via ContainerKillEvent like {{ContainerKillEvent(containerId, 
ContainerExitStatus.KILL_EXCEEDED_MEMORY, msg)}}. Is this out of way?

 Add ContainerExitStatus.KILL_EXCEEDED_MEMORY and pass it to app masters
 ---

 Key: YARN-2091
 URL: https://issues.apache.org/jira/browse/YARN-2091
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Bikas Saha
Assignee: Tsuyoshi OZAWA

 Currently, the AM cannot programmatically determine if the task was killed 
 due to using excessive memory. The NM kills it without passing this 
 information in the container status back to the RM. So the AM cannot take any 
 action here. The jira tracks adding this exit status and passing it from the 
 NM to the RM and then the AM. In general, there may be other such actions 
 taken by YARN that are currently opaque to the AM. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-1913) With Fair Scheduler, cluster can logjam when all resources are consumed by AMs


 [ 
https://issues.apache.org/jira/browse/YARN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan updated YARN-1913:
--

Attachment: YARN-1913.patch

 With Fair Scheduler, cluster can logjam when all resources are consumed by AMs
 --

 Key: YARN-1913
 URL: https://issues.apache.org/jira/browse/YARN-1913
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.3.0
Reporter: bc Wong
Assignee: Wei Yan
  Labels: easyfix
 Attachments: YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, 
 YARN-1913.patch


 It's possible to deadlock a cluster by submitting many applications at once, 
 and have all cluster resources taken up by AMs.
 One solution is for the scheduler to limit resources taken up by AMs, as a 
 percentage of total cluster resources, via a maxApplicationMasterShare 
 config.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2091) Add ContainerExitStatus.KILL_EXCEEDED_MEMORY and pass it to app masters

2014-05-27 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010630#comment-14010630
 ] 

Bikas Saha commented on YARN-2091:
--

We are on the same page. The kill reason is directly a ContainerExitStatus.

 Add ContainerExitStatus.KILL_EXCEEDED_MEMORY and pass it to app masters
 ---

 Key: YARN-2091
 URL: https://issues.apache.org/jira/browse/YARN-2091
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Bikas Saha
Assignee: Tsuyoshi OZAWA

 Currently, the AM cannot programmatically determine if the task was killed 
 due to using excessive memory. The NM kills it without passing this 
 information in the container status back to the RM. So the AM cannot take any 
 action here. The jira tracks adding this exit status and passing it from the 
 NM to the RM and then the AM. In general, there may be other such actions 
 taken by YARN that are currently opaque to the AM. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-596) In fair scheduler, intra-application container priorities affect inter-application preemption decisions


[ 
https://issues.apache.org/jira/browse/YARN-596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010643#comment-14010643
 ] 

Hadoop QA commented on YARN-596:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12647026/YARN-596.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3841//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3841//console

This message is automatically generated.

 In fair scheduler, intra-application container priorities affect 
 inter-application preemption decisions
 ---

 Key: YARN-596
 URL: https://issues.apache.org/jira/browse/YARN-596
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.0.3-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-596.patch, YARN-596.patch, YARN-596.patch, 
 YARN-596.patch, YARN-596.patch, YARN-596.patch, YARN-596.patch, 
 YARN-596.patch, YARN-596.patch


 In the fair scheduler, containers are chosen for preemption in the following 
 way:
 All containers for all apps that are in queues that are over their fair share 
 are put in a list.
 The list is sorted in order of the priority that the container was requested 
 in.
 This means that an application can shield itself from preemption by 
 requesting it's containers at higher priorities, which doesn't really make 
 sense.
 Also, an application that is not over its fair share, but that is in a queue 
 that is over it's fair share is just as likely to have containers preempted 
 as an application that is over its fair share.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1913) With Fair Scheduler, cluster can logjam when all resources are consumed by AMs


[ 
https://issues.apache.org/jira/browse/YARN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010659#comment-14010659
 ] 

Hadoop QA commented on YARN-1913:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12647029/YARN-1913.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3842//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3842//console

This message is automatically generated.

 With Fair Scheduler, cluster can logjam when all resources are consumed by AMs
 --

 Key: YARN-1913
 URL: https://issues.apache.org/jira/browse/YARN-1913
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.3.0
Reporter: bc Wong
Assignee: Wei Yan
  Labels: easyfix
 Attachments: YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, 
 YARN-1913.patch


 It's possible to deadlock a cluster by submitting many applications at once, 
 and have all cluster resources taken up by AMs.
 One solution is for the scheduler to limit resources taken up by AMs, as a 
 percentage of total cluster resources, via a maxApplicationMasterShare 
 config.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-2103) Inconsistency between viaProto flag and initial value of SerializedExceptionProto.Builder

2014-05-27 Thread Binglin Chang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010700#comment-14010700
 ] 

Binglin Chang commented on YARN-2103:
-

Hi [~ozawa], thanks for reviewing the patch and the comments. I use the 
original title because the bug isn't just about inconsistent viaProto, but also 
lack of equals and hashcode method(which will affect other records who uses 
SerializedException), I guess I should point out all bugs in the jira. 

about code format, most PBImpl classes use those common code:
{code}
  private void maybeInitBuilder() {
if (viaProto || builder == null) {
  builder = GetApplicationsRequestProto.newBuilder(proto);
}
viaProto = false;
  }

  @Override
  public int hashCode() {
return getProto().hashCode();
  }

  @Override
  public boolean equals(Object other) {
if (other == null)
  return false;
if (other.getClass().isAssignableFrom(this.getClass())) {
  return this.getProto().equals(this.getClass().cast(other).getProto());
}
return false;
  }

{code}

you can see GetApplicationsRequestPBImpl/GetApplicationsResponsePBImpl,  I just 
follow those patterns, maybe we can change them all in another JIRA, changing 
them may not fit into in this JIRA. 

bq.  How about adding concrete tests as a first step of generic tests on 
YARN-2051. 
After generic test are added, those old tests are probably redundant and can be 
removed. Guess we can discuss this in the future. I can provide a separate test 
currently.



 Inconsistency between viaProto flag and initial value of 
 SerializedExceptionProto.Builder
 -

 Key: YARN-2103
 URL: https://issues.apache.org/jira/browse/YARN-2103
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Binglin Chang
Assignee: Binglin Chang
 Attachments: YARN-2103.v1.patch


 {code}
   SerializedExceptionProto proto = SerializedExceptionProto
   .getDefaultInstance();
   SerializedExceptionProto.Builder builder = null;
   boolean viaProto = false;
 {code}
 Since viaProto is false, we should initiate build rather than proto



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (YARN-2103) Inconsistency between viaProto flag and initial value of SerializedExceptionProto.Builder

2014-05-27 Thread Binglin Chang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Binglin Chang updated YARN-2103:


Description: 
Bug 1:
{code}
  SerializedExceptionProto proto = SerializedExceptionProto
  .getDefaultInstance();
  SerializedExceptionProto.Builder builder = null;
  boolean viaProto = false;
{code}

Since viaProto is false, we should initiate build rather than proto

Bug 2:
the class does not provide hashcode() and equals() like other PBImpl records, 
this class is used in other records, it may affect other records' behavior. 



  was:
{code}
  SerializedExceptionProto proto = SerializedExceptionProto
  .getDefaultInstance();
  SerializedExceptionProto.Builder builder = null;
  boolean viaProto = false;
{code}

Since viaProto is false, we should initiate build rather than proto



 Inconsistency between viaProto flag and initial value of 
 SerializedExceptionProto.Builder
 -

 Key: YARN-2103
 URL: https://issues.apache.org/jira/browse/YARN-2103
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Binglin Chang
Assignee: Binglin Chang
 Attachments: YARN-2103.v1.patch


 Bug 1:
 {code}
   SerializedExceptionProto proto = SerializedExceptionProto
   .getDefaultInstance();
   SerializedExceptionProto.Builder builder = null;
   boolean viaProto = false;
 {code}
 Since viaProto is false, we should initiate build rather than proto
 Bug 2:
 the class does not provide hashcode() and equals() like other PBImpl records, 
 this class is used in other records, it may affect other records' behavior. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-1474) Make schedulers services

2014-05-27 Thread Karthik Kambatla (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010728#comment-14010728
 ] 

Karthik Kambatla commented on YARN-1474:


Thanks Tsuyoshi. We are very close to getting this in. Few minor comments:
# In each of the schedulers, I don't think we need the following snippet or for 
that matter the variable {{initialized}} at all. {{reinitialize()}} would have 
just the contents of else-block. When using the scheduler, one should 
setRMContext(), init() and then reinitialize() thereafter.
{code}
if (!initialized) {
  this.rmContext = rmContext;
  initScheduler(configuration);
  startSchedulerThreads();
} else {
{code}
# ResourceSchedulerWrapper should override serviceInit, serviceStart and 
serviceStop methods. Not init, start and stop. 
# I have a feeling we ll have to update some tests including the ones that are 
modified in the latest patch to call scheduler.init() right after 
scheduler.setRMContext, if we are not using the scheduler from a MockRM or 
ResourceManager instance.

 Make schedulers services
 

 Key: YARN-1474
 URL: https://issues.apache.org/jira/browse/YARN-1474
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Affects Versions: 2.3.0, 2.4.0
Reporter: Sandy Ryza
Assignee: Tsuyoshi OZAWA
 Attachments: YARN-1474.1.patch, YARN-1474.10.patch, 
 YARN-1474.11.patch, YARN-1474.12.patch, YARN-1474.13.patch, 
 YARN-1474.14.patch, YARN-1474.15.patch, YARN-1474.16.patch, 
 YARN-1474.17.patch, YARN-1474.2.patch, YARN-1474.3.patch, YARN-1474.4.patch, 
 YARN-1474.5.patch, YARN-1474.6.patch, YARN-1474.7.patch, YARN-1474.8.patch, 
 YARN-1474.9.patch


 Schedulers currently have a reinitialize but no start and stop.  Fitting them 
 into the YARN service model would make things more coherent.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (YARN-596) In fair scheduler, intra-application container priorities affect inter-application preemption decisions