[jira] [Commented] (YARN-2890) MiniMRYarnCluster should turn on timeline service if configured to do so

2014-11-21 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221859#comment-14221859
 ] 

Varun Saxena commented on YARN-2890:


Oh. I saw it unassigned for several hours so assigned it to myself. You can 
assign it back to yourself, if you want.

> MiniMRYarnCluster should turn on timeline service if configured to do so
> 
>
> Key: YARN-2890
> URL: https://issues.apache.org/jira/browse/YARN-2890
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Mit Desai
>Assignee: Varun Saxena
> Fix For: 2.6.1
>
>
> Currently the MiniMRYarnCluster does not consider the configuration value for 
> enabling timeline service before starting. The MiniYarnCluster should only 
> start the timeline service if it is configured to do so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2139) [Umbrella] Support for Disk as a Resource in YARN

2014-11-21 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221783#comment-14221783
 ] 

Karthik Kambatla commented on YARN-2139:


Valid points, Bikas. [~ywskycn] and I will spend sometime and propose a design 
that would allow plugging in these multiple dimensions.

> [Umbrella] Support for Disk as a Resource in YARN 
> --
>
> Key: YARN-2139
> URL: https://issues.apache.org/jira/browse/YARN-2139
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Wei Yan
> Attachments: Disk_IO_Scheduling_Design_1.pdf, 
> Disk_IO_Scheduling_Design_2.pdf, YARN-2139-prototype-2.patch, 
> YARN-2139-prototype.patch
>
>
> YARN should consider disk as another resource for (1) scheduling tasks on 
> nodes, (2) isolation at runtime, (3) spindle locality. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2877) Extend YARN to support distributed scheduling

2014-11-21 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221745#comment-14221745
 ] 

Wangda Tan commented on YARN-2877:
--

Thanks [~sriramsrao] for bringing up the great idea and 
[~kkaranasos]/[~curino]'s explanations. Definitely we need such mechanisms to 
have low-latency container launching to support millisec-level-latency tasks.

Some questions about this,
# Since the LocalRMs will be totally distributed, does it still possible to 
enforce capacity between queues?
# Will such opportunistical containers come to view of the central RM (used to 
schedule CONSERVATIVE containers)? 
## If yes, will the central RM can decide if a opportunistical container is 
valid or not (saying #containers excesses the app's limitation)? And will the 
preemption still works for opportunistical containers
## If no, should we have someone to coordinate such containers?
# Will central scheduler state (maybe not completely, but important info like 
queue used resource, etc.) broadcast to distributed LocalRMs? I think it might 
be usaful for LocalRMs to decide which opportunistical container should go 
first.

Thanks in advance!

Wangda

> Extend YARN to support distributed scheduling
> -
>
> Key: YARN-2877
> URL: https://issues.apache.org/jira/browse/YARN-2877
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager, resourcemanager
>Reporter: Sriram Rao
>
> This is an umbrella JIRA that proposes to extend YARN to support distributed 
> scheduling.  Briefly, some of the motivations for distributed scheduling are 
> the following:
> 1. Improve cluster utilization by opportunistically executing tasks otherwise 
> idle resources on individual machines.
> 2. Reduce allocation latency.  Tasks where the scheduling time dominates 
> (i.e., task execution time is much less compared to the time required for 
> obtaining a container from the RM).
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2881) Implement PlanFollower for FairScheduler

2014-11-21 Thread Subru Krishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221698#comment-14221698
 ] 

Subru Krishnan commented on YARN-2881:
--

I meant YARN-2738 :).

> Implement PlanFollower for FairScheduler
> 
>
> Key: YARN-2881
> URL: https://issues.apache.org/jira/browse/YARN-2881
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Reporter: Anubhav Dhoot
>Assignee: Anubhav Dhoot
> Attachments: YARN-2881.prelim.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2881) Implement PlanFollower for FairScheduler

2014-11-21 Thread Subru Krishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221695#comment-14221695
 ] 

Subru Krishnan commented on YARN-2881:
--

[~adhoot], thanks for the  patch. It's good to see that majority of the code 
can be reused between both Fair & Capacity Scheduler.

A few comments:
  * Are you assuming that parent queue names are unique in FS?
  * _run()_ need not be synchronized. I know this is from previous code but it 
would be good to clean it up since we are refactoring the code.
  * _getChildReservationQueues()_ could be implemented by the 
_AbstractSchedulerPlanFollower_ using _Queue::getQueueInfo_ ?
  * I think we can add a _getResourceCalculator_ to _YarnScheduler_ as it makes 
sense. Then we need not override _calculateTargetCapacity()_ and 
_isPlanResourcesLessThanReservations()_.
  * Minor: spurious white lines in imports of _CapacitySchedulerPlanFollower_ & 
_FairSchedulerPlanFollower_.

We should be able to see reservation system running end2end with this patch in 
conjunction with YARN-2378.


> Implement PlanFollower for FairScheduler
> 
>
> Key: YARN-2881
> URL: https://issues.apache.org/jira/browse/YARN-2881
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Reporter: Anubhav Dhoot
>Assignee: Anubhav Dhoot
> Attachments: YARN-2881.prelim.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2669) FairScheduler: queue names shouldn't allow periods

2014-11-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221671#comment-14221671
 ] 

Hadoop QA commented on YARN-2669:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12682990/YARN-2669-5.patch
  against trunk revision 1e9a3f4.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5904//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5904//console

This message is automatically generated.

> FairScheduler: queue names shouldn't allow periods
> --
>
> Key: YARN-2669
> URL: https://issues.apache.org/jira/browse/YARN-2669
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Wei Yan
>Assignee: Wei Yan
> Fix For: 2.7.0
>
> Attachments: YARN-2669-1.patch, YARN-2669-2.patch, YARN-2669-3.patch, 
> YARN-2669-4.patch, YARN-2669-5.patch
>
>
> For an allocation file like:
> {noformat}
> 
>   
> 4096mb,4vcores
>   
> 
> {noformat}
> Users may wish to config minResources for a queue with full path "root.q1". 
> However, right now, fair scheduler will treat this configureation for the 
> queue with full name "root.root.q1". We need to print out a warning msg to 
> notify users about this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2139) [Umbrella] Support for Disk as a Resource in YARN

2014-11-21 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221669#comment-14221669
 ] 

Wangda Tan commented on YARN-2139:
--

Thanks [~bikassaha] and [~kasha],

+1 for work on a branch, there might be some great amount of changes across all 
the major modules, frequently rebasing might be a issue if this is based on 
trunk.
And also totally agree about having an abstract policy to wrap disk affinity / 
iops / bandwidth, etc.

> [Umbrella] Support for Disk as a Resource in YARN 
> --
>
> Key: YARN-2139
> URL: https://issues.apache.org/jira/browse/YARN-2139
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Wei Yan
> Attachments: Disk_IO_Scheduling_Design_1.pdf, 
> Disk_IO_Scheduling_Design_2.pdf, YARN-2139-prototype-2.patch, 
> YARN-2139-prototype.patch
>
>
> YARN should consider disk as another resource for (1) scheduling tasks on 
> nodes, (2) isolation at runtime, (3) spindle locality. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2893) AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream

2014-11-21 Thread Gera Shegalov (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221662#comment-14221662
 ] 

Gera Shegalov commented on YARN-2893:
-

Here is the stack trace:
{code}
 Got exception: java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:197)
at java.io.DataInputStream.readFully(DataInputStream.java:169)
at 
org.apache.hadoop.security.Credentials.readTokenStorageStream(Credentials.java:189)
at 
org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.setupTokens(AMLauncher.java:225)
at 
org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.createAMContainerLaunchContext(AMLauncher.java:196)
at 
org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:107)
at 
org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:250)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
{code}

Since the launch context is corrupt all subsequent max app attempts fail as 
well . This is a non-deterministic Heisenbug that does not reproduce on job 
re-submission.

> AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream
> --
>
> Key: YARN-2893
> URL: https://issues.apache.org/jira/browse/YARN-2893
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.4.0
>Reporter: Gera Shegalov
>
> MapReduce jobs on our clusters experience sporadic failures due to corrupt 
> tokens in the AM launch context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2056) Disable preemption at Queue level

2014-11-21 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221659#comment-14221659
 ] 

Wangda Tan commented on YARN-2056:
--

Thanks [~jlowe]'s review. [~curino] wanna take a look?


> Disable preemption at Queue level
> -
>
> Key: YARN-2056
> URL: https://issues.apache.org/jira/browse/YARN-2056
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.4.0
>Reporter: Mayank Bansal
>Assignee: Eric Payne
> Attachments: YARN-2056.201408202039.txt, YARN-2056.201408260128.txt, 
> YARN-2056.201408310117.txt, YARN-2056.201409022208.txt, 
> YARN-2056.201409181916.txt, YARN-2056.201409210049.txt, 
> YARN-2056.201409232329.txt, YARN-2056.201409242210.txt, 
> YARN-2056.201410132225.txt, YARN-2056.201410141330.txt, 
> YARN-2056.201410232244.txt, YARN-2056.201410311746.txt, 
> YARN-2056.201411041635.txt, YARN-2056.201411072153.txt, 
> YARN-2056.201411122305.txt, YARN-2056.201411132215.txt, 
> YARN-2056.201411142002.txt
>
>
> We need to be able to disable preemption at individual queue level



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-2893) AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream

2014-11-21 Thread Gera Shegalov (JIRA)
Gera Shegalov created YARN-2893:
---

 Summary: AMLaucher: sporadic job failures due to EOFException in 
readTokenStorageStream
 Key: YARN-2893
 URL: https://issues.apache.org/jira/browse/YARN-2893
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Gera Shegalov


MapReduce jobs on our clusters experience sporadic failures due to corrupt 
tokens in the AM launch context.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2669) FairScheduler: queue names shouldn't allow periods

2014-11-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221627#comment-14221627
 ] 

Hudson commented on YARN-2669:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6589 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6589/])
YARN-2669. FairScheduler: queue names shouldn't allow periods (Wei Yan via 
Sandy Ryza) (sandy: rev a128cca305cecb215a2eef2ef543d1bf9b23a41b)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestQueuePlacementPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestAllocationFileLoaderService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/PeriodGroupsMapping.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/FairScheduler.apt.vm
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueuePlacementRule.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java


> FairScheduler: queue names shouldn't allow periods
> --
>
> Key: YARN-2669
> URL: https://issues.apache.org/jira/browse/YARN-2669
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Wei Yan
>Assignee: Wei Yan
> Fix For: 2.7.0
>
> Attachments: YARN-2669-1.patch, YARN-2669-2.patch, YARN-2669-3.patch, 
> YARN-2669-4.patch, YARN-2669-5.patch
>
>
> For an allocation file like:
> {noformat}
> 
>   
> 4096mb,4vcores
>   
> 
> {noformat}
> Users may wish to config minResources for a queue with full path "root.q1". 
> However, right now, fair scheduler will treat this configureation for the 
> queue with full name "root.root.q1". We need to print out a warning msg to 
> notify users about this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2669) FairScheduler: queue names shouldn't allow periods

2014-11-21 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-2669:
-
Priority: Major  (was: Minor)

> FairScheduler: queue names shouldn't allow periods
> --
>
> Key: YARN-2669
> URL: https://issues.apache.org/jira/browse/YARN-2669
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Wei Yan
>Assignee: Wei Yan
> Attachments: YARN-2669-1.patch, YARN-2669-2.patch, YARN-2669-3.patch, 
> YARN-2669-4.patch, YARN-2669-5.patch
>
>
> For an allocation file like:
> {noformat}
> 
>   
> 4096mb,4vcores
>   
> 
> {noformat}
> Users may wish to config minResources for a queue with full path "root.q1". 
> However, right now, fair scheduler will treat this configureation for the 
> queue with full name "root.root.q1". We need to print out a warning msg to 
> notify users about this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2669) FairScheduler: queue names shouldn't allow periods

2014-11-21 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-2669:
-
Summary: FairScheduler: queue names shouldn't allow periods  (was: 
FairScheduler: queueName shouldn't allow periods the allocation.xml)

> FairScheduler: queue names shouldn't allow periods
> --
>
> Key: YARN-2669
> URL: https://issues.apache.org/jira/browse/YARN-2669
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Wei Yan
>Assignee: Wei Yan
>Priority: Minor
> Attachments: YARN-2669-1.patch, YARN-2669-2.patch, YARN-2669-3.patch, 
> YARN-2669-4.patch, YARN-2669-5.patch
>
>
> For an allocation file like:
> {noformat}
> 
>   
> 4096mb,4vcores
>   
> 
> {noformat}
> Users may wish to config minResources for a queue with full path "root.q1". 
> However, right now, fair scheduler will treat this configureation for the 
> queue with full name "root.root.q1". We need to print out a warning msg to 
> notify users about this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2669) FairScheduler: queueName shouldn't allow periods the allocation.xml

2014-11-21 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221604#comment-14221604
 ] 

Sandy Ryza commented on YARN-2669:
--

+1

> FairScheduler: queueName shouldn't allow periods the allocation.xml
> ---
>
> Key: YARN-2669
> URL: https://issues.apache.org/jira/browse/YARN-2669
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Wei Yan
>Assignee: Wei Yan
>Priority: Minor
> Attachments: YARN-2669-1.patch, YARN-2669-2.patch, YARN-2669-3.patch, 
> YARN-2669-4.patch, YARN-2669-5.patch
>
>
> For an allocation file like:
> {noformat}
> 
>   
> 4096mb,4vcores
>   
> 
> {noformat}
> Users may wish to config minResources for a queue with full path "root.q1". 
> However, right now, fair scheduler will treat this configureation for the 
> queue with full name "root.root.q1". We need to print out a warning msg to 
> notify users about this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2139) [Umbrella] Support for Disk as a Resource in YARN

2014-11-21 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221591#comment-14221591
 ] 

Bikas Saha commented on YARN-2139:
--

Given that this design and possible implementation might go through unstable 
rounds and are currently not abstracted enough in the core code, doing this on 
a branch seems prudent. 
Given that SSDs are becoming common, thinking of storage as only spinning disks 
may be limited. Multiple writers  may affect each other more negatively on 
spinning disk vs SSDs. It may be useful to see if the consideration of storage 
could be abstracted into a plugin so that storage could have a different 
resource allocation policy by storage type (e.g. allocate/share by spindle for 
spinning disk storage vs allocate/share by iops on ssd storage vs 
allocate/share by network bandwidth for non-DAS storage). If we can abstract 
the policy into a plugin on trunk itself then perhaps we would not need a 
branch. Secondly, it will probably take a long time to agree on what a common 
policy should be and the consensus decision will probably not be a good fit for 
a large percentage of real clusters because of hardware variety. So making this 
a plugin would enable quicker development, trial and usage of disk based 
allocation compared to arriving at a grand unified allocation model for storage.

> [Umbrella] Support for Disk as a Resource in YARN 
> --
>
> Key: YARN-2139
> URL: https://issues.apache.org/jira/browse/YARN-2139
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Wei Yan
> Attachments: Disk_IO_Scheduling_Design_1.pdf, 
> Disk_IO_Scheduling_Design_2.pdf, YARN-2139-prototype-2.patch, 
> YARN-2139-prototype.patch
>
>
> YARN should consider disk as another resource for (1) scheduling tasks on 
> nodes, (2) isolation at runtime, (3) spindle locality. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2056) Disable preemption at Queue level

2014-11-21 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221583#comment-14221583
 ] 

Jason Lowe commented on YARN-2056:
--

I'm +1 on the latest patch as well.  I'll commit this sometime early next week 
unless there are objections.

> Disable preemption at Queue level
> -
>
> Key: YARN-2056
> URL: https://issues.apache.org/jira/browse/YARN-2056
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.4.0
>Reporter: Mayank Bansal
>Assignee: Eric Payne
> Attachments: YARN-2056.201408202039.txt, YARN-2056.201408260128.txt, 
> YARN-2056.201408310117.txt, YARN-2056.201409022208.txt, 
> YARN-2056.201409181916.txt, YARN-2056.201409210049.txt, 
> YARN-2056.201409232329.txt, YARN-2056.201409242210.txt, 
> YARN-2056.201410132225.txt, YARN-2056.201410141330.txt, 
> YARN-2056.201410232244.txt, YARN-2056.201410311746.txt, 
> YARN-2056.201411041635.txt, YARN-2056.201411072153.txt, 
> YARN-2056.201411122305.txt, YARN-2056.201411132215.txt, 
> YARN-2056.201411142002.txt
>
>
> We need to be able to disable preemption at individual queue level



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2669) FairScheduler: queueName shouldn't allow periods the allocation.xml

2014-11-21 Thread Wei Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan updated YARN-2669:
--
Attachment: YARN-2669-5.patch

Thanks, [~sandyr]. A new patch is updated.

> FairScheduler: queueName shouldn't allow periods the allocation.xml
> ---
>
> Key: YARN-2669
> URL: https://issues.apache.org/jira/browse/YARN-2669
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Wei Yan
>Assignee: Wei Yan
>Priority: Minor
> Attachments: YARN-2669-1.patch, YARN-2669-2.patch, YARN-2669-3.patch, 
> YARN-2669-4.patch, YARN-2669-5.patch
>
>
> For an allocation file like:
> {noformat}
> 
>   
> 4096mb,4vcores
>   
> 
> {noformat}
> Users may wish to config minResources for a queue with full path "root.q1". 
> However, right now, fair scheduler will treat this configureation for the 
> queue with full name "root.root.q1". We need to print out a warning msg to 
> notify users about this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2669) FairScheduler: queueName shouldn't allow periods the allocation.xml

2014-11-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221573#comment-14221573
 ] 

Hadoop QA commented on YARN-2669:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12682965/YARN-2669-4.patch
  against trunk revision 23dacb3.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5903//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5903//console

This message is automatically generated.

> FairScheduler: queueName shouldn't allow periods the allocation.xml
> ---
>
> Key: YARN-2669
> URL: https://issues.apache.org/jira/browse/YARN-2669
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Wei Yan
>Assignee: Wei Yan
>Priority: Minor
> Attachments: YARN-2669-1.patch, YARN-2669-2.patch, YARN-2669-3.patch, 
> YARN-2669-4.patch
>
>
> For an allocation file like:
> {noformat}
> 
>   
> 4096mb,4vcores
>   
> 
> {noformat}
> Users may wish to config minResources for a queue with full path "root.q1". 
> However, right now, fair scheduler will treat this configureation for the 
> queue with full name "root.root.q1". We need to print out a warning msg to 
> notify users about this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2801) Documentation development for Node labels requirment

2014-11-21 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221571#comment-14221571
 ] 

Wangda Tan commented on YARN-2801:
--

And added this as sub task of YARN-2492.

> Documentation development for Node labels requirment
> 
>
> Key: YARN-2801
> URL: https://issues.apache.org/jira/browse/YARN-2801
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Gururaj Shetty
>
> Documentation needs to be developed for the node label requirements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2801) Documentation development for Node labels requirment

2014-11-21 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221568#comment-14221568
 ] 

Wangda Tan commented on YARN-2801:
--

[~gururaj],
Thanks for volunteering to do this, but I've a WIP patch for this, would you 
mind me take over this task?

Wangda

> Documentation development for Node labels requirment
> 
>
> Key: YARN-2801
> URL: https://issues.apache.org/jira/browse/YARN-2801
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: documentation
>Reporter: Gururaj Shetty
>
> Documentation needs to be developed for the node label requirements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2801) Documentation development for Node labels requirment

2014-11-21 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-2801:
-
Issue Type: Sub-task  (was: New Feature)
Parent: YARN-2492

> Documentation development for Node labels requirment
> 
>
> Key: YARN-2801
> URL: https://issues.apache.org/jira/browse/YARN-2801
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Gururaj Shetty
>
> Documentation needs to be developed for the node label requirements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2139) [Umbrella] Support for Disk as a Resource in YARN

2014-11-21 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221561#comment-14221561
 ] 

Karthik Kambatla commented on YARN-2139:


[~leftnoteasy] - completely agree with both Arun and you on the 
spindle-locality-affinity front. The design doc hints at it, but doesn't cover 
it in as much detail as it should. I am all up for accomplishing that too here, 
I can work on fleshing out the locality-affinity pieces as we start getting the 
remaining parts in. 

I am considering starting the development on a feature-branch so we have a 
chance to change things before merging into trunk and branch-2.  Are people 
okay with that? 

> [Umbrella] Support for Disk as a Resource in YARN 
> --
>
> Key: YARN-2139
> URL: https://issues.apache.org/jira/browse/YARN-2139
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Wei Yan
> Attachments: Disk_IO_Scheduling_Design_1.pdf, 
> Disk_IO_Scheduling_Design_2.pdf, YARN-2139-prototype-2.patch, 
> YARN-2139-prototype.patch
>
>
> YARN should consider disk as another resource for (1) scheduling tasks on 
> nodes, (2) isolation at runtime, (3) spindle locality. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2139) [Umbrella] Support for Disk as a Resource in YARN

2014-11-21 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221556#comment-14221556
 ] 

Wangda Tan commented on YARN-2139:
--

Thanks [~ywskycn] for the design doc and prototype.

I have similar feeling like what [~acmurthy] commented, the disk resource is a 
little different from vcore. CPU is a shared resource, processes/threads can 
occupy cpu cores and also can be easily switch to another cores. But disks is 
not, (in spite of RAID), if a process write to a file on local disk (like 
Kafka), you cannot switch the file being writing to another disk easily.

And also, we need consider if there're multiple containers scheduled to a same 
physical disk, it is possible that the total bandwidth of these containers will 
drop very fast.

So I think the scheduling for disks is more like *affinity* to disks (like give 
disk#1,#2,#4 to the container) instead of just limit number of processes on 
each node.

Any thoughts? Please feel free to correct me if I was wrong.

Thanks,
Wangda

> [Umbrella] Support for Disk as a Resource in YARN 
> --
>
> Key: YARN-2139
> URL: https://issues.apache.org/jira/browse/YARN-2139
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Wei Yan
> Attachments: Disk_IO_Scheduling_Design_1.pdf, 
> Disk_IO_Scheduling_Design_2.pdf, YARN-2139-prototype-2.patch, 
> YARN-2139-prototype.patch
>
>
> YARN should consider disk as another resource for (1) scheduling tasks on 
> nodes, (2) isolation at runtime, (3) spindle locality. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2669) FairScheduler: queueName shouldn't allow periods the allocation.xml

2014-11-21 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221513#comment-14221513
 ] 

Sandy Ryza commented on YARN-2669:
--

This is looking good.  A few comments.

Can we add documentation for this behavior in FairScheduler.apt.vm?

We should be doing the same conversion for group names, right?

{code}
+  + " submitted by user " + user + " with an illegal queue name ("
+  + queueName + "). "
{code}
Nit: I think it's better not to surround the queue name with parentheses.

{code}
+return queueName + "." + convertUsername(user);
{code}
Can we call convertUsername something like cleanUsername to be a little more 
descriptive?

> FairScheduler: queueName shouldn't allow periods the allocation.xml
> ---
>
> Key: YARN-2669
> URL: https://issues.apache.org/jira/browse/YARN-2669
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Wei Yan
>Assignee: Wei Yan
>Priority: Minor
> Attachments: YARN-2669-1.patch, YARN-2669-2.patch, YARN-2669-3.patch, 
> YARN-2669-4.patch
>
>
> For an allocation file like:
> {noformat}
> 
>   
> 4096mb,4vcores
>   
> 
> {noformat}
> Users may wish to config minResources for a queue with full path "root.q1". 
> However, right now, fair scheduler will treat this configureation for the 
> queue with full name "root.root.q1". We need to print out a warning msg to 
> notify users about this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2679) Add metric for container launch duration

2014-11-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221514#comment-14221514
 ] 

Hudson commented on YARN-2679:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6587 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6587/])
YARN-2679. Add metric for container launch duration. (Zhihai Xu via kasha) 
(kasha: rev 233b61e495e136a843dabb7315bbb9ea37e7adce)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/metrics/TestNodeManagerMetrics.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/metrics/NodeManagerMetrics.java


> Add metric for container launch duration
> 
>
> Key: YARN-2679
> URL: https://issues.apache.org/jira/browse/YARN-2679
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.5.0
>Reporter: zhihai xu
>Assignee: zhihai xu
> Fix For: 2.7.0
>
> Attachments: YARN-2679.000.patch, YARN-2679.001.patch, 
> YARN-2679.002.patch
>
>
> add metrics in NodeManagerMetrics to get prepare time to launch container.
> The prepare time is the duration between sending 
> ContainersLauncherEventType.LAUNCH_CONTAINER event and receiving  
> ContainerEventType.CONTAINER_LAUNCHED event.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2664) Improve RM webapp to expose info about reservations.

2014-11-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221512#comment-14221512
 ] 

Hadoop QA commented on YARN-2664:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12682956/YARN-2664.5.patch
  against trunk revision 23dacb3.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 5 
release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebApp
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesApps
  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebAppFairScheduler
  
org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5902//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/5902//artifact/patchprocess/patchReleaseAuditProblems.txt
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5902//console

This message is automatically generated.

> Improve RM webapp to expose info about reservations.
> 
>
> Key: YARN-2664
> URL: https://issues.apache.org/jira/browse/YARN-2664
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Carlo Curino
>Assignee: Matteo Mazzucchelli
> Attachments: PlannerPage_screenshot.pdf, YARN-2664.1.patch, 
> YARN-2664.2.patch, YARN-2664.3.patch, YARN-2664.4.patch, YARN-2664.5.patch, 
> YARN-2664.patch
>
>
> YARN-1051 provides a new functionality in the RM to ask for reservation on 
> resources. Exposing this through the webapp GUI is important.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2675) the containersKilled metrics is not updated when the container is killed during localization.

2014-11-21 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221509#comment-14221509
 ] 

Karthik Kambatla commented on YARN-2675:


Looks good. [~vinodkv] - do you want to take a look as well? 

> the containersKilled metrics is not updated when the container is killed 
> during localization.
> -
>
> Key: YARN-2675
> URL: https://issues.apache.org/jira/browse/YARN-2675
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.5.0
>Reporter: zhihai xu
>Assignee: zhihai xu
> Attachments: YARN-2675.000.patch, YARN-2675.001.patch, 
> YARN-2675.002.patch, YARN-2675.003.patch, YARN-2675.004.patch
>
>
> The containersKilled metrics is not updated when the container is killed 
> during localization. We should add KILLING state in finished of 
> ContainerImpl.java to update killedContainer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2892) Unable to get AMRMToken in unmanaged AM when using a secure cluster

2014-11-21 Thread Sevada Abraamyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sevada Abraamyan updated YARN-2892:
---
Description: 
An AMRMToken is retrieved from the ApplicationReport by the YarnClient. 
When the RM creates the ApplicationReport and sends it back to the client it 
makes a simple security check whether it should include the AMRMToken in the 
report (See createAndGetApplicationReport in RMAppImpl).This security check 
verifies that the user who submitted the original application is the same user 
who is requesting the ApplicationReport. If they are indeed the same user then 
it includes the AMRMToken, otherwise it does not include it.

The problem arises from the fact that when an application is submitted, the RM  
saves the short username of the user who created the application (See 
submitApplication in ClientRmService). Afterwards when the ApplicationReport is 
requested, the system tries to match the full username of the requester against 
the previously stored short username. 

In a secure cluster using Kerberos this check fails because the principle is 
stripped from the username when we request a short username. So for example the 
short username might be "Foo" whereas the full username is "f...@company.com"

Note: A very similar problem has been previously reported 
([Yarn-2232|https://issues.apache.org/jira/browse/YARN-2232])

  was:
An AMRMToken is retrieved from the ApplicationReport by the YarnClient. 
When the RM creates the ApplicationReport and sends it back to the client it 
makes a simple security check whether it should include the AMRMToken in the 
report (See createAndGetApplicationReport in RMAppImpl).This security check 
verifies that the user who submitted the original application is the same user 
who is requesting the ApplicationReport. If they are indeed the same user then 
it includes the AMRMToken, otherwise it does not include it.

The problem arises from the fact that when an application is submitted, the RM  
saves the short username of the user who created the application (See 
submitApplication in ClientRmService). Afterwards when the ApplicationReport is 
requested, the system tries to match the full username of the requester against 
the previously stored short username. 

In a secure cluster using Kerberos this check fails because the principle is 
stripped from the username when we request a short username. So for example the 
short username might be "Foo" whereas the full username is "f...@company.com"

Note: A very similar problem has been previously reported in the past in 
[Yarn-2232|https://issues.apache.org/jira/browse/YARN-2232]. 


> Unable to get AMRMToken in unmanaged AM when using a secure cluster
> ---
>
> Key: YARN-2892
> URL: https://issues.apache.org/jira/browse/YARN-2892
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Sevada Abraamyan
>
> An AMRMToken is retrieved from the ApplicationReport by the YarnClient. 
> When the RM creates the ApplicationReport and sends it back to the client it 
> makes a simple security check whether it should include the AMRMToken in the 
> report (See createAndGetApplicationReport in RMAppImpl).This security check 
> verifies that the user who submitted the original application is the same 
> user who is requesting the ApplicationReport. If they are indeed the same 
> user then it includes the AMRMToken, otherwise it does not include it.
> The problem arises from the fact that when an application is submitted, the 
> RM  saves the short username of the user who created the application (See 
> submitApplication in ClientRmService). Afterwards when the ApplicationReport 
> is requested, the system tries to match the full username of the requester 
> against the previously stored short username. 
> In a secure cluster using Kerberos this check fails because the principle is 
> stripped from the username when we request a short username. So for example 
> the short username might be "Foo" whereas the full username is 
> "f...@company.com"
> Note: A very similar problem has been previously reported 
> ([Yarn-2232|https://issues.apache.org/jira/browse/YARN-2232])



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-2892) Unable to get AMRMToken in unmanaged AM when using a secure cluster

2014-11-21 Thread Sevada Abraamyan (JIRA)
Sevada Abraamyan created YARN-2892:
--

 Summary: Unable to get AMRMToken in unmanaged AM when using a 
secure cluster
 Key: YARN-2892
 URL: https://issues.apache.org/jira/browse/YARN-2892
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Sevada Abraamyan


An AMRMToken is retrieved from the ApplicationReport by the YarnClient. 
When the RM creates the ApplicationReport and sends it back to the client it 
makes a simple security check whether it should include the AMRMToken in the 
report (See createAndGetApplicationReport in RMAppImpl).This security check 
verifies that the user who submitted the original application is the same user 
who is requesting the ApplicationReport. If they are indeed the same user then 
it includes the AMRMToken, otherwise it does not include it.

The problem arises from the fact that when an application is submitted, the RM  
saves the short username of the user who created the application (See 
submitApplication in ClientRmService). Afterwards when the ApplicationReport is 
requested, the system tries to match the full username of the requester against 
the previously stored short username. 

In a secure cluster using Kerberos this check fails because the principle is 
stripped from the username when we request a short username. So for example the 
short username might be "Foo" whereas the full username is "f...@company.com"

Note: A very similar problem has been previously reported in the past in 
[Yarn-2232|https://issues.apache.org/jira/browse/YARN-2232]. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2679) Add metric for container launch duration

2014-11-21 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-2679:
---
Summary: Add metric for container launch duration  (was: add container 
launch prepare time metrics to NM.)

> Add metric for container launch duration
> 
>
> Key: YARN-2679
> URL: https://issues.apache.org/jira/browse/YARN-2679
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.5.0
>Reporter: zhihai xu
>Assignee: zhihai xu
> Attachments: YARN-2679.000.patch, YARN-2679.001.patch, 
> YARN-2679.002.patch
>
>
> add metrics in NodeManagerMetrics to get prepare time to launch container.
> The prepare time is the duration between sending 
> ContainersLauncherEventType.LAUNCH_CONTAINER event and receiving  
> ContainerEventType.CONTAINER_LAUNCHED event.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2669) FairScheduler: queueName shouldn't allow periods the allocation.xml

2014-11-21 Thread Wei Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan updated YARN-2669:
--
Attachment: YARN-2669-4.patch

Update a patch to handle the periods in the user's specified queue name.
For queue name like ".A" or "A.", the scheduler will reject the job and print 
out a msg to the user. For queue name "A.B", it will be accepted.

> FairScheduler: queueName shouldn't allow periods the allocation.xml
> ---
>
> Key: YARN-2669
> URL: https://issues.apache.org/jira/browse/YARN-2669
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Wei Yan
>Assignee: Wei Yan
>Priority: Minor
> Attachments: YARN-2669-1.patch, YARN-2669-2.patch, YARN-2669-3.patch, 
> YARN-2669-4.patch
>
>
> For an allocation file like:
> {noformat}
> 
>   
> 4096mb,4vcores
>   
> 
> {noformat}
> Users may wish to config minResources for a queue with full path "root.q1". 
> However, right now, fair scheduler will treat this configureation for the 
> queue with full name "root.root.q1". We need to print out a warning msg to 
> notify users about this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2765) Add leveldb-based implementation for RMStateStore

2014-11-21 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221448#comment-14221448
 ] 

Jason Lowe commented on YARN-2765:
--

bq. Can't we do one "create if missing"?

This is to distinguish between a state store that wasn't there (and thus needs 
to be created) vs. opening an empty, existing state store.  We log different 
messages during startup so it's easy to distinguish between these cases.  IMHO 
it's important to know when the state store wasn't there on startup and needed 
to be created.

> Add leveldb-based implementation for RMStateStore
> -
>
> Key: YARN-2765
> URL: https://issues.apache.org/jira/browse/YARN-2765
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Jason Lowe
>Assignee: Jason Lowe
> Attachments: YARN-2765.patch, YARN-2765v2.patch
>
>
> It would be nice to have a leveldb option to the resourcemanager recovery 
> store. Leveldb would provide some benefits over the existing filesystem store 
> such as better support for atomic operations, fewer I/O ops per state update, 
> and far fewer total files on the filesystem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2664) Improve RM webapp to expose info about reservations.

2014-11-21 Thread Matteo Mazzucchelli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Mazzucchelli updated YARN-2664:
--
Attachment: YARN-2664.5.patch

Hi Carlo, you are right.
I have included the wrong library in the patch.

> Improve RM webapp to expose info about reservations.
> 
>
> Key: YARN-2664
> URL: https://issues.apache.org/jira/browse/YARN-2664
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Carlo Curino
>Assignee: Matteo Mazzucchelli
> Attachments: PlannerPage_screenshot.pdf, YARN-2664.1.patch, 
> YARN-2664.2.patch, YARN-2664.3.patch, YARN-2664.4.patch, YARN-2664.5.patch, 
> YARN-2664.patch
>
>
> YARN-1051 provides a new functionality in the RM to ask for reservation on 
> resources. Exposing this through the webapp GUI is important.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2679) add container launch prepare time metrics to NM.

2014-11-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221405#comment-14221405
 ] 

Hadoop QA commented on YARN-2679:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12682947/YARN-2679.002.patch
  against trunk revision 23dacb3.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5901//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5901//console

This message is automatically generated.

> add container launch prepare time metrics to NM.
> 
>
> Key: YARN-2679
> URL: https://issues.apache.org/jira/browse/YARN-2679
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.5.0
>Reporter: zhihai xu
>Assignee: zhihai xu
> Attachments: YARN-2679.000.patch, YARN-2679.001.patch, 
> YARN-2679.002.patch
>
>
> add metrics in NodeManagerMetrics to get prepare time to launch container.
> The prepare time is the duration between sending 
> ContainersLauncherEventType.LAUNCH_CONTAINER event and receiving  
> ContainerEventType.CONTAINER_LAUNCHED event.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2679) add container launch prepare time metrics to NM.

2014-11-21 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221382#comment-14221382
 ] 

Karthik Kambatla commented on YARN-2679:


+1, pending Jenkins. 

> add container launch prepare time metrics to NM.
> 
>
> Key: YARN-2679
> URL: https://issues.apache.org/jira/browse/YARN-2679
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.5.0
>Reporter: zhihai xu
>Assignee: zhihai xu
> Attachments: YARN-2679.000.patch, YARN-2679.001.patch, 
> YARN-2679.002.patch
>
>
> add metrics in NodeManagerMetrics to get prepare time to launch container.
> The prepare time is the duration between sending 
> ContainersLauncherEventType.LAUNCH_CONTAINER event and receiving  
> ContainerEventType.CONTAINER_LAUNCHED event.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2679) add container launch prepare time metrics to NM.

2014-11-21 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221354#comment-14221354
 ] 

zhihai xu commented on YARN-2679:
-

Uploaded new patch YARN-2679.002.patch to change the Metric description to 
"Container launch duration".

> add container launch prepare time metrics to NM.
> 
>
> Key: YARN-2679
> URL: https://issues.apache.org/jira/browse/YARN-2679
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.5.0
>Reporter: zhihai xu
>Assignee: zhihai xu
> Attachments: YARN-2679.000.patch, YARN-2679.001.patch, 
> YARN-2679.002.patch
>
>
> add metrics in NodeManagerMetrics to get prepare time to launch container.
> The prepare time is the duration between sending 
> ContainersLauncherEventType.LAUNCH_CONTAINER event and receiving  
> ContainerEventType.CONTAINER_LAUNCHED event.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2679) add container launch prepare time metrics to NM.

2014-11-21 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-2679:

Attachment: YARN-2679.002.patch

> add container launch prepare time metrics to NM.
> 
>
> Key: YARN-2679
> URL: https://issues.apache.org/jira/browse/YARN-2679
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.5.0
>Reporter: zhihai xu
>Assignee: zhihai xu
> Attachments: YARN-2679.000.patch, YARN-2679.001.patch, 
> YARN-2679.002.patch
>
>
> add metrics in NodeManagerMetrics to get prepare time to launch container.
> The prepare time is the duration between sending 
> ContainersLauncherEventType.LAUNCH_CONTAINER event and receiving  
> ContainerEventType.CONTAINER_LAUNCHED event.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)

2014-11-21 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221342#comment-14221342
 ] 

Wangda Tan commented on YARN-2495:
--

Hi [~Naganarasimha],
Thanks for update,
Several minor comments,
1) As same as ResourceTrackerService, it's better to have a field like 
isDecentralizedNodeLabelConfigurationEnabled (or some other name you like) in 
NodeStatusUpdaterImpl. Should be more clear than statement like
{code}
+if (nodeLabelsProvider!=null) {
{code}

2) In ResourceTrackerService,
The message:
{code}
 String message =
 "NodeManager from node " + host + "(cmPort: " + cmPort + " httpPort: "
 + httpPort + ") " + "registered with capability: " + capability
-+ ", assigned nodeId " + nodeId;
++ ", assigned nodeId " + nodeId + ", node labels { "
++ StringUtils.join(",", nodeLabels) + " } ";
{code}
Should add a check, only logging node labels message when replace is succeeded. 
Ideally you should have a StringBuilder do this

3) A style suggestion is, as convention, bi-opts like "=", "!=", "+", etc. 
should leave a space between and after it. I can see several occurrences in the 
patch.

Wangda

> Allow admin specify labels from each NM (Distributed configuration)
> ---
>
> Key: YARN-2495
> URL: https://issues.apache.org/jira/browse/YARN-2495
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, 
> YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, 
> YARN-2495.20141119-1.patch, YARN-2495_20141022.1.patch
>
>
> Target of this JIRA is to allow admin specify labels in each NM, this covers
> - User can set labels in each NM (by setting yarn-site.xml or using script 
> suggested by [~aw])
> - NM will send labels to RM via ResourceTracker API
> - RM will set labels in NodeLabelManager when NM register/update labels



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2404) Remove ApplicationAttemptState and ApplicationState class in RMStateStore class

2014-11-21 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221334#comment-14221334
 ] 

Jian He commented on YARN-2404:
---

Tsuyoshi, thanks for updating the patch ! looks good overall, some minor 
comments
- remove following in loadApplicationAttemptState
{code}
ApplicationAttemptId attemptId =
ConverterUtils.toApplicationAttemptId(attemptIDStr);
{code}

- we may change the attemptTokens type to be Credentials. and do the convert 
from/to ByteBuffer inside the method, instead of the caller 
{code}
  public abstract ByteBuffer getAppAttemptTokens();
  public abstract void setAppAttemptTokens(ByteBuffer attemptTokens);
{code}
- the following assert is always true
{code}
ApplicationId appId =
appState.getApplicationSubmissionContext().getApplicationId();
// assert child node name is same as actual applicationId
assert appId.equals(
appState.getApplicationSubmissionContext().getApplicationId());
{code}
- the credentials is not used.
{code}
Credentials credentials = null;
if (attemptState.getAppAttemptTokens() != null) {
  credentials = new Credentials();
  DataInputByteBuffer dibb = new DataInputByteBuffer();
  dibb.reset(attemptState.getAppAttemptTokens());
  credentials.readTokenStorageStream(dibb);
}
{code}


> Remove ApplicationAttemptState and ApplicationState class in RMStateStore 
> class 
> 
>
> Key: YARN-2404
> URL: https://issues.apache.org/jira/browse/YARN-2404
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jian He
>Assignee: Tsuyoshi OZAWA
> Attachments: YARN-2404.1.patch, YARN-2404.2.patch, YARN-2404.3.patch, 
> YARN-2404.4.patch, YARN-2404.5.patch, YARN-2404.6.patch
>
>
> We can remove ApplicationState and ApplicationAttemptState class in 
> RMStateStore, given that we already have ApplicationStateData and 
> ApplicationAttemptStateData records. we may just replace ApplicationState 
> with ApplicationStateData, similarly for ApplicationAttemptState.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2891) Failed Container Executor does not provide a clear error message

2014-11-21 Thread Dustin Cote (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Cote updated YARN-2891:
--
Issue Type: Improvement  (was: Bug)

> Failed Container Executor does not provide a clear error message
> 
>
> Key: YARN-2891
> URL: https://issues.apache.org/jira/browse/YARN-2891
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.5.1
> Environment: any
>Reporter: Dustin Cote
>Priority: Minor
>
> When checking access to directories, the container executor does not provide 
> clear information on which directory actually could not be accessed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-2891) Failed Container Executor does not provide a clear error message

2014-11-21 Thread Dustin Cote (JIRA)
Dustin Cote created YARN-2891:
-

 Summary: Failed Container Executor does not provide a clear error 
message
 Key: YARN-2891
 URL: https://issues.apache.org/jira/browse/YARN-2891
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.5.1
 Environment: any
Reporter: Dustin Cote
Priority: Minor


When checking access to directories, the container executor does not provide 
clear information on which directory actually could not be accessed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2765) Add leveldb-based implementation for RMStateStore

2014-11-21 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221326#comment-14221326
 ] 

Zhijie Shen commented on YARN-2765:
---

bq. I knew rocksdb could be used as a cache of data that came from HDFS or 
could be backed-up to HDFS, but I didn't think it could read/write directly to 
it as part of normal operations.

Hm... I should have wrongly understand the feature. Thanks for correction.

One question about the patch: Why is it necessary to try create the DB with 
{{options.createIfMissing(false);}} and then {{options.createIfMissing(true);}} 
if it fails at first attempt? Can't we do one "create if missing"?

> Add leveldb-based implementation for RMStateStore
> -
>
> Key: YARN-2765
> URL: https://issues.apache.org/jira/browse/YARN-2765
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Jason Lowe
>Assignee: Jason Lowe
> Attachments: YARN-2765.patch, YARN-2765v2.patch
>
>
> It would be nice to have a leveldb option to the resourcemanager recovery 
> store. Leveldb would provide some benefits over the existing filesystem store 
> such as better support for atomic operations, fewer I/O ops per state update, 
> and far fewer total files on the filesystem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2762) RMAdminCLI node-labels-related args should be trimmed and checked before sending to RM

2014-11-21 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221310#comment-14221310
 ] 

Wangda Tan commented on YARN-2762:
--

[~rohithsharma],
Thanks for patch, I've updated the title a little bit to make it better 
describe what we want, 

Even though, with YARN-2843, all hosts/labels will be trimmed to make sure the 
correctness, but it is also good to have this checking in CLI side.

Some suggestions:
1) Every labels should be trimmed before sending to RM
2) When there's no labels after trimmed, we should use the same error message 
as 
{code}
  else if ("-addToClusterNodeLabels".equals(cmd)) {
if (i >= args.length) {
  System.err.println("No cluster node-labels are specified");
  exitCode = -1;
} else {
  exitCode = addToClusterNodeLabels(args[i]);
}
  }
{code}
To make it consistent.
3) There's one error message is not correct 
{code}
else if ("-replaceLabelsOnNode".equals(cmd)) {
if (i >= args.length) {
  System.err.println("No cluster node-labels are specified");
  exitCode = -1;
} else {
  exitCode = replaceLabelsOnNodes(args[i]);
}
{code}
It should be "no node-labels are specified when trying to replace labels on 
node" or something, I suggest you can address this together with your patch.

Thanks,
Wangda

> RMAdminCLI node-labels-related args should be trimmed and checked before 
> sending to RM
> --
>
> Key: YARN-2762
> URL: https://issues.apache.org/jira/browse/YARN-2762
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Rohith
>Assignee: Rohith
>Priority: Minor
> Attachments: YARN-2762.patch
>
>
> All NodeLabel args validation's are done at server side. The same can be done 
> at RMAdminCLI so that unnecessary RPC calls can be avoided.
> And for the input such as "x,y,,z,", no need to add empty string instead can 
> be skipped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2877) Extend YARN to support distributed scheduling

2014-11-21 Thread Sujeet Varakhedi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221309#comment-14221309
 ] 

Sujeet Varakhedi commented on YARN-2877:


+ 1 for distributed scheduling and SQL engines for Hadoop can greatly benefit 
from it. We also need to look at a design we can give AMs more control over 
scheduling policies where RM just acts a source of overall cluster state, NM's 
have local queues and then based on NM queue wait times AM's can decide where 
to requests tasks. Similar to how Sparrow works. This kind of scheduling 
becomes important for services that need dedicated non-shared clusters like 
HBASE and HAWQ.

> Extend YARN to support distributed scheduling
> -
>
> Key: YARN-2877
> URL: https://issues.apache.org/jira/browse/YARN-2877
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager, resourcemanager
>Reporter: Sriram Rao
>
> This is an umbrella JIRA that proposes to extend YARN to support distributed 
> scheduling.  Briefly, some of the motivations for distributed scheduling are 
> the following:
> 1. Improve cluster utilization by opportunistically executing tasks otherwise 
> idle resources on individual machines.
> 2. Reduce allocation latency.  Tasks where the scheduling time dominates 
> (i.e., task execution time is much less compared to the time required for 
> obtaining a container from the RM).
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2762) RMAdminCLI node-labels-related args should be trimmed and checked before sending to RM

2014-11-21 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-2762:
-
Summary: RMAdminCLI node-labels-related args should be trimmed and checked 
before sending to RM  (was: Provide RMAdminCLI args validation for 
NodeLabelManager operations)

> RMAdminCLI node-labels-related args should be trimmed and checked before 
> sending to RM
> --
>
> Key: YARN-2762
> URL: https://issues.apache.org/jira/browse/YARN-2762
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Rohith
>Assignee: Rohith
>Priority: Minor
> Attachments: YARN-2762.patch
>
>
> All NodeLabel args validation's are done at server side. The same can be done 
> at RMAdminCLI so that unnecessary RPC calls can be avoided.
> And for the input such as "x,y,,z,", no need to add empty string instead can 
> be skipped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2882) Introducing container types

2014-11-21 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221288#comment-14221288
 ] 

Konstantinos Karanasos commented on YARN-2882:
--

[~Sujeet Varakhedi] The scope of pre-emption/killing is not only within an 
application. Whenever a guaranteed-start task arrives in an NM that cannot 
accommodate its execution due to running queueable tasks, it is allowed to 
pre-empt/kill one or more of those, even if they belong to another application.
Clearly there can be policies that decide which of the running queueable tasks 
to pre-empt/kill (and one of them could be to avoid pre-empting/killing a task 
of another application, if there is a good reason for that).

> Introducing container types
> ---
>
> Key: YARN-2882
> URL: https://issues.apache.org/jira/browse/YARN-2882
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Konstantinos Karanasos
>
> This JIRA introduces the notion of container types.
> We propose two initial types of containers: guaranteed-start and queueable 
> containers.
> Guaranteed-start are the existing containers, which are allocated by the 
> central RM and are instantaneously started, once allocated.
> Queueable is a new type of container, which allows containers to be queued in 
> the NM, thus their execution may be arbitrarily delayed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2727) In RMAdminCLI usage display, instead of "yarn.node-labels.fs-store.root-dir", "yarn.node-labels.fs-store.uri" is being displayed

2014-11-21 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221281#comment-14221281
 ] 

Wangda Tan commented on YARN-2727:
--

[~Naganarasimha], I'll close this as duplicated, thanks for pointing out this 
issue.

> In RMAdminCLI usage display, instead of "yarn.node-labels.fs-store.root-dir", 
> "yarn.node-labels.fs-store.uri" is being displayed
> 
>
> Key: YARN-2727
> URL: https://issues.apache.org/jira/browse/YARN-2727
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>Priority: Minor
> Attachments: YARN-2727.20141023.1.patch
>
>
> In org.apache.hadoop.yarn.client.cli.RMAdminCLI usage display instead of 
> "yarn.node-labels.fs-store.root-dir", "yarn.node-labels.fs-store.uri" is 
> being used
> And also some modifications for the description



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2880) Add a test in TestRMRestart to make sure node labels will be recovered if it is enabled

2014-11-21 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221271#comment-14221271
 ] 

Wangda Tan commented on YARN-2880:
--

[~rohithsharma],
Thanks for taking up this JIRA,
bq. IIUC, as of now recovery is not yet supported till YARN-2800 is committed.
Not really, YARN-2800 is just to make a better user experience, recovery is 
already supported now.

bq. Any document available?
Not yet, documentation work is still in progress

bq. How can I configure Nodelabels? Is it only rmadmin as of now?
You can use rmadmin or REST API to configure node labels

bq. I set labels to NM from rmadmin,but how do I make use of these labels?
Before documentation available, you can take a look at 
testQueueParsing...Label... And also, you can take a look 
TestContainerAllocation#test..Labels, they're integration test in RM side. For 
end-to-end test, you can take a look at TestDistributedShellWithNodeLabels.

Please let me know if you have any other questions.

Thanks,
Wangda

> Add a test in TestRMRestart to make sure node labels will be recovered if it 
> is enabled
> ---
>
> Key: YARN-2880
> URL: https://issues.apache.org/jira/browse/YARN-2880
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Wangda Tan
>Assignee: Rohith
>
> As suggested by [~ozawa], 
> [link|https://issues.apache.org/jira/browse/YARN-2800?focusedCommentId=14217569&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14217569].
>  We should have a such test to make sure there will be no regression



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2517) Implement TimelineClientAsync

2014-11-21 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-2517:
--
Target Version/s: 2.7.0

> Implement TimelineClientAsync
> -
>
> Key: YARN-2517
> URL: https://issues.apache.org/jira/browse/YARN-2517
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Tsuyoshi OZAWA
> Attachments: YARN-2517.1.patch, YARN-2517.2.patch
>
>
> In some scenarios, we'd like to put timeline entities in another thread no to 
> block the current one.
> It's good to have a TimelineClientAsync like AMRMClientAsync and 
> NMClientAsync. It can buffer entities, put them in a separate thread, and 
> have callback to handle the responses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2517) Implement TimelineClientAsync

2014-11-21 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221249#comment-14221249
 ] 

Zhijie Shen commented on YARN-2517:
---

I'm not sure Future is going to help the use case. Say I want to use 
TimelineClientAsync to do async put entity operations. With Future, 
putEntitiesAsync returns immediately. However, to know if my put entity 
operation is successful not, I still have to been blocked at Future#get(), or 
create a separate thread to wait for the response. But IMHO, one goal of 
TimelineClientAsync is to relieve users from multithreading details, such that 
Type (1) sounds better to me.

Rethink whether we create a separate TimelineClientAsync or add async method in 
TimelineClient. We have putEntities and putDomain, and in the future we will 
have more get APIs. For now, the most concerned API is putEntities, as we don't 
want it to block the normal execution logic of an app. Maybe compromise now is 
to add putEntitiesAsync to TimelineClient. In the future, let's see if we want 
to have a separate TimelineClientAsync that contains a bunch of async APIs.

Thoughts?

> Implement TimelineClientAsync
> -
>
> Key: YARN-2517
> URL: https://issues.apache.org/jira/browse/YARN-2517
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Tsuyoshi OZAWA
> Attachments: YARN-2517.1.patch, YARN-2517.2.patch
>
>
> In some scenarios, we'd like to put timeline entities in another thread no to 
> block the current one.
> It's good to have a TimelineClientAsync like AMRMClientAsync and 
> NMClientAsync. It can buffer entities, put them in a separate thread, and 
> have callback to handle the responses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2604) Scheduler should consider max-allocation-* in conjunction with the largest node

2014-11-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221245#comment-14221245
 ] 

Hudson commented on YARN-2604:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6585 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6585/])
YARN-2604. Scheduler should consider max-allocation-* in conjunction with the 
largest node. (Robert Kanter via kasha) (kasha: rev 
3114d4731dcca7cb6c16aaa7c7a6550b7dd7dccb)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestFifoScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestContainerAllocation.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestAbstractYarnScheduler.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java


> Scheduler should consider max-allocation-* in conjunction with the largest 
> node
> ---
>
> Key: YARN-2604
> URL: https://issues.apache.org/jira/browse/YARN-2604
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 2.5.1
>Reporter: Karthik Kambatla
>Assignee: Robert Kanter
> Attachments: YARN-2604.patch, YARN-2604.patch, YARN-2604.patch, 
> YARN-2604.patch, YARN-2604.patch, YARN-2604.patch
>
>
> If the scheduler max-allocation-* values are larger than the resources 
> available on the largest node in the cluster, an application requesting 
> resources between the two values will be accepted by the scheduler but the 
> requests will never be satisfied. The app essentially hangs forever. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2882) Introducing container types

2014-11-21 Thread Sujeet Varakhedi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221232#comment-14221232
 ] 

Sujeet Varakhedi commented on YARN-2882:


Is preemption idea only within the scope of an application? Can a 
guaranteed-start task result in preemption queued task of another application? 

> Introducing container types
> ---
>
> Key: YARN-2882
> URL: https://issues.apache.org/jira/browse/YARN-2882
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Konstantinos Karanasos
>
> This JIRA introduces the notion of container types.
> We propose two initial types of containers: guaranteed-start and queueable 
> containers.
> Guaranteed-start are the existing containers, which are allocated by the 
> central RM and are instantaneously started, once allocated.
> Queueable is a new type of container, which allows containers to be queued in 
> the NM, thus their execution may be arbitrarily delayed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2869) CapacityScheduler should trim sub queue names when parse configuration

2014-11-21 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-2869:
-
Attachment: YARN-2869-2.patch

[~vinodkv], Thanks for review. The "mvn eclipse:eclipse" is not related. And 
added a test covers nested queue parsing needs trimming.

> CapacityScheduler should trim sub queue names when parse configuration
> --
>
> Key: YARN-2869
> URL: https://issues.apache.org/jira/browse/YARN-2869
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler, resourcemanager
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-2869-1.patch, YARN-2869-2.patch
>
>
> Currently, capacity scheduler doesn't trim sub queue name when parsing queue 
> names, for example, the configuration
> {code}
> 
>  
>  ...root.queues
>   a, b  , c
>  
>  
>  ...root.b.capacity
>  100
>  
>   
>  ...
> 
> {code}
> Will fail with error: 
> {code}
> java.lang.IllegalArgumentException: Illegal capacity of -1.0 for queue root. 
> a 
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getCapacity(CapacitySchedulerConfiguration.java:332)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.getCapacityFromConf(LeafQueue.java:196)
> 
> {code}
> It will try to find a queues with name " a", " b  ", and " c", which is 
> apparently wrong, we should do trimming on these sub queue names.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2604) Scheduler should consider max-allocation-* in conjunction with the largest node

2014-11-21 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221211#comment-14221211
 ] 

Karthik Kambatla commented on YARN-2604:


Looks good. Thanks your patience through the reviews, Robert. 

+1, checking this in. 

> Scheduler should consider max-allocation-* in conjunction with the largest 
> node
> ---
>
> Key: YARN-2604
> URL: https://issues.apache.org/jira/browse/YARN-2604
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 2.5.1
>Reporter: Karthik Kambatla
>Assignee: Robert Kanter
> Attachments: YARN-2604.patch, YARN-2604.patch, YARN-2604.patch, 
> YARN-2604.patch, YARN-2604.patch, YARN-2604.patch
>
>
> If the scheduler max-allocation-* values are larger than the resources 
> available on the largest node in the cluster, an application requesting 
> resources between the two values will be accepted by the scheduler but the 
> requests will never be satisfied. The app essentially hangs forever. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2877) Extend YARN to support distributed scheduling

2014-11-21 Thread Sriram Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221203#comment-14221203
 ] 

Sriram Rao commented on YARN-2877:
--

[~airbots]  The number of AM's running on any machine is 
configurable/small---on the order of a few tens, and so the overhead on LocalRM 
should be negligible.

> Extend YARN to support distributed scheduling
> -
>
> Key: YARN-2877
> URL: https://issues.apache.org/jira/browse/YARN-2877
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager, resourcemanager
>Reporter: Sriram Rao
>
> This is an umbrella JIRA that proposes to extend YARN to support distributed 
> scheduling.  Briefly, some of the motivations for distributed scheduling are 
> the following:
> 1. Improve cluster utilization by opportunistically executing tasks otherwise 
> idle resources on individual machines.
> 2. Reduce allocation latency.  Tasks where the scheduling time dominates 
> (i.e., task execution time is much less compared to the time required for 
> obtaining a container from the RM).
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2877) Extend YARN to support distributed scheduling

2014-11-21 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221174#comment-14221174
 ] 

Chen He commented on YARN-2877:
---

This is a interesting idea. Distributed scheduling and global scheduling have 
their own pros and cons. For short, global scheduling can achieve optimal 
matching between tasks and resources but may have scalability problem when 
system becomes larger and larger. Distributed scheduling is scalable but may 
reach sub-optimal if there is no communication between those distributed 
schedulers. 

The LocalRM can reduce the RM's burden by doing communications to local AMs. It 
is a good idea. IMHO, the worker nodes become increasingly powerful and large 
(more mems and cores). Is that possible that the LocalRM affects NM's 
performance if there are many AMs running on a single server?  

> Extend YARN to support distributed scheduling
> -
>
> Key: YARN-2877
> URL: https://issues.apache.org/jira/browse/YARN-2877
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager, resourcemanager
>Reporter: Sriram Rao
>
> This is an umbrella JIRA that proposes to extend YARN to support distributed 
> scheduling.  Briefly, some of the motivations for distributed scheduling are 
> the following:
> 1. Improve cluster utilization by opportunistically executing tasks otherwise 
> idle resources on individual machines.
> 2. Reduce allocation latency.  Tasks where the scheduling time dominates 
> (i.e., task execution time is much less compared to the time required for 
> obtaining a container from the RM).
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2604) Scheduler should consider max-allocation-* in conjunction with the largest node

2014-11-21 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221169#comment-14221169
 ] 

Robert Kanter commented on YARN-2604:
-

My last comment should have said "vcores", not "scores" Apple _autocorrected_ 
it :)

> Scheduler should consider max-allocation-* in conjunction with the largest 
> node
> ---
>
> Key: YARN-2604
> URL: https://issues.apache.org/jira/browse/YARN-2604
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 2.5.1
>Reporter: Karthik Kambatla
>Assignee: Robert Kanter
> Attachments: YARN-2604.patch, YARN-2604.patch, YARN-2604.patch, 
> YARN-2604.patch, YARN-2604.patch, YARN-2604.patch
>
>
> If the scheduler max-allocation-* values are larger than the resources 
> available on the largest node in the cluster, an application requesting 
> resources between the two values will be accepted by the scheduler but the 
> requests will never be satisfied. The app essentially hangs forever. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2375) Allow enabling/disabling timeline server per framework

2014-11-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221052#comment-14221052
 ] 

Hudson commented on YARN-2375:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #12 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/12/])
YARN-2375. Allow enabling/disabling timeline server per framework. (Mit Desai 
via jeagles) (jeagles: rev c298a9a845f89317eb9efad332e6657c56736a4d)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestTimelineClient.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/impl/TimelineClientImpl.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryEventHandler.java


> Allow enabling/disabling timeline server per framework
> --
>
> Key: YARN-2375
> URL: https://issues.apache.org/jira/browse/YARN-2375
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Eagles
>Assignee: Mit Desai
> Fix For: 2.7.0, 2.6.1
>
> Attachments: YARN-2375.1.patch, YARN-2375.patch, YARN-2375.patch, 
> YARN-2375.patch, YARN-2375.patch
>
>
> This JIRA is to remove the ats enabled flag check within the 
> TimelineClientImpl. Example where this fails is below.
> While running secure timeline server with ats flag set to disabled on 
> resource manager, Timeline delegation token renewer throws an NPE. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-2087) YARN proxy doesn't relay verbs other than GET

2014-11-21 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved YARN-2087.
--
Resolution: Duplicate

> YARN proxy doesn't relay verbs other than GET
> -
>
> Key: YARN-2087
> URL: https://issues.apache.org/jira/browse/YARN-2087
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager, webapp
>Affects Versions: 2.4.0
>Reporter: Steve Loughran
>
> the {{WebAppProxy}} class only proxies GET requests, REST verbs PUT, DELETE 
> and POST aren't handled. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2031) YARN Proxy model doesn't support REST APIs in AMs

2014-11-21 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated YARN-2031:
-
Issue Type: Sub-task  (was: Bug)
Parent: YARN-2084

> YARN Proxy model doesn't support REST APIs in AMs
> -
>
> Key: YARN-2031
> URL: https://issues.apache.org/jira/browse/YARN-2031
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>
> AMs can't support REST APIs because
> # the AM filter redirects all requests to the proxy with a 302 response (not 
> 307)
> # the proxy doesn't forward PUT/POST/DELETE verbs
> Either the AM filter needs to return 307 and the proxy to forward the verbs, 
> or Am filter should not filter a REST bit of the web site



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2087) YARN proxy doesn't relay verbs other than GET

2014-11-21 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221037#comment-14221037
 ] 

Steve Loughran commented on YARN-2087:
--

Filed (and forgotten about) as YARN-2031

> YARN proxy doesn't relay verbs other than GET
> -
>
> Key: YARN-2087
> URL: https://issues.apache.org/jira/browse/YARN-2087
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager, webapp
>Affects Versions: 2.4.0
>Reporter: Steve Loughran
>
> the {{WebAppProxy}} class only proxies GET requests, REST verbs PUT, DELETE 
> and POST aren't handled. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-2031) YARN Proxy model doesn't support REST APIs in AMs

2014-11-21 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran reassigned YARN-2031:


Assignee: Steve Loughran

> YARN Proxy model doesn't support REST APIs in AMs
> -
>
> Key: YARN-2031
> URL: https://issues.apache.org/jira/browse/YARN-2031
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>
> AMs can't support REST APIs because
> # the AM filter redirects all requests to the proxy with a 302 response (not 
> 307)
> # the proxy doesn't forward PUT/POST/DELETE verbs
> Either the AM filter needs to return 307 and the proxy to forward the verbs, 
> or Am filter should not filter a REST bit of the web site



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2375) Allow enabling/disabling timeline server per framework

2014-11-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221031#comment-14221031
 ] 

Hudson commented on YARN-2375:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1964 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1964/])
YARN-2375. Allow enabling/disabling timeline server per framework. (Mit Desai 
via jeagles) (jeagles: rev c298a9a845f89317eb9efad332e6657c56736a4d)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryEventHandler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestTimelineClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/impl/TimelineClientImpl.java


> Allow enabling/disabling timeline server per framework
> --
>
> Key: YARN-2375
> URL: https://issues.apache.org/jira/browse/YARN-2375
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Eagles
>Assignee: Mit Desai
> Fix For: 2.7.0, 2.6.1
>
> Attachments: YARN-2375.1.patch, YARN-2375.patch, YARN-2375.patch, 
> YARN-2375.patch, YARN-2375.patch
>
>
> This JIRA is to remove the ats enabled flag check within the 
> TimelineClientImpl. Example where this fails is below.
> While running secure timeline server with ats flag set to disabled on 
> resource manager, Timeline delegation token renewer throws an NPE. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2517) Implement TimelineClientAsync

2014-11-21 Thread Mit Desai (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221027#comment-14221027
 ] 

Mit Desai commented on YARN-2517:
-

Yes I have been working on timeline service recently. I will take a look.

> Implement TimelineClientAsync
> -
>
> Key: YARN-2517
> URL: https://issues.apache.org/jira/browse/YARN-2517
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhijie Shen
>Assignee: Tsuyoshi OZAWA
> Attachments: YARN-2517.1.patch, YARN-2517.2.patch
>
>
> In some scenarios, we'd like to put timeline entities in another thread no to 
> block the current one.
> It's good to have a TimelineClientAsync like AMRMClientAsync and 
> NMClientAsync. It can buffer entities, put them in a separate thread, and 
> have callback to handle the responses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2884) Proxying all AM-RM communications

2014-11-21 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221025#comment-14221025
 ] 

Konstantinos Karanasos commented on YARN-2884:
--

[~kasha], [~curino], [~subru], given that this proxy/agent will only focus on 
the AM-RM communication, we may also explicitly call it AMRMProxy or AMRMAgent 
(following the naming convention of the already existing AMRMClient* classes).

[~djp] I just added a comment in the umbrella JIRA (YARN-2877), trying to give 
some more details.
We are not proposing to substitute all scheduling decisions with distributed 
ones. The guaranteed-start containers will continue to be scheduled by the 
central RM. However, the queueable ones will be scheduled in a distributed 
fashion. 
The first candidate for queueable containers is the short-running tasks, in 
which the overhead of contacting the central RM is a significant part of the 
overall task execution time. Scheduling these requests without contacting the 
central RM will reduce their latency, increase the utilization of the cluster 
(no idle resources waiting to contact the RM), while it will offload the 
central RM (which is good for scaling in big clusters).

> Proxying all AM-RM communications
> -
>
> Key: YARN-2884
> URL: https://issues.apache.org/jira/browse/YARN-2884
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Carlo Curino
>
> We introduce the notion of an RMProxy, running on each node (or once per 
> rack). Upon start the AM is forced (via tokens and configuration) to direct 
> all its requests to a new services running on the NM that provide a proxy to 
> the central RM. 
> This give us a place to:
> 1) perform distributed scheduling decisions
> 2) throttling mis-behaving AMs
> 3) mask the access to a federation of RMs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2877) Extend YARN to support distributed scheduling

2014-11-21 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221012#comment-14221012
 ] 

Konstantinos Karanasos commented on YARN-2877:
--

Adding some more details, now that we have added the first sub-tasks.

In YARN-2882 we introduce two *types of containers*: guaranteed-start and 
queueable. The former are the ones existing in YARN today (are allocated from 
the central RM, and once allocated, are guaranteed to start). The latter make 
it possible to queue container requests in the NMs and will be used for 
distributed scheduling.
The *queuing of (queueable) container requests* in the NMs is proposed in 
YARN-2883.

Each NM will now also have a *LocalRM* (Local ResourceManager) that will 
receive all container requests from the AMs running on the same machine:
- For the guaranteed-start container requests, the LocalRM acts as a proxy 
(YARN-2884), forwarding them to the central RM. 
- For the queueable container requests, the LocalRM is responsible for sending 
them directly to the NM queues (bypassing the central RM). Deciding the NMs 
where these requests are queued is based on the estimated waiting time in the 
NM queues, as discussed in YARN-2886.

Based on some policy (YARN-2887), each AM will determine *what type of 
containers to ask*: only guaranteed-start, only queueable, or a mix thereof. 
For instance, an AM may request guaranteed-start containers for its tasks that 
are expected to be long-running, whereas it may ask for queueable containers 
for its short tasks (in which the back-and-forth with the central RM may be 
longer than the task execution time). This way we reduce the scheduling 
latency, while increasing the utilization of the cluster (if we had to go to 
the central RM for all these short tasks, some resources of the cluster might 
remain idle in the meanwhile).

To ensure the NM queues remain balanced, we propose *corrective mechanisms for 
NM queue rebalancing* in YARN-2888.
Moreover, to ensure no AM is abusing the system by asking too many queueable 
containers, we can impose a limit in the *number of queueable containers* that 
each AM can receive (YARN-2889).

> Extend YARN to support distributed scheduling
> -
>
> Key: YARN-2877
> URL: https://issues.apache.org/jira/browse/YARN-2877
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager, resourcemanager
>Reporter: Sriram Rao
>
> This is an umbrella JIRA that proposes to extend YARN to support distributed 
> scheduling.  Briefly, some of the motivations for distributed scheduling are 
> the following:
> 1. Improve cluster utilization by opportunistically executing tasks otherwise 
> idle resources on individual machines.
> 2. Reduce allocation latency.  Tasks where the scheduling time dominates 
> (i.e., task execution time is much less compared to the time required for 
> obtaining a container from the RM).
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2375) Allow enabling/disabling timeline server per framework

2014-11-21 Thread Mit Desai (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mit Desai updated YARN-2375:

Fix Version/s: 2.6.1

> Allow enabling/disabling timeline server per framework
> --
>
> Key: YARN-2375
> URL: https://issues.apache.org/jira/browse/YARN-2375
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Eagles
>Assignee: Mit Desai
> Fix For: 2.7.0, 2.6.1
>
> Attachments: YARN-2375.1.patch, YARN-2375.patch, YARN-2375.patch, 
> YARN-2375.patch, YARN-2375.patch
>
>
> This JIRA is to remove the ats enabled flag check within the 
> TimelineClientImpl. Example where this fails is below.
> While running secure timeline server with ats flag set to disabled on 
> resource manager, Timeline delegation token renewer throws an NPE. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2375) Allow enabling/disabling timeline server per framework

2014-11-21 Thread Mit Desai (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14220995#comment-14220995
 ] 

Mit Desai commented on YARN-2375:
-

Thanks for the quick reviews [~jeagles] and [~zjshen].


> Allow enabling/disabling timeline server per framework
> --
>
> Key: YARN-2375
> URL: https://issues.apache.org/jira/browse/YARN-2375
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Eagles
>Assignee: Mit Desai
> Fix For: 2.7.0
>
> Attachments: YARN-2375.1.patch, YARN-2375.patch, YARN-2375.patch, 
> YARN-2375.patch, YARN-2375.patch
>
>
> This JIRA is to remove the ats enabled flag check within the 
> TimelineClientImpl. Example where this fails is below.
> While running secure timeline server with ats flag set to disabled on 
> resource manager, Timeline delegation token renewer throws an NPE. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2890) MiniMRYarnCluster should turn on timeline service if configured to do so

2014-11-21 Thread Mit Desai (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14220971#comment-14220971
 ] 

Mit Desai commented on YARN-2890:
-

Forgot to assign it to me. :P
Thats fine. You can carry on.

> MiniMRYarnCluster should turn on timeline service if configured to do so
> 
>
> Key: YARN-2890
> URL: https://issues.apache.org/jira/browse/YARN-2890
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Mit Desai
>Assignee: Varun Saxena
> Fix For: 2.6.1
>
>
> Currently the MiniMRYarnCluster does not consider the configuration value for 
> enabling timeline service before starting. The MiniYarnCluster should only 
> start the timeline service if it is configured to do so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2890) MiniMRYarnCluster should turn on timeline service if configured to do so

2014-11-21 Thread Mit Desai (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14220970#comment-14220970
 ] 

Mit Desai commented on YARN-2890:
-

[~varun_saxena] I was already working on the issue.

> MiniMRYarnCluster should turn on timeline service if configured to do so
> 
>
> Key: YARN-2890
> URL: https://issues.apache.org/jira/browse/YARN-2890
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Mit Desai
>Assignee: Varun Saxena
> Fix For: 2.6.1
>
>
> Currently the MiniMRYarnCluster does not consider the configuration value for 
> enabling timeline service before starting. The MiniYarnCluster should only 
> start the timeline service if it is configured to do so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2375) Allow enabling/disabling timeline server per framework

2014-11-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14220965#comment-14220965
 ] 

Hudson commented on YARN-2375:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1940 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1940/])
YARN-2375. Allow enabling/disabling timeline server per framework. (Mit Desai 
via jeagles) (jeagles: rev c298a9a845f89317eb9efad332e6657c56736a4d)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/impl/TimelineClientImpl.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryEventHandler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestTimelineClient.java
* hadoop-yarn-project/CHANGES.txt


> Allow enabling/disabling timeline server per framework
> --
>
> Key: YARN-2375
> URL: https://issues.apache.org/jira/browse/YARN-2375
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Eagles
>Assignee: Mit Desai
> Fix For: 2.7.0
>
> Attachments: YARN-2375.1.patch, YARN-2375.patch, YARN-2375.patch, 
> YARN-2375.patch, YARN-2375.patch
>
>
> This JIRA is to remove the ats enabled flag check within the 
> TimelineClientImpl. Example where this fails is below.
> While running secure timeline server with ats flag set to disabled on 
> resource manager, Timeline delegation token renewer throws an NPE. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2375) Allow enabling/disabling timeline server per framework

2014-11-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14220940#comment-14220940
 ] 

Hudson commented on YARN-2375:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #12 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/12/])
YARN-2375. Allow enabling/disabling timeline server per framework. (Mit Desai 
via jeagles) (jeagles: rev c298a9a845f89317eb9efad332e6657c56736a4d)
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryEventHandler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestTimelineClient.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/impl/TimelineClientImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java


> Allow enabling/disabling timeline server per framework
> --
>
> Key: YARN-2375
> URL: https://issues.apache.org/jira/browse/YARN-2375
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Eagles
>Assignee: Mit Desai
> Fix For: 2.7.0
>
> Attachments: YARN-2375.1.patch, YARN-2375.patch, YARN-2375.patch, 
> YARN-2375.patch, YARN-2375.patch
>
>
> This JIRA is to remove the ats enabled flag check within the 
> TimelineClientImpl. Example where this fails is below.
> While running secure timeline server with ats flag set to disabled on 
> resource manager, Timeline delegation token renewer throws an NPE. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2375) Allow enabling/disabling timeline server per framework

2014-11-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14220818#comment-14220818
 ] 

Hudson commented on YARN-2375:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #750 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/750/])
YARN-2375. Allow enabling/disabling timeline server per framework. (Mit Desai 
via jeagles) (jeagles: rev c298a9a845f89317eb9efad332e6657c56736a4d)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/impl/TimelineClientImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryEventHandler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestTimelineClient.java


> Allow enabling/disabling timeline server per framework
> --
>
> Key: YARN-2375
> URL: https://issues.apache.org/jira/browse/YARN-2375
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Eagles
>Assignee: Mit Desai
> Fix For: 2.7.0
>
> Attachments: YARN-2375.1.patch, YARN-2375.patch, YARN-2375.patch, 
> YARN-2375.patch, YARN-2375.patch
>
>
> This JIRA is to remove the ats enabled flag check within the 
> TimelineClientImpl. Example where this fails is below.
> While running secure timeline server with ats flag set to disabled on 
> resource manager, Timeline delegation token renewer throws an NPE. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2375) Allow enabling/disabling timeline server per framework

2014-11-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14220807#comment-14220807
 ] 

Hudson commented on YARN-2375:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #12 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/12/])
YARN-2375. Allow enabling/disabling timeline server per framework. (Mit Desai 
via jeagles) (jeagles: rev c298a9a845f89317eb9efad332e6657c56736a4d)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/impl/TimelineClientImpl.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryEventHandler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestTimelineClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java
* hadoop-yarn-project/CHANGES.txt


> Allow enabling/disabling timeline server per framework
> --
>
> Key: YARN-2375
> URL: https://issues.apache.org/jira/browse/YARN-2375
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Eagles
>Assignee: Mit Desai
> Fix For: 2.7.0
>
> Attachments: YARN-2375.1.patch, YARN-2375.patch, YARN-2375.patch, 
> YARN-2375.patch, YARN-2375.patch
>
>
> This JIRA is to remove the ats enabled flag check within the 
> TimelineClientImpl. Example where this fails is below.
> While running secure timeline server with ats flag set to disabled on 
> resource manager, Timeline delegation token renewer throws an NPE. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2637) maximum-am-resource-percent could be violated when resource of AM is > minimumAllocation

2014-11-21 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14220784#comment-14220784
 ] 

Junping Du commented on YARN-2637:
--

Hi [~cwelch], thanks for your patch update. Could you please check the failed 
tests are related to your latest patch? Thanks!

> maximum-am-resource-percent could be violated when resource of AM is > 
> minimumAllocation
> 
>
> Key: YARN-2637
> URL: https://issues.apache.org/jira/browse/YARN-2637
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Wangda Tan
>Assignee: Craig Welch
>Priority: Critical
> Attachments: YARN-2637.0.patch, YARN-2637.1.patch, YARN-2637.2.patch, 
> YARN-2637.6.patch
>
>
> Currently, number of AM in leaf queue will be calculated in following way:
> {code}
> max_am_resource = queue_max_capacity * maximum_am_resource_percent
> #max_am_number = max_am_resource / minimum_allocation
> #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor
> {code}
> And when submit new application to RM, it will check if an app can be 
> activated in following way:
> {code}
> for (Iterator i=pendingApplications.iterator(); 
>  i.hasNext(); ) {
>   FiCaSchedulerApp application = i.next();
>   
>   // Check queue limit
>   if (getNumActiveApplications() >= getMaximumActiveApplications()) {
> break;
>   }
>   
>   // Check user limit
>   User user = getUser(application.getUser());
>   if (user.getActiveApplications() < 
> getMaximumActiveApplicationsPerUser()) {
> user.activateApplication();
> activeApplications.add(application);
> i.remove();
> LOG.info("Application " + application.getApplicationId() +
> " from user: " + application.getUser() + 
> " activated in queue: " + getQueueName());
>   }
> }
> {code}
> An example is,
> If a queue has capacity = 1G, max_am_resource_percent  = 0.2, the maximum 
> resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be 
> launched is 200, and if user uses 5M for each AM (> minimum_allocation). All 
> apps can still be activated, and it will occupy all resource of a queue 
> instead of only a max_am_resource_percent of a queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2884) Proxying all AM-RM communications

2014-11-21 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14220780#comment-14220780
 ] 

Junping Du commented on YARN-2884:
--

I don't think the name matter too much ...
IMO, this sounds like a complicated effort. Before we go ahead, may be we 
should have analysis on the motivation towards "distributed scheduling 
decisions"? - What we could gain there and what we could lost in potential? 

> Proxying all AM-RM communications
> -
>
> Key: YARN-2884
> URL: https://issues.apache.org/jira/browse/YARN-2884
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Carlo Curino
>
> We introduce the notion of an RMProxy, running on each node (or once per 
> rack). Upon start the AM is forced (via tokens and configuration) to direct 
> all its requests to a new services running on the NM that provide a proxy to 
> the central RM. 
> This give us a place to:
> 1) perform distributed scheduling decisions
> 2) throttling mis-behaving AMs
> 3) mask the access to a federation of RMs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2637) maximum-am-resource-percent could be violated when resource of AM is > minimumAllocation

2014-11-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14220663#comment-14220663
 ] 

Hadoop QA commented on YARN-2637:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12682825/YARN-2637.6.patch
  against trunk revision c298a9a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 11 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.security.TestClientToAMTokens
  
org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.TestRMContainerImpl
  
org.apache.hadoop.yarn.server.resourcemanager.security.TestAMRMTokens
  
org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart
  
org.apache.hadoop.yarn.server.resourcemanager.TestApplicationCleanup
  
org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRMRPCResponseId
  
org.apache.hadoop.yarn.server.resourcemanager.TestResourceManager
  
org.apache.hadoop.yarn.server.resourcemanager.TestApplicationMasterService
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.TestSchedulerUtils
  
org.apache.hadoop.yarn.server.resourcemanager.TestAMAuthorization
  
org.apache.hadoop.yarn.server.resourcemanager.TestApplicationMasterLauncher
  
org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRMRPCNodeUpdates
  
org.apache.hadoop.yarn.server.resourcemanager.reservation.TestCapacitySchedulerPlanFollower
  
org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart
  org.apache.hadoop.yarn.server.resourcemanager.TestRM

  The following test timeouts occurred in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

org.apache.hadoop.yarn.server.resourcemanager.TestClientRMService
org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart
org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5900//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5900//console

This message is automatically generated.

> maximum-am-resource-percent could be violated when resource of AM is > 
> minimumAllocation
> 
>
> Key: YARN-2637
> URL: https://issues.apache.org/jira/browse/YARN-2637
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Wangda Tan
>Assignee: Craig Welch
>Priority: Critical
> Attachments: YARN-2637.0.patch, YARN-2637.1.patch, YARN-2637.2.patch, 
> YARN-2637.6.patch
>
>
> Currently, number of AM in leaf queue will be calculated in following way:
> {code}
> max_am_resource = queue_max_capacity * maximum_am_resource_percent
> #max_am_number = max_am_resource / minimum_allocation
> #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor
> {code}
> And when submit new application to RM, it will check if an app can be 
> activated in following way:
> {code}
> for (Iterator i=pendingApplications.iterator(); 
>  i.hasNext(); ) {
>   FiCaSchedulerApp application = i.next();
>   
>   // Check queue limit
>   if (getNumActiveApplications() >= getMaximumActiveApplications()) {
> break;
>   }
>   
>   // Check user limit
>   User user = getUser(application.getUser());
>   if (user.getActiveApplications() < 
> getMaximumActiveApplicationsPerUser()) {
> user.activateApplication();
> activeApplications.add(application);
> i.remove();
> LOG.info(

[jira] [Assigned] (YARN-2890) MiniMRYarnCluster should turn on timeline service if configured to do so

2014-11-21 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena reassigned YARN-2890:
--

Assignee: Varun Saxena

> MiniMRYarnCluster should turn on timeline service if configured to do so
> 
>
> Key: YARN-2890
> URL: https://issues.apache.org/jira/browse/YARN-2890
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Mit Desai
>Assignee: Varun Saxena
> Fix For: 2.6.1
>
>
> Currently the MiniMRYarnCluster does not consider the configuration value for 
> enabling timeline service before starting. The MiniYarnCluster should only 
> start the timeline service if it is configured to do so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2188) Client service for cache manager

2014-11-21 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14220639#comment-14220639
 ] 

Chris Trezzo commented on YARN-2188:


Thanks for the comments! I will post an updated patch.

> Client service for cache manager
> 
>
> Key: YARN-2188
> URL: https://issues.apache.org/jira/browse/YARN-2188
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
> Attachments: YARN-2188-trunk-v1.patch, YARN-2188-trunk-v2.patch, 
> YARN-2188-trunk-v3.patch, YARN-2188-trunk-v4.patch
>
>
> Implement the client service for the shared cache manager. This service is 
> responsible for handling client requests to use and release resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)