[jira] [Assigned] (YARN-3635) Get-queue-mapping should be a common interface of YarnScheduler

2015-07-15 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza reassigned YARN-3635:


Assignee: Sandy Ryza  (was: Wangda Tan)

 Get-queue-mapping should be a common interface of YarnScheduler
 ---

 Key: YARN-3635
 URL: https://issues.apache.org/jira/browse/YARN-3635
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Wangda Tan
Assignee: Sandy Ryza
 Attachments: YARN-3635.1.patch, YARN-3635.2.patch, YARN-3635.3.patch, 
 YARN-3635.4.patch, YARN-3635.5.patch, YARN-3635.6.patch


 Currently, both of fair/capacity scheduler support queue mapping, which makes 
 scheduler can change queue of an application after submitted to scheduler.
 One issue of doing this in specific scheduler is: If the queue after mapping 
 has different maximum_allocation/default-node-label-expression of the 
 original queue, {{validateAndCreateResourceRequest}} in RMAppManager checks 
 the wrong queue.
 I propose to make the queue mapping as a common interface of scheduler, and 
 RMAppManager set the queue after mapping before doing validations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3635) Get-queue-mapping should be a common interface of YarnScheduler

2015-07-15 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-3635:
-
Assignee: Tan, Wangda  (was: Sandy Ryza)

 Get-queue-mapping should be a common interface of YarnScheduler
 ---

 Key: YARN-3635
 URL: https://issues.apache.org/jira/browse/YARN-3635
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Wangda Tan
Assignee: Tan, Wangda
 Attachments: YARN-3635.1.patch, YARN-3635.2.patch, YARN-3635.3.patch, 
 YARN-3635.4.patch, YARN-3635.5.patch, YARN-3635.6.patch


 Currently, both of fair/capacity scheduler support queue mapping, which makes 
 scheduler can change queue of an application after submitted to scheduler.
 One issue of doing this in specific scheduler is: If the queue after mapping 
 has different maximum_allocation/default-node-label-expression of the 
 original queue, {{validateAndCreateResourceRequest}} in RMAppManager checks 
 the wrong queue.
 I propose to make the queue mapping as a common interface of scheduler, and 
 RMAppManager set the queue after mapping before doing validations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3635) Get-queue-mapping should be a common interface of YarnScheduler

2015-07-15 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14629141#comment-14629141
 ] 

Sandy Ryza commented on YARN-3635:
--

BTW I got all this from QueuePlacementPolicy and QueuePlacementRule, which are 
pretty quick reads if you want to take a look.

 Get-queue-mapping should be a common interface of YarnScheduler
 ---

 Key: YARN-3635
 URL: https://issues.apache.org/jira/browse/YARN-3635
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Wangda Tan
Assignee: Tan, Wangda
 Attachments: YARN-3635.1.patch, YARN-3635.2.patch, YARN-3635.3.patch, 
 YARN-3635.4.patch, YARN-3635.5.patch, YARN-3635.6.patch


 Currently, both of fair/capacity scheduler support queue mapping, which makes 
 scheduler can change queue of an application after submitted to scheduler.
 One issue of doing this in specific scheduler is: If the queue after mapping 
 has different maximum_allocation/default-node-label-expression of the 
 original queue, {{validateAndCreateResourceRequest}} in RMAppManager checks 
 the wrong queue.
 I propose to make the queue mapping as a common interface of scheduler, and 
 RMAppManager set the queue after mapping before doing validations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3635) Get-queue-mapping should be a common interface of YarnScheduler

2015-07-15 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14629139#comment-14629139
 ] 

Sandy Ryza commented on YARN-3635:
--

[~leftnoteasy], apologies for this quick drive-by review - I am currently 
traveling.

The JIRA appears to be lacking a design-doc and I wasn't able to find 
documentation in the patch.  The patch should ultimately include some detailed 
documentation, but I don't want to ask this of you before OKing the approach.  
In light of this, a few questions:
* What steps are required for the Fair Scheduler to integrate with this?
* Is a common way of configuration proposed?
* How does this differ from the current Fair Scheduler model?  To summarize:
** The FS model consists of a sequence of placement rules that the app is 
passed through.
** Each placement rule gets the chance to assign the app to a queue, reject the 
app, or pass.  If it passes, the next rule gets a chance.
** A placement rule can base its decision on:
*** The submitting user.
*** The set of groups the submitting user belongs to.
*** The queue requested in the app submission.
*** A set of configuration options that are specific to the rule.
*** The set of queues given in the Fair Scheduler configuration.
** Rules are marked as terminal if they will never pass.  This helps to avoid 
misconfigurations where users place rules after terminal rules.
** Rules have a create attribute which determines whether they can create a 
new queue or whether they must assign to existing queues.
** Currently the set of placement rules is limited to what's implemented in 
YARN.  I.e. there's no public pluggable rule support.
I noticed from Vinod's comment that this patch follows a similar structure.  
Are there places where my summary would not describe what's going on in this 
patch?



 Get-queue-mapping should be a common interface of YarnScheduler
 ---

 Key: YARN-3635
 URL: https://issues.apache.org/jira/browse/YARN-3635
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Wangda Tan
Assignee: Tan, Wangda
 Attachments: YARN-3635.1.patch, YARN-3635.2.patch, YARN-3635.3.patch, 
 YARN-3635.4.patch, YARN-3635.5.patch, YARN-3635.6.patch


 Currently, both of fair/capacity scheduler support queue mapping, which makes 
 scheduler can change queue of an application after submitted to scheduler.
 One issue of doing this in specific scheduler is: If the queue after mapping 
 has different maximum_allocation/default-node-label-expression of the 
 original queue, {{validateAndCreateResourceRequest}} in RMAppManager checks 
 the wrong queue.
 I propose to make the queue mapping as a common interface of scheduler, and 
 RMAppManager set the queue after mapping before doing validations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3866) AM-RM protocol changes to support container resizing

2015-07-10 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14623047#comment-14623047
 ] 

Sandy Ryza commented on YARN-3866:
--

Hi [~jianhe].  Most application writers should be using AMRMClient, so not 
dealing with this interface directly.  That said, given that they are separate 
data types, I think two different methods would be preferable.

 AM-RM protocol changes to support container resizing
 

 Key: YARN-3866
 URL: https://issues.apache.org/jira/browse/YARN-3866
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api
Reporter: MENG DING
Assignee: MENG DING
 Attachments: YARN-3866.1.patch, YARN-3866.2.patch


 YARN-1447 and YARN-1448 are outdated. 
 This ticket deals with AM-RM Protocol changes to support container resize 
 according to the latest design in YARN-1197.
 1) Add increase/decrease requests in AllocateRequest
 2) Get approved increase/decrease requests from RM in AllocateResponse
 3) Add relevant test cases



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1197) Support changing resources of an allocated container

2015-06-17 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14591055#comment-14591055
 ] 

Sandy Ryza commented on YARN-1197:
--

The latest proposal makes sense to me as well.  Thanks [~wangda] and [~mding]!

 Support changing resources of an allocated container
 

 Key: YARN-1197
 URL: https://issues.apache.org/jira/browse/YARN-1197
 Project: Hadoop YARN
  Issue Type: Task
  Components: api, nodemanager, resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Wangda Tan
 Attachments: YARN-1197 old-design-docs-patches-for-reference.zip, 
 YARN-1197_Design.pdf


 The current YARN resource management logic assumes resource allocated to a 
 container is fixed during the lifetime of it. When users want to change a 
 resource 
 of an allocated container the only way is releasing it and allocating a new 
 container with expected size.
 Allowing run-time changing resources of an allocated container will give us 
 better control of resource usage in application side



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1197) Support changing resources of an allocated container

2015-06-16 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14588478#comment-14588478
 ] 

Sandy Ryza commented on YARN-1197:
--

bq. I think this assumes cluster is quite idle, I understand the low latency 
could be achieved, but it's not guaranteed since we don't support 
oversubscribing, etc.
If the cluster is fully contended we certainly won't get this performance.  But 
as long as there is a decent chunk of space, which is common in many settings, 
we can.  The cluster doesn't need to be fully idle by any means.

More broadly, just because YARN is not good at hitting sub-second latencies 
doesn't mean that it isn't a design goal.  I strongly oppose any argument that 
uses the current slowness of YARN as a justification for why we should make 
architectural decisions that could compromise latencies.

That said, I still don't have a strong grasp on the kind of complexity we're 
introducing in the AM, so would like to try to understand that before arguing 
against you further.

Is the main problem we're grappling still the one Meng brought up here:
https://issues.apache.org/jira/browse/YARN-1197?focusedCommentId=14556803page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14556803?
I.e. that an AM can receive an increase from the RM, then issue a decrease to 
the NM, and then use its increase to get resources it doesn't deserve?

Or is the idea that, even if we didn't have this JIRA, NMClient is too 
complicated, and we'd like to reduce that?

 Support changing resources of an allocated container
 

 Key: YARN-1197
 URL: https://issues.apache.org/jira/browse/YARN-1197
 Project: Hadoop YARN
  Issue Type: Task
  Components: api, nodemanager, resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Wangda Tan
 Attachments: YARN-1197 old-design-docs-patches-for-reference.zip, 
 YARN-1197_Design.pdf


 The current YARN resource management logic assumes resource allocated to a 
 container is fixed during the lifetime of it. When users want to change a 
 resource 
 of an allocated container the only way is releasing it and allocating a new 
 container with expected size.
 Allowing run-time changing resources of an allocated container will give us 
 better control of resource usage in application side



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1197) Support changing resources of an allocated container

2015-06-15 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14586249#comment-14586249
 ] 

Sandy Ryza commented on YARN-1197:
--

Sorry, I've been quiet here for a while, but I'd be concerned about a design 
that requires going through the ResourceManager for decreases.  If I understand 
correctly, this would be considerable hit to performance, which could be 
prohibitive for frameworks like Spark that might use container-resizing for 
allocating per-task resources.

 Support changing resources of an allocated container
 

 Key: YARN-1197
 URL: https://issues.apache.org/jira/browse/YARN-1197
 Project: Hadoop YARN
  Issue Type: Task
  Components: api, nodemanager, resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Wangda Tan
 Attachments: YARN-1197 old-design-docs-patches-for-reference.zip, 
 YARN-1197_Design.pdf


 The current YARN resource management logic assumes resource allocated to a 
 container is fixed during the lifetime of it. When users want to change a 
 resource 
 of an allocated container the only way is releasing it and allocating a new 
 container with expected size.
 Allowing run-time changing resources of an allocated container will give us 
 better control of resource usage in application side



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1197) Support changing resources of an allocated container

2015-06-15 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14586687#comment-14586687
 ] 

Sandy Ryza commented on YARN-1197:
--

bq. Going through RM directly is better as the RM will immediately know that 
the resource is available for future allocations
Is the idea that the RM would make allocations using the space before receiving 
acknowledgement from the NodeManager that it has resized the container 
(adjusted cgroups)? 

 Support changing resources of an allocated container
 

 Key: YARN-1197
 URL: https://issues.apache.org/jira/browse/YARN-1197
 Project: Hadoop YARN
  Issue Type: Task
  Components: api, nodemanager, resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Wangda Tan
 Attachments: YARN-1197 old-design-docs-patches-for-reference.zip, 
 YARN-1197_Design.pdf


 The current YARN resource management logic assumes resource allocated to a 
 container is fixed during the lifetime of it. When users want to change a 
 resource 
 of an allocated container the only way is releasing it and allocating a new 
 container with expected size.
 Allowing run-time changing resources of an allocated container will give us 
 better control of resource usage in application side



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1197) Support changing resources of an allocated container

2015-06-15 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14587127#comment-14587127
 ] 

Sandy Ryza commented on YARN-1197:
--

Option (a) can occur in the low hundreds of milliseconds if the cluster is 
tuned properly, independent of cluster size.
1) Submit increase request to RM.  Poll RM 100 milliseconds later after 
continuous scheduling thread has run in order to pick up the increase token.
2) Send increase token to NM.

Why does the AM need to poll the NM about increase status before taking action? 
 Does the NM need to do anything other than update its tracking of the 
resources allotted to the container?

Also, it's not unlikely that schedulers will be improved to return the increase 
token on the same heartbeat that it's requested.  So this could all happen in 2 
RPCs + a scheduler decision, and no additional wait time.  Anything more than 
this is probably prohibitively expensive for a framework like Spark to submit 
an increase request before running each task.

Would option (b) ever be able to achieve this kind of latency?

 Support changing resources of an allocated container
 

 Key: YARN-1197
 URL: https://issues.apache.org/jira/browse/YARN-1197
 Project: Hadoop YARN
  Issue Type: Task
  Components: api, nodemanager, resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Wangda Tan
 Attachments: YARN-1197 old-design-docs-patches-for-reference.zip, 
 YARN-1197_Design.pdf


 The current YARN resource management logic assumes resource allocated to a 
 container is fixed during the lifetime of it. When users want to change a 
 resource 
 of an allocated container the only way is releasing it and allocating a new 
 container with expected size.
 Allowing run-time changing resources of an allocated container will give us 
 better control of resource usage in application side



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1197) Support changing resources of an allocated container

2015-06-15 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14587072#comment-14587072
 ] 

Sandy Ryza commented on YARN-1197:
--

bq. RM still needs to wait for an acknowledgement from NM to confirm that the 
increase is done before sending out response to AM. This will take two 
heartbeat cycles, but this is not much worse than giving out a token to AM 
first, and then letting AM initiating the increase.

I would argue that waiting for an NM-RM heartbeat is much worse than waiting 
for an AM-RM heartbeat.  With continuous scheduling, the RM can make decisions 
in millisecond time, and the AM can regulate its heartbeats according to the 
application's needs to get fast responses.  If an NM-RM heartbeat is involved, 
the application is at the mercy of the cluster settings, which should be in the 
multi-second range for large clusters.

 Support changing resources of an allocated container
 

 Key: YARN-1197
 URL: https://issues.apache.org/jira/browse/YARN-1197
 Project: Hadoop YARN
  Issue Type: Task
  Components: api, nodemanager, resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Wangda Tan
 Attachments: YARN-1197 old-design-docs-patches-for-reference.zip, 
 YARN-1197_Design.pdf


 The current YARN resource management logic assumes resource allocated to a 
 container is fixed during the lifetime of it. When users want to change a 
 resource 
 of an allocated container the only way is releasing it and allocating a new 
 container with expected size.
 Allowing run-time changing resources of an allocated container will give us 
 better control of resource usage in application side



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1197) Support changing resources of an allocated container

2015-06-15 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14587067#comment-14587067
 ] 

Sandy Ryza commented on YARN-1197:
--

Is my understanding correct that the broader plan is to move stopping 
containers out of the AM-NM protocol? 

 Support changing resources of an allocated container
 

 Key: YARN-1197
 URL: https://issues.apache.org/jira/browse/YARN-1197
 Project: Hadoop YARN
  Issue Type: Task
  Components: api, nodemanager, resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Wangda Tan
 Attachments: YARN-1197 old-design-docs-patches-for-reference.zip, 
 YARN-1197_Design.pdf


 The current YARN resource management logic assumes resource allocated to a 
 container is fixed during the lifetime of it. When users want to change a 
 resource 
 of an allocated container the only way is releasing it and allocating a new 
 container with expected size.
 Allowing run-time changing resources of an allocated container will give us 
 better control of resource usage in application side



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1197) Support changing resources of an allocated container

2015-06-15 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14587168#comment-14587168
 ] 

Sandy Ryza commented on YARN-1197:
--

bq. If you consider all now/future optimizations, such as continous-scheduling 
/ scheduler make decision at same AM-RM heart-beat. (b) needs one more NM-RM 
heart-beat interval. I agree with you, it could be hundreds of milli-seconds 
(a) vs. multi-seconds (b). when the cluster is idle.

To clarify: with proper tuning, we can currently get low hundreds of 
milliseconds without adding any new scheduler features.  With the new scheduler 
feature I'm imagining, we'd only be limited by the RPC + scheduler time, so we 
could get 10s of milliseconds with proper tuning.

 Support changing resources of an allocated container
 

 Key: YARN-1197
 URL: https://issues.apache.org/jira/browse/YARN-1197
 Project: Hadoop YARN
  Issue Type: Task
  Components: api, nodemanager, resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Wangda Tan
 Attachments: YARN-1197 old-design-docs-patches-for-reference.zip, 
 YARN-1197_Design.pdf


 The current YARN resource management logic assumes resource allocated to a 
 container is fixed during the lifetime of it. When users want to change a 
 resource 
 of an allocated container the only way is releasing it and allocating a new 
 container with expected size.
 Allowing run-time changing resources of an allocated container will give us 
 better control of resource usage in application side



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1197) Support changing resources of an allocated container

2015-06-15 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14587174#comment-14587174
 ] 

Sandy Ryza commented on YARN-1197:
--

Regarding complexity in the AM, the NMClient utility so far has been an API 
that's fairly easy for app developers to interact with.  I've used it more than 
once and had no issues.  Would we not be able to handle most of the additional 
complexity behind it?

 Support changing resources of an allocated container
 

 Key: YARN-1197
 URL: https://issues.apache.org/jira/browse/YARN-1197
 Project: Hadoop YARN
  Issue Type: Task
  Components: api, nodemanager, resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Wangda Tan
 Attachments: YARN-1197 old-design-docs-patches-for-reference.zip, 
 YARN-1197_Design.pdf


 The current YARN resource management logic assumes resource allocated to a 
 container is fixed during the lifetime of it. When users want to change a 
 resource 
 of an allocated container the only way is releasing it and allocating a new 
 container with expected size.
 Allowing run-time changing resources of an allocated container will give us 
 better control of resource usage in application side



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-314) Schedulers should allow resource requests of different sizes at the same priority and location

2015-05-24 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557912#comment-14557912
 ] 

Sandy Ryza commented on YARN-314:
-

Do we have applications that need this capability?

 Schedulers should allow resource requests of different sizes at the same 
 priority and location
 --

 Key: YARN-314
 URL: https://issues.apache.org/jira/browse/YARN-314
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Affects Versions: 2.0.2-alpha
Reporter: Sandy Ryza
Assignee: Karthik Kambatla
 Attachments: yarn-314-prelim.patch


 Currently, resource requests for the same container and locality are expected 
 to all be the same size.
 While it it doesn't look like it's needed for apps currently, and can be 
 circumvented by specifying different priorities if absolutely necessary, it 
 seems to me that the ability to request containers with different resource 
 requirements at the same priority level should be there for the future and 
 for completeness sake.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3633) With Fair Scheduler, cluster can logjam when there are too many queues

2015-05-13 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542897#comment-14542897
 ] 

Sandy Ryza commented on YARN-3633:
--

Another thought is that we could say the max AM share only applies after first 
AM.

 With Fair Scheduler, cluster can logjam when there are too many queues
 --

 Key: YARN-3633
 URL: https://issues.apache.org/jira/browse/YARN-3633
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.6.0
Reporter: Rohit Agarwal
Assignee: Rohit Agarwal
Priority: Critical

 It's possible to logjam a cluster by submitting many applications at once in 
 different queues.
 For example, let's say there is a cluster with 20GB of total memory. Let's 
 say 4 users submit applications at the same time. The fair share of each 
 queue is 5GB. Let's say that maxAMShare is 0.5. So, each queue has at most 
 2.5GB memory for AMs. If all the users requested AMs of size 3GB - the 
 cluster logjams. Nothing gets scheduled even when 20GB of resources are 
 available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-5) Add support for FifoScheduler to schedule CPU along with memory.

2015-05-08 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14535008#comment-14535008
 ] 

Sandy Ryza commented on YARN-5:
---

+1 to Vinod's point

 Add support for FifoScheduler to schedule CPU along with memory.
 

 Key: YARN-5
 URL: https://issues.apache.org/jira/browse/YARN-5
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Arun C Murthy
Assignee: Arun C Murthy





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-810) Support CGroup ceiling enforcement on CPU

2015-05-06 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-810:

Assignee: (was: Sandy Ryza)

 Support CGroup ceiling enforcement on CPU
 -

 Key: YARN-810
 URL: https://issues.apache.org/jira/browse/YARN-810
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 2.1.0-beta, 2.0.5-alpha
Reporter: Chris Riccomini
  Labels: BB2015-05-TBR
 Attachments: YARN-810-3.patch, YARN-810-4.patch, YARN-810-5.patch, 
 YARN-810-6.patch, YARN-810.patch, YARN-810.patch


 Problem statement:
 YARN currently lets you define an NM's pcore count, and a pcore:vcore ratio. 
 Containers are then allowed to request vcores between the minimum and maximum 
 defined in the yarn-site.xml.
 In the case where a single-threaded container requests 1 vcore, with a 
 pcore:vcore ratio of 1:4, the container is still allowed to use up to 100% of 
 the core it's using, provided that no other container is also using it. This 
 happens, even though the only guarantee that YARN/CGroups is making is that 
 the container will get at least 1/4th of the core.
 If a second container then comes along, the second container can take 
 resources from the first, provided that the first container is still getting 
 at least its fair share (1/4th).
 There are certain cases where this is desirable. There are also certain cases 
 where it might be desirable to have a hard limit on CPU usage, and not allow 
 the process to go above the specified resource requirement, even if it's 
 available.
 Here's an RFC that describes the problem in more detail:
 http://lwn.net/Articles/336127/
 Solution:
 As it happens, when CFS is used in combination with CGroups, you can enforce 
 a ceiling using two files in cgroups:
 {noformat}
 cpu.cfs_quota_us
 cpu.cfs_period_us
 {noformat}
 The usage of these two files is documented in more detail here:
 https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Resource_Management_Guide/sec-cpu.html
 Testing:
 I have tested YARN CGroups using the 2.0.5-alpha implementation. By default, 
 it behaves as described above (it is a soft cap, and allows containers to use 
 more than they asked for). I then tested CFS CPU quotas manually with YARN.
 First, you can see that CFS is in use in the CGroup, based on the file names:
 {noformat}
 [criccomi@eat1-qa464 ~]$ sudo -u app ls -l /cgroup/cpu/hadoop-yarn/
 total 0
 -r--r--r-- 1 app app 0 Jun 13 16:46 cgroup.procs
 drwxr-xr-x 2 app app 0 Jun 13 17:08 container_1371141151815_0004_01_02
 -rw-r--r-- 1 app app 0 Jun 13 16:46 cpu.cfs_period_us
 -rw-r--r-- 1 app app 0 Jun 13 16:46 cpu.cfs_quota_us
 -rw-r--r-- 1 app app 0 Jun 13 16:46 cpu.rt_period_us
 -rw-r--r-- 1 app app 0 Jun 13 16:46 cpu.rt_runtime_us
 -rw-r--r-- 1 app app 0 Jun 13 16:46 cpu.shares
 -r--r--r-- 1 app app 0 Jun 13 16:46 cpu.stat
 -rw-r--r-- 1 app app 0 Jun 13 16:46 notify_on_release
 -rw-r--r-- 1 app app 0 Jun 13 16:46 tasks
 [criccomi@eat1-qa464 ~]$ sudo -u app cat
 /cgroup/cpu/hadoop-yarn/cpu.cfs_period_us
 10
 [criccomi@eat1-qa464 ~]$ sudo -u app cat
 /cgroup/cpu/hadoop-yarn/cpu.cfs_quota_us
 -1
 {noformat}
 Oddly, it appears that the cfs_period_us is set to .1s, not 1s.
 We can place processes in hard limits. I have process 4370 running YARN 
 container container_1371141151815_0003_01_03 on a host. By default, it's 
 running at ~300% cpu usage.
 {noformat}
 CPU
 4370 criccomi  20   0 1157m 551m  14m S 240.3  0.8  87:10.91 ...
 {noformat}
 When I set the CFS quote:
 {noformat}
 echo 1000  
 /cgroup/cpu/hadoop-yarn/container_1371141151815_0003_01_03/cpu.cfs_quota_us
  CPU
 4370 criccomi  20   0 1157m 563m  14m S  1.0  0.8  90:08.39 ...
 {noformat}
 It drops to 1% usage, and you can see the box has room to spare:
 {noformat}
 Cpu(s):  2.4%us,  1.0%sy,  0.0%ni, 92.2%id,  4.2%wa,  0.0%hi,  0.1%si, 
 0.0%st
 {noformat}
 Turning the quota back to -1:
 {noformat}
 echo -1  
 /cgroup/cpu/hadoop-yarn/container_1371141151815_0003_01_03/cpu.cfs_quota_us
 {noformat}
 Burns the cores again:
 {noformat}
 Cpu(s): 11.1%us,  1.7%sy,  0.0%ni, 83.9%id,  3.1%wa,  0.0%hi,  0.2%si, 
 0.0%st
 CPU
 4370 criccomi  20   0 1157m 563m  14m S 253.9  0.8  89:32.31 ...
 {noformat}
 On my dev box, I was testing CGroups by running a python process eight times, 
 to burn through all the cores, since it was doing as described above (giving 
 extra CPU to the process, even with a cpu.shares limit). Toggling the 
 cfs_quota_us seems to enforce a hard limit.
 Implementation:
 What do you guys 

[jira] [Commented] (YARN-3485) FairScheduler headroom calculation doesn't consider maxResources for Fifo and FairShare policies

2015-04-28 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518099#comment-14518099
 ] 

Sandy Ryza commented on YARN-3485:
--

It looks like the patch computes the headroom as min(cluster total - cluster 
consumed, queue max resource).  Do we not want it to be min(cluster total - 
cluster consumed, queue max resource - queue consumed)?

 FairScheduler headroom calculation doesn't consider maxResources for Fifo and 
 FairShare policies
 

 Key: YARN-3485
 URL: https://issues.apache.org/jira/browse/YARN-3485
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.7.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Priority: Critical
 Attachments: yarn-3485-1.patch, yarn-3485-prelim.patch


 FairScheduler's headroom calculations consider the fairshare and 
 cluster-available-resources, and the fairshare has maxResources. However, for 
 Fifo and Fairshare policies, the fairshare is used only for memory and not 
 cpu. So, the scheduler ends up showing a higher headroom than is actually 
 available. This could lead to applications waiting for resources far longer 
 than then intend to. e.g. MAPREDUCE-6302.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3485) FairScheduler headroom calculation doesn't consider maxResources for Fifo and FairShare policies

2015-04-28 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518222#comment-14518222
 ] 

Sandy Ryza commented on YARN-3485:
--

One nit:
{code}
+return Math.min( Math.min(value1, value2), value3);
{code}
has an extra space.

Otherwise +1.

 FairScheduler headroom calculation doesn't consider maxResources for Fifo and 
 FairShare policies
 

 Key: YARN-3485
 URL: https://issues.apache.org/jira/browse/YARN-3485
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.7.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Priority: Critical
 Attachments: yarn-3485-1.patch, yarn-3485-2.patch, 
 yarn-3485-prelim.patch


 FairScheduler's headroom calculations consider the fairshare and 
 cluster-available-resources, and the fairshare has maxResources. However, for 
 Fifo and Fairshare policies, the fairshare is used only for memory and not 
 cpu. So, the scheduler ends up showing a higher headroom than is actually 
 available. This could lead to applications waiting for resources far longer 
 than then intend to. e.g. MAPREDUCE-6302.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3415) Non-AM containers can be counted towards amResourceUsage of a Fair Scheduler queue

2015-04-02 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-3415:
-
Summary: Non-AM containers can be counted towards amResourceUsage of a Fair 
Scheduler queue  (was: Non-AM containers can be counted towards amResourceUsage 
of a fairscheduler queue)

 Non-AM containers can be counted towards amResourceUsage of a Fair Scheduler 
 queue
 --

 Key: YARN-3415
 URL: https://issues.apache.org/jira/browse/YARN-3415
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.6.0
Reporter: Rohit Agarwal
Assignee: zhihai xu
Priority: Critical
 Attachments: YARN-3415.000.patch, YARN-3415.001.patch, 
 YARN-3415.002.patch


 We encountered this problem while running a spark cluster. The 
 amResourceUsage for a queue became artificially high and then the cluster got 
 deadlocked because the maxAMShare constrain kicked in and no new AM got 
 admitted to the cluster.
 I have described the problem in detail here: 
 https://github.com/apache/spark/pull/5233#issuecomment-87160289
 In summary - the condition for adding the container's memory towards 
 amResourceUsage is fragile. It depends on the number of live containers 
 belonging to the app. We saw that the spark AM went down without explicitly 
 releasing its requested containers and then one of those containers memory 
 was counted towards amResource.
 cc - [~sandyr]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3415) Non-AM containers can be counted towards amResourceUsage of a fairscheduler queue

2015-04-01 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14391218#comment-14391218
 ] 

Sandy Ryza commented on YARN-3415:
--

+1

 Non-AM containers can be counted towards amResourceUsage of a fairscheduler 
 queue
 -

 Key: YARN-3415
 URL: https://issues.apache.org/jira/browse/YARN-3415
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.6.0
Reporter: Rohit Agarwal
Assignee: zhihai xu
Priority: Critical
 Attachments: YARN-3415.000.patch, YARN-3415.001.patch


 We encountered this problem while running a spark cluster. The 
 amResourceUsage for a queue became artificially high and then the cluster got 
 deadlocked because the maxAMShare constrain kicked in and no new AM got 
 admitted to the cluster.
 I have described the problem in detail here: 
 https://github.com/apache/spark/pull/5233#issuecomment-87160289
 In summary - the condition for adding the container's memory towards 
 amResourceUsage is fragile. It depends on the number of live containers 
 belonging to the app. We saw that the spark AM went down without explicitly 
 releasing its requested containers and then one of those containers memory 
 was counted towards amResource.
 cc - [~sandyr]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3415) Non-AM containers can be counted towards amResourceUsage of a fairscheduler queue

2015-04-01 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14391624#comment-14391624
 ] 

Sandy Ryza commented on YARN-3415:
--

[~ragarwal] did you have any more comments before I commit this?

 Non-AM containers can be counted towards amResourceUsage of a fairscheduler 
 queue
 -

 Key: YARN-3415
 URL: https://issues.apache.org/jira/browse/YARN-3415
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.6.0
Reporter: Rohit Agarwal
Assignee: zhihai xu
Priority: Critical
 Attachments: YARN-3415.000.patch, YARN-3415.001.patch


 We encountered this problem while running a spark cluster. The 
 amResourceUsage for a queue became artificially high and then the cluster got 
 deadlocked because the maxAMShare constrain kicked in and no new AM got 
 admitted to the cluster.
 I have described the problem in detail here: 
 https://github.com/apache/spark/pull/5233#issuecomment-87160289
 In summary - the condition for adding the container's memory towards 
 amResourceUsage is fragile. It depends on the number of live containers 
 belonging to the app. We saw that the spark AM went down without explicitly 
 releasing its requested containers and then one of those containers memory 
 was counted towards amResource.
 cc - [~sandyr]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3415) Non-AM containers can be counted towards amResourceUsage of a fairscheduler queue

2015-03-30 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14387425#comment-14387425
 ] 

Sandy Ryza commented on YARN-3415:
--

This looks mostly reasonable.  A few comments:
* In FSAppAttempt, can we change the If this container is used to run AM 
comment to If not running unmanaged, the first container we allocate is always 
the AM. Update the leaf queue's AM usage?
* The four lines of comment in FSLeafQueue could be reduced to If isAMRunning 
is true, we're no running an unmanaged AM.
* Would it make sense to move the call to setAMResource that's currently in 
FairScheduler next to the call to getQueue().addAMResourceUsage() so that the 
queue and attempt resource usage get updated at the same time?


 Non-AM containers can be counted towards amResourceUsage of a fairscheduler 
 queue
 -

 Key: YARN-3415
 URL: https://issues.apache.org/jira/browse/YARN-3415
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.6.0
Reporter: Rohit Agarwal
Assignee: zhihai xu
Priority: Critical
 Attachments: YARN-3415.000.patch


 We encountered this problem while running a spark cluster. The 
 amResourceUsage for a queue became artificially high and then the cluster got 
 deadlocked because the maxAMShare constrain kicked in and no new AM got 
 admitted to the cluster.
 I have described the problem in detail here: 
 https://github.com/apache/spark/pull/5233#issuecomment-87160289
 In summary - the condition for adding the container's memory towards 
 amResourceUsage is fragile. It depends on the number of live containers 
 belonging to the app. We saw that the spark AM went down without explicitly 
 releasing its requested containers and then one of those containers memory 
 was counted towards amResource.
 cc - [~sandyr]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3415) Non-AM containers can be counted towards amResourceUsage of a fairscheduler queue

2015-03-28 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14385347#comment-14385347
 ] 

Sandy Ryza commented on YARN-3415:
--

Thanks for filing this [~ragarwal] and for taking this up [~zxu].  This seems 
like a fairly serious issue.

 Non-AM containers can be counted towards amResourceUsage of a fairscheduler 
 queue
 -

 Key: YARN-3415
 URL: https://issues.apache.org/jira/browse/YARN-3415
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.6.0
Reporter: Rohit Agarwal
Assignee: zhihai xu

 We encountered this problem while running a spark cluster. The 
 amResourceUsage for a queue became artificially high and then the cluster got 
 deadlocked because the maxAMShare constrain kicked in and no new AM got 
 admitted to the cluster.
 I have described the problem in detail here: 
 https://github.com/apache/spark/pull/5233#issuecomment-87160289
 In summary - the condition for adding the container's memory towards 
 amResourceUsage is fragile. It depends on the number of live containers 
 belonging to the app. We saw that the spark AM went down without explicitly 
 releasing its requested containers and then one of those containers memory 
 was counted towards amResource.
 cc - [~sandyr]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3415) Non-AM containers can be counted towards amResourceUsage of a fairscheduler queue

2015-03-28 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-3415:
-
Target Version/s: 2.7.0, 2.6.1

 Non-AM containers can be counted towards amResourceUsage of a fairscheduler 
 queue
 -

 Key: YARN-3415
 URL: https://issues.apache.org/jira/browse/YARN-3415
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.6.0
Reporter: Rohit Agarwal
Assignee: zhihai xu
Priority: Critical

 We encountered this problem while running a spark cluster. The 
 amResourceUsage for a queue became artificially high and then the cluster got 
 deadlocked because the maxAMShare constrain kicked in and no new AM got 
 admitted to the cluster.
 I have described the problem in detail here: 
 https://github.com/apache/spark/pull/5233#issuecomment-87160289
 In summary - the condition for adding the container's memory towards 
 amResourceUsage is fragile. It depends on the number of live containers 
 belonging to the app. We saw that the spark AM went down without explicitly 
 releasing its requested containers and then one of those containers memory 
 was counted towards amResource.
 cc - [~sandyr]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3415) Non-AM containers can be counted towards amResourceUsage of a fairscheduler queue

2015-03-28 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-3415:
-
Priority: Critical  (was: Major)

 Non-AM containers can be counted towards amResourceUsage of a fairscheduler 
 queue
 -

 Key: YARN-3415
 URL: https://issues.apache.org/jira/browse/YARN-3415
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.6.0
Reporter: Rohit Agarwal
Assignee: zhihai xu
Priority: Critical

 We encountered this problem while running a spark cluster. The 
 amResourceUsage for a queue became artificially high and then the cluster got 
 deadlocked because the maxAMShare constrain kicked in and no new AM got 
 admitted to the cluster.
 I have described the problem in detail here: 
 https://github.com/apache/spark/pull/5233#issuecomment-87160289
 In summary - the condition for adding the container's memory towards 
 amResourceUsage is fragile. It depends on the number of live containers 
 belonging to the app. We saw that the spark AM went down without explicitly 
 releasing its requested containers and then one of those containers memory 
 was counted towards amResource.
 cc - [~sandyr]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2990) FairScheduler's delay-scheduling always waits for node-local and rack-local delays, even for off-rack-only requests

2015-02-06 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310034#comment-14310034
 ] 

Sandy Ryza commented on YARN-2990:
--

+1.  Sorry for the delay in getting to this.

 FairScheduler's delay-scheduling always waits for node-local and rack-local 
 delays, even for off-rack-only requests
 ---

 Key: YARN-2990
 URL: https://issues.apache.org/jira/browse/YARN-2990
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.6.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: yarn-2990-0.patch, yarn-2990-1.patch, yarn-2990-2.patch, 
 yarn-2990-test.patch


 Looking at the FairScheduler, it appears the node/rack locality delays are 
 used for all requests, even those that are only off-rack. 
 More details in comments. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3101) FairScheduler#fitInMaxShare was added to validate reservations but it does not consider it

2015-02-05 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307602#comment-14307602
 ] 

Sandy Ryza commented on YARN-3101:
--

+1

 FairScheduler#fitInMaxShare was added to validate reservations but it does 
 not consider it 
 ---

 Key: YARN-3101
 URL: https://issues.apache.org/jira/browse/YARN-3101
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3101-Siqi.v1.patch, YARN-3101-Siqi.v2.patch, 
 YARN-3101.001.patch, YARN-3101.002.patch, YARN-3101.003.patch, 
 YARN-3101.003.patch, YARN-3101.004.patch, YARN-3101.004.patch


 YARN-2811 added fitInMaxShare to validate reservations on a queue, but did 
 not count it during its calculations. It also had the condition reversed so 
 the test was still passing because both cancelled each other. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3101) In Fair Scheduler, fix canceling of reservations for exceeding max share

2015-02-05 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-3101:
-
Summary: In Fair Scheduler, fix canceling of reservations for exceeding max 
share  (was: Fix canceling of reservations for exceeding max share)

 In Fair Scheduler, fix canceling of reservations for exceeding max share
 

 Key: YARN-3101
 URL: https://issues.apache.org/jira/browse/YARN-3101
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3101-Siqi.v1.patch, YARN-3101-Siqi.v2.patch, 
 YARN-3101.001.patch, YARN-3101.002.patch, YARN-3101.003.patch, 
 YARN-3101.003.patch, YARN-3101.004.patch, YARN-3101.004.patch


 YARN-2811 added fitInMaxShare to validate reservations on a queue, but did 
 not count it during its calculations. It also had the condition reversed so 
 the test was still passing because both cancelled each other. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3101) Fix canceling of reservations for exceeding max share

2015-02-05 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-3101:
-
Summary: Fix canceling of reservations for exceeding max share  (was: 
FairScheduler#fitInMaxShare was added to validate reservations but it does not 
consider it )

 Fix canceling of reservations for exceeding max share
 -

 Key: YARN-3101
 URL: https://issues.apache.org/jira/browse/YARN-3101
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3101-Siqi.v1.patch, YARN-3101-Siqi.v2.patch, 
 YARN-3101.001.patch, YARN-3101.002.patch, YARN-3101.003.patch, 
 YARN-3101.003.patch, YARN-3101.004.patch, YARN-3101.004.patch


 YARN-2811 added fitInMaxShare to validate reservations on a queue, but did 
 not count it during its calculations. It also had the condition reversed so 
 the test was still passing because both cancelled each other. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3101) FairScheduler#fitInMaxShare was added to validate reservations but it does not consider it

2015-01-30 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298887#comment-14298887
 ] 

Sandy Ryza commented on YARN-3101:
--

[~adhoot] is this the same condition that's evaluated when reserving a resource 
in the first place?  I.e. might we ever make a reservation and then immediately 
end up canceling it?

Also, I believe [~l201514] is correct that 
reservedAppSchedulable.getResource(reservedPriority))) will not return the 
right quantity and node.getReservedContainer().getReservedResource() is 
correct. 

Last of all, while we're at it, can we rename fitInMaxShare to 
fitsInMaxShare?

 FairScheduler#fitInMaxShare was added to validate reservations but it does 
 not consider it 
 ---

 Key: YARN-3101
 URL: https://issues.apache.org/jira/browse/YARN-3101
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3101-Siqi.v1.patch, YARN-3101.001.patch, 
 YARN-3101.002.patch


 YARN-2811 added fitInMaxShare to validate reservations on a queue, but did 
 not count it during its calculations. It also had the condition reversed so 
 the test was still passing because both cancelled each other. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3101) FairScheduler#fitInMaxShare was added to validate reservations but it does not consider it

2015-01-30 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299428#comment-14299428
 ] 

Sandy Ryza commented on YARN-3101:
--

In that case it sounds like the behavior is that we can go one container over 
the max resources.  While this might be worth changing in a separate JIRA, we 
should maintain that behavior with the reservations.

 FairScheduler#fitInMaxShare was added to validate reservations but it does 
 not consider it 
 ---

 Key: YARN-3101
 URL: https://issues.apache.org/jira/browse/YARN-3101
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3101-Siqi.v1.patch, YARN-3101-Siqi.v2.patch, 
 YARN-3101.001.patch, YARN-3101.002.patch


 YARN-2811 added fitInMaxShare to validate reservations on a queue, but did 
 not count it during its calculations. It also had the condition reversed so 
 the test was still passing because both cancelled each other. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2990) FairScheduler's delay-scheduling always waits for node-local and rack-local delays, even for off-rack-only requests

2015-01-22 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287871#comment-14287871
 ] 

Sandy Ryza commented on YARN-2990:
--

Other than the addition of the anyLocalRequests check core here:
{code}
+  if (offSwitchRequest.getNumContainers()  0 
+  (!anyLocalRequests(priority)
+  || allowedLocality.equals(NodeType.OFF_SWITCH))) {
{code}
are the other changes core to the fix?  If not, given that this is touchy code, 
can we leave things the way they are or make the changes in a separate cleanup 
JIRA?

Also, a couple nits:
* Need some extra indentation in the snippet above
* anyLocalRequests is kind of a confusing name for that method, because any 
often means off-switch when thinking about locality.  Maybe 
hasNodeOrRackRequests.

 FairScheduler's delay-scheduling always waits for node-local and rack-local 
 delays, even for off-rack-only requests
 ---

 Key: YARN-2990
 URL: https://issues.apache.org/jira/browse/YARN-2990
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.6.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: yarn-2990-0.patch, yarn-2990-1.patch, 
 yarn-2990-test.patch


 Looking at the FairScheduler, it appears the node/rack locality delays are 
 used for all requests, even those that are only off-rack. 
 More details in comments. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException

2014-12-04 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234035#comment-14234035
 ] 

Sandy Ryza commented on YARN-2910:
--

Using a CopyOnWriteArrayList would make adding an application an O(n) 
operation.  On many clusters, this happens quite frequently.  Acquiring a lock 
is cheap when there is no contention.  if app submissions are frequent, I'd 
rather slow down requests for queue info than the submissions themselves.  
Otherwise, the former shouldn't have a large effect on the performance of the 
latter.   


 FSLeafQueue can throw ConcurrentModificationException
 -

 Key: YARN-2910
 URL: https://issues.apache.org/jira/browse/YARN-2910
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.5.0
Reporter: Wilfred Spiegelenburg
Assignee: Wilfred Spiegelenburg
 Attachments: FSLeafQueue_concurrent_exception.txt, YARN-2910.patch


 The list that maintains the runnable and the non runnable apps are a standard 
 ArrayList but there is no guarantee that it will only be manipulated by one 
 thread in the system. This can lead to the following exception:
 {noformat}
 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN 
 CONTACTING RM.
 java.util.ConcurrentModificationException: 
 java.util.ConcurrentModificationException
 at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
 at java.util.ArrayList$Itr.next(ArrayList.java:831)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516)
 {noformat}
 Full stack trace in the attached file.
 We should guard against that by using a thread safe version from 
 java.util.concurrent.CopyOnWriteArrayList



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException

2014-11-30 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-2910:
-
Assignee: Wilfred Spiegelenburg

 FSLeafQueue can throw ConcurrentModificationException
 -

 Key: YARN-2910
 URL: https://issues.apache.org/jira/browse/YARN-2910
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.5.0
Reporter: Wilfred Spiegelenburg
Assignee: Wilfred Spiegelenburg
 Attachments: FSLeafQueue_concurrent_exception.txt, YARN-2910.patch


 The list that maintains the runnable and the non runnable apps are a standard 
 ArrayList but there is no guarantee that it will only be manipulated by one 
 thread in the system. This can lead to the following exception:
 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN 
 CONTACTING RM.
 java.util.ConcurrentModificationException: 
 java.util.ConcurrentModificationException
 at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
 at java.util.ArrayList$Itr.next(ArrayList.java:831)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516)
 Full stack trace in the attached file.
 We should guard against that by using a thread safe version from 
 java.util.concurrent.CopyOnWriteArrayList



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2669) FairScheduler: queueName shouldn't allow periods the allocation.xml

2014-11-21 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14221513#comment-14221513
 ] 

Sandy Ryza commented on YARN-2669:
--

This is looking good.  A few comments.

Can we add documentation for this behavior in FairScheduler.apt.vm?

We should be doing the same conversion for group names, right?

{code}
+  +  submitted by user  + user +  with an illegal queue name (
+  + queueName + ). 
{code}
Nit: I think it's better not to surround the queue name with parentheses.

{code}
+return queueName + . + convertUsername(user);
{code}
Can we call convertUsername something like cleanUsername to be a little more 
descriptive?

 FairScheduler: queueName shouldn't allow periods the allocation.xml
 ---

 Key: YARN-2669
 URL: https://issues.apache.org/jira/browse/YARN-2669
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Wei Yan
Assignee: Wei Yan
Priority: Minor
 Attachments: YARN-2669-1.patch, YARN-2669-2.patch, YARN-2669-3.patch, 
 YARN-2669-4.patch


 For an allocation file like:
 {noformat}
 allocations
   queue name=root.q1
 minResources4096mb,4vcores/minResources
   /queue
 /allocations
 {noformat}
 Users may wish to config minResources for a queue with full path root.q1. 
 However, right now, fair scheduler will treat this configureation for the 
 queue with full name root.root.q1. We need to print out a warning msg to 
 notify users about this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2669) FairScheduler: queueName shouldn't allow periods the allocation.xml

2014-11-21 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14221604#comment-14221604
 ] 

Sandy Ryza commented on YARN-2669:
--

+1

 FairScheduler: queueName shouldn't allow periods the allocation.xml
 ---

 Key: YARN-2669
 URL: https://issues.apache.org/jira/browse/YARN-2669
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Wei Yan
Assignee: Wei Yan
Priority: Minor
 Attachments: YARN-2669-1.patch, YARN-2669-2.patch, YARN-2669-3.patch, 
 YARN-2669-4.patch, YARN-2669-5.patch


 For an allocation file like:
 {noformat}
 allocations
   queue name=root.q1
 minResources4096mb,4vcores/minResources
   /queue
 /allocations
 {noformat}
 Users may wish to config minResources for a queue with full path root.q1. 
 However, right now, fair scheduler will treat this configureation for the 
 queue with full name root.root.q1. We need to print out a warning msg to 
 notify users about this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2669) FairScheduler: queue names shouldn't allow periods

2014-11-21 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-2669:
-
Summary: FairScheduler: queue names shouldn't allow periods  (was: 
FairScheduler: queueName shouldn't allow periods the allocation.xml)

 FairScheduler: queue names shouldn't allow periods
 --

 Key: YARN-2669
 URL: https://issues.apache.org/jira/browse/YARN-2669
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Wei Yan
Assignee: Wei Yan
Priority: Minor
 Attachments: YARN-2669-1.patch, YARN-2669-2.patch, YARN-2669-3.patch, 
 YARN-2669-4.patch, YARN-2669-5.patch


 For an allocation file like:
 {noformat}
 allocations
   queue name=root.q1
 minResources4096mb,4vcores/minResources
   /queue
 /allocations
 {noformat}
 Users may wish to config minResources for a queue with full path root.q1. 
 However, right now, fair scheduler will treat this configureation for the 
 queue with full name root.root.q1. We need to print out a warning msg to 
 notify users about this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2669) FairScheduler: queue names shouldn't allow periods

2014-11-21 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-2669:
-
Priority: Major  (was: Minor)

 FairScheduler: queue names shouldn't allow periods
 --

 Key: YARN-2669
 URL: https://issues.apache.org/jira/browse/YARN-2669
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Wei Yan
Assignee: Wei Yan
 Attachments: YARN-2669-1.patch, YARN-2669-2.patch, YARN-2669-3.patch, 
 YARN-2669-4.patch, YARN-2669-5.patch


 For an allocation file like:
 {noformat}
 allocations
   queue name=root.q1
 minResources4096mb,4vcores/minResources
   /queue
 /allocations
 {noformat}
 Users may wish to config minResources for a queue with full path root.q1. 
 However, right now, fair scheduler will treat this configureation for the 
 queue with full name root.root.q1. We need to print out a warning msg to 
 notify users about this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2811) Fair Scheduler is violating max memory settings in 2.4

2014-11-14 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212015#comment-14212015
 ] 

Sandy Ryza commented on YARN-2811:
--

This looks almost good to go - the last thing is that we should use 
Resources.fitsIn instead of Resources.lessThanOrEqual(RESOURCE_CALCULATOR...), 
as the latter will only consider memory.

 Fair Scheduler is violating max memory settings in 2.4
 --

 Key: YARN-2811
 URL: https://issues.apache.org/jira/browse/YARN-2811
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Siqi Li
Assignee: Siqi Li
 Attachments: YARN-2811.v1.patch, YARN-2811.v2.patch, 
 YARN-2811.v3.patch, YARN-2811.v4.patch, YARN-2811.v5.patch, 
 YARN-2811.v6.patch, YARN-2811.v7.patch


 This has been seen on several queues showing the allocated MB going 
 significantly above the max MB and it appears to have started with the 2.4 
 upgrade. It could be a regression bug from 2.0 to 2.4



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2811) Fair Scheduler is violating max memory settings in 2.4

2014-11-14 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14213045#comment-14213045
 ] 

Sandy Ryza commented on YARN-2811:
--

+1

 Fair Scheduler is violating max memory settings in 2.4
 --

 Key: YARN-2811
 URL: https://issues.apache.org/jira/browse/YARN-2811
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Siqi Li
Assignee: Siqi Li
 Attachments: YARN-2811.v1.patch, YARN-2811.v2.patch, 
 YARN-2811.v3.patch, YARN-2811.v4.patch, YARN-2811.v5.patch, 
 YARN-2811.v6.patch, YARN-2811.v7.patch, YARN-2811.v8.patch, YARN-2811.v9.patch


 This has been seen on several queues showing the allocated MB going 
 significantly above the max MB and it appears to have started with the 2.4 
 upgrade. It could be a regression bug from 2.0 to 2.4



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2811) In Fair Scheduler, reservation fulfillments shouldn't ignore max share

2014-11-14 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-2811:
-
Summary: In Fair Scheduler, reservation fulfillments shouldn't ignore max 
share  (was: Fair Scheduler is violating max memory settings in 2.4)

 In Fair Scheduler, reservation fulfillments shouldn't ignore max share
 --

 Key: YARN-2811
 URL: https://issues.apache.org/jira/browse/YARN-2811
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Siqi Li
Assignee: Siqi Li
 Attachments: YARN-2811.v1.patch, YARN-2811.v2.patch, 
 YARN-2811.v3.patch, YARN-2811.v4.patch, YARN-2811.v5.patch, 
 YARN-2811.v6.patch, YARN-2811.v7.patch, YARN-2811.v8.patch, YARN-2811.v9.patch


 This has been seen on several queues showing the allocated MB going 
 significantly above the max MB and it appears to have started with the 2.4 
 upgrade. It could be a regression bug from 2.0 to 2.4



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2811) Fair Scheduler is violating max memory settings in 2.4

2014-11-12 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208403#comment-14208403
 ] 

Sandy Ryza commented on YARN-2811:
--

IIUC, this looks like it will check the immediate parent of the queue, but 
won't go any farther up in the hierarchy.

Can fitsIn be given a more descriptive name, like fitsInMaxShares?

Last, to avoid code duplication, can the check be moved into this same if 
statement:
{code}
if (!reservedAppSchedulable.hasContainerForNode(reservedPriority, node)) {
{code}

 Fair Scheduler is violating max memory settings in 2.4
 --

 Key: YARN-2811
 URL: https://issues.apache.org/jira/browse/YARN-2811
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Siqi Li
Assignee: Siqi Li
 Attachments: YARN-2811.v1.patch, YARN-2811.v2.patch, 
 YARN-2811.v3.patch, YARN-2811.v4.patch, YARN-2811.v5.patch


 This has been seen on several queues showing the allocated MB going 
 significantly above the max MB and it appears to have started with the 2.4 
 upgrade. It could be a regression bug from 2.0 to 2.4



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2811) Fair Scheduler is violating max memory settings in 2.4

2014-11-08 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14203685#comment-14203685
 ] 

Sandy Ryza commented on YARN-2811:
--

I just realized an issue with this.  maxResources can be set on parent queues 
as well, so checking the maxResources of the leaf queue that the app is part of 
is not enough.  Sorry for not catching this earlier.

A couple more style nitpicks: remember to keep lines close to 80 characters and 
to put a space after the double slashes that initiate a comment.  Also, FSQueue 
has a getMaxShare method, so you don't need to go to the trouble of getting the 
name and passing it to the map in the allocation configuration.

 Fair Scheduler is violating max memory settings in 2.4
 --

 Key: YARN-2811
 URL: https://issues.apache.org/jira/browse/YARN-2811
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Siqi Li
Assignee: Siqi Li
 Attachments: YARN-2811.v1.patch, YARN-2811.v2.patch, 
 YARN-2811.v3.patch, YARN-2811.v4.patch


 This has been seen on several queues showing the allocated MB going 
 significantly above the max MB and it appears to have started with the 2.4 
 upgrade. It could be a regression bug from 2.0 to 2.4



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2811) Fair Scheduler is violating max memory settings in 2.4

2014-11-07 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14201884#comment-14201884
 ] 

Sandy Ryza commented on YARN-2811:
--

Cool, thanks for the updated patch.  Are you able to add a test to verify the 
behavior?  A couple nits:

{code}
+if (Resources.fitsIn(queue.getResourceUsage(), queue.scheduler
+.getAllocationConfiguration().getMaxResources(queue.getName( {
{code}
Since we're in FairScheduler, can we just access the allocation configuration 
directly?

{code}
//Don't hold the reservation if queue reaches its maximum
{code}
Double slashes should have a space after them.

 Fair Scheduler is violating max memory settings in 2.4
 --

 Key: YARN-2811
 URL: https://issues.apache.org/jira/browse/YARN-2811
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Siqi Li
Assignee: Siqi Li
 Attachments: YARN-2811.v1.patch, YARN-2811.v2.patch, 
 YARN-2811.v3.patch


 This has been seen on several queues showing the allocated MB going 
 significantly above the max MB and it appears to have started with the 2.4 
 upgrade. It could be a regression bug from 2.0 to 2.4



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2811) Fair Scheduler is violating max memory settings in 2.4

2014-11-06 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14199980#comment-14199980
 ] 

Sandy Ryza commented on YARN-2811:
--

Thanks for uncovering this [~l201514].

I think that in this case, in addition to not assigning the container, the 
application should release the reservation so that other apps can get to the 
node.

 Fair Scheduler is violating max memory settings in 2.4
 --

 Key: YARN-2811
 URL: https://issues.apache.org/jira/browse/YARN-2811
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Siqi Li
Assignee: Siqi Li
 Attachments: YARN-2811.v1.patch, YARN-2811.v2.patch


 This has been seen on several queues showing the allocated MB going 
 significantly above the max MB and it appears to have started with the 2.4 
 upgrade. It could be a regression bug from 2.0 to 2.4



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2669) FairScheduler: print out a warning log when users provider a queueName starting with root. in the allocation.xml

2014-10-09 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165496#comment-14165496
 ] 

Sandy Ryza commented on YARN-2669:
--

Might it make more sense to just throw a validation error and crash?  Users 
usually don't look in the RM logs unless something is wrong.

 FairScheduler: print out a warning log when users provider a queueName 
 starting with root. in the allocation.xml
 --

 Key: YARN-2669
 URL: https://issues.apache.org/jira/browse/YARN-2669
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Wei Yan
Assignee: Wei Yan
Priority: Minor

 For an allocation file like:
 {noformat}
 allocations
   queue name=root.q1
 minResources4096mb,4vcores/minResources
   /queue
 /allocations
 {noformat}
 Users may wish to config minResources for a queue with full path root.q1. 
 However, right now, fair scheduler will treat this configureation for the 
 queue with full name root.root.q1. We need to print out a warning msg to 
 notify users about this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2669) FairScheduler: print out a warning log when users provider a queueName starting with root. in the allocation.xml

2014-10-09 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165642#comment-14165642
 ] 

Sandy Ryza commented on YARN-2669:
--

We shouldn't allow configured queue names to have periods in them.  I believe 
we already don't accept queues named root, but if we do, we shouldn't.

 FairScheduler: print out a warning log when users provider a queueName 
 starting with root. in the allocation.xml
 --

 Key: YARN-2669
 URL: https://issues.apache.org/jira/browse/YARN-2669
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Wei Yan
Assignee: Wei Yan
Priority: Minor

 For an allocation file like:
 {noformat}
 allocations
   queue name=root.q1
 minResources4096mb,4vcores/minResources
   /queue
 /allocations
 {noformat}
 Users may wish to config minResources for a queue with full path root.q1. 
 However, right now, fair scheduler will treat this configureation for the 
 queue with full name root.root.q1. We need to print out a warning msg to 
 notify users about this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2635) TestRM, TestRMRestart, TestClientToAMTokens should run with both CS and FS

2014-10-03 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14158328#comment-14158328
 ] 

Sandy Ryza commented on YARN-2635:
--

Parametrized should be spelled Paramet *e* rized.  Can you fix that on commit?

Otherwise, +1.

 

 TestRM, TestRMRestart, TestClientToAMTokens should run with both CS and FS
 --

 Key: YARN-2635
 URL: https://issues.apache.org/jira/browse/YARN-2635
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Wei Yan
Assignee: Wei Yan
 Attachments: YARN-2635-1.patch, YARN-2635-2.patch, yarn-2635-3.patch, 
 yarn-2635-4.patch


 If we change the scheduler from Capacity Scheduler to Fair Scheduler, the 
 TestRMRestart would fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-2643) Don't create a new DominantResourceCalculator on every FairScheduler.allocate call

2014-10-03 Thread Sandy Ryza (JIRA)
Sandy Ryza created YARN-2643:


 Summary: Don't create a new DominantResourceCalculator on every 
FairScheduler.allocate call
 Key: YARN-2643
 URL: https://issues.apache.org/jira/browse/YARN-2643
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Sandy Ryza
Assignee: Karthik Kambatla
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1414) with Fair Scheduler reserved MB in WebUI is leaking when killing waiting jobs

2014-10-02 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14157336#comment-14157336
 ] 

Sandy Ryza commented on YARN-1414:
--

Awesome

 with Fair Scheduler reserved MB in WebUI is leaking when killing waiting jobs
 -

 Key: YARN-1414
 URL: https://issues.apache.org/jira/browse/YARN-1414
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager, scheduler
Affects Versions: 2.0.5-alpha
Reporter: Siqi Li
Assignee: Siqi Li
 Attachments: YARN-1221-subtask.v1.patch.txt, YARN-1221-v2.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2635) TestRMRestart should run with all schedulers

2014-10-02 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14157624#comment-14157624
 ] 

Sandy Ryza commented on YARN-2635:
--

This seems like a good idea.  A few stylistic comments.

Can we rename RMSchedulerParametrizedTestBase to 
ParameterizedSchedulerTestBase?  The former confuses me a little because it 
like something that happened, rather than a noun, and RM doesn't seem 
necessary.  Also, Parameterized as spelled in the JUnit class name has three 
e's.  Lastly, can the class include some header comments on what it's doing?

{code}
+  protected void configScheduler(YarnConfiguration conf) throws IOException {
+// Configure scheduler
{code}
Just name the method configureScheduler instead of an abbreviation then comment.

{code}
+  private void configFifoScheduler(YarnConfiguration conf) {
+conf.set(YarnConfiguration.RM_SCHEDULER, FifoScheduler.class.getName());
+  }
+
+  private void configCapacityScheduler(YarnConfiguration conf) {
+conf.set(YarnConfiguration.RM_SCHEDULER, 
CapacityScheduler.class.getName());
+  }
{code}
These are only one line - can we just inline them?

{code}
+  protected YarnConfiguration conf = null;
{code}
I think better to make this private and expose it through a getConfig method.

Running the tests without FIFO seems reasonable to me.

One last thought - not sure how feasible this is, but the code might be simpler 
if we get rid of SchedulerType and just have the parameters be Configuration 
objects?

 TestRMRestart should run with all schedulers
 

 Key: YARN-2635
 URL: https://issues.apache.org/jira/browse/YARN-2635
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Wei Yan
Assignee: Wei Yan
 Attachments: YARN-2635-1.patch, YARN-2635-2.patch, yarn-2635-3.patch


 If we change the scheduler from Capacity Scheduler to Fair Scheduler, the 
 TestRMRestart would fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1414) with Fair Scheduler reserved MB in WebUI is leaking when killing waiting jobs

2014-10-01 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14156020#comment-14156020
 ] 

Sandy Ryza commented on YARN-1414:
--

[~jrottinghuis] I will take a look. [~l201514] mind rebasing so that the patch 
will apply?

 with Fair Scheduler reserved MB in WebUI is leaking when killing waiting jobs
 -

 Key: YARN-1414
 URL: https://issues.apache.org/jira/browse/YARN-1414
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager, scheduler
Affects Versions: 2.0.5-alpha
Reporter: Siqi Li
Assignee: Siqi Li
 Fix For: 2.2.0

 Attachments: YARN-1221-subtask.v1.patch.txt, YARN-1221-v2.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2596) TestWorkPreservingRMRestart for FairScheduler failed on trunk

2014-09-24 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14146921#comment-14146921
 ] 

Sandy Ryza commented on YARN-2596:
--

+1 pending jenkins

 TestWorkPreservingRMRestart for FairScheduler failed on trunk
 -

 Key: YARN-2596
 URL: https://issues.apache.org/jira/browse/YARN-2596
 Project: Hadoop YARN
  Issue Type: Test
Reporter: Junping Du
Assignee: Karthik Kambatla
 Attachments: yarn-2596-1.patch


 As test result from YARN-668, the test failure can be reproduce locally 
 without apply new patch to trunk. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2252) Intermittent failure for testcase TestFairScheduler.testContinuousScheduling

2014-09-23 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14144424#comment-14144424
 ] 

Sandy Ryza commented on YARN-2252:
--

+1

 Intermittent failure for testcase TestFairScheduler.testContinuousScheduling
 

 Key: YARN-2252
 URL: https://issues.apache.org/jira/browse/YARN-2252
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: trunk-win
Reporter: Ratandeep Ratti
  Labels: hadoop2, scheduler, yarn
 Attachments: YARN-2252-1.patch, yarn-2252-2.patch


 This test-case is failing sporadically on my machine. I think I have a 
 plausible explanation  for this.
 It seems that when the Scheduler is being asked for resources, the resource 
 requests that are being constructed have no preference for the hosts (nodes).
 The two mock hosts constructed, both have a memory of 8192 mb.
 The containers(resources) being requested each require a memory of 1024mb, 
 hence a single node can execute both the resource requests for the 
 application.
 In the end of the test-case it is being asserted that the containers 
 (resource requests) be executed on different nodes, but since we haven't 
 specified any preferences for nodes when requesting the resources, the 
 scheduler (at times) executes both the containers (requests) on the same node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2555) Effective max-allocation-* should consider biggest node

2014-09-15 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14135013#comment-14135013
 ] 

Sandy Ryza commented on YARN-2555:
--

[~gp.leftnoteasy], this isn't the same as having an NM variable affect the RM 
conf.  Considering the effective max allocation as the biggest node means 
rejecting requests that won't fit on any node, which I believe is the correct 
behavior.  The issue I had with YARN-2422 was handling at this at the 
configuration level, rather than properly handling this for heterogeneous 
clusters.

Thanks for pointing that out [~agentvindo.dev] - agreed that this duplicates 
YARN-56.  I think something like the approach outlined here probably makes the 
most sense for that JIRA.

 Effective max-allocation-* should consider biggest node
 ---

 Key: YARN-2555
 URL: https://issues.apache.org/jira/browse/YARN-2555
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.5.1
Reporter: Karthik Kambatla

 The effective max-allocation-mb should be 
 min(admin-configured-max-allocation-mb, max-mb-on-one-node), so we can reject 
 container requests for resources larger than any node. Today, these requests 
 wait forever. 
 We should do this for all resources and update the effective value on node 
 updates. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-415) Capture aggregate memory allocation at the app-level for chargeback

2014-09-12 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131908#comment-14131908
 ] 

Sandy Ryza commented on YARN-415:
-

Awesome to see this go in!

 Capture aggregate memory allocation at the app-level for chargeback
 ---

 Key: YARN-415
 URL: https://issues.apache.org/jira/browse/YARN-415
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: resourcemanager
Affects Versions: 2.5.0
Reporter: Kendall Thrapp
Assignee: Eric Payne
 Fix For: 2.6.0

 Attachments: YARN-415--n10.patch, YARN-415--n2.patch, 
 YARN-415--n3.patch, YARN-415--n4.patch, YARN-415--n5.patch, 
 YARN-415--n6.patch, YARN-415--n7.patch, YARN-415--n8.patch, 
 YARN-415--n9.patch, YARN-415.201405311749.txt, YARN-415.201406031616.txt, 
 YARN-415.201406262136.txt, YARN-415.201407042037.txt, 
 YARN-415.201407071542.txt, YARN-415.201407171553.txt, 
 YARN-415.201407172144.txt, YARN-415.201407232237.txt, 
 YARN-415.201407242148.txt, YARN-415.201407281816.txt, 
 YARN-415.201408062232.txt, YARN-415.201408080204.txt, 
 YARN-415.201408092006.txt, YARN-415.201408132109.txt, 
 YARN-415.201408150030.txt, YARN-415.201408181938.txt, 
 YARN-415.201408181938.txt, YARN-415.201408212033.txt, 
 YARN-415.201409040036.txt, YARN-415.201409092204.txt, 
 YARN-415.201409102216.txt, YARN-415.patch


 For the purpose of chargeback, I'd like to be able to compute the cost of an
 application in terms of cluster resource usage.  To start out, I'd like to 
 get the memory utilization of an application.  The unit should be MB-seconds 
 or something similar and, from a chargeback perspective, the memory amount 
 should be the memory reserved for the application, as even if the app didn't 
 use all that memory, no one else was able to use it.
 (reserved ram for container 1 * lifetime of container 1) + (reserved ram for
 container 2 * lifetime of container 2) + ... + (reserved ram for container n 
 * lifetime of container n)
 It'd be nice to have this at the app level instead of the job level because:
 1. We'd still be able to get memory usage for jobs that crashed (and wouldn't 
 appear on the job history server).
 2. We'd be able to get memory usage for future non-MR jobs (e.g. Storm).
 This new metric should be available both through the RM UI and RM Web 
 Services REST API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2154) FairScheduler: Improve preemption to preempt only those containers that would satisfy the incoming request

2014-09-08 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14126005#comment-14126005
 ] 

Sandy Ryza commented on YARN-2154:
--

I'd like to add another constraint that I've been thinking about into the mix.  
We don't necessarily need to implement it in this JIRA, but I think it's worth 
considering how it would affect the approach.

A queue should only be able to preempt a container from another queue if every 
queue between the starved queue and their least common ancestor is starved.  
This essentially means that we consider preemption and fairness hierarchically. 
 If the marketing and engineering queues are square in terms of resources, 
starved teams in engineering shouldn't be able to take resources from queues in 
marketing - they should only be able to preempt from queues within engineering.



 FairScheduler: Improve preemption to preempt only those containers that would 
 satisfy the incoming request
 --

 Key: YARN-2154
 URL: https://issues.apache.org/jira/browse/YARN-2154
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler
Affects Versions: 2.4.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Priority: Critical

 Today, FairScheduler uses a spray-gun approach to preemption. Instead, it 
 should only preempt resources that would satisfy the incoming request. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2486) FileSystem counters can overflow for large number of readOps, largeReadOps, writeOps

2014-09-01 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14117800#comment-14117800
 ] 

Sandy Ryza commented on YARN-2486:
--

Unfortunately these methods were made public in 2.5, so we can't change their 
signatures.  We can, however, add versions with new names that return longs.

 FileSystem counters can overflow for large number of readOps, largeReadOps, 
 writeOps
 

 Key: YARN-2486
 URL: https://issues.apache.org/jira/browse/YARN-2486
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.5.0, 2.4.1
Reporter: Swapnil Daingade
Priority: Minor

 The org.apache.hadoop.fs.FileSystem.Statistics.StatisticsData class defines 
 readOps, largeReadOps, writeOps as int. Also the The 
 org.apache.hadoop.fs.FileSystem.Statistics class has methods like 
 getReadOps(), getLargeReadOps() and getWriteOps() that return int. These int 
 values can overflow if the exceed 2^31-1 showing negative values. It would be 
 nice if these can be changed to long.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2448) RM should expose the name of the ResourceCalculator being used when AMs register

2014-08-31 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14116828#comment-14116828
 ] 

Sandy Ryza commented on YARN-2448:
--

As Karthik mentioned, the ResourceCalculator is an abstraction used by the 
Capacity Scheduler that isn't a great fit for the Fair Scheduler, which always 
enforces CPU limits but can be configured with a different fairness policy at 
each queue in the hierarchy.  If this is necessary, can we provide a narrower 
interface such as a boolean indicating whether the scheduler considers CPU in 
its decisions?

 RM should expose the name of the ResourceCalculator being used when AMs 
 register
 

 Key: YARN-2448
 URL: https://issues.apache.org/jira/browse/YARN-2448
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: apache-yarn-2448.0.patch, apache-yarn-2448.1.patch


 The RM should expose the name of the ResourceCalculator being used when AMs 
 register, as part of the RegisterApplicationMasterResponse.
 This will allow applications to make better decisions when scheduling. 
 MapReduce for example, only looks at memory when deciding it's scheduling, 
 even though the RM could potentially be using the DominantResourceCalculator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2422) yarn.scheduler.maximum-allocation-mb should not be hard-coded in yarn-default.xml

2014-08-18 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101274#comment-14101274
 ] 

Sandy Ryza commented on YARN-2422:
--

I think it's weird to have a nodemanager property impact what goes on in the 
ResourceManager. Using this property would be especially weird on heterogeneous 
clusters where resources vary from node to node.  Preferable would be to, 
independently of yarn.scheduler.maximum-allocation-mb, make the ResourceManager 
reject any requests that are larger than the largest node in the cluster.  And 
then default yarn.scheduler.maximum-allocaiton-mb to infinite. 

 yarn.scheduler.maximum-allocation-mb should not be hard-coded in 
 yarn-default.xml
 -

 Key: YARN-2422
 URL: https://issues.apache.org/jira/browse/YARN-2422
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.6.0
Reporter: Gopal V
Priority: Minor
 Attachments: YARN-2422.1.patch


 Cluster with 40Gb NM refuses to run containers 8Gb.
 It was finally tracked down to yarn-default.xml hard-coding it to 8Gb.
 In case of lack of a better override, it should default to - 
 ${yarn.nodemanager.resource.memory-mb} instead of a hard-coded 8Gb.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2430) FairShareComparator: cache the results of getResourceUsage()

2014-08-18 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101732#comment-14101732
 ] 

Sandy Ryza commented on YARN-2430:
--

I believe #3 is the best approach as it's more performant than #1 and #2 has 
correctness issues.  I actually implemented it a little while ago as part of 
YARN-1297 and will try to get that in.

 FairShareComparator: cache the results of getResourceUsage()
 

 Key: YARN-2430
 URL: https://issues.apache.org/jira/browse/YARN-2430
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh

 The compare of FairShareComparator has 3 invocation of  getResourceUsage per 
 comparable object. In the case of queues, the implementation of 
 getResourceUsage requires iterating over the apps and adding up their current 
 usage. The compare method can reuse the result of getResourceUsage to reduce 
 the load by third. However, to further reduce the load the result of 
 getResourceUsage can be cached in FSLeafQueue. This would be more efficient 
 since the invocation of compare method on each Comparable object is = 1.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2420) Fair Scheduler: change yarn.scheduler.fair.assignmultiple from boolean to integer

2014-08-14 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14097937#comment-14097937
 ] 

Sandy Ryza commented on YARN-2420:
--

Does yarn.scheduler.fair.max.assign satisfy what you're looking for?

 Fair Scheduler: change yarn.scheduler.fair.assignmultiple from boolean to 
 integer
 -

 Key: YARN-2420
 URL: https://issues.apache.org/jira/browse/YARN-2420
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Wei Yan
Assignee: Wei Yan
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2420) Fair Scheduler: dynamically update yarn.scheduler.fair.max.assign based on cluster load

2014-08-14 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14097961#comment-14097961
 ] 

Sandy Ryza commented on YARN-2420:
--

Cool.

Regarding adjusting maxassign dynamically, my view has been that this isn't 
needed when continuous scheduling is turned on, and eventually we expect 
everyone to switch over to continuous scheduling.  Thoughts?

 Fair Scheduler: dynamically update yarn.scheduler.fair.max.assign based on 
 cluster load
 ---

 Key: YARN-2420
 URL: https://issues.apache.org/jira/browse/YARN-2420
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Wei Yan
Assignee: Wei Yan
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2399) FairScheduler: Merge AppSchedulable and FSSchedulerApp into FSAppAttempt

2014-08-12 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14094598#comment-14094598
 ] 

Sandy Ryza commented on YARN-2399:
--

+1

 FairScheduler: Merge AppSchedulable and FSSchedulerApp into FSAppAttempt
 

 Key: YARN-2399
 URL: https://issues.apache.org/jira/browse/YARN-2399
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler
Affects Versions: 2.5.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: yarn-2399-1.patch, yarn-2399-2.patch, yarn-2399-3.patch


 FairScheduler has two data structures for an application, making the code 
 hard to track. We should merge these for better maintainability in the 
 long-term. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2413) capacity scheduler will overallocate vcores

2014-08-12 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14094885#comment-14094885
 ] 

Sandy Ryza commented on YARN-2413:
--

The capacity scheduler truncates all vcore requests to 0 if the 
DominantResourceCalculator is not used. I think in this case it also doesn't 
make an effort to respect node vcore capacities at all.

 capacity scheduler will overallocate vcores
 ---

 Key: YARN-2413
 URL: https://issues.apache.org/jira/browse/YARN-2413
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 3.0.0, 2.2.0
Reporter: Allen Wittenauer
Priority: Critical

 It doesn't appear that the capacity scheduler is properly allocating vcores 
 when making scheduling decisions, which may result in overallocation of CPU 
 resources.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2413) capacity scheduler will overallocate vcores

2014-08-12 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14094899#comment-14094899
 ] 

Sandy Ryza commented on YARN-2413:
--

I believe this is the expected behavior (i.e. Capacity Scheduler by default 
doesn't use vcores in scheduling).

 capacity scheduler will overallocate vcores
 ---

 Key: YARN-2413
 URL: https://issues.apache.org/jira/browse/YARN-2413
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 3.0.0, 2.2.0
Reporter: Allen Wittenauer
Priority: Critical

 It doesn't appear that the capacity scheduler is properly allocating vcores 
 when making scheduling decisions, which may result in overallocation of CPU 
 resources.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2413) capacity scheduler will overallocate vcores

2014-08-12 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14094903#comment-14094903
 ] 

Sandy Ryza commented on YARN-2413:
--

I don't have an opinion on whether we should keep this as the default behavior, 
just wanted to clear up that it's what's expected.

 capacity scheduler will overallocate vcores
 ---

 Key: YARN-2413
 URL: https://issues.apache.org/jira/browse/YARN-2413
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 3.0.0, 2.2.0
Reporter: Allen Wittenauer
Priority: Critical

 It doesn't appear that the capacity scheduler is properly allocating vcores 
 when making scheduling decisions, which may result in overallocation of CPU 
 resources.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2399) FairScheduler: Merge AppSchedulable and FSSchedulerApp into FSAppAttempt

2014-08-11 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14093592#comment-14093592
 ] 

Sandy Ryza commented on YARN-2399:
--

I noticed in FSAppAttempt there are some instance variables mixed in with the 
functions.  Not sure if it was like this already, but can we move them up to 
the top?

 FairScheduler: Merge AppSchedulable and FSSchedulerApp into FSAppAttempt
 

 Key: YARN-2399
 URL: https://issues.apache.org/jira/browse/YARN-2399
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler
Affects Versions: 2.5.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: yarn-2399-1.patch, yarn-2399-2.patch


 FairScheduler has two data structures for an application, making the code 
 hard to track. We should merge these for better maintainability in the 
 long-term. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2399) FairScheduler: Merge AppSchedulable and FSSchedulerApp into FSAppAttempt

2014-08-11 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14093597#comment-14093597
 ] 

Sandy Ryza commented on YARN-2399:
--

Also, can we move all the methods that implement methods in Schedulable 
together?

{code}
+  // TODO (KK): Rename these
{code}
Rename these?

{code}
-new 
ConcurrentHashMapApplicationId,SchedulerApplicationFSSchedulerApp();
+new 
ConcurrentHashMapApplicationId,SchedulerApplicationFSAppAttempt();
{code}
Mind adding a space here after ApplicationId because you're fixing this line 
anyway?

{code}
+  private FSAppAttempt mockAppSched(long startTime) {
+FSAppAttempt schedApp = mock(FSAppAttempt.class);
+when(schedApp.getStartTime()).thenReturn(startTime);
+return schedApp;
   }
{code}
Call this mockAppAttempt?

Otherwise, LGTM

 FairScheduler: Merge AppSchedulable and FSSchedulerApp into FSAppAttempt
 

 Key: YARN-2399
 URL: https://issues.apache.org/jira/browse/YARN-2399
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler
Affects Versions: 2.5.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: yarn-2399-1.patch, yarn-2399-2.patch


 FairScheduler has two data structures for an application, making the code 
 hard to track. We should merge these for better maintainability in the 
 long-term. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-807) When querying apps by queue, iterating over all apps is inefficient and limiting

2014-08-08 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14090420#comment-14090420
 ] 

Sandy Ryza commented on YARN-807:
-

bq. If you think it's a bug, we can resolve it in YARN-2385. 

bq. We may need to create a Mapqueue-name, app-id in RMContext.
It's also worth considering only holding this map for completed applications, 
so we don't need to keep two maps for running applications.

 When querying apps by queue, iterating over all apps is inefficient and 
 limiting 
 -

 Key: YARN-807
 URL: https://issues.apache.org/jira/browse/YARN-807
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.4-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Fix For: 2.3.0

 Attachments: YARN-807-1.patch, YARN-807-2.patch, YARN-807-3.patch, 
 YARN-807-4.patch, YARN-807.patch


 The question which apps are in queue x can be asked via the RM REST APIs, 
 through the ClientRMService, and through the command line.  In all these 
 cases, the question is answered by scanning through every RMApp and filtering 
 by the app's queue name.
 All schedulers maintain a mapping of queues to applications.  I think it 
 would make more sense to ask the schedulers which applications are in a given 
 queue. This is what was done in MR1. This would also have the advantage of 
 allowing a parent queue to return all the applications on leaf queues under 
 it, and allow queue name aliases, as in the way that root.default and 
 default refer to the same queue in the fair scheduler.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-807) When querying apps by queue, iterating over all apps is inefficient and limiting

2014-08-08 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14090447#comment-14090447
 ] 

Sandy Ryza commented on YARN-807:
-

I just remembered a couple reasons why it's important that we go through the 
scheduler:
* *Getting all the apps underneath a parent queue* - the scheduler holds queue 
hierarchy information that allows us to return applications in all leaf queues 
underneath a parent queue.
* *Alisases* - In the Fair Scheduler, default is shorthand for 
root.default, so querying on either of these names should return applications 
in that queue.

I'm open to approaches that don't require going through the scheduler, but I 
think we should make sure they keep supporting these capabilities.

 When querying apps by queue, iterating over all apps is inefficient and 
 limiting 
 -

 Key: YARN-807
 URL: https://issues.apache.org/jira/browse/YARN-807
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.4-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Fix For: 2.3.0

 Attachments: YARN-807-1.patch, YARN-807-2.patch, YARN-807-3.patch, 
 YARN-807-4.patch, YARN-807.patch


 The question which apps are in queue x can be asked via the RM REST APIs, 
 through the ClientRMService, and through the command line.  In all these 
 cases, the question is answered by scanning through every RMApp and filtering 
 by the app's queue name.
 All schedulers maintain a mapping of queues to applications.  I think it 
 would make more sense to ask the schedulers which applications are in a given 
 queue. This is what was done in MR1. This would also have the advantage of 
 allowing a parent queue to return all the applications on leaf queues under 
 it, and allow queue name aliases, as in the way that root.default and 
 default refer to the same queue in the fair scheduler.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2352) FairScheduler: Collect metrics on duration of critical methods that affect performance

2014-08-07 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089874#comment-14089874
 ] 

Sandy Ryza commented on YARN-2352:
--

My only comment is that I think it would make more sense to call these metrics 
FSOpDurations.  Otherwise LGTM.

 FairScheduler: Collect metrics on duration of critical methods that affect 
 performance
 --

 Key: YARN-2352
 URL: https://issues.apache.org/jira/browse/YARN-2352
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Affects Versions: 2.4.1
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: fs-perf-metrics.png, yarn-2352-1.patch, 
 yarn-2352-2.patch, yarn-2352-2.patch, yarn-2352-3.patch, yarn-2352-4.patch


 We need more metrics for better visibility into FairScheduler performance. At 
 the least, we need to do this for (1) handle node events, (2) update, (3) 
 compute fairshares, (4) preemption.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2352) FairScheduler: Collect metrics on duration of critical methods that affect performance

2014-08-07 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089880#comment-14089880
 ] 

Sandy Ryza commented on YARN-2352:
--

And also - is there a reason we need to change all the clocks to 
getClock()s?

 FairScheduler: Collect metrics on duration of critical methods that affect 
 performance
 --

 Key: YARN-2352
 URL: https://issues.apache.org/jira/browse/YARN-2352
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Affects Versions: 2.4.1
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: fs-perf-metrics.png, yarn-2352-1.patch, 
 yarn-2352-2.patch, yarn-2352-2.patch, yarn-2352-3.patch, yarn-2352-4.patch


 We need more metrics for better visibility into FairScheduler performance. At 
 the least, we need to do this for (1) handle node events, (2) update, (3) 
 compute fairshares, (4) preemption.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2352) FairScheduler: Collect metrics on duration of critical methods that affect performance

2014-08-07 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089997#comment-14089997
 ] 

Sandy Ryza commented on YARN-2352:
--

+1

 FairScheduler: Collect metrics on duration of critical methods that affect 
 performance
 --

 Key: YARN-2352
 URL: https://issues.apache.org/jira/browse/YARN-2352
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Affects Versions: 2.4.1
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: fs-perf-metrics.png, yarn-2352-1.patch, 
 yarn-2352-2.patch, yarn-2352-2.patch, yarn-2352-3.patch, yarn-2352-4.patch, 
 yarn-2352-5.patch


 We need more metrics for better visibility into FairScheduler performance. At 
 the least, we need to do this for (1) handle node events, (2) update, (3) 
 compute fairshares, (4) preemption.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-807) When querying apps by queue, iterating over all apps is inefficient and limiting

2014-08-07 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14090172#comment-14090172
 ] 

Sandy Ryza commented on YARN-807:
-

Hi [~leftnoteasy],

I think the expected behavior should be to include both active and pending 
apps.  If that changed with this patch, then I introduced a bug.  Perhaps more 
worryingly, it appears that this patch makes it so that completed apps aren't 
returned when querying by queue, which I don't think is necessarily desirable 
behavior.

 When querying apps by queue, iterating over all apps is inefficient and 
 limiting 
 -

 Key: YARN-807
 URL: https://issues.apache.org/jira/browse/YARN-807
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.4-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Fix For: 2.3.0

 Attachments: YARN-807-1.patch, YARN-807-2.patch, YARN-807-3.patch, 
 YARN-807-4.patch, YARN-807.patch


 The question which apps are in queue x can be asked via the RM REST APIs, 
 through the ClientRMService, and through the command line.  In all these 
 cases, the question is answered by scanning through every RMApp and filtering 
 by the app's queue name.
 All schedulers maintain a mapping of queues to applications.  I think it 
 would make more sense to ask the schedulers which applications are in a given 
 queue. This is what was done in MR1. This would also have the advantage of 
 allowing a parent queue to return all the applications on leaf queues under 
 it, and allow queue name aliases, as in the way that root.default and 
 default refer to the same queue in the fair scheduler.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2352) FairScheduler: Collect metrics on duration of critical methods that affect performance

2014-08-06 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14087887#comment-14087887
 ] 

Sandy Ryza commented on YARN-2352:
--

IIUC, this patch will only record the duration.  If we go that route, I think 
we should call these metrics lastNodeUpdateDuration etc..  However, would it 
make sense to go with an approach that records more historical information?  
For example, RPCMetrics uses a MutableRate to keep stats on the processing time 
for RPCs, and I think a similar model could work here.

Last, is there any need to make the FSPerfMetrics instance static?  Right now I 
think the Fair Scheduler has managed to avoid any mutable static variables. 

 FairScheduler: Collect metrics on duration of critical methods that affect 
 performance
 --

 Key: YARN-2352
 URL: https://issues.apache.org/jira/browse/YARN-2352
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Affects Versions: 2.4.1
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
 Attachments: fs-perf-metrics.png, yarn-2352-1.patch, 
 yarn-2352-2.patch, yarn-2352-2.patch


 We need more metrics for better visibility into FairScheduler performance. At 
 the least, we need to do this for (1) handle node events, (2) update, (3) 
 compute fairshares, (4) preemption.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (YARN-2367) Make ResourceCalculator configurable for FairScheduler and FifoScheduler like CapacityScheduler

2014-07-29 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza resolved YARN-2367.
--

Resolution: Not a Problem

Hi Swapnil,
The Fair Scheduler supports this through a different interface.  Scheduling 
policies can be configured at any queue level in the hierarchy.

In general, the FIFO scheduler lacks most of the advanced functionality of the 
Fair and Capacity schedulers.  My opinion is that achieving parity is a 
non-goal.  If you think this shouldn't be the case, feel free to reopen this 
JIRA under a name like Support multi-resource scheduling in the FIFO 
scheduler and we can discuss whether that's worth embarking on.

 Make ResourceCalculator configurable for FairScheduler and FifoScheduler like 
 CapacityScheduler
 ---

 Key: YARN-2367
 URL: https://issues.apache.org/jira/browse/YARN-2367
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.2.0, 2.3.0, 2.4.1
Reporter: Swapnil Daingade
Priority: Minor

 The ResourceCalculator used by CapacityScheduler is read from a configuration 
 file entry capacity-scheduler.xml 
 yarn.scheduler.capacity.resource-calculator. This allows for custom 
 implementations that implement the ResourceCalculator interface to be plugged 
 in. It would be nice to have the same functionality in FairScheduler and 
 FifoScheduler.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2328) FairScheduler: Verify update and continuous scheduling threads are stopped when the scheduler is stopped

2014-07-22 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071361#comment-14071361
 ] 

Sandy Ryza commented on YARN-2328:
--

{code}
-if (node != null  Resources.fitsIn(minimumAllocation,
-node.getAvailableResource())) {
+if (node != null 
+Resources.fitsIn(minimumAllocation, node.getAvailableResource())) {
{code}
This looks unrelated.

+1 otherwise.

 FairScheduler: Verify update and continuous scheduling threads are stopped 
 when the scheduler is stopped
 

 Key: YARN-2328
 URL: https://issues.apache.org/jira/browse/YARN-2328
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.4.1
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Priority: Minor
 Attachments: yarn-2328-1.patch


 FairScheduler threads can use a little cleanup and tests. To begin with, the 
 update and continuous-scheduling threads should extend Thread and handle 
 being interrupted. We should have tests for starting and stopping them as 
 well. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-2313) Livelock can occur in FairScheduler when there are lots of running apps

2014-07-22 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-2313:
-

Summary: Livelock can occur in FairScheduler when there are lots of running 
apps  (was: Livelock can occur on FairScheduler when there are lots of running 
apps)

 Livelock can occur in FairScheduler when there are lots of running apps
 ---

 Key: YARN-2313
 URL: https://issues.apache.org/jira/browse/YARN-2313
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.4.1
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: YARN-2313.1.patch, YARN-2313.2.patch, YARN-2313.3.patch, 
 YARN-2313.4.patch, rm-stack-trace.txt


 Observed livelock on FairScheduler when there are lots entry in queue. After 
 my investigating code, following case can occur:
 1. {{update()}} called by UpdateThread takes longer times than 
 UPDATE_INTERVAL(500ms) if there are lots queue.
 2. UpdateThread goes busy loop.
 3. Other threads(AllocationFileReloader, 
 ResourceManager$SchedulerEventDispatcher) can wait forever.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2313) Livelock can occur on FairScheduler when there are lots entry in queue

2014-07-20 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14067998#comment-14067998
 ] 

Sandy Ryza commented on YARN-2313:
--

Thanks for reporting this [~ozawa].

A couple nits:
* The new configuration should be defined in FairSchedulerConfiguration like 
other fair scheduler props
* If I understand correctly, the race described in the findbugs could never 
actually happen.  For code readability, I think it's better to add a findbugs 
exclude than an unnecessary synchronization.
* In the warning, replace use with using
* Extra space after DEFAULT_RM_SCHEDULER_FS_UPDATE_INTERVAL_MS

Eventually, I think we should try to be smarter about the work that goes on in 
update().  In most cases, the fair shares will stay the same, or will only 
change for apps in a particular queue, so we can avoid recomputation.

 Livelock can occur on FairScheduler when there are lots entry in queue
 --

 Key: YARN-2313
 URL: https://issues.apache.org/jira/browse/YARN-2313
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.4.1
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: YARN-2313.1.patch, YARN-2313.2.patch, YARN-2313.3.patch, 
 rm-stack-trace.txt


 Observed livelock on FairScheduler when there are lots entry in queue. After 
 my investigating code, following case can occur:
 1. {{update()}} called by UpdateThread takes longer times than 
 UPDATE_INTERVAL(500ms) if there are lots queue.
 2. UpdateThread goes busy loop.
 3. Other threads(AllocationFileReloader, 
 ResourceManager$SchedulerEventDispatcher) can wait forever.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-796) Allow for (admin) labels on nodes and resource-requests

2014-07-20 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14068017#comment-14068017
 ] 

Sandy Ryza commented on YARN-796:
-

I'm worried that the proposal is becoming too complex.  Can we try to whittle 
the proposal down to a minimum viable feature?  I'm not necessarily opposed to 
the more advanced parts of it like queue label policies and updating labels on 
the fly, and the design should aim to make them possible in the future, but I 
don't think they need to be part of the initial implementation.

To me it seems like the essential requirements here are:
* A way for nodes to be tagged with labels
* A way to make scheduling requests based on these labels

I'm also skeptical about the need for adding/removing labels dynamically.  Do 
we have concrete use cases for this?

Lastly, as BC and Sunil have pointed out, specifying the labels in the 
NodeManager confs greatly simplifies configuration when nodes are being added.  
Are there advantages to a centralized configuration?



 Allow for (admin) labels on nodes and resource-requests
 ---

 Key: YARN-796
 URL: https://issues.apache.org/jira/browse/YARN-796
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun C Murthy
Assignee: Wangda Tan
 Attachments: LabelBasedScheduling.pdf, 
 Node-labels-Requirements-Design-doc-V1.pdf, YARN-796.patch


 It will be useful for admins to specify labels for nodes. Examples of labels 
 are OS, processor architecture etc.
 We should expose these labels and allow applications to specify labels on 
 resource-requests.
 Obviously we need to support admin operations on adding/removing node labels.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2323) FairShareComparator creates too much Resource object

2014-07-20 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14068143#comment-14068143
 ] 

Sandy Ryza commented on YARN-2323:
--

As it's a static final variable, ONE should be all caps.  Otherwise, LGTM.

 FairShareComparator creates too much Resource object
 

 Key: YARN-2323
 URL: https://issues.apache.org/jira/browse/YARN-2323
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler
Reporter: Hong Zhiguo
Assignee: Hong Zhiguo
Priority: Minor
 Attachments: YARN-2323.patch


 Each call of {{FairShareComparator}} creates a new Resource object one:
 {code}
 Resource one = Resources.createResource(1);
 {code}
 At the volume of 1000 nodes and 1000 apps, the comparator will be called more 
 than 10 million times per second, thus creating more than 10 million object 
 one, which is unnecessary.
 Since the object one is read-only and is never referenced outside of 
 comparator, we could make it static.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-2323) FairShareComparator creates too many Resource objects

2014-07-20 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-2323:
-

Summary: FairShareComparator creates too many Resource objects  (was: 
FairShareComparator creates too much Resource object)

 FairShareComparator creates too many Resource objects
 -

 Key: YARN-2323
 URL: https://issues.apache.org/jira/browse/YARN-2323
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler
Reporter: Hong Zhiguo
Assignee: Hong Zhiguo
Priority: Minor
 Attachments: YARN-2323-2.patch, YARN-2323.patch


 Each call of {{FairShareComparator}} creates a new Resource object one:
 {code}
 Resource one = Resources.createResource(1);
 {code}
 At the volume of 1000 nodes and 1000 apps, the comparator will be called more 
 than 10 million times per second, thus creating more than 10 million object 
 one, which is unnecessary.
 Since the object one is read-only and is never referenced outside of 
 comparator, we could make it static.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2257) Add user to queue mappings to automatically place users' apps into specific queues

2014-07-16 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063749#comment-14063749
 ] 

Sandy Ryza commented on YARN-2257:
--

[~wangda] I agree with you that expecting admins to recompile Hadoop is 
unreasonable.  I don't think we would expect admins to add rules.  The idea is 
more to have a small library of rules we provide that fit into a common 
configuration framework.

If we were to add QueuePlacementRule that accepts a list of user-queue, and 
wanted to express accept the user's queue if they specify it in 
ApplicationSubmissionContext, otherwise look for a user-queue mapping, if none 
is found, use the default queue, configuring it according to the current Fair 
Scheduler format would look something like this:
{code}
queuePlacementPolicy
  rule name=specified /
  rule name=userToQueue
mapping name=sally queue=queue1
mapping name=emilio queue=queue2
  /rule
  rule name=default/
/queuePlacementPolicy
{code}

 Add user to queue mappings to automatically place users' apps into specific 
 queues
 --

 Key: YARN-2257
 URL: https://issues.apache.org/jira/browse/YARN-2257
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Reporter: Patrick Liu
Assignee: Vinod Kumar Vavilapalli
  Labels: features

 Currently, the fair-scheduler supports two modes, default queue or individual 
 queue for each user.
 Apparently, the default queue is not a good option, because the resources 
 cannot be managed for each user or group.
 However, individual queue for each user is not good enough. Especially when 
 connecting yarn with hive. There will be increasing hive users in a corporate 
 environment. If we create a queue for a user, the resource management will be 
 hard to maintain.
 I think the problem can be solved like this:
 1. Define user-queue mapping in Fair-Scheduler.xml. Inside each queue, use 
 aclSubmitApps to control user's ability.
 2. Each time a user submit an app to yarn, if the user has mapped to a queue, 
 the app will be scheduled to that queue; otherwise, the app will be submitted 
 to default queue.
 3. If the user cannot pass aclSubmitApps limits, the app will not be accepted.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2257) Add user to queue mappings to automatically place users' apps into specific queues

2014-07-14 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14060982#comment-14060982
 ] 

Sandy Ryza commented on YARN-2257:
--

Wangda,

That policy would work well for a some situations, but I don't think it covers 
many reasonable scenarios.  For example, we might want to ignore the queue that 
the user defines entirely.  Or admins might want to be able to just send apps 
to queues named with the user's group, instead of specifying the mapping for 
every group. Or a subdivision in an organization might want to make placements 
based on group, while a different subdivision using the same cluster might want 
to make placements based on user.

Would you mind taking a look at the Automatically placing applications in 
queues section and corresponding configuration example in
http://hadoop.apache.org/docs/r2.3.0/hadoop-yarn/hadoop-yarn-site/FairScheduler.html
My opinion is that this is a good fit for YARN in general.

 Add user to queue mappings to automatically place users' apps into specific 
 queues
 --

 Key: YARN-2257
 URL: https://issues.apache.org/jira/browse/YARN-2257
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Reporter: Patrick Liu
Assignee: Vinod Kumar Vavilapalli
  Labels: features

 Currently, the fair-scheduler supports two modes, default queue or individual 
 queue for each user.
 Apparently, the default queue is not a good option, because the resources 
 cannot be managed for each user or group.
 However, individual queue for each user is not good enough. Especially when 
 connecting yarn with hive. There will be increasing hive users in a corporate 
 environment. If we create a queue for a user, the resource management will be 
 hard to maintain.
 I think the problem can be solved like this:
 1. Define user-queue mapping in Fair-Scheduler.xml. Inside each queue, use 
 aclSubmitApps to control user's ability.
 2. Each time a user submit an app to yarn, if the user has mapped to a queue, 
 the app will be scheduled to that queue; otherwise, the app will be submitted 
 to default queue.
 3. If the user cannot pass aclSubmitApps limits, the app will not be accepted.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-796) Allow for (admin) labels on nodes and resource-requests

2014-07-11 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14059074#comment-14059074
 ] 

Sandy Ryza commented on YARN-796:
-

+1 on reducing the complexity of the label predicates.  We should only use OR 
if we can think of a few concrete use cases where we would need it.

 Allow for (admin) labels on nodes and resource-requests
 ---

 Key: YARN-796
 URL: https://issues.apache.org/jira/browse/YARN-796
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun C Murthy
Assignee: Wangda Tan
 Attachments: LabelBasedScheduling.pdf, 
 Node-labels-Requirements-Design-doc-V1.pdf, YARN-796.patch


 It will be useful for admins to specify labels for nodes. Examples of labels 
 are OS, processor architecture etc.
 We should expose these labels and allow applications to specify labels on 
 resource-requests.
 Obviously we need to support admin operations on adding/removing node labels.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2274) FairScheduler: Add debug information about cluster capacity, availability and reservations

2014-07-11 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14059292#comment-14059292
 ] 

Sandy Ryza commented on YARN-2274:
--

{code}
+if (--updatesToSkipForDebug  0) {
+  updatesToSkipForDebug = UPDATE_DEBUG_FREQUENCY;
+  if (LOG.isDebugEnabled()) {
+LOG.debug(Cluster Capacity:  + clusterResource +
+  Allocations:  + rootMetrics.getAllocatedResources() +
+  Availability:  + Resource.newInstance(
+rootMetrics.getAvailableMB(),
+rootMetrics.getAvailableVirtualCores()) +
+  Demand:  + rootQueue.getDemand());
+  }
+}
{code}
Moving the if (LOG.isDebugEnabled) to the outside of this chunk would make it 
easier for readers who don't care about what's debug logged to realize they can 
skip this whole segment.  If you're OK with that change, +1 and it can be fixed 
on commit?

 FairScheduler: Add debug information about cluster capacity, availability and 
 reservations
 --

 Key: YARN-2274
 URL: https://issues.apache.org/jira/browse/YARN-2274
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler
Affects Versions: 2.4.1
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Priority: Trivial
 Attachments: yarn-2274-1.patch, yarn-2274-2.patch


 FairScheduler logs have little information on cluster capacity and 
 availability. Need this information to debug production issues. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2026) Fair scheduler : Fair share for inactive queues causes unfair allocation in some scenarios

2014-07-10 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14058009#comment-14058009
 ] 

Sandy Ryza commented on YARN-2026:
--

I think Ashwin makes a good point.

I think displaying both is reasonable if we present it in a careful way.  For 
example, it might make sense to add tooltips that explain the difference.

 Fair scheduler : Fair share for inactive queues causes unfair allocation in 
 some scenarios
 --

 Key: YARN-2026
 URL: https://issues.apache.org/jira/browse/YARN-2026
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Ashwin Shankar
Assignee: Ashwin Shankar
  Labels: scheduler
 Attachments: YARN-2026-v1.txt, YARN-2026-v2.txt


 Problem1- While using hierarchical queues in fair scheduler,there are few 
 scenarios where we have seen a leaf queue with least fair share can take 
 majority of the cluster and starve a sibling parent queue which has greater 
 weight/fair share and preemption doesn’t kick in to reclaim resources.
 The root cause seems to be that fair share of a parent queue is distributed 
 to all its children irrespective of whether its an active or an inactive(no 
 apps running) queue. Preemption based on fair share kicks in only if the 
 usage of a queue is less than 50% of its fair share and if it has demands 
 greater than that. When there are many queues under a parent queue(with high 
 fair share),the child queue’s fair share becomes really low. As a result when 
 only few of these child queues have apps running,they reach their *tiny* fair 
 share quickly and preemption doesn’t happen even if other leaf 
 queues(non-sibling) are hogging the cluster.
 This can be solved by dividing fair share of parent queue only to active 
 child queues.
 Here is an example describing the problem and proposed solution:
 root.lowPriorityQueue is a leaf queue with weight 2
 root.HighPriorityQueue is parent queue with weight 8
 root.HighPriorityQueue has 10 child leaf queues : 
 root.HighPriorityQueue.childQ(1..10)
 Above config,results in root.HighPriorityQueue having 80% fair share
 and each of its ten child queue would have 8% fair share. Preemption would 
 happen only if the child queue is 4% (0.5*8=4). 
 Lets say at the moment no apps are running in any of the 
 root.HighPriorityQueue.childQ(1..10) and few apps are running in 
 root.lowPriorityQueue which is taking up 95% of the cluster.
 Up till this point,the behavior of FS is correct.
 Now,lets say root.HighPriorityQueue.childQ1 got a big job which requires 30% 
 of the cluster. It would get only the available 5% in the cluster and 
 preemption wouldn't kick in since its above 4%(half fair share).This is bad 
 considering childQ1 is under a highPriority parent queue which has *80% fair 
 share*.
 Until root.lowPriorityQueue starts relinquishing containers,we would see the 
 following allocation on the scheduler page:
 *root.lowPriorityQueue = 95%*
 *root.HighPriorityQueue.childQ1=5%*
 This can be solved by distributing a parent’s fair share only to active 
 queues.
 So in the example above,since childQ1 is the only active queue
 under root.HighPriorityQueue, it would get all its parent’s fair share i.e. 
 80%.
 This would cause preemption to reclaim the 30% needed by childQ1 from 
 root.lowPriorityQueue after fairSharePreemptionTimeout seconds.
 Problem2 - Also note that similar situation can happen between 
 root.HighPriorityQueue.childQ1 and root.HighPriorityQueue.childQ2,if childQ2 
 hogs the cluster. childQ2 can take up 95% cluster and childQ1 would be stuck 
 at 5%,until childQ2 starts relinquishing containers. We would like each of 
 childQ1 and childQ2 to get half of root.HighPriorityQueue  fair share ie 
 40%,which would ensure childQ1 gets upto 40% resource if needed through 
 preemption.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2274) FairScheduler: Add debug information about cluster capacity, availability and reservations

2014-07-10 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14058085#comment-14058085
 ] 

Sandy Ryza commented on YARN-2274:
--

Demanded resources could also be a useful statistic to report.  The update 
thread typically runs twice every second, so it might make sense to 5th update 
or something to avoid a flood of messages.

 FairScheduler: Add debug information about cluster capacity, availability and 
 reservations
 --

 Key: YARN-2274
 URL: https://issues.apache.org/jira/browse/YARN-2274
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler
Affects Versions: 2.4.1
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Priority: Trivial
 Attachments: yarn-2274-1.patch


 FairScheduler logs have little information on cluster capacity and 
 availability. Need this information to debug production issues. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2257) Add user to queue mappings to automatically place users' apps into specific queues

2014-07-08 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14054617#comment-14054617
 ] 

Sandy Ryza commented on YARN-2257:
--

To add some background: since MR1, the Fair Scheduler has been able to place 
apps into queues named with the username or group of the submitter.  Last year, 
YARN-1392 extended this to accept more general policies  - essentially any 
function of (submitter's username, submitter's groups, requested queue), with a 
structure that allows phrasing the policy in terms of simple rules and 
fallbacks.

This generality is useful because different organizations have a variety of 
ways they organize their users on their Hadoop clusters.  Some administrators 
want to be able to decide which queue a user's job goes into by placing them 
into a unix group, while others want a queue for every user, with the option 
for certain users to intentionally submit their jobs to certain queues.  The 
Fair Scheduler also in particular has need for some added complexity here, 
because it models users within a queue with their own queues, unlike the 
Capacity Scheduler, which has a construct for this.

We chose to put these queue placement policies in the Fair Scheduler because 
other schedulers didn't have a precedent for placing apps in queues other than 
the one requested, but my opinion is that these they could be a useful feature 
for YARN.  If not, we should at least add user-queue mappings in a way that's 
compatible with more general mappings.

 Add user to queue mappings to automatically place users' apps into specific 
 queues
 --

 Key: YARN-2257
 URL: https://issues.apache.org/jira/browse/YARN-2257
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Reporter: Patrick Liu
Assignee: Vinod Kumar Vavilapalli
  Labels: features

 Currently, the fair-scheduler supports two modes, default queue or individual 
 queue for each user.
 Apparently, the default queue is not a good option, because the resources 
 cannot be managed for each user or group.
 However, individual queue for each user is not good enough. Especially when 
 connecting yarn with hive. There will be increasing hive users in a corporate 
 environment. If we create a queue for a user, the resource management will be 
 hard to maintain.
 I think the problem can be solved like this:
 1. Define user-queue mapping in Fair-Scheduler.xml. Inside each queue, use 
 aclSubmitApps to control user's ability.
 2. Each time a user submit an app to yarn, if the user has mapped to a queue, 
 the app will be scheduled to that queue; otherwise, the app will be submitted 
 to default queue.
 3. If the user cannot pass aclSubmitApps limits, the app will not be accepted.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2026) Fair scheduler : Fair share for inactive queues causes unfair allocation in some scenarios

2014-07-08 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14054629#comment-14054629
 ] 

Sandy Ryza commented on YARN-2026:
--

I had a conversation with [~kkambatl] about this, and he convinced me that we 
should turn this on in all cases - i.e. modify FairSharePolicy and 
DominantResourceFairnessPolicy instead of creating additional policies.  Sorry 
to vacillate on this.

Some additional comments on the code:
{code}
+return this.getNumRunnableApps()  0;
{code}

{code}
+  || (sched instanceof FSQueue  ((FSQueue) sched).isActive())) {
{code}
Instead of using instanceof, can we add an isActive method to Schedulable, and 
always return true for it in AppSchedulable?

{code}
+out.println(   queue name=\childA1\ /);
+out.println(   queue name=\childA2\ /);
+out.println(   queue name=\childA3\ /);
+out.println(   queue name=\childA4\ /);
+out.println(   queue name=\childA5\ /);
+out.println(   queue name=\childA6\ /);
+out.println(   queue name=\childA7\ /);
+out.println(   queue name=\childA8\ /);
{code}
Do we need this many children?

{code}
+out.println(/queue);
+
+out.println(/allocations);
{code}
Unnecessary newline

{code}
+  public void testFairShareActiveOnly_ShareResetsToZeroWhenAppsComplete()
{code}
Take out underscore.

{code}
+  private void setupCluster(int mem, int vCores) throws IOException {
{code}
Give this method a name that's more descriptive of the kind of configuration 
it's setting up.

{code}
+  private void setupCluster(int nodeMem) throws IOException {
{code}
Can this call the setupCluster that takes two arguments?

To help with the fight against TestFairScheduler becoming a monstrosity, the 
tests should go into a new test file.  TestFairSchedulerPreemption is a good 
example of how to do this.

{code}
+int nodeVcores = 10;
{code}
Nit: nodeVCores

 Fair scheduler : Fair share for inactive queues causes unfair allocation in 
 some scenarios
 --

 Key: YARN-2026
 URL: https://issues.apache.org/jira/browse/YARN-2026
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Ashwin Shankar
Assignee: Ashwin Shankar
  Labels: scheduler
 Attachments: YARN-2026-v1.txt, YARN-2026-v2.txt


 Problem1- While using hierarchical queues in fair scheduler,there are few 
 scenarios where we have seen a leaf queue with least fair share can take 
 majority of the cluster and starve a sibling parent queue which has greater 
 weight/fair share and preemption doesn’t kick in to reclaim resources.
 The root cause seems to be that fair share of a parent queue is distributed 
 to all its children irrespective of whether its an active or an inactive(no 
 apps running) queue. Preemption based on fair share kicks in only if the 
 usage of a queue is less than 50% of its fair share and if it has demands 
 greater than that. When there are many queues under a parent queue(with high 
 fair share),the child queue’s fair share becomes really low. As a result when 
 only few of these child queues have apps running,they reach their *tiny* fair 
 share quickly and preemption doesn’t happen even if other leaf 
 queues(non-sibling) are hogging the cluster.
 This can be solved by dividing fair share of parent queue only to active 
 child queues.
 Here is an example describing the problem and proposed solution:
 root.lowPriorityQueue is a leaf queue with weight 2
 root.HighPriorityQueue is parent queue with weight 8
 root.HighPriorityQueue has 10 child leaf queues : 
 root.HighPriorityQueue.childQ(1..10)
 Above config,results in root.HighPriorityQueue having 80% fair share
 and each of its ten child queue would have 8% fair share. Preemption would 
 happen only if the child queue is 4% (0.5*8=4). 
 Lets say at the moment no apps are running in any of the 
 root.HighPriorityQueue.childQ(1..10) and few apps are running in 
 root.lowPriorityQueue which is taking up 95% of the cluster.
 Up till this point,the behavior of FS is correct.
 Now,lets say root.HighPriorityQueue.childQ1 got a big job which requires 30% 
 of the cluster. It would get only the available 5% in the cluster and 
 preemption wouldn't kick in since its above 4%(half fair share).This is bad 
 considering childQ1 is under a highPriority parent queue which has *80% fair 
 share*.
 Until root.lowPriorityQueue starts relinquishing containers,we would see the 
 following allocation on the scheduler page:
 *root.lowPriorityQueue = 95%*
 *root.HighPriorityQueue.childQ1=5%*
 This can be solved by distributing a parent’s fair share only to active 
 queues.
 So in the example above,since childQ1 is the only active queue
 under root.HighPriorityQueue, it would get 

[jira] [Commented] (YARN-2257) Add user to queue mapping in Fair-Scheduler

2014-07-07 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053846#comment-14053846
 ] 

Sandy Ryza commented on YARN-2257:
--

Definitely needed.  This should be implemented as a QueuePlacementRule.

 Add user to queue mapping in Fair-Scheduler
 ---

 Key: YARN-2257
 URL: https://issues.apache.org/jira/browse/YARN-2257
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Reporter: Patrick Liu
  Labels: features

 Currently, the fair-scheduler supports two modes, default queue or individual 
 queue for each user.
 Apparently, the default queue is not a good option, because the resources 
 cannot be managed for each user or group.
 However, individual queue for each user is not good enough. Especially when 
 connecting yarn with hive. There will be increasing hive users in a corporate 
 environment. If we create a queue for a user, the resource management will be 
 hard to maintain.
 I think the problem can be solved like this:
 1. Define user-queue mapping in Fair-Scheduler.xml. Inside each queue, use 
 aclSubmitApps to control user's ability.
 2. Each time a user submit an app to yarn, if the user has mapped to a queue, 
 the app will be scheduled to that queue; otherwise, the app will be submitted 
 to default queue.
 3. If the user cannot pass aclSubmitApps limits, the app will not be accepted.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2250) FairScheduler.findLowestCommonAncestorQueue returns null when queues not identical

2014-07-04 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14052497#comment-14052497
 ] 

Sandy Ryza commented on YARN-2250:
--

+1.  A couple lines go over 80 characters and the names are identical comment 
still applies.  Fixing these on commit.

 FairScheduler.findLowestCommonAncestorQueue returns null when queues not 
 identical
 --

 Key: YARN-2250
 URL: https://issues.apache.org/jira/browse/YARN-2250
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.4.0, 2.4.1
Reporter: Krisztian Horvath
 Attachments: YARN-2250-1.patch, YARN-2250-2.patch


 We need to update the queue metrics until to lowest common ancestor of the 
 target and source queue. This method fails to retrieve the right queue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-2250) Moving apps between queues - FairScheduler

2014-07-03 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-2250:
-

Target Version/s: 2.6.0
   Fix Version/s: (was: 3.0.0)

 Moving apps between queues - FairScheduler
 --

 Key: YARN-2250
 URL: https://issues.apache.org/jira/browse/YARN-2250
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.4.0, 2.4.1
Reporter: Krisztian Horvath

 We need to update the queue metrics until to lowest common ancestor of the 
 target and source queue. This method fails to retrieve the right queue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2250) Moving apps between queues - FairScheduler

2014-07-03 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051169#comment-14051169
 ] 

Sandy Ryza commented on YARN-2250:
--

Hi Krisztian,
Would you mind including an example of a situation where the metrics become off?

 Moving apps between queues - FairScheduler
 --

 Key: YARN-2250
 URL: https://issues.apache.org/jira/browse/YARN-2250
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.4.0, 2.4.1
Reporter: Krisztian Horvath

 We need to update the queue metrics until to lowest common ancestor of the 
 target and source queue. This method fails to retrieve the right queue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2250) Moving apps between queues - FairScheduler

2014-07-03 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051692#comment-14051692
 ] 

Sandy Ryza commented on YARN-2250:
--

I think the bug can be fixed by replacing name1.substring(lastPeriodIndex) with 
name1.substring(0, lastPeriodIndex).  I tried this out and all your tests 
passed.

 Moving apps between queues - FairScheduler
 --

 Key: YARN-2250
 URL: https://issues.apache.org/jira/browse/YARN-2250
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.4.0, 2.4.1
Reporter: Krisztian Horvath
 Attachments: YARN-2250-1.patch


 We need to update the queue metrics until to lowest common ancestor of the 
 target and source queue. This method fails to retrieve the right queue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-2250) FairScheduler.findLowestCommonAncestorQueue returns null when queues not identical

2014-07-03 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-2250:
-

Summary: FairScheduler.findLowestCommonAncestorQueue returns null when 
queues not identical  (was: Moving apps between queues - FairScheduler)

 FairScheduler.findLowestCommonAncestorQueue returns null when queues not 
 identical
 --

 Key: YARN-2250
 URL: https://issues.apache.org/jira/browse/YARN-2250
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.4.0, 2.4.1
Reporter: Krisztian Horvath
 Attachments: YARN-2250-1.patch


 We need to update the queue metrics until to lowest common ancestor of the 
 target and source queue. This method fails to retrieve the right queue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2214) preemptContainerPreCheck() in FSParentQueue delays convergence towards fairness

2014-06-25 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044066#comment-14044066
 ] 

Sandy Ryza commented on YARN-2214:
--

Makes sense

 preemptContainerPreCheck() in FSParentQueue delays convergence towards 
 fairness
 ---

 Key: YARN-2214
 URL: https://issues.apache.org/jira/browse/YARN-2214
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.5.0
Reporter: Ashwin Shankar

 preemptContainerPreCheck() in FSParentQueue rejects preemption requests if 
 the parent queue is below fair share. This can cause a delay in converging 
 towards fairness when the starved leaf queue and the queue above fairshare 
 belong under a non-root parent queue(ie their least common ancestor is a 
 parent queue which is not root).
 Here is an example :
 root.parent has fair share = 80% and usage = 80%
 root.parent.child1 has fair share =40% usage = 80%
 root.parent.child2 has fair share=40% usage=0%
 Now a job is submitted to child2 and the demand is 40%.
 Preemption will kick in and try to reclaim all the 40% from child1.
 When it preempts the first container from child1,the usage of root.parent 
 will become 80%, which is less than root.parent's fair share,causing 
 preemption to stop.So only one container gets preempted in this round 
 although the need is a lot more. child2 would eventually get to half its fair 
 share but only after multiple rounds of preemption.
 Solution is to remove preemptContainerPreCheck() in FSParentQueue and keep it 
 only in FSLeafQueue(which is already there).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


  1   2   3   4   5   6   7   8   9   10   >