[jira] [Commented] (YARN-3298) User-limit should be enforced in CapacityScheduler

2015-03-12 Thread Nathan Roberts (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14358708#comment-14358708
 ] 

Nathan Roberts commented on YARN-3298:
--

I agree. Let's not change anything for the time being. If YARN-2113 requires 
some tweaking in this area, we can do it at that time.


 User-limit should be enforced in CapacityScheduler
 --

 Key: YARN-3298
 URL: https://issues.apache.org/jira/browse/YARN-3298
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler, yarn
Reporter: Wangda Tan
Assignee: Wangda Tan

 User-limit is not treat as a hard-limit for now, it will not consider 
 required-resource (resource of being-allocated resource request). And also, 
 when user's used resource equals to user-limit, it will still continue. This 
 will generate jitter issues when we have YARN-2069 (preemption policy kills a 
 container under an user, and scheduler allocate a container under the same 
 user soon after).
 The expected behavior should be as same as queue's capacity:
 Only when user.usage + required = user-limit (1), queue will continue to 
 allocate container.
 (1), user-limit mentioned here is determined by following computing
 {code}
 current-capacity = queue.used + now-required (when queue.used  
 queue.capacity)
queue.capacity (when queue.used  queue.capacity)
 user-limit = min(max(current-capacity / #active-users, current-capacity * 
 user-limit / 100), queue-capacity * user-limit-factor)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3298) User-limit should be enforced in CapacityScheduler

2015-03-11 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357564#comment-14357564
 ] 

Wangda Tan commented on YARN-3298:
--

[~nroberts], 
I think I got your point now. Yes, as you said, if we enforce the limit (used + 
requred = user-limit), and don't change the user-limit computation, queue 
cannot over its configured capacity.

Originally, this ticket trying to solve the jitter problem when we have the 
YARN-2069. However, YARN-2069 will only take effect when queue becomes 
over-satisfied, at that time, CS will not give queue more resources. So the 
jitter won't happen actually.

Jitter will happen when we have YARN-2113 (preemption will happen to balance 
usage between users when queue doesn't over its capacity), at that time, 
user-limit enforcement should be done.

Basically, I agree with your method, which is {{current_capacity = 
max(queue.used,queue.capacity)+now_required}}, it can solve the queue cannot 
over its configured capacity problem, but it seems not necessary at least for 
now. We can delay this change until YARN-2113 is required.

Thoughts?

Thanks,
Wangda

 User-limit should be enforced in CapacityScheduler
 --

 Key: YARN-3298
 URL: https://issues.apache.org/jira/browse/YARN-3298
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler, yarn
Reporter: Wangda Tan
Assignee: Wangda Tan

 User-limit is not treat as a hard-limit for now, it will not consider 
 required-resource (resource of being-allocated resource request). And also, 
 when user's used resource equals to user-limit, it will still continue. This 
 will generate jitter issues when we have YARN-2069 (preemption policy kills a 
 container under an user, and scheduler allocate a container under the same 
 user soon after).
 The expected behavior should be as same as queue's capacity:
 Only when user.usage + required = user-limit (1), queue will continue to 
 allocate container.
 (1), user-limit mentioned here is determined by following computing
 {code}
 current-capacity = queue.used + now-required (when queue.used  
 queue.capacity)
queue.capacity (when queue.used  queue.capacity)
 user-limit = min(max(current-capacity / #active-users, current-capacity * 
 user-limit / 100), queue-capacity * user-limit-factor)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3298) User-limit should be enforced in CapacityScheduler

2015-03-10 Thread Nathan Roberts (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355030#comment-14355030
 ] 

Nathan Roberts commented on YARN-3298:
--

If you have a prototype patch, please post it since that will make the proposal 
crystal clear. 

The issue I raised at 
https://issues.apache.org/jira/browse/YARN-3298?focusedCommentId=14353053page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14353053
 doesn't happen today. I think it will happen if we try to be precisely strict 
with user-limit. It's also not about small amount of resource cannot be used 
in a queue - It's everything between capacity and max-capacity, which can be a 
large percentage of the cluster.

In my mind.
- I would be ok with changing current-capacity = 
max(queue.used,queue.capacity)+now-required; because I think it's more 
consistent. Not strictly necessary though, just an improvement.
- I don't see an overwhelming reason to make user-limit a precisely enforced 
hard-limit. Currently, users can't get beyond it by very much, and that seems 
ok to me. 




 User-limit should be enforced in CapacityScheduler
 --

 Key: YARN-3298
 URL: https://issues.apache.org/jira/browse/YARN-3298
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler, yarn
Reporter: Wangda Tan
Assignee: Wangda Tan

 User-limit is not treat as a hard-limit for now, it will not consider 
 required-resource (resource of being-allocated resource request). And also, 
 when user's used resource equals to user-limit, it will still continue. This 
 will generate jitter issues when we have YARN-2069 (preemption policy kills a 
 container under an user, and scheduler allocate a container under the same 
 user soon after).
 The expected behavior should be as same as queue's capacity:
 Only when user.usage + required = user-limit (1), queue will continue to 
 allocate container.
 (1), user-limit mentioned here is determined by following computing
 {code}
 current-capacity = queue.used + now-required (when queue.used  
 queue.capacity)
queue.capacity (when queue.used  queue.capacity)
 user-limit = min(max(current-capacity / #active-users, current-capacity * 
 user-limit / 100), queue-capacity * user-limit-factor)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3298) User-limit should be enforced in CapacityScheduler

2015-03-09 Thread Nathan Roberts (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353053#comment-14353053
 ] 

Nathan Roberts commented on YARN-3298:
--

Thanks [~leftnoteasy] for the additional detail. Maybe I should just wait for 
the patch, but here's the case I'm worried about.

queue.used is just under queue.capacity, so current-capacity = queue.capacity.
two users in the queue, both have same used resources

user-limit will be slightly less than (queue-capacity/2). (so user-limit can be 
extremely close to user.usage)

user.usage + required might now be slightly greater than user-limit. If that 
happens, it seems like we'll be unable to cross the capacity threshold. Once 
above capacity, I think it will work, but crossing that threshold might be hard.

Seems like current-capacity should be calculated as:
{code}
current-capacity = max(queue.used,queue.capacity)+now-required;
{code}




 User-limit should be enforced in CapacityScheduler
 --

 Key: YARN-3298
 URL: https://issues.apache.org/jira/browse/YARN-3298
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler, yarn
Reporter: Wangda Tan
Assignee: Wangda Tan

 User-limit is not treat as a hard-limit for now, it will not consider 
 required-resource (resource of being-allocated resource request). And also, 
 when user's used resource equals to user-limit, it will still continue. This 
 will generate jitter issues when we have YARN-2069 (preemption policy kills a 
 container under an user, and scheduler allocate a container under the same 
 user soon after).
 The expected behavior should be as same as queue's capacity:
 Only when user.usage + required = user-limit (1), queue will continue to 
 allocate container.
 (1), user-limit mentioned here is determined by following computing
 {code}
 current-capacity = queue.used + now-required (when queue.used  
 queue.capacity)
queue.capacity (when queue.used  queue.capacity)
 user-limit = min(max(current-capacity / #active-users, current-capacity * 
 user-limit / 100), queue-capacity * user-limit-factor)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3298) User-limit should be enforced in CapacityScheduler

2015-03-09 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353553#comment-14353553
 ] 

Wangda Tan commented on YARN-3298:
--

Hi [~nroberts],
If I understand what you meant correctly, maybe we can just relax when 
user.used  user.limit (instead of user.used + now_required = user.limit), 
which can solve the problem you mentioned.

 User-limit should be enforced in CapacityScheduler
 --

 Key: YARN-3298
 URL: https://issues.apache.org/jira/browse/YARN-3298
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler, yarn
Reporter: Wangda Tan
Assignee: Wangda Tan

 User-limit is not treat as a hard-limit for now, it will not consider 
 required-resource (resource of being-allocated resource request). And also, 
 when user's used resource equals to user-limit, it will still continue. This 
 will generate jitter issues when we have YARN-2069 (preemption policy kills a 
 container under an user, and scheduler allocate a container under the same 
 user soon after).
 The expected behavior should be as same as queue's capacity:
 Only when user.usage + required = user-limit (1), queue will continue to 
 allocate container.
 (1), user-limit mentioned here is determined by following computing
 {code}
 current-capacity = queue.used + now-required (when queue.used  
 queue.capacity)
queue.capacity (when queue.used  queue.capacity)
 user-limit = min(max(current-capacity / #active-users, current-capacity * 
 user-limit / 100), queue-capacity * user-limit-factor)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3298) User-limit should be enforced in CapacityScheduler

2015-03-09 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353852#comment-14353852
 ] 

Wangda Tan commented on YARN-3298:
--

[~nroberts],
As you mentioned, it is mostly as same as what we have today, and I think it 
cannot solve the jitter problem. What I really want to say is enforce the 
limit. To solve small amount of resource cannot be used in a queue problem 
which you mentioned in 
https://issues.apache.org/jira/browse/YARN-3298?focusedCommentId=14353053page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14353053,
 setting user-limit a little bit higher should solve the problem also. (like 
from 50 to 51).

Sounds like a plan?

 User-limit should be enforced in CapacityScheduler
 --

 Key: YARN-3298
 URL: https://issues.apache.org/jira/browse/YARN-3298
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler, yarn
Reporter: Wangda Tan
Assignee: Wangda Tan

 User-limit is not treat as a hard-limit for now, it will not consider 
 required-resource (resource of being-allocated resource request). And also, 
 when user's used resource equals to user-limit, it will still continue. This 
 will generate jitter issues when we have YARN-2069 (preemption policy kills a 
 container under an user, and scheduler allocate a container under the same 
 user soon after).
 The expected behavior should be as same as queue's capacity:
 Only when user.usage + required = user-limit (1), queue will continue to 
 allocate container.
 (1), user-limit mentioned here is determined by following computing
 {code}
 current-capacity = queue.used + now-required (when queue.used  
 queue.capacity)
queue.capacity (when queue.used  queue.capacity)
 user-limit = min(max(current-capacity / #active-users, current-capacity * 
 user-limit / 100), queue-capacity * user-limit-factor)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3298) User-limit should be enforced in CapacityScheduler

2015-03-09 Thread Nathan Roberts (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353833#comment-14353833
 ] 

Nathan Roberts commented on YARN-3298:
--

[~leftnoteasy], won't that be extremely close to what it is today? If so, then 
does it really solve the jitter issue you originally cited?

Just to make sure I'm in-sync with your proposed direction, this is the code 
you're thinking about modifying, correct? 
{code}
// Note: We aren't considering the current request since there is a fixed
// overhead of the AM, but it's a  check, not a = check, so...
if (Resources
.greaterThan(resourceCalculator, clusterResource,
user.getConsumedResourceByLabel(label),
limit)) {
{code}

 User-limit should be enforced in CapacityScheduler
 --

 Key: YARN-3298
 URL: https://issues.apache.org/jira/browse/YARN-3298
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler, yarn
Reporter: Wangda Tan
Assignee: Wangda Tan

 User-limit is not treat as a hard-limit for now, it will not consider 
 required-resource (resource of being-allocated resource request). And also, 
 when user's used resource equals to user-limit, it will still continue. This 
 will generate jitter issues when we have YARN-2069 (preemption policy kills a 
 container under an user, and scheduler allocate a container under the same 
 user soon after).
 The expected behavior should be as same as queue's capacity:
 Only when user.usage + required = user-limit (1), queue will continue to 
 allocate container.
 (1), user-limit mentioned here is determined by following computing
 {code}
 current-capacity = queue.used + now-required (when queue.used  
 queue.capacity)
queue.capacity (when queue.used  queue.capacity)
 user-limit = min(max(current-capacity / #active-users, current-capacity * 
 user-limit / 100), queue-capacity * user-limit-factor)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3298) User-limit should be enforced in CapacityScheduler

2015-03-06 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14351171#comment-14351171
 ] 

Wangda Tan commented on YARN-3298:
--

Hi [~nroberts],
In your example, each user can still get 5x (after my proposal).

*According to how user-limit get computed:* (my proposal doesn't change this 
part) 
current-capacity = queue.used + now-required (assume queue's usage is more than 
queue's capacity)
user-limit = min(max(current-capacity / #active-users, current-capacity * 
user-limit / 100), queue-capacity * user-limit-factor)

I realized maybe you mis-understood user-limit to be user-limit option 
only, but actually what I meant is the above equation :).

bq. What I see user limit doing is controlling which of the actively requesting 
applications are getting newly available resources. Basically, making it so 
that the queue can grow to 10x in the above example while trying to make sure 
that each of the users within the queue are getting equal shares of capacity.
This will be enforced, each user will allow to use more than queue's minimum 
share, and can grow up get equal share of capacity when user-limit and 
user-limit-factor is properly set.

The only difference is, in the past, each user can get (5x + 1 container 
resource), but after this patch, each user can get = 5x resource.

Does this make sense to you?

 User-limit should be enforced in CapacityScheduler
 --

 Key: YARN-3298
 URL: https://issues.apache.org/jira/browse/YARN-3298
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler, yarn
Reporter: Wangda Tan
Assignee: Wangda Tan

 User-limit is not treat as a hard-limit for now, it will not consider 
 required-resource (resource of being-allocated resource request). And also, 
 when user's used resource equals to user-limit, it will still continue. This 
 will generate jitter issues when we have YARN-2069 (preemption policy kills a 
 container under an user, and scheduler allocate a container under the same 
 user soon after).
 The expected behavior should be as same as queue's capacity:
 Only when user.usage + required = user-limit, queue will continue to 
 allocate container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3298) User-limit should be enforced in CapacityScheduler

2015-03-06 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14351178#comment-14351178
 ] 

Wangda Tan commented on YARN-3298:
--

Updated description a little bit to make it less confused.

 User-limit should be enforced in CapacityScheduler
 --

 Key: YARN-3298
 URL: https://issues.apache.org/jira/browse/YARN-3298
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler, yarn
Reporter: Wangda Tan
Assignee: Wangda Tan

 User-limit is not treat as a hard-limit for now, it will not consider 
 required-resource (resource of being-allocated resource request). And also, 
 when user's used resource equals to user-limit, it will still continue. This 
 will generate jitter issues when we have YARN-2069 (preemption policy kills a 
 container under an user, and scheduler allocate a container under the same 
 user soon after).
 The expected behavior should be as same as queue's capacity:
 Only when user.usage + required = user-limit (1), queue will continue to 
 allocate container.
 (1), user-limit mentioned here is determined by following computing
 {code}
 current-capacity = queue.used + now-required (when queue.used  
 queue.capacity)
queue.capacity (when queue.used  queue.capacity)
 user-limit = min(max(current-capacity / #active-users, current-capacity * 
 user-limit / 100), queue-capacity * user-limit-factor)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3298) User-limit should be enforced in CapacityScheduler

2015-03-06 Thread Nathan Roberts (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14350614#comment-14350614
 ] 

Nathan Roberts commented on YARN-3298:
--

Hi Wangda. I'm a little concerned about this proposal. I think userlimit has 
been acting this way for a long time so a change could have a very significant 
impact on how queues behave.  If I'm understanding the proposal correctly, a 
queue that is configured with minimum_user_limit_percent=50, capacity=x, 
max_capacity=10x, user_limit_factor=5  2 active users would not be able to 
get the queue above x. Please correct me if that's not the case. Assuming that 
is the case, I'm not sure that's what we want.

What I see user limit doing is controlling which of the actively requesting 
applications are getting newly available resources. Basically, making it so 
that the queue can grow to 10x in the above example while trying to make sure 
that each of the users within the queue are getting equal shares of capacity.   

 User-limit should be enforced in CapacityScheduler
 --

 Key: YARN-3298
 URL: https://issues.apache.org/jira/browse/YARN-3298
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler, yarn
Reporter: Wangda Tan
Assignee: Wangda Tan

 User-limit is not treat as a hard-limit for now, it will not consider 
 required-resource (resource of being-allocated resource request). And also, 
 when user's used resource equals to user-limit, it will still continue. This 
 will generate jitter issues when we have YARN-2069 (preemption policy kills a 
 container under an user, and scheduler allocate a container under the same 
 user soon after).
 The expected behavior should be as same as queue's capacity:
 Only when user.usage + required = user-limit, queue will continue to 
 allocate container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)