[jira] [Commented] (YARN-392) Make it possible to specify hard locality constraints in resource requests

2013-05-18 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661403#comment-13661403
 ] 

Sandy Ryza commented on YARN-392:
-

bq. it would still be possible to blacklist racks by setting the disable flag 
on a rack and submitting node requests for nodes under it.
By which I mean: it would still be possible to blacklist racks by setting the 
disable flag on a rack and submitting *no node requests for nodes under it.

 Make it possible to specify hard locality constraints in resource requests
 --

 Key: YARN-392
 URL: https://issues.apache.org/jira/browse/YARN-392
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Sandy Ryza
 Attachments: YARN-392-1.patch, YARN-392-2.patch, YARN-392-2.patch, 
 YARN-392-2.patch, YARN-392-3.patch, YARN-392-4.patch, YARN-392.patch


 Currently its not possible to specify scheduling requests for specific nodes 
 and nowhere else. The RM automatically relaxes locality to rack and * and 
 assigns non-specified machines to the app.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-569) CapacityScheduler: support for preemption (using a capacity monitor)

2013-05-18 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661410#comment-13661410
 ] 

Carlo Curino commented on YARN-569:
---

The findbugs warnings are on accesses of a ResourceCalculator and 
minAllocation, so not really concerning.

 CapacityScheduler: support for preemption (using a capacity monitor)
 

 Key: YARN-569
 URL: https://issues.apache.org/jira/browse/YARN-569
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: 3queues.pdf, CapScheduler_with_preemption.pdf, 
 preemption.2.patch, YARN-569.1.patch, YARN-569.2.patch, YARN-569.patch, 
 YARN-569.patch


 There is a tension between the fast-pace reactive role of the 
 CapacityScheduler, which needs to respond quickly to 
 applications resource requests, and node updates, and the more introspective, 
 time-based considerations 
 needed to observe and correct for capacity balance. To this purpose we opted 
 instead of hacking the delicate
 mechanisms of the CapacityScheduler directly to add support for preemption by 
 means of a Capacity Monitor,
 which can be run optionally as a separate service (much like the 
 NMLivelinessMonitor).
 The capacity monitor (similarly to equivalent functionalities in the fairness 
 scheduler) operates running on intervals 
 (e.g., every 3 seconds), observe the state of the assignment of resources to 
 queues from the capacity scheduler, 
 performs off-line computation to determine if preemption is needed, and how 
 best to edit the current schedule to 
 improve capacity, and generates events that produce four possible actions:
 # Container de-reservations
 # Resource-based preemptions
 # Container-based preemptions
 # Container killing
 The actions listed above are progressively more costly, and it is up to the 
 policy to use them as desired to achieve the rebalancing goals. 
 Note that due to the lag in the effect of these actions the policy should 
 operate at the macroscopic level (e.g., preempt tens of containers
 from a queue) and not trying to tightly and consistently micromanage 
 container allocations. 
 - Preemption policy  (ProportionalCapacityPreemptionPolicy): 
 - 
 Preemption policies are by design pluggable, in the following we present an 
 initial policy (ProportionalCapacityPreemptionPolicy) we have been 
 experimenting with.  The ProportionalCapacityPreemptionPolicy behaves as 
 follows:
 # it gathers from the scheduler the state of the queues, in particular, their 
 current capacity, guaranteed capacity and pending requests (*)
 # if there are pending requests from queues that are under capacity it 
 computes a new ideal balanced state (**)
 # it computes the set of preemptions needed to repair the current schedule 
 and achieve capacity balance (accounting for natural completion rates, and 
 respecting bounds on the amount of preemption we allow for each round)
 # it selects which applications to preempt from each over-capacity queue (the 
 last one in the FIFO order)
 # it remove reservations from the most recently assigned app until the amount 
 of resource to reclaim is obtained, or until no more reservations exits
 # (if not enough) it issues preemptions for containers from the same 
 applications (reverse chronological order, last assigned container first) 
 again until necessary or until no containers except the AM container are left,
 # (if not enough) it moves onto unreserve and preempt from the next 
 application. 
 # containers that have been asked to preempt are tracked across executions. 
 If a containers is among the one to be preempted for more than a certain 
 time, the container is moved in a the list of containers to be forcibly 
 killed. 
 Notes:
 (*) at the moment, in order to avoid double-counting of the requests, we only 
 look at the ANY part of pending resource requests, which means we might not 
 preempt on behalf of AMs that ask only for specific locations but not any. 
 (**) The ideal balance state is one in which each queue has at least its 
 guaranteed capacity, and the spare capacity is distributed among queues (that 
 wants some) as a weighted fair share. Where the weighting is based on the 
 guaranteed capacity of a queue, and the function runs to a fix point.  
 Tunables of the ProportionalCapacityPreemptionPolicy:
 # observe-only mode (i.e., log the actions it would take, but behave as 
 read-only)
 # how frequently to run the policy
 # how long to wait between preemption and kill of a container
 # which fraction of the containers I would like to obtain should I preempt 
 (has to do with the natural rate at which containers are returned)
 # deadzone size, i.e., what 

[jira] [Commented] (YARN-392) Make it possible to specify hard locality constraints in resource requests

2013-05-18 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661427#comment-13661427
 ] 

Alejandro Abdelnur commented on YARN-392:
-

latest patch LGTM. Bikas, does Sandy's responses address your concerns? I'd 
like to get this in so we can move to the next step which is getting this 
exposed in the client API.

 Make it possible to specify hard locality constraints in resource requests
 --

 Key: YARN-392
 URL: https://issues.apache.org/jira/browse/YARN-392
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Sandy Ryza
 Attachments: YARN-392-1.patch, YARN-392-2.patch, YARN-392-2.patch, 
 YARN-392-2.patch, YARN-392-3.patch, YARN-392-4.patch, YARN-392.patch


 Currently its not possible to specify scheduling requests for specific nodes 
 and nowhere else. The RM automatically relaxes locality to rack and * and 
 assigns non-specified machines to the app.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira