[jira] [Updated] (YARN-569) CapacityScheduler: support for preemption (using a capacity monitor)

2013-05-31 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated YARN-569:
---

Attachment: YARN-569.3.patch

 CapacityScheduler: support for preemption (using a capacity monitor)
 

 Key: YARN-569
 URL: https://issues.apache.org/jira/browse/YARN-569
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: 3queues.pdf, CapScheduler_with_preemption.pdf, 
 preemption.2.patch, YARN-569.1.patch, YARN-569.2.patch, YARN-569.3.patch, 
 YARN-569.patch, YARN-569.patch


 There is a tension between the fast-pace reactive role of the 
 CapacityScheduler, which needs to respond quickly to 
 applications resource requests, and node updates, and the more introspective, 
 time-based considerations 
 needed to observe and correct for capacity balance. To this purpose we opted 
 instead of hacking the delicate
 mechanisms of the CapacityScheduler directly to add support for preemption by 
 means of a Capacity Monitor,
 which can be run optionally as a separate service (much like the 
 NMLivelinessMonitor).
 The capacity monitor (similarly to equivalent functionalities in the fairness 
 scheduler) operates running on intervals 
 (e.g., every 3 seconds), observe the state of the assignment of resources to 
 queues from the capacity scheduler, 
 performs off-line computation to determine if preemption is needed, and how 
 best to edit the current schedule to 
 improve capacity, and generates events that produce four possible actions:
 # Container de-reservations
 # Resource-based preemptions
 # Container-based preemptions
 # Container killing
 The actions listed above are progressively more costly, and it is up to the 
 policy to use them as desired to achieve the rebalancing goals. 
 Note that due to the lag in the effect of these actions the policy should 
 operate at the macroscopic level (e.g., preempt tens of containers
 from a queue) and not trying to tightly and consistently micromanage 
 container allocations. 
 - Preemption policy  (ProportionalCapacityPreemptionPolicy): 
 - 
 Preemption policies are by design pluggable, in the following we present an 
 initial policy (ProportionalCapacityPreemptionPolicy) we have been 
 experimenting with.  The ProportionalCapacityPreemptionPolicy behaves as 
 follows:
 # it gathers from the scheduler the state of the queues, in particular, their 
 current capacity, guaranteed capacity and pending requests (*)
 # if there are pending requests from queues that are under capacity it 
 computes a new ideal balanced state (**)
 # it computes the set of preemptions needed to repair the current schedule 
 and achieve capacity balance (accounting for natural completion rates, and 
 respecting bounds on the amount of preemption we allow for each round)
 # it selects which applications to preempt from each over-capacity queue (the 
 last one in the FIFO order)
 # it remove reservations from the most recently assigned app until the amount 
 of resource to reclaim is obtained, or until no more reservations exits
 # (if not enough) it issues preemptions for containers from the same 
 applications (reverse chronological order, last assigned container first) 
 again until necessary or until no containers except the AM container are left,
 # (if not enough) it moves onto unreserve and preempt from the next 
 application. 
 # containers that have been asked to preempt are tracked across executions. 
 If a containers is among the one to be preempted for more than a certain 
 time, the container is moved in a the list of containers to be forcibly 
 killed. 
 Notes:
 (*) at the moment, in order to avoid double-counting of the requests, we only 
 look at the ANY part of pending resource requests, which means we might not 
 preempt on behalf of AMs that ask only for specific locations but not any. 
 (**) The ideal balance state is one in which each queue has at least its 
 guaranteed capacity, and the spare capacity is distributed among queues (that 
 wants some) as a weighted fair share. Where the weighting is based on the 
 guaranteed capacity of a queue, and the function runs to a fix point.  
 Tunables of the ProportionalCapacityPreemptionPolicy:
 # observe-only mode (i.e., log the actions it would take, but behave as 
 read-only)
 # how frequently to run the policy
 # how long to wait between preemption and kill of a container
 # which fraction of the containers I would like to obtain should I preempt 
 (has to do with the natural rate at which containers are returned)
 # deadzone size, i.e., what % of over-capacity should I ignore (if we are off 
 perfect balance by some small % we ignore it)
 # overall amount of 

[jira] [Updated] (YARN-569) CapacityScheduler: support for preemption (using a capacity monitor)

2013-06-04 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated YARN-569:
---

Attachment: YARN-569.4.patch

Rebase after YARN-635, YARN-735, YARN-748, YARN-749. Fixed findbugs warnings.

 CapacityScheduler: support for preemption (using a capacity monitor)
 

 Key: YARN-569
 URL: https://issues.apache.org/jira/browse/YARN-569
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: 3queues.pdf, CapScheduler_with_preemption.pdf, 
 preemption.2.patch, YARN-569.1.patch, YARN-569.2.patch, YARN-569.3.patch, 
 YARN-569.4.patch, YARN-569.patch, YARN-569.patch


 There is a tension between the fast-pace reactive role of the 
 CapacityScheduler, which needs to respond quickly to 
 applications resource requests, and node updates, and the more introspective, 
 time-based considerations 
 needed to observe and correct for capacity balance. To this purpose we opted 
 instead of hacking the delicate
 mechanisms of the CapacityScheduler directly to add support for preemption by 
 means of a Capacity Monitor,
 which can be run optionally as a separate service (much like the 
 NMLivelinessMonitor).
 The capacity monitor (similarly to equivalent functionalities in the fairness 
 scheduler) operates running on intervals 
 (e.g., every 3 seconds), observe the state of the assignment of resources to 
 queues from the capacity scheduler, 
 performs off-line computation to determine if preemption is needed, and how 
 best to edit the current schedule to 
 improve capacity, and generates events that produce four possible actions:
 # Container de-reservations
 # Resource-based preemptions
 # Container-based preemptions
 # Container killing
 The actions listed above are progressively more costly, and it is up to the 
 policy to use them as desired to achieve the rebalancing goals. 
 Note that due to the lag in the effect of these actions the policy should 
 operate at the macroscopic level (e.g., preempt tens of containers
 from a queue) and not trying to tightly and consistently micromanage 
 container allocations. 
 - Preemption policy  (ProportionalCapacityPreemptionPolicy): 
 - 
 Preemption policies are by design pluggable, in the following we present an 
 initial policy (ProportionalCapacityPreemptionPolicy) we have been 
 experimenting with.  The ProportionalCapacityPreemptionPolicy behaves as 
 follows:
 # it gathers from the scheduler the state of the queues, in particular, their 
 current capacity, guaranteed capacity and pending requests (*)
 # if there are pending requests from queues that are under capacity it 
 computes a new ideal balanced state (**)
 # it computes the set of preemptions needed to repair the current schedule 
 and achieve capacity balance (accounting for natural completion rates, and 
 respecting bounds on the amount of preemption we allow for each round)
 # it selects which applications to preempt from each over-capacity queue (the 
 last one in the FIFO order)
 # it remove reservations from the most recently assigned app until the amount 
 of resource to reclaim is obtained, or until no more reservations exits
 # (if not enough) it issues preemptions for containers from the same 
 applications (reverse chronological order, last assigned container first) 
 again until necessary or until no containers except the AM container are left,
 # (if not enough) it moves onto unreserve and preempt from the next 
 application. 
 # containers that have been asked to preempt are tracked across executions. 
 If a containers is among the one to be preempted for more than a certain 
 time, the container is moved in a the list of containers to be forcibly 
 killed. 
 Notes:
 (*) at the moment, in order to avoid double-counting of the requests, we only 
 look at the ANY part of pending resource requests, which means we might not 
 preempt on behalf of AMs that ask only for specific locations but not any. 
 (**) The ideal balance state is one in which each queue has at least its 
 guaranteed capacity, and the spare capacity is distributed among queues (that 
 wants some) as a weighted fair share. Where the weighting is based on the 
 guaranteed capacity of a queue, and the function runs to a fix point.  
 Tunables of the ProportionalCapacityPreemptionPolicy:
 # observe-only mode (i.e., log the actions it would take, but behave as 
 read-only)
 # how frequently to run the policy
 # how long to wait between preemption and kill of a container
 # which fraction of the containers I would like to obtain should I preempt 
 (has to do with the natural rate at which containers are returned)
 # deadzone size, i.e., what % of 

[jira] [Updated] (YARN-569) CapacityScheduler: support for preemption (using a capacity monitor)

2013-06-12 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated YARN-569:
---

Attachment: YARN-569.6.patch

 CapacityScheduler: support for preemption (using a capacity monitor)
 

 Key: YARN-569
 URL: https://issues.apache.org/jira/browse/YARN-569
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: 3queues.pdf, CapScheduler_with_preemption.pdf, 
 preemption.2.patch, YARN-569.1.patch, YARN-569.2.patch, YARN-569.3.patch, 
 YARN-569.4.patch, YARN-569.5.patch, YARN-569.6.patch, YARN-569.patch, 
 YARN-569.patch


 There is a tension between the fast-pace reactive role of the 
 CapacityScheduler, which needs to respond quickly to 
 applications resource requests, and node updates, and the more introspective, 
 time-based considerations 
 needed to observe and correct for capacity balance. To this purpose we opted 
 instead of hacking the delicate
 mechanisms of the CapacityScheduler directly to add support for preemption by 
 means of a Capacity Monitor,
 which can be run optionally as a separate service (much like the 
 NMLivelinessMonitor).
 The capacity monitor (similarly to equivalent functionalities in the fairness 
 scheduler) operates running on intervals 
 (e.g., every 3 seconds), observe the state of the assignment of resources to 
 queues from the capacity scheduler, 
 performs off-line computation to determine if preemption is needed, and how 
 best to edit the current schedule to 
 improve capacity, and generates events that produce four possible actions:
 # Container de-reservations
 # Resource-based preemptions
 # Container-based preemptions
 # Container killing
 The actions listed above are progressively more costly, and it is up to the 
 policy to use them as desired to achieve the rebalancing goals. 
 Note that due to the lag in the effect of these actions the policy should 
 operate at the macroscopic level (e.g., preempt tens of containers
 from a queue) and not trying to tightly and consistently micromanage 
 container allocations. 
 - Preemption policy  (ProportionalCapacityPreemptionPolicy): 
 - 
 Preemption policies are by design pluggable, in the following we present an 
 initial policy (ProportionalCapacityPreemptionPolicy) we have been 
 experimenting with.  The ProportionalCapacityPreemptionPolicy behaves as 
 follows:
 # it gathers from the scheduler the state of the queues, in particular, their 
 current capacity, guaranteed capacity and pending requests (*)
 # if there are pending requests from queues that are under capacity it 
 computes a new ideal balanced state (**)
 # it computes the set of preemptions needed to repair the current schedule 
 and achieve capacity balance (accounting for natural completion rates, and 
 respecting bounds on the amount of preemption we allow for each round)
 # it selects which applications to preempt from each over-capacity queue (the 
 last one in the FIFO order)
 # it remove reservations from the most recently assigned app until the amount 
 of resource to reclaim is obtained, or until no more reservations exits
 # (if not enough) it issues preemptions for containers from the same 
 applications (reverse chronological order, last assigned container first) 
 again until necessary or until no containers except the AM container are left,
 # (if not enough) it moves onto unreserve and preempt from the next 
 application. 
 # containers that have been asked to preempt are tracked across executions. 
 If a containers is among the one to be preempted for more than a certain 
 time, the container is moved in a the list of containers to be forcibly 
 killed. 
 Notes:
 (*) at the moment, in order to avoid double-counting of the requests, we only 
 look at the ANY part of pending resource requests, which means we might not 
 preempt on behalf of AMs that ask only for specific locations but not any. 
 (**) The ideal balance state is one in which each queue has at least its 
 guaranteed capacity, and the spare capacity is distributed among queues (that 
 wants some) as a weighted fair share. Where the weighting is based on the 
 guaranteed capacity of a queue, and the function runs to a fix point.  
 Tunables of the ProportionalCapacityPreemptionPolicy:
 # observe-only mode (i.e., log the actions it would take, but behave as 
 read-only)
 # how frequently to run the policy
 # how long to wait between preemption and kill of a container
 # which fraction of the containers I would like to obtain should I preempt 
 (has to do with the natural rate at which containers are returned)
 # deadzone size, i.e., what % of over-capacity should I ignore (if we are off 
 perfect 

[jira] [Commented] (YARN-569) CapacityScheduler: support for preemption (using a capacity monitor)

2013-06-13 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13682571#comment-13682571
 ] 

Chris Douglas commented on YARN-569:


Thanks for the feedback; we revised the patch. We comment below on questions 
that required explanation, while all the small ones are addressed directly in 
the code following your suggestions.

bq. This doesnt seem to affect the fair scheduler or does it? If not, then it 
can be misleading for users.
bq. How do we envisage multiple policies working together without stepping on 
each other? Better off limiting to 1?

The intent was for orthogonal policies to interact with the scheduler, or- if 
conflicting- be coordinated by a composite policy. Though you're right, the 
naming toward preemption is confusing; the patch renames the properties to 
refer to monitors, only. Because the only example is the 
{{ProportionalCapacityPreemptionPolicy}}, {{null}} seemed like the correct 
default. As for limiting to 1 monitor or not, we are experiencing with other 
policies that focus on different aspect of the schedule (e.g., deadlines and 
automatic tuning of queue capacity) and it seems possible to play nice with 
other policies (e.g., ProportionalCapacityPreemptionPolicy), so we would prefer 
to have the mechanism to remain capable of loading multiple monitors.

bq. Not joining the thread to make sure its cleaned up?

The contract for shutting down a monitor is not baked into the API, yet. While 
the proportional policy runs quickly, it's not obvious whether other policies 
would be both long running and respond to interrupts. By way of illustration, 
other monitors we've experimented with call into third party code for 
CPU-intensive calculation. Since YARN-117 went in a few hours ago, that might 
be a chance to define that more crisply. Thoughts?

bq. Why no lock here when the other new methods have a lock? Do we not care 
that the app remains in applications during the duration of the operations?

The semantics of the {{\@Lock}} annotation were not entirely clear from the 
examples in the code, so it's possible the inconsistency is our application of 
it. We're probably making the situation worse, so we omitted the annotations in 
the updated patch. To answer your question: we don't care, because the selected 
container already exited (part of the natural termination factor in the policy).

bq. There is one critical difference between old and new behavior. The new code 
will not send the finish event to the container if its not part of the 
liveContainers. This probably is wrong.
bq. FicaSchedulerNode.unreserveResource(). Checks have been added for the 
reserved container but will the code reach that point if there was no 
reservation actually left on that node? In the same vein, can it happen that 
the node has a new reservation that was made out of band of the preemption 
logic cycle. Hence, the reserved container on the node would exist but could be 
from a different application. 

Good catch, these are related. The change to boolean was necessary because 
we're calling the {{unreserve}} logic from a new context. Since only one 
application can have a single reservation on a node, and because we're freeing 
it through that application, we won't accidentally free another application's 
reservation. However, calling {{unreserve}} on a reservation that converted to 
a container will fail, so we need to know whether the state changed before 
updating the metric.

bq. Couldnt quite grok this. What is delta? What is 0.5? A percentage? Whats 
the math behind the calculation? Should it be even absent preemption instead 
of even absent natural termination? Is this applied before or after 
TOTAL_PREEMPTION_PER_ROUND?

The delta is the difference between the computed ideal capacity and the actual. 
A value of 0.5 would preempt only 50% of the containers the policy thinks 
should be preempted, as the rest are expected to exit naturally. The comment 
is saying that- even without any containers exiting on their own- the policy 
will geometrically push capacity into the deadzone. In this case, 50% per 
round, in 5 rounds the policy will be within a 5% deadzone of the ideal 
capacity. It's applied before the total preemption per round; the latter 
proportionally affects all preemption targets.

Because some containers will complete while the policy runs, it may make sense 
to tune it aggressively (or affect it with observed completion rates), but 
we'll want to get some experience running with this.


 CapacityScheduler: support for preemption (using a capacity monitor)
 

 Key: YARN-569
 URL: https://issues.apache.org/jira/browse/YARN-569
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: 

[jira] [Updated] (YARN-569) CapacityScheduler: support for preemption (using a capacity monitor)

2013-06-18 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated YARN-569:
---

Attachment: YARN-569.8.patch

 CapacityScheduler: support for preemption (using a capacity monitor)
 

 Key: YARN-569
 URL: https://issues.apache.org/jira/browse/YARN-569
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: 3queues.pdf, CapScheduler_with_preemption.pdf, 
 preemption.2.patch, YARN-569.1.patch, YARN-569.2.patch, YARN-569.3.patch, 
 YARN-569.4.patch, YARN-569.5.patch, YARN-569.6.patch, YARN-569.8.patch, 
 YARN-569.patch, YARN-569.patch


 There is a tension between the fast-pace reactive role of the 
 CapacityScheduler, which needs to respond quickly to 
 applications resource requests, and node updates, and the more introspective, 
 time-based considerations 
 needed to observe and correct for capacity balance. To this purpose we opted 
 instead of hacking the delicate
 mechanisms of the CapacityScheduler directly to add support for preemption by 
 means of a Capacity Monitor,
 which can be run optionally as a separate service (much like the 
 NMLivelinessMonitor).
 The capacity monitor (similarly to equivalent functionalities in the fairness 
 scheduler) operates running on intervals 
 (e.g., every 3 seconds), observe the state of the assignment of resources to 
 queues from the capacity scheduler, 
 performs off-line computation to determine if preemption is needed, and how 
 best to edit the current schedule to 
 improve capacity, and generates events that produce four possible actions:
 # Container de-reservations
 # Resource-based preemptions
 # Container-based preemptions
 # Container killing
 The actions listed above are progressively more costly, and it is up to the 
 policy to use them as desired to achieve the rebalancing goals. 
 Note that due to the lag in the effect of these actions the policy should 
 operate at the macroscopic level (e.g., preempt tens of containers
 from a queue) and not trying to tightly and consistently micromanage 
 container allocations. 
 - Preemption policy  (ProportionalCapacityPreemptionPolicy): 
 - 
 Preemption policies are by design pluggable, in the following we present an 
 initial policy (ProportionalCapacityPreemptionPolicy) we have been 
 experimenting with.  The ProportionalCapacityPreemptionPolicy behaves as 
 follows:
 # it gathers from the scheduler the state of the queues, in particular, their 
 current capacity, guaranteed capacity and pending requests (*)
 # if there are pending requests from queues that are under capacity it 
 computes a new ideal balanced state (**)
 # it computes the set of preemptions needed to repair the current schedule 
 and achieve capacity balance (accounting for natural completion rates, and 
 respecting bounds on the amount of preemption we allow for each round)
 # it selects which applications to preempt from each over-capacity queue (the 
 last one in the FIFO order)
 # it remove reservations from the most recently assigned app until the amount 
 of resource to reclaim is obtained, or until no more reservations exits
 # (if not enough) it issues preemptions for containers from the same 
 applications (reverse chronological order, last assigned container first) 
 again until necessary or until no containers except the AM container are left,
 # (if not enough) it moves onto unreserve and preempt from the next 
 application. 
 # containers that have been asked to preempt are tracked across executions. 
 If a containers is among the one to be preempted for more than a certain 
 time, the container is moved in a the list of containers to be forcibly 
 killed. 
 Notes:
 (*) at the moment, in order to avoid double-counting of the requests, we only 
 look at the ANY part of pending resource requests, which means we might not 
 preempt on behalf of AMs that ask only for specific locations but not any. 
 (**) The ideal balance state is one in which each queue has at least its 
 guaranteed capacity, and the spare capacity is distributed among queues (that 
 wants some) as a weighted fair share. Where the weighting is based on the 
 guaranteed capacity of a queue, and the function runs to a fix point.  
 Tunables of the ProportionalCapacityPreemptionPolicy:
 # observe-only mode (i.e., log the actions it would take, but behave as 
 read-only)
 # how frequently to run the policy
 # how long to wait between preemption and kill of a container
 # which fraction of the containers I would like to obtain should I preempt 
 (has to do with the natural rate at which containers are returned)
 # deadzone size, i.e., what % of over-capacity should I ignore (if we are 

[jira] [Commented] (YARN-569) CapacityScheduler: support for preemption (using a capacity monitor)

2013-06-18 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13687515#comment-13687515
 ] 

Chris Douglas commented on YARN-569:


Updated patch, rebased on YARN-117, etc. On configuration, we didn't include 
the knobs for the proportional policy, but left it as a default with a warning 
to look at the config for the policy. Does that seem reasonable? We can add a 
section on it as part of YARN-650.

bq. We are setting values on the allocateresponse after replacing lastResponse 
in the responseMap. This entire section is guarded by the lastResponse value 
obtained from this map (questionable effectiveness perhaps but orthogonal). So 
we should probably be setting everything in the new response (the preemption 
stuff) before the new response replaces the lastResponse in the responseMap.

You're saying the block updating the {{responseMap}} probably belongs just 
before the return? That makes sense, though I haven't traced it explicitly.

 CapacityScheduler: support for preemption (using a capacity monitor)
 

 Key: YARN-569
 URL: https://issues.apache.org/jira/browse/YARN-569
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: 3queues.pdf, CapScheduler_with_preemption.pdf, 
 preemption.2.patch, YARN-569.1.patch, YARN-569.2.patch, YARN-569.3.patch, 
 YARN-569.4.patch, YARN-569.5.patch, YARN-569.6.patch, YARN-569.8.patch, 
 YARN-569.patch, YARN-569.patch


 There is a tension between the fast-pace reactive role of the 
 CapacityScheduler, which needs to respond quickly to 
 applications resource requests, and node updates, and the more introspective, 
 time-based considerations 
 needed to observe and correct for capacity balance. To this purpose we opted 
 instead of hacking the delicate
 mechanisms of the CapacityScheduler directly to add support for preemption by 
 means of a Capacity Monitor,
 which can be run optionally as a separate service (much like the 
 NMLivelinessMonitor).
 The capacity monitor (similarly to equivalent functionalities in the fairness 
 scheduler) operates running on intervals 
 (e.g., every 3 seconds), observe the state of the assignment of resources to 
 queues from the capacity scheduler, 
 performs off-line computation to determine if preemption is needed, and how 
 best to edit the current schedule to 
 improve capacity, and generates events that produce four possible actions:
 # Container de-reservations
 # Resource-based preemptions
 # Container-based preemptions
 # Container killing
 The actions listed above are progressively more costly, and it is up to the 
 policy to use them as desired to achieve the rebalancing goals. 
 Note that due to the lag in the effect of these actions the policy should 
 operate at the macroscopic level (e.g., preempt tens of containers
 from a queue) and not trying to tightly and consistently micromanage 
 container allocations. 
 - Preemption policy  (ProportionalCapacityPreemptionPolicy): 
 - 
 Preemption policies are by design pluggable, in the following we present an 
 initial policy (ProportionalCapacityPreemptionPolicy) we have been 
 experimenting with.  The ProportionalCapacityPreemptionPolicy behaves as 
 follows:
 # it gathers from the scheduler the state of the queues, in particular, their 
 current capacity, guaranteed capacity and pending requests (*)
 # if there are pending requests from queues that are under capacity it 
 computes a new ideal balanced state (**)
 # it computes the set of preemptions needed to repair the current schedule 
 and achieve capacity balance (accounting for natural completion rates, and 
 respecting bounds on the amount of preemption we allow for each round)
 # it selects which applications to preempt from each over-capacity queue (the 
 last one in the FIFO order)
 # it remove reservations from the most recently assigned app until the amount 
 of resource to reclaim is obtained, or until no more reservations exits
 # (if not enough) it issues preemptions for containers from the same 
 applications (reverse chronological order, last assigned container first) 
 again until necessary or until no containers except the AM container are left,
 # (if not enough) it moves onto unreserve and preempt from the next 
 application. 
 # containers that have been asked to preempt are tracked across executions. 
 If a containers is among the one to be preempted for more than a certain 
 time, the container is moved in a the list of containers to be forcibly 
 killed. 
 Notes:
 (*) at the moment, in order to avoid double-counting of the requests, we only 
 look at the ANY part of pending resource requests, which means we 

[jira] [Updated] (YARN-569) CapacityScheduler: support for preemption (using a capacity monitor)

2013-06-19 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated YARN-569:
---

Attachment: YARN-569.9.patch

bq. One other thing to check would be if the preemption policy will use 
refreshed values when the capacity scheduler config is refreshed on the fly. 
Looks like cloneQueues() will take the absolute used and guaranteed numbers on 
every clone. So we should be good wrt that. Would be good to check other values 
the policy looks at.

*nod* Right now, the policy rebuilds its view of the scheduler at every pass, 
but it doesn't refresh its own config parameters.

bq. Noticed formatting issues with spaces in the patch. eg. cloneQueues()

Did another pass over the patch, fixed up spacing, formatting, and removed 
obvious whitespace changes. Sorry, did a few of these already, but missed a few.

Also moved the check in the {{ApplicationMasterService}} as part of this patch.

 CapacityScheduler: support for preemption (using a capacity monitor)
 

 Key: YARN-569
 URL: https://issues.apache.org/jira/browse/YARN-569
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: 3queues.pdf, CapScheduler_with_preemption.pdf, 
 preemption.2.patch, YARN-569.1.patch, YARN-569.2.patch, YARN-569.3.patch, 
 YARN-569.4.patch, YARN-569.5.patch, YARN-569.6.patch, YARN-569.8.patch, 
 YARN-569.9.patch, YARN-569.patch, YARN-569.patch


 There is a tension between the fast-pace reactive role of the 
 CapacityScheduler, which needs to respond quickly to 
 applications resource requests, and node updates, and the more introspective, 
 time-based considerations 
 needed to observe and correct for capacity balance. To this purpose we opted 
 instead of hacking the delicate
 mechanisms of the CapacityScheduler directly to add support for preemption by 
 means of a Capacity Monitor,
 which can be run optionally as a separate service (much like the 
 NMLivelinessMonitor).
 The capacity monitor (similarly to equivalent functionalities in the fairness 
 scheduler) operates running on intervals 
 (e.g., every 3 seconds), observe the state of the assignment of resources to 
 queues from the capacity scheduler, 
 performs off-line computation to determine if preemption is needed, and how 
 best to edit the current schedule to 
 improve capacity, and generates events that produce four possible actions:
 # Container de-reservations
 # Resource-based preemptions
 # Container-based preemptions
 # Container killing
 The actions listed above are progressively more costly, and it is up to the 
 policy to use them as desired to achieve the rebalancing goals. 
 Note that due to the lag in the effect of these actions the policy should 
 operate at the macroscopic level (e.g., preempt tens of containers
 from a queue) and not trying to tightly and consistently micromanage 
 container allocations. 
 - Preemption policy  (ProportionalCapacityPreemptionPolicy): 
 - 
 Preemption policies are by design pluggable, in the following we present an 
 initial policy (ProportionalCapacityPreemptionPolicy) we have been 
 experimenting with.  The ProportionalCapacityPreemptionPolicy behaves as 
 follows:
 # it gathers from the scheduler the state of the queues, in particular, their 
 current capacity, guaranteed capacity and pending requests (*)
 # if there are pending requests from queues that are under capacity it 
 computes a new ideal balanced state (**)
 # it computes the set of preemptions needed to repair the current schedule 
 and achieve capacity balance (accounting for natural completion rates, and 
 respecting bounds on the amount of preemption we allow for each round)
 # it selects which applications to preempt from each over-capacity queue (the 
 last one in the FIFO order)
 # it remove reservations from the most recently assigned app until the amount 
 of resource to reclaim is obtained, or until no more reservations exits
 # (if not enough) it issues preemptions for containers from the same 
 applications (reverse chronological order, last assigned container first) 
 again until necessary or until no containers except the AM container are left,
 # (if not enough) it moves onto unreserve and preempt from the next 
 application. 
 # containers that have been asked to preempt are tracked across executions. 
 If a containers is among the one to be preempted for more than a certain 
 time, the container is moved in a the list of containers to be forcibly 
 killed. 
 Notes:
 (*) at the moment, in order to avoid double-counting of the requests, we only 
 look at the ANY part of pending resource requests, which means we might not 
 preempt on behalf of AMs that ask only 

[jira] [Updated] (YARN-569) CapacityScheduler: support for preemption (using a capacity monitor)

2013-06-24 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated YARN-569:
---

Attachment: YARN-569.10.patch

 CapacityScheduler: support for preemption (using a capacity monitor)
 

 Key: YARN-569
 URL: https://issues.apache.org/jira/browse/YARN-569
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: 3queues.pdf, CapScheduler_with_preemption.pdf, 
 preemption.2.patch, YARN-569.10.patch, YARN-569.1.patch, YARN-569.2.patch, 
 YARN-569.3.patch, YARN-569.4.patch, YARN-569.5.patch, YARN-569.6.patch, 
 YARN-569.8.patch, YARN-569.9.patch, YARN-569.patch, YARN-569.patch


 There is a tension between the fast-pace reactive role of the 
 CapacityScheduler, which needs to respond quickly to 
 applications resource requests, and node updates, and the more introspective, 
 time-based considerations 
 needed to observe and correct for capacity balance. To this purpose we opted 
 instead of hacking the delicate
 mechanisms of the CapacityScheduler directly to add support for preemption by 
 means of a Capacity Monitor,
 which can be run optionally as a separate service (much like the 
 NMLivelinessMonitor).
 The capacity monitor (similarly to equivalent functionalities in the fairness 
 scheduler) operates running on intervals 
 (e.g., every 3 seconds), observe the state of the assignment of resources to 
 queues from the capacity scheduler, 
 performs off-line computation to determine if preemption is needed, and how 
 best to edit the current schedule to 
 improve capacity, and generates events that produce four possible actions:
 # Container de-reservations
 # Resource-based preemptions
 # Container-based preemptions
 # Container killing
 The actions listed above are progressively more costly, and it is up to the 
 policy to use them as desired to achieve the rebalancing goals. 
 Note that due to the lag in the effect of these actions the policy should 
 operate at the macroscopic level (e.g., preempt tens of containers
 from a queue) and not trying to tightly and consistently micromanage 
 container allocations. 
 - Preemption policy  (ProportionalCapacityPreemptionPolicy): 
 - 
 Preemption policies are by design pluggable, in the following we present an 
 initial policy (ProportionalCapacityPreemptionPolicy) we have been 
 experimenting with.  The ProportionalCapacityPreemptionPolicy behaves as 
 follows:
 # it gathers from the scheduler the state of the queues, in particular, their 
 current capacity, guaranteed capacity and pending requests (*)
 # if there are pending requests from queues that are under capacity it 
 computes a new ideal balanced state (**)
 # it computes the set of preemptions needed to repair the current schedule 
 and achieve capacity balance (accounting for natural completion rates, and 
 respecting bounds on the amount of preemption we allow for each round)
 # it selects which applications to preempt from each over-capacity queue (the 
 last one in the FIFO order)
 # it remove reservations from the most recently assigned app until the amount 
 of resource to reclaim is obtained, or until no more reservations exits
 # (if not enough) it issues preemptions for containers from the same 
 applications (reverse chronological order, last assigned container first) 
 again until necessary or until no containers except the AM container are left,
 # (if not enough) it moves onto unreserve and preempt from the next 
 application. 
 # containers that have been asked to preempt are tracked across executions. 
 If a containers is among the one to be preempted for more than a certain 
 time, the container is moved in a the list of containers to be forcibly 
 killed. 
 Notes:
 (*) at the moment, in order to avoid double-counting of the requests, we only 
 look at the ANY part of pending resource requests, which means we might not 
 preempt on behalf of AMs that ask only for specific locations but not any. 
 (**) The ideal balance state is one in which each queue has at least its 
 guaranteed capacity, and the spare capacity is distributed among queues (that 
 wants some) as a weighted fair share. Where the weighting is based on the 
 guaranteed capacity of a queue, and the function runs to a fix point.  
 Tunables of the ProportionalCapacityPreemptionPolicy:
 # observe-only mode (i.e., log the actions it would take, but behave as 
 read-only)
 # how frequently to run the policy
 # how long to wait between preemption and kill of a container
 # which fraction of the containers I would like to obtain should I preempt 
 (has to do with the natural rate at which containers are returned)
 # deadzone size, i.e., what % of 

[jira] [Commented] (YARN-569) CapacityScheduler: support for preemption (using a capacity monitor)

2013-06-24 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692572#comment-13692572
 ] 

Chris Douglas commented on YARN-569:


{{TestAMAuthorization}} also fails on trunk, YARN-878

 CapacityScheduler: support for preemption (using a capacity monitor)
 

 Key: YARN-569
 URL: https://issues.apache.org/jira/browse/YARN-569
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: 3queues.pdf, CapScheduler_with_preemption.pdf, 
 preemption.2.patch, YARN-569.10.patch, YARN-569.1.patch, YARN-569.2.patch, 
 YARN-569.3.patch, YARN-569.4.patch, YARN-569.5.patch, YARN-569.6.patch, 
 YARN-569.8.patch, YARN-569.9.patch, YARN-569.patch, YARN-569.patch


 There is a tension between the fast-pace reactive role of the 
 CapacityScheduler, which needs to respond quickly to 
 applications resource requests, and node updates, and the more introspective, 
 time-based considerations 
 needed to observe and correct for capacity balance. To this purpose we opted 
 instead of hacking the delicate
 mechanisms of the CapacityScheduler directly to add support for preemption by 
 means of a Capacity Monitor,
 which can be run optionally as a separate service (much like the 
 NMLivelinessMonitor).
 The capacity monitor (similarly to equivalent functionalities in the fairness 
 scheduler) operates running on intervals 
 (e.g., every 3 seconds), observe the state of the assignment of resources to 
 queues from the capacity scheduler, 
 performs off-line computation to determine if preemption is needed, and how 
 best to edit the current schedule to 
 improve capacity, and generates events that produce four possible actions:
 # Container de-reservations
 # Resource-based preemptions
 # Container-based preemptions
 # Container killing
 The actions listed above are progressively more costly, and it is up to the 
 policy to use them as desired to achieve the rebalancing goals. 
 Note that due to the lag in the effect of these actions the policy should 
 operate at the macroscopic level (e.g., preempt tens of containers
 from a queue) and not trying to tightly and consistently micromanage 
 container allocations. 
 - Preemption policy  (ProportionalCapacityPreemptionPolicy): 
 - 
 Preemption policies are by design pluggable, in the following we present an 
 initial policy (ProportionalCapacityPreemptionPolicy) we have been 
 experimenting with.  The ProportionalCapacityPreemptionPolicy behaves as 
 follows:
 # it gathers from the scheduler the state of the queues, in particular, their 
 current capacity, guaranteed capacity and pending requests (*)
 # if there are pending requests from queues that are under capacity it 
 computes a new ideal balanced state (**)
 # it computes the set of preemptions needed to repair the current schedule 
 and achieve capacity balance (accounting for natural completion rates, and 
 respecting bounds on the amount of preemption we allow for each round)
 # it selects which applications to preempt from each over-capacity queue (the 
 last one in the FIFO order)
 # it remove reservations from the most recently assigned app until the amount 
 of resource to reclaim is obtained, or until no more reservations exits
 # (if not enough) it issues preemptions for containers from the same 
 applications (reverse chronological order, last assigned container first) 
 again until necessary or until no containers except the AM container are left,
 # (if not enough) it moves onto unreserve and preempt from the next 
 application. 
 # containers that have been asked to preempt are tracked across executions. 
 If a containers is among the one to be preempted for more than a certain 
 time, the container is moved in a the list of containers to be forcibly 
 killed. 
 Notes:
 (*) at the moment, in order to avoid double-counting of the requests, we only 
 look at the ANY part of pending resource requests, which means we might not 
 preempt on behalf of AMs that ask only for specific locations but not any. 
 (**) The ideal balance state is one in which each queue has at least its 
 guaranteed capacity, and the spare capacity is distributed among queues (that 
 wants some) as a weighted fair share. Where the weighting is based on the 
 guaranteed capacity of a queue, and the function runs to a fix point.  
 Tunables of the ProportionalCapacityPreemptionPolicy:
 # observe-only mode (i.e., log the actions it would take, but behave as 
 read-only)
 # how frequently to run the policy
 # how long to wait between preemption and kill of a container
 # which fraction of the containers I would like to obtain should I preempt 
 (has to do with 

[jira] [Updated] (YARN-569) CapacityScheduler: support for preemption (using a capacity monitor)

2013-07-10 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated YARN-569:
---

Attachment: YARN-569.11.patch

Rebase.

 CapacityScheduler: support for preemption (using a capacity monitor)
 

 Key: YARN-569
 URL: https://issues.apache.org/jira/browse/YARN-569
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: 3queues.pdf, CapScheduler_with_preemption.pdf, 
 preemption.2.patch, YARN-569.10.patch, YARN-569.11.patch, YARN-569.1.patch, 
 YARN-569.2.patch, YARN-569.3.patch, YARN-569.4.patch, YARN-569.5.patch, 
 YARN-569.6.patch, YARN-569.8.patch, YARN-569.9.patch, YARN-569.patch, 
 YARN-569.patch


 There is a tension between the fast-pace reactive role of the 
 CapacityScheduler, which needs to respond quickly to 
 applications resource requests, and node updates, and the more introspective, 
 time-based considerations 
 needed to observe and correct for capacity balance. To this purpose we opted 
 instead of hacking the delicate
 mechanisms of the CapacityScheduler directly to add support for preemption by 
 means of a Capacity Monitor,
 which can be run optionally as a separate service (much like the 
 NMLivelinessMonitor).
 The capacity monitor (similarly to equivalent functionalities in the fairness 
 scheduler) operates running on intervals 
 (e.g., every 3 seconds), observe the state of the assignment of resources to 
 queues from the capacity scheduler, 
 performs off-line computation to determine if preemption is needed, and how 
 best to edit the current schedule to 
 improve capacity, and generates events that produce four possible actions:
 # Container de-reservations
 # Resource-based preemptions
 # Container-based preemptions
 # Container killing
 The actions listed above are progressively more costly, and it is up to the 
 policy to use them as desired to achieve the rebalancing goals. 
 Note that due to the lag in the effect of these actions the policy should 
 operate at the macroscopic level (e.g., preempt tens of containers
 from a queue) and not trying to tightly and consistently micromanage 
 container allocations. 
 - Preemption policy  (ProportionalCapacityPreemptionPolicy): 
 - 
 Preemption policies are by design pluggable, in the following we present an 
 initial policy (ProportionalCapacityPreemptionPolicy) we have been 
 experimenting with.  The ProportionalCapacityPreemptionPolicy behaves as 
 follows:
 # it gathers from the scheduler the state of the queues, in particular, their 
 current capacity, guaranteed capacity and pending requests (*)
 # if there are pending requests from queues that are under capacity it 
 computes a new ideal balanced state (**)
 # it computes the set of preemptions needed to repair the current schedule 
 and achieve capacity balance (accounting for natural completion rates, and 
 respecting bounds on the amount of preemption we allow for each round)
 # it selects which applications to preempt from each over-capacity queue (the 
 last one in the FIFO order)
 # it remove reservations from the most recently assigned app until the amount 
 of resource to reclaim is obtained, or until no more reservations exits
 # (if not enough) it issues preemptions for containers from the same 
 applications (reverse chronological order, last assigned container first) 
 again until necessary or until no containers except the AM container are left,
 # (if not enough) it moves onto unreserve and preempt from the next 
 application. 
 # containers that have been asked to preempt are tracked across executions. 
 If a containers is among the one to be preempted for more than a certain 
 time, the container is moved in a the list of containers to be forcibly 
 killed. 
 Notes:
 (*) at the moment, in order to avoid double-counting of the requests, we only 
 look at the ANY part of pending resource requests, which means we might not 
 preempt on behalf of AMs that ask only for specific locations but not any. 
 (**) The ideal balance state is one in which each queue has at least its 
 guaranteed capacity, and the spare capacity is distributed among queues (that 
 wants some) as a weighted fair share. Where the weighting is based on the 
 guaranteed capacity of a queue, and the function runs to a fix point.  
 Tunables of the ProportionalCapacityPreemptionPolicy:
 # observe-only mode (i.e., log the actions it would take, but behave as 
 read-only)
 # how frequently to run the policy
 # how long to wait between preemption and kill of a container
 # which fraction of the containers I would like to obtain should I preempt 
 (has to do with the natural rate at which containers are 

[jira] [Updated] (YARN-569) CapacityScheduler: support for preemption (using a capacity monitor)

2013-07-12 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated YARN-569:
---

Fix Version/s: 2.1.0-beta

 CapacityScheduler: support for preemption (using a capacity monitor)
 

 Key: YARN-569
 URL: https://issues.apache.org/jira/browse/YARN-569
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Carlo Curino
Assignee: Carlo Curino
 Fix For: 2.1.0-beta

 Attachments: 3queues.pdf, CapScheduler_with_preemption.pdf, 
 preemption.2.patch, YARN-569.10.patch, YARN-569.11.patch, YARN-569.1.patch, 
 YARN-569.2.patch, YARN-569.3.patch, YARN-569.4.patch, YARN-569.5.patch, 
 YARN-569.6.patch, YARN-569.8.patch, YARN-569.9.patch, YARN-569.patch, 
 YARN-569.patch


 There is a tension between the fast-pace reactive role of the 
 CapacityScheduler, which needs to respond quickly to 
 applications resource requests, and node updates, and the more introspective, 
 time-based considerations 
 needed to observe and correct for capacity balance. To this purpose we opted 
 instead of hacking the delicate
 mechanisms of the CapacityScheduler directly to add support for preemption by 
 means of a Capacity Monitor,
 which can be run optionally as a separate service (much like the 
 NMLivelinessMonitor).
 The capacity monitor (similarly to equivalent functionalities in the fairness 
 scheduler) operates running on intervals 
 (e.g., every 3 seconds), observe the state of the assignment of resources to 
 queues from the capacity scheduler, 
 performs off-line computation to determine if preemption is needed, and how 
 best to edit the current schedule to 
 improve capacity, and generates events that produce four possible actions:
 # Container de-reservations
 # Resource-based preemptions
 # Container-based preemptions
 # Container killing
 The actions listed above are progressively more costly, and it is up to the 
 policy to use them as desired to achieve the rebalancing goals. 
 Note that due to the lag in the effect of these actions the policy should 
 operate at the macroscopic level (e.g., preempt tens of containers
 from a queue) and not trying to tightly and consistently micromanage 
 container allocations. 
 - Preemption policy  (ProportionalCapacityPreemptionPolicy): 
 - 
 Preemption policies are by design pluggable, in the following we present an 
 initial policy (ProportionalCapacityPreemptionPolicy) we have been 
 experimenting with.  The ProportionalCapacityPreemptionPolicy behaves as 
 follows:
 # it gathers from the scheduler the state of the queues, in particular, their 
 current capacity, guaranteed capacity and pending requests (*)
 # if there are pending requests from queues that are under capacity it 
 computes a new ideal balanced state (**)
 # it computes the set of preemptions needed to repair the current schedule 
 and achieve capacity balance (accounting for natural completion rates, and 
 respecting bounds on the amount of preemption we allow for each round)
 # it selects which applications to preempt from each over-capacity queue (the 
 last one in the FIFO order)
 # it remove reservations from the most recently assigned app until the amount 
 of resource to reclaim is obtained, or until no more reservations exits
 # (if not enough) it issues preemptions for containers from the same 
 applications (reverse chronological order, last assigned container first) 
 again until necessary or until no containers except the AM container are left,
 # (if not enough) it moves onto unreserve and preempt from the next 
 application. 
 # containers that have been asked to preempt are tracked across executions. 
 If a containers is among the one to be preempted for more than a certain 
 time, the container is moved in a the list of containers to be forcibly 
 killed. 
 Notes:
 (*) at the moment, in order to avoid double-counting of the requests, we only 
 look at the ANY part of pending resource requests, which means we might not 
 preempt on behalf of AMs that ask only for specific locations but not any. 
 (**) The ideal balance state is one in which each queue has at least its 
 guaranteed capacity, and the spare capacity is distributed among queues (that 
 wants some) as a weighted fair share. Where the weighting is based on the 
 guaranteed capacity of a queue, and the function runs to a fix point.  
 Tunables of the ProportionalCapacityPreemptionPolicy:
 # observe-only mode (i.e., log the actions it would take, but behave as 
 read-only)
 # how frequently to run the policy
 # how long to wait between preemption and kill of a container
 # which fraction of the containers I would like to obtain should I preempt 
 (has to do with the natural rate at which containers 

[jira] [Commented] (YARN-1184) ClassCastException is thrown during preemption When a huge job is submitted to a queue B whose resources is used by a job in queueA

2013-09-16 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769067#comment-13769067
 ] 

Chris Douglas commented on YARN-1184:
-

I committed this.

Thanks Bikas for the review.

 ClassCastException is thrown during preemption When a huge job is submitted 
 to a queue B whose resources is used by a job in queueA
 ---

 Key: YARN-1184
 URL: https://issues.apache.org/jira/browse/YARN-1184
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler, resourcemanager
Affects Versions: 2.1.0-beta
Reporter: J.Andreina
Assignee: Chris Douglas
 Fix For: 2.1.1-beta

 Attachments: Y1184-0.patch, Y1184-1.patch


 preemption is enabled.
 Queue = a,b
 a capacity = 30%
 b capacity = 70%
 Step 1: Assign a big job to queue a ( so that job_a will utilize some 
 resources from queue b)
 Step 2: Assigne a big job to queue b.
 Following exception is thrown at Resource Manager
 {noformat}
 2013-09-12 10:42:32,535 ERROR [SchedulingMonitor 
 (ProportionalCapacityPreemptionPolicy)] yarn.YarnUncaughtExceptionHandler 
 (YarnUncaughtExceptionHandler.java:uncaughtException(68)) - Thread 
 Thread[SchedulingMonitor (ProportionalCapacityPreemptionPolicy),5,main] threw 
 an Exception.
 java.lang.ClassCastException: java.util.Collections$UnmodifiableSet cannot be 
 cast to java.util.NavigableSet
   at 
 org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy.getContainersToPreempt(ProportionalCapacityPreemptionPolicy.java:403)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy.containerBasedPreemptOrKill(ProportionalCapacityPreemptionPolicy.java:202)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy.editSchedule(ProportionalCapacityPreemptionPolicy.java:173)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.monitor.SchedulingMonitor.invokePolicy(SchedulingMonitor.java:72)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.monitor.SchedulingMonitor$PreemptionChecker.run(SchedulingMonitor.java:82)
   at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1184) ClassCastException is thrown during preemption When a huge job is submitted to a queue B whose resources is used by a job in queueA

2013-09-16 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated YARN-1184:


Attachment: Y1184-1.patch

 ClassCastException is thrown during preemption When a huge job is submitted 
 to a queue B whose resources is used by a job in queueA
 ---

 Key: YARN-1184
 URL: https://issues.apache.org/jira/browse/YARN-1184
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler, resourcemanager
Affects Versions: 2.1.0-beta
Reporter: J.Andreina
Assignee: Chris Douglas
 Fix For: 2.1.1-beta

 Attachments: Y1184-0.patch, Y1184-1.patch


 preemption is enabled.
 Queue = a,b
 a capacity = 30%
 b capacity = 70%
 Step 1: Assign a big job to queue a ( so that job_a will utilize some 
 resources from queue b)
 Step 2: Assigne a big job to queue b.
 Following exception is thrown at Resource Manager
 {noformat}
 2013-09-12 10:42:32,535 ERROR [SchedulingMonitor 
 (ProportionalCapacityPreemptionPolicy)] yarn.YarnUncaughtExceptionHandler 
 (YarnUncaughtExceptionHandler.java:uncaughtException(68)) - Thread 
 Thread[SchedulingMonitor (ProportionalCapacityPreemptionPolicy),5,main] threw 
 an Exception.
 java.lang.ClassCastException: java.util.Collections$UnmodifiableSet cannot be 
 cast to java.util.NavigableSet
   at 
 org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy.getContainersToPreempt(ProportionalCapacityPreemptionPolicy.java:403)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy.containerBasedPreemptOrKill(ProportionalCapacityPreemptionPolicy.java:202)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy.editSchedule(ProportionalCapacityPreemptionPolicy.java:173)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.monitor.SchedulingMonitor.invokePolicy(SchedulingMonitor.java:72)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.monitor.SchedulingMonitor$PreemptionChecker.run(SchedulingMonitor.java:82)
   at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-2297) Preemption can hang in corner case by not allowing any task container to proceed.

2014-07-15 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063086#comment-14063086
 ] 

Chris Douglas commented on YARN-2297:
-

Are there realistic configurations where this creates a problem? If a queue is 
configured with less than a container's capacity, what is the intent?

 Preemption can hang in corner case by not allowing any task container to 
 proceed.
 -

 Key: YARN-2297
 URL: https://issues.apache.org/jira/browse/YARN-2297
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Affects Versions: 2.5.0
Reporter: Tassapol Athiapinya
Assignee: Wangda Tan
Priority: Critical

 Preemption can cause hang issue in single-node cluster. Only AMs run. No task 
 container can run.
 h3. queue configuration
 Queue A/B has 1% and 99% respectively. 
 No max capacity.
 h3. scenario
 Turn on preemption. Configure 1 NM with 4 GB of memory. Use only 2 apps. Use 
 1 user.
 Submit app 1 to queue A. AM needs 2 GB. There is 1 task that needs 2 GB. 
 Occupy entire cluster.
 Submit app 2 to queue B. AM needs 2 GB. There are 3 tasks that need 2 GB each.
 Instead of entire app 1 preempted, app 1 AM will stay. App 2 AM will launch. 
 No task of either app can proceed. 
 h3. commands
 /usr/lib/hadoop/bin/hadoop jar 
 /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar randomtextwriter 
 -Dmapreduce.map.memory.mb=2000 
 -Dyarn.app.mapreduce.am.command-opts=-Xmx1800M 
 -Dmapreduce.randomtextwriter.bytespermap=2147483648 
 -Dmapreduce.job.queuename=A -Dmapreduce.map.maxattempts=100 
 -Dmapreduce.am.max-attempts=1 -Dyarn.app.mapreduce.am.resource.mb=2000 
 -Dmapreduce.map.java.opts=-Xmx1800M 
 -Dmapreduce.randomtextwriter.mapsperhost=1 
 -Dmapreduce.randomtextwriter.totalbytes=2147483648 dir1
 /usr/lib/hadoop/bin/hadoop jar 
 /usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-tests.jar sleep 
 -Dmapreduce.map.memory.mb=2000 
 -Dyarn.app.mapreduce.am.command-opts=-Xmx1800M 
 -Dmapreduce.job.queuename=B -Dmapreduce.map.maxattempts=100 
 -Dmapreduce.am.max-attempts=1 -Dyarn.app.mapreduce.am.resource.mb=2000 
 -Dmapreduce.map.java.opts=-Xmx1800M -m 1 -r 0 -mt 4000  -rt 0



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2297) Preemption can hang in corner case by not allowing any task container to proceed.

2014-07-15 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063115#comment-14063115
 ] 

Chris Douglas commented on YARN-2297:
-

I'll try asking the question differently. Does this occur when the absolute 
guaranteed capacity of a queue is smaller than the minimum container size? If 
so, then what is the operator expressing with that configuration?

 Preemption can hang in corner case by not allowing any task container to 
 proceed.
 -

 Key: YARN-2297
 URL: https://issues.apache.org/jira/browse/YARN-2297
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Affects Versions: 2.5.0
Reporter: Tassapol Athiapinya
Assignee: Wangda Tan
Priority: Critical

 Preemption can cause hang issue in single-node cluster. Only AMs run. No task 
 container can run.
 h3. queue configuration
 Queue A/B has 1% and 99% respectively. 
 No max capacity.
 h3. scenario
 Turn on preemption. Configure 1 NM with 4 GB of memory. Use only 2 apps. Use 
 1 user.
 Submit app 1 to queue A. AM needs 2 GB. There is 1 task that needs 2 GB. 
 Occupy entire cluster.
 Submit app 2 to queue B. AM needs 2 GB. There are 3 tasks that need 2 GB each.
 Instead of entire app 1 preempted, app 1 AM will stay. App 2 AM will launch. 
 No task of either app can proceed. 
 h3. commands
 /usr/lib/hadoop/bin/hadoop jar 
 /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar randomtextwriter 
 -Dmapreduce.map.memory.mb=2000 
 -Dyarn.app.mapreduce.am.command-opts=-Xmx1800M 
 -Dmapreduce.randomtextwriter.bytespermap=2147483648 
 -Dmapreduce.job.queuename=A -Dmapreduce.map.maxattempts=100 
 -Dmapreduce.am.max-attempts=1 -Dyarn.app.mapreduce.am.resource.mb=2000 
 -Dmapreduce.map.java.opts=-Xmx1800M 
 -Dmapreduce.randomtextwriter.mapsperhost=1 
 -Dmapreduce.randomtextwriter.totalbytes=2147483648 dir1
 /usr/lib/hadoop/bin/hadoop jar 
 /usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-tests.jar sleep 
 -Dmapreduce.map.memory.mb=2000 
 -Dyarn.app.mapreduce.am.command-opts=-Xmx1800M 
 -Dmapreduce.job.queuename=B -Dmapreduce.map.maxattempts=100 
 -Dmapreduce.am.max-attempts=1 -Dyarn.app.mapreduce.am.resource.mb=2000 
 -Dmapreduce.map.java.opts=-Xmx1800M -m 1 -r 0 -mt 4000  -rt 0



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2297) Preemption can hang in corner case by not allowing any task container to proceed.

2014-07-16 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063232#comment-14063232
 ] 

Chris Douglas commented on YARN-2297:
-

The parameter defining the deadzone around the computed ideal [1] flattens out 
that jitter. When the guaranteed capacity for the queue is so vanishingly small 
that the deadzone is smaller than a single container allocation, then the 
deadzone (and guaranteed queue capacity) is effectively zero.

[1] 
{{yarn.resourcemanager.monitor.capacity.preemption.max_ignored_over_capacity}}

 Preemption can hang in corner case by not allowing any task container to 
 proceed.
 -

 Key: YARN-2297
 URL: https://issues.apache.org/jira/browse/YARN-2297
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Affects Versions: 2.5.0
Reporter: Tassapol Athiapinya
Assignee: Wangda Tan
Priority: Critical

 Preemption can cause hang issue in single-node cluster. Only AMs run. No task 
 container can run.
 h3. queue configuration
 Queue A/B has 1% and 99% respectively. 
 No max capacity.
 h3. scenario
 Turn on preemption. Configure 1 NM with 4 GB of memory. Use only 2 apps. Use 
 1 user.
 Submit app 1 to queue A. AM needs 2 GB. There is 1 task that needs 2 GB. 
 Occupy entire cluster.
 Submit app 2 to queue B. AM needs 2 GB. There are 3 tasks that need 2 GB each.
 Instead of entire app 1 preempted, app 1 AM will stay. App 2 AM will launch. 
 No task of either app can proceed. 
 h3. commands
 /usr/lib/hadoop/bin/hadoop jar 
 /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar randomtextwriter 
 -Dmapreduce.map.memory.mb=2000 
 -Dyarn.app.mapreduce.am.command-opts=-Xmx1800M 
 -Dmapreduce.randomtextwriter.bytespermap=2147483648 
 -Dmapreduce.job.queuename=A -Dmapreduce.map.maxattempts=100 
 -Dmapreduce.am.max-attempts=1 -Dyarn.app.mapreduce.am.resource.mb=2000 
 -Dmapreduce.map.java.opts=-Xmx1800M 
 -Dmapreduce.randomtextwriter.mapsperhost=1 
 -Dmapreduce.randomtextwriter.totalbytes=2147483648 dir1
 /usr/lib/hadoop/bin/hadoop jar 
 /usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-tests.jar sleep 
 -Dmapreduce.map.memory.mb=2000 
 -Dyarn.app.mapreduce.am.command-opts=-Xmx1800M 
 -Dmapreduce.job.queuename=B -Dmapreduce.map.maxattempts=100 
 -Dmapreduce.am.max-attempts=1 -Dyarn.app.mapreduce.am.resource.mb=2000 
 -Dmapreduce.map.java.opts=-Xmx1800M -m 1 -r 0 -mt 4000  -rt 0



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-2297) Preemption can hang when configured ridiculously

2014-07-16 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated YARN-2297:


Summary: Preemption can hang when configured ridiculously  (was: Preemption 
can hang in corner case by not allowing any task container to proceed.)

 Preemption can hang when configured ridiculously
 

 Key: YARN-2297
 URL: https://issues.apache.org/jira/browse/YARN-2297
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Affects Versions: 2.5.0
Reporter: Tassapol Athiapinya
Assignee: Wangda Tan
Priority: Critical

 Preemption can cause hang issue in single-node cluster. Only AMs run. No task 
 container can run.
 h3. queue configuration
 Queue A/B has 1% and 99% respectively. 
 No max capacity.
 h3. scenario
 Turn on preemption. Configure 1 NM with 4 GB of memory. Use only 2 apps. Use 
 1 user.
 Submit app 1 to queue A. AM needs 2 GB. There is 1 task that needs 2 GB. 
 Occupy entire cluster.
 Submit app 2 to queue B. AM needs 2 GB. There are 3 tasks that need 2 GB each.
 Instead of entire app 1 preempted, app 1 AM will stay. App 2 AM will launch. 
 No task of either app can proceed. 
 h3. commands
 /usr/lib/hadoop/bin/hadoop jar 
 /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar randomtextwriter 
 -Dmapreduce.map.memory.mb=2000 
 -Dyarn.app.mapreduce.am.command-opts=-Xmx1800M 
 -Dmapreduce.randomtextwriter.bytespermap=2147483648 
 -Dmapreduce.job.queuename=A -Dmapreduce.map.maxattempts=100 
 -Dmapreduce.am.max-attempts=1 -Dyarn.app.mapreduce.am.resource.mb=2000 
 -Dmapreduce.map.java.opts=-Xmx1800M 
 -Dmapreduce.randomtextwriter.mapsperhost=1 
 -Dmapreduce.randomtextwriter.totalbytes=2147483648 dir1
 /usr/lib/hadoop/bin/hadoop jar 
 /usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-tests.jar sleep 
 -Dmapreduce.map.memory.mb=2000 
 -Dyarn.app.mapreduce.am.command-opts=-Xmx1800M 
 -Dmapreduce.job.queuename=B -Dmapreduce.map.maxattempts=100 
 -Dmapreduce.am.max-attempts=1 -Dyarn.app.mapreduce.am.resource.mb=2000 
 -Dmapreduce.map.java.opts=-Xmx1800M -m 1 -r 0 -mt 4000  -rt 0



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-2297) Preemption can prevent progress in small queues

2014-07-16 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated YARN-2297:


Summary: Preemption can prevent progress in small queues  (was: Preemption 
can hang when configured ridiculously)

 Preemption can prevent progress in small queues
 ---

 Key: YARN-2297
 URL: https://issues.apache.org/jira/browse/YARN-2297
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Affects Versions: 2.5.0
Reporter: Tassapol Athiapinya
Assignee: Wangda Tan
Priority: Critical

 Preemption can cause hang issue in single-node cluster. Only AMs run. No task 
 container can run.
 h3. queue configuration
 Queue A/B has 1% and 99% respectively. 
 No max capacity.
 h3. scenario
 Turn on preemption. Configure 1 NM with 4 GB of memory. Use only 2 apps. Use 
 1 user.
 Submit app 1 to queue A. AM needs 2 GB. There is 1 task that needs 2 GB. 
 Occupy entire cluster.
 Submit app 2 to queue B. AM needs 2 GB. There are 3 tasks that need 2 GB each.
 Instead of entire app 1 preempted, app 1 AM will stay. App 2 AM will launch. 
 No task of either app can proceed. 
 h3. commands
 /usr/lib/hadoop/bin/hadoop jar 
 /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar randomtextwriter 
 -Dmapreduce.map.memory.mb=2000 
 -Dyarn.app.mapreduce.am.command-opts=-Xmx1800M 
 -Dmapreduce.randomtextwriter.bytespermap=2147483648 
 -Dmapreduce.job.queuename=A -Dmapreduce.map.maxattempts=100 
 -Dmapreduce.am.max-attempts=1 -Dyarn.app.mapreduce.am.resource.mb=2000 
 -Dmapreduce.map.java.opts=-Xmx1800M 
 -Dmapreduce.randomtextwriter.mapsperhost=1 
 -Dmapreduce.randomtextwriter.totalbytes=2147483648 dir1
 /usr/lib/hadoop/bin/hadoop jar 
 /usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-tests.jar sleep 
 -Dmapreduce.map.memory.mb=2000 
 -Dyarn.app.mapreduce.am.command-opts=-Xmx1800M 
 -Dmapreduce.job.queuename=B -Dmapreduce.map.maxattempts=100 
 -Dmapreduce.am.max-attempts=1 -Dyarn.app.mapreduce.am.resource.mb=2000 
 -Dmapreduce.map.java.opts=-Xmx1800M -m 1 -r 0 -mt 4000  -rt 0



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-2424) LCE should support non-cgroups, non-secure mode

2014-08-20 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated YARN-2424:


Attachment: Y2424-1.patch

Added a version with a log statement that warns on startup. [~tucu00], is this 
sufficient? The config docs are pretty clear about the effect of setting the 
parameter, and this should be noticed if someone is experimenting with LCE.

 LCE should support non-cgroups, non-secure mode
 ---

 Key: YARN-2424
 URL: https://issues.apache.org/jira/browse/YARN-2424
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.3.0, 2.4.0, 2.5.0, 2.4.1
Reporter: Allen Wittenauer
Priority: Blocker
 Attachments: Y2424-1.patch, YARN-2424.patch


 After YARN-1253, LCE no longer works for non-secure, non-cgroup scenarios.  
 This is a fairly serious regression, as turning on LCE prior to turning on 
 full-blown security is a fairly standard procedure.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-2424) LCE should support non-cgroups, non-secure mode

2014-08-21 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated YARN-2424:


Assignee: Allen Wittenauer  (was: Chris Douglas)

 LCE should support non-cgroups, non-secure mode
 ---

 Key: YARN-2424
 URL: https://issues.apache.org/jira/browse/YARN-2424
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.3.0, 2.4.0, 2.5.0, 2.4.1
Reporter: Allen Wittenauer
Assignee: Allen Wittenauer
Priority: Blocker
 Attachments: Y2424-1.patch, YARN-2424.patch


 After YARN-1253, LCE no longer works for non-secure, non-cgroup scenarios.  
 This is a fairly serious regression, as turning on LCE prior to turning on 
 full-blown security is a fairly standard procedure.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2470) A high value for yarn.nodemanager.delete.debug-delay-sec causes Nodemanager to crash. Slider needs this value to be high. Setting a very high value throws an exception a

2014-08-29 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14115581#comment-14115581
 ] 

Chris Douglas commented on YARN-2470:
-

Failing to start is the correct behavior; that timeout is not valid. Is your 
intent to disable cleanup entirely?

 A high value for yarn.nodemanager.delete.debug-delay-sec causes Nodemanager 
 to crash. Slider needs this value to be high. Setting a very high value 
 throws an exception and nodemanager does not start
 --

 Key: YARN-2470
 URL: https://issues.apache.org/jira/browse/YARN-2470
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.4.1
Reporter: Shivaji Dutta
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1709) Admission Control: Reservation subsystem

2014-09-05 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14123885#comment-14123885
 ] 

Chris Douglas commented on YARN-1709:
-

Overall, the patch lgtm. Just a few minor tweaks, then I'm +1
* very minor: Javadoc could be compressed a bit (empty lines)

{{InMemoryPlan}}
* The {{ZERO_RESOURCE}} instance escapes via {{getConsumptionForUser}}
* Some lines are more than 80 characters
* The logging can use built-in substitution more efficiently. Instead of:
{code}
String errMsg =
MessageFormat
.format(
The specified Reservation with ID {0} does not exist in the plan,
reservation.getReservationId());
LOG.error(errMsg);
{code}
Prefer:
{code}
LOG.error(The specified Reservation with ID {} does not exist in the plan,
reservation.getReservationId());
{code}
Some of the code already uses this construction, but a few still use 
{{MessageFormat}}.
* This form is harder to read:
{code}
InMemoryReservationAllocation inMemReservation = null;
if (reservation instanceof InMemoryReservationAllocation) {
  inMemReservation = (InMemoryReservationAllocation) reservation;
} else {
  // [snip] log error
  throw new RuntimeException(errMsg);
}
{code}
than the if (error) { throw; } construction used the other checks. Is it an 
improvement over {{ClassCastException}}?
* {{addReservation}} doesn't need to hold the write lock while it checks 
invariants on its arguments
* The private methods that assume locks ({{incrementAllocation}}, 
{{decrementAllocation}}, {{removeReservation}}, etc.) are held should probably 
{{assert}} that precondition (e.g., {{RRWL::isWriteLockedByCurrentThread()}})
* {{getMinimumAllocation}} and {{getMaximumAllocation}} return mutable data 
that should probably be cloned

{{InMemoryReservationAllocation}}
* minor style: redundant {{this}} in get methods
* {{toString}} should use {{StringBuilder}} instead of {{StringBuffer}}

{{PlanView}}
* Mismatched javadoc on {{getEarliestStartTime}}
* {{getLastEndTime}} specifies UTC. Is that enforced in the implementation?

{{ReservationInterval}}
* Can this be made immutable? It's a key in several maps

{{RLESparseResourceAllocation}}
* Though some methods in {{InMemoryPlan}}, the {{ZERO_RESOURCE}} internal 
variable can escape via {{getCapacityAtTime}}.

 Admission Control: Reservation subsystem
 

 Key: YARN-1709
 URL: https://issues.apache.org/jira/browse/YARN-1709
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Subramaniam Krishnan
 Attachments: YARN-1709.patch, YARN-1709.patch, YARN-1709.patch


 This JIRA is about the key data structure used to track resources over time 
 to enable YARN-1051. The Reservation subsystem is conceptually a plan of 
 how the scheduler will allocate resources over-time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1710) Admission Control: agents to allocate reservation

2014-09-10 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129038#comment-14129038
 ] 

Chris Douglas commented on YARN-1710:
-

{{GreedyReservationAgent}}
* Consider {{@link}} for {{ReservationRequest}} in class javadoc
* An inline comment could replace the {{adjustContract()}} method
* Most of the javadoc on private methods can be cut
* {{currentReservationStage}} does not need to be declared outside the loop
* {{allocations}} cannot be null
* An internal {{Resource(0, 0)}} could be reused
* {{li}} should be part of the loop ({{for}} not {{while}}). Its initialization 
is unreadable; please use temp vars.
* Generally, embedded calls are difficult to read:
{code}
if (findEarliestTime(allocations.keySet())  earliestStart) {
  allocations.put(new ReservationInterval(earliestStart,
  findEarliestTime(allocations.keySet())), ReservationRequest
  .newInstance(Resource.newInstance(0, 0), 0));
  // consider to add trailing zeros at the end for simmetry
}
{code}
Assuming the {{ReservationRequest}} is never modified by the plan:
{code}
private final ZERO_RSRC =
ReservationRequest.newInstance(Resource.newInstance(0, 0), 0);
// ...
long allocStart = findEarliestTime(allocations.keySet());
if (allocStart  earliestStart) {
  ReservationInterval preAlloc =
new ReservationInterval(earliestStart, allocStart);
  allocations.put(preAlloc, ZERO_RSRC);
}
{code}
* {{findEarliestTime(allocations.keySet())}} is called several times and should 
be memoized
** Would a {{TreeSet}} be more appropriate, given this access pattern?
* Instead of:
{code}
boolean result = false;
if (oldReservation != null) {
  result = plan.updateReservation(capReservation);
} else {
  result = plan.addReservation(capReservation);
}
return result;
{code}
Consider:
{code}
if (oldReservation != null) {
  return plan.updateReservation(capReservation);
}
return plan.addReservation(capReservation);
{code}
* A comment unpacking the arithmetic for calculating {{curMaxGang}} would help 
readability

{{TestGreedyReservationAgent}}
* Instead of fixing the seed, consider setting and logging it for each run.
* {{testStress}} is brittle, as it verifies only the timeout; {{testBig}} and 
{{testSmall}} don't verify anything. Both tests are useful, but probably not as 
part of the build. Dropping the annotation and adding a {{main()}} that calls 
each fo them would be one alternative.

 Admission Control: agents to allocate reservation
 -

 Key: YARN-1710
 URL: https://issues.apache.org/jira/browse/YARN-1710
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: YARN-1710.1.patch, YARN-1710.patch


 This JIRA tracks the algorithms used to allocate a user ReservationRequest 
 coming in from the new reservation API (YARN-1708), in the inventory 
 subsystem (YARN-1709) maintaining the current plan for the cluster. The focus 
 of this agents is to quickly find a solution for the set of contraints 
 provided by the user, and the physical constraints of the plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2475) ReservationSystem: replan upon capacity reduction

2014-09-10 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129041#comment-14129041
 ] 

Chris Douglas commented on YARN-2475:
-

{{SimpleCapacityReplanner}}
* The Clock can be initialized in the constructor, declared private and final
* The exception refers to an InventorySizeAdjusmentPolicy
* nit: redundant parenthesis in the main loop, exceeds 80 char
* {{curSessions}} cannot be null; prefer {{!isEmpty()}} to {{size()  0}}
** Is this check even necessary? {{sort}} and the following loop should be noops
* A brief comment about the natural order of {{ReservationAllocations}} would 
help readability of this loop. It's in the class doc, but something inline 
would be helpful
* An internal {{Resource(0,0)}} could be reused, instead of creating it in the 
loop
* Could the inner loop be more readable? The embedded function calls in the 
{{Resource}} arithmetic are hard to read (pseudo):
{code}
ArrayList curSessions = new ArrayList(plan.getResourcesAtTime(t));
Collections.sort(curSessions);
for (Iterator i = curSessions.iterator(); i.hasNext()  excessCap  0;) {
  InMemoryReservationAllocation a = (InMemoryReservationAllocation) i.next();
  plan.deleteReservation(a.getReservationId());
  excessCap -= a.getResourcesAtTime(t);
}
{code}
* Why is the enforcement window tied to {{CapacitySchedulerConfiguration}}?

{{TestSimpleCapacityReplanner}}
* Tests should not call {{Thread.sleep}}; instead update the mock
* Passing in a mocked {{Clock}} to the cstr rather than assigning it in the 
test is cleaner
* Instead of {{assertTrue(cond != null)}} use {{assertNotNull(cond)}} (same for 
positive null check)
* The test should not catch and discard {{PlanningException}}

 ReservationSystem: replan upon capacity reduction
 -

 Key: YARN-2475
 URL: https://issues.apache.org/jira/browse/YARN-2475
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: YARN-2475.patch


 In the context of YARN-1051, if capacity of the cluster drops significantly 
 upon machine failures we need to trigger a reorganization of the planned 
 reservations. As reservations are absolute it is possible that they will 
 not all fit, and some need to be rejected a-posteriori.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1710) Admission Control: agents to allocate reservation

2014-09-12 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131700#comment-14131700
 ] 

Chris Douglas commented on YARN-1710:
-

bq. I am not memoizing findEarliestTime, as it would only save one invocation 
(the others are on diff sets, or updated version of the same set)

I'm confused. There are three invocations:
{code}
if (findEarliestTime(allocations.keySet())  earliestStart) {
  allocations.put(new ReservationInterval(earliestStart,
  findEarliestTime(allocations.keySet())), ZERO_RES);
}
ReservationAllocation capReservation =
new InMemoryReservationAllocation(reservationId, contract, user,
plan.getQueueName(), findEarliestTime(allocations.keySet()),
findLatestTime(allocations.keySet()), allocations,
plan.getResourceCalculator(), plan.getMinimumAllocation());
{code}
Isn't earliest time is either the earliest in the set, or the interval this 
just added?

 Admission Control: agents to allocate reservation
 -

 Key: YARN-1710
 URL: https://issues.apache.org/jira/browse/YARN-1710
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: YARN-1710.1.patch, YARN-1710.2.patch, YARN-1710.patch


 This JIRA tracks the algorithms used to allocate a user ReservationRequest 
 coming in from the new reservation API (YARN-1708), in the inventory 
 subsystem (YARN-1709) maintaining the current plan for the cluster. The focus 
 of this agents is to quickly find a solution for the set of contraints 
 provided by the user, and the physical constraints of the plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2475) ReservationSystem: replan upon capacity reduction

2014-09-12 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131713#comment-14131713
 ] 

Chris Douglas commented on YARN-2475:
-

+1, other than a couple very minor nits:
* the new cstr accepting {{Clock}} can be package-private, with the no-arg cstr 
calling {{this(new UTCClock());}} (comment unnecessary, or replace with 
{{@VisibleForTesting}})
* The unit test could have a more descriptive name than {{test()}}, declare 
{{PlanningException}} in its throws clause instead of calling 
{{Assert::fail()}} on catching it, and not declare {{InterruptedException}} 
which it no longer throws

Just a minor clarification: as this iterates over each instant of the plan, are 
others allowed to modify it?

 ReservationSystem: replan upon capacity reduction
 -

 Key: YARN-2475
 URL: https://issues.apache.org/jira/browse/YARN-2475
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: YARN-2475.patch, YARN-2475.patch


 In the context of YARN-1051, if capacity of the cluster drops significantly 
 upon machine failures we need to trigger a reorganization of the planned 
 reservations. As reservations are absolute it is possible that they will 
 not all fit, and some need to be rejected a-posteriori.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1709) Admission Control: Reservation subsystem

2014-09-12 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14131727#comment-14131727
 ] 

Chris Douglas commented on YARN-1709:
-

Thanks for the updates. Just a few minor tweaks, then I'm +1
* In checking the preconditions:
{code}
if (!readWriteLock.isWriteLockedByCurrentThread()) {
  return;
}
{code}
The intent was to {{assert}} and crash, so tests against this code can detect 
violations if the code is modified. When assertions are disabled, the check is 
elided
* Instead of two cstr that assign all the final fields, the no-arg should call 
the other
* Instead of explicitly throwing {{ClassCastException}}, this should just 
attempt the cast. The cause is implicit, and doesn't require a custom error 
string

 Admission Control: Reservation subsystem
 

 Key: YARN-1709
 URL: https://issues.apache.org/jira/browse/YARN-1709
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Subramaniam Krishnan
 Attachments: YARN-1709.patch, YARN-1709.patch, YARN-1709.patch, 
 YARN-1709.patch, YARN-1709.patch, YARN-1709.patch


 This JIRA is about the key data structure used to track resources over time 
 to enable YARN-1051. The Reservation subsystem is conceptually a plan of 
 how the scheduler will allocate resources over-time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2475) ReservationSystem: replan upon capacity reduction

2014-09-12 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14132210#comment-14132210
 ] 

Chris Douglas commented on YARN-2475:
-

Yes, that makes sense. Just curious about the contract.

 ReservationSystem: replan upon capacity reduction
 -

 Key: YARN-2475
 URL: https://issues.apache.org/jira/browse/YARN-2475
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: YARN-2475.patch, YARN-2475.patch, YARN-2475.patch


 In the context of YARN-1051, if capacity of the cluster drops significantly 
 upon machine failures we need to trigger a reorganization of the planned 
 reservations. As reservations are absolute it is possible that they will 
 not all fit, and some need to be rejected a-posteriori.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1710) Admission Control: agents to allocate reservation

2014-09-12 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14132466#comment-14132466
 ] 

Chris Douglas commented on YARN-1710:
-

+1 lgtm. Thanks [~curino] for all the iterations on this

 Admission Control: agents to allocate reservation
 -

 Key: YARN-1710
 URL: https://issues.apache.org/jira/browse/YARN-1710
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: YARN-1710.1.patch, YARN-1710.2.patch, YARN-1710.3.patch, 
 YARN-1710.4.patch, YARN-1710.patch


 This JIRA tracks the algorithms used to allocate a user ReservationRequest 
 coming in from the new reservation API (YARN-1708), in the inventory 
 subsystem (YARN-1709) maintaining the current plan for the cluster. The focus 
 of this agents is to quickly find a solution for the set of contraints 
 provided by the user, and the physical constraints of the plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1711) CapacityOverTimePolicy: a policy to enforce quotas over time for YARN-1709

2014-09-14 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133406#comment-14133406
 ] 

Chris Douglas commented on YARN-1711:
-

General
- The public classes should be annotated with the correct visibility and 
stability annotations (probably {{@Public}} and {{@Unstable}})

{{SharingPolicy}}
- Javadoc throws clause and some parameters not populated
- In particular, the {{excludeList}} parameter could use some unpacking.

{{CapacityOverTimePolicy}}
- Just performing the cast, and throwing {{ClassCastException}} implicitly is 
equally clear
- nit: spacing/concat: {{plan.getTotalCapacity() + )by  + accepting 
reservation: }}
- Just as an observation, no change requested: the assumption that the sharing 
policy holds the lock on the plan is probably OK, but since both are interfaces 
there may be a missing abstraction that associates compatible sets of 
interlocking components.
- I can't think of a more appropriate solution to handling aggregates of 
{{Resource}}. Anything more correct doesn't really justify the complexity, 
certainly not before we get some more experience with planning. Since enabling 
this is optional, enforcement with the {{IntegralResource}} is a pragmatic 
tradeoff.

{{*Exception}}
- s/MismatchingUserException/MismatchedUserException/
- Are the subclasses of {{PlanningException}} for a caller to distinguish the 
cause for the rejected request, so it can refine it? If that's the case, should 
they contain diagnostic information as a payload e.g., requested vs actual 
user? If the intent is to extract it, then some more easily parsed format for 
the message might be appropriate (e.g., JSON).

{{NoOverCommitPolicy}}
- The {{excludeList}} should probably be final, and cleared/populated with a 
clone of the set on calls to {{init()}}

{{CapacitySchedulerConfiguration}}
- Missing javadoc for the new parameters.

{{TestNoOverCommitPolicy}}
- Consider using {{@Test(expected = SomeException.class)}} instead of 
{{Assert::fail()}} and try/catch for {{testSingleFail()}}
- Consider specifying the expected cause/subtype instead of generic 
{{PlanningException}}
- {{testMultiTenantFail}} only veries that a {{PlanningException}} is thrown, 
not that it fails as expected

{{TestCapacityOverTimePolicy}}
- Most of the tests don't verify that the failure occurs when and how its 
parameters specify, but only check that a {{PlanningException}} is thrown.

 CapacityOverTimePolicy: a policy to enforce quotas over time for YARN-1709
 --

 Key: YARN-1711
 URL: https://issues.apache.org/jira/browse/YARN-1711
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Carlo Curino
Assignee: Carlo Curino
  Labels: reservations
 Attachments: YARN-1711.1.patch, YARN-1711.2.patch, YARN-1711.3.patch, 
 YARN-1711.patch


 This JIRA tracks the development of a policy that enforces user quotas (a 
 time-extension of the notion of capacity) in the inventory subsystem 
 discussed in YARN-1709.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1711) CapacityOverTimePolicy: a policy to enforce quotas over time for YARN-1709

2014-09-15 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134718#comment-14134718
 ] 

Chris Douglas commented on YARN-1711:
-

+1 Thanks for addressing the feedback on the patch

 CapacityOverTimePolicy: a policy to enforce quotas over time for YARN-1709
 --

 Key: YARN-1711
 URL: https://issues.apache.org/jira/browse/YARN-1711
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Carlo Curino
Assignee: Carlo Curino
  Labels: reservations
 Attachments: YARN-1711.1.patch, YARN-1711.2.patch, YARN-1711.3.patch, 
 YARN-1711.4.patch, YARN-1711.patch


 This JIRA tracks the development of a policy that enforces user quotas (a 
 time-extension of the notion of capacity) in the inventory subsystem 
 discussed in YARN-1709.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1051) YARN Admission Control/Planner: enhancing the resource allocation model with time.

2014-10-06 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated YARN-1051:

Fix Version/s: (was: 3.0.0)
   2.6.0

 YARN Admission Control/Planner: enhancing the resource allocation model with 
 time.
 --

 Key: YARN-1051
 URL: https://issues.apache.org/jira/browse/YARN-1051
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler, resourcemanager, scheduler
Reporter: Carlo Curino
Assignee: Carlo Curino
 Fix For: 2.6.0

 Attachments: YARN-1051-design.pdf, YARN-1051.1.patch, 
 YARN-1051.patch, curino_MSR-TR-2013-108.pdf, socc14-paper15.pdf, 
 techreport.pdf


 In this umbrella JIRA we propose to extend the YARN RM to handle time 
 explicitly, allowing users to reserve capacity over time. This is an 
 important step towards SLAs, long-running services, workflows, and helps for 
 gang scheduling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-12 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13630893#comment-13630893
 ] 

Chris Douglas commented on YARN-45:
---

[~sandyr]: Yes, but the correct format/semantics for time are a complex 
discussion in themselves. To keep this easy to review and the discussion 
focused, we were going to file that separately. But I totally agree: for the AM 
to respond intelligently, the time before it's forced to give up the container 
is valuable input.

[~bikash]: Agree almost completely. In YARN-569, the hysteresis you cite 
motivated several design points, including multiple dampers on actions taken by 
the preemption policy, out-of-band observation/enforcement, and no effort to 
fine-tune particular allocations. The role of preemption (to summarize what 
[~curino] discussed in detail in the prenominate JIRA) is to make coarse 
corrections around the core scheduler invariants (e.g., capacity, fairness). 
Rather than introducing new races or complexity, one could argue that 
preemption is a dual of allocation in an inconsistent environment.

Your proposal matches case (1) in the above 
[comment|https://issues.apache.org/jira/browse/YARN-45?focusedCommentId=13628950page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13628950],
 where the RM specifies the set of containers in jeopardy and a contract (as 
{{ResourceRequest}}) for avoiding the kills, should the AM have cause to pick 
different containers. Further, your observation that the RM has enough 
information in priorities, etc. to make an educated guess at those containers 
is spot-on. IIRC, the policy uses allocation order when selecting containers, 
but that should be a secondary key after priority.

The disputed point, and I'm not sure we actually disagree, is the claim that 
the AM should never kill things in response to this message. To be fair, that 
can be implemented by just ignoring the requests, so it's orthogonal to this 
particular protocol, but it's certainly an important best practice to discuss 
to ensure we're capturing the right thing. Certainly there are many cases where 
ignoring the message is correct; most CDFs of map task execution time show that 
over 80% finish in less than a minute, so the AM has few reasons to 
pessimistically kill them.

There are a few scenarios where this isn't optimal. Take the case of YARN-415, 
where the AM is billed cumulatively for cluster time. Assume an AM knows (a) 
the container will not finish (reinforcing [~sandyr]'s point about including 
time in the preemption message) and (b) the work done is not worth 
checkpointing. It can conclude that killing the container is in its best 
interest, because squatting on the resource could affect its ability to get 
containers in the future (or simply cost more). Moreover, for long-lived 
services and speculative container allocation/retention, the AM may actually be 
holding the container only as an optimization or for a future execution, so it 
could release it at low cost to itself. Finally, the time allowed before the RM 
starts killing containers can be extended if AMs typically return resources 
before the deadline.

It's also a mechanism for the RM to advise the AM about constraints that 
prevent it from granting its pending requests. The AM currently kills reducers 
if it can't get containers to regenerate lost map output. If the scheduler 
values some containers more than others, the AM's response to starvation can be 
improved from random killing. This is a case where the current implementation 
acknowledges the fact that it already runs in an inconsistent environment.

 Scheduler feedback to AM to release containers
 --

 Key: YARN-45
 URL: https://issues.apache.org/jira/browse/YARN-45
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Chris Douglas
Assignee: Carlo Curino
 Attachments: YARN-45.patch, YARN-45.patch


 The ResourceManager strikes a balance between cluster utilization and strict 
 enforcement of resource invariants in the cluster. Individual allocations of 
 containers must be reclaimed- or reserved- to restore the global invariants 
 when cluster load shifts. In some cases, the ApplicationMaster can respond to 
 fluctuations in resource availability without losing the work already 
 completed by that task (MAPREDUCE-4584). Supplying it with this information 
 would be helpful for overall cluster utilization [1]. To this end, we want to 
 establish a protocol for the RM to ask the AM to release containers.
 [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on 

[jira] [Commented] (YARN-573) Shared data structures in Public Localizer and Private Localizer are not Thread safe.

2013-04-13 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13631268#comment-13631268
 ] 

Chris Douglas commented on YARN-573:


Pardon?

 Shared data structures in Public Localizer and Private Localizer are not 
 Thread safe.
 -

 Key: YARN-573
 URL: https://issues.apache.org/jira/browse/YARN-573
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Omkar Vinit Joshi
Assignee: Omkar Vinit Joshi

 PublicLocalizer
 1) pending accessed by addResource (part of event handling) and run method 
 (as a part of PublicLocalizer.run() ).
 PrivateLocalizer
 1) pending accessed by addResource (part of event handling) and 
 findNextResource (i.remove()).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-16 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13632630#comment-13632630
 ] 

Chris Douglas commented on YARN-45:
---

bq. ResourceRequest is not actionable in the sense that neither of the 
schedulers can currently send a non-empty ResourceRequest to preempt. Both only 
do preemption by containers though they have some plumbing to send RR's if they 
want to do so. So I am not quite sure what you mean by We indeed have code 
that exercises the ResourceRequest version of it.

A prototype impl against MapReduce responds to {{ResourceRequest}} in the 
preempt message. We're currently polishing and splitting that up for review, 
but wanted to get consensus on the Yarn changes in case new requirements 
required reworking the rest.

An RM impl that includes killing for {{ResourceRequest}} (or {{Resource}}) is a 
more invasive change, particularly because (a) the AM needs to reason about 
which recently finished containers are included in the message (i.e., it needs 
to reason about what the RM knows, so the RM needs to be consistent in what it 
tells the AM) and (b) the RM needs to track its previous preemption requests, 
timing them out in the context of existing allocations and exited containers 
(i.e., decisions to preempt need to incorporate subsequent information).

To get experience before proposing anything drastic, we marked this API as 
experimental, wrote the enforcement policy against {{ContainerID}}, and tucked 
it behind a pluggable interface. This way, the AM can ignore stale requests for 
exited containers and the RM can time out particular containers it asked for 
easily; every computed preemption set is bound in a namespace that sidesteps 
the most disruptive impl issues on both sides.

bq.  By not using location we are implicitly using the * location right? 
Might as well make it explicit. Non * locations will make sense when affinity 
based preemptions occur.

Yes, that's exactly the intent. The policy in YARN-569 doesn't attempt to bias 
the preemptions to match the requests in under-capacity queues, but that's a 
natural policy to implement against this protocol.

{quote}
The bare-minimum requirement seems:

# RM should notify the AM that a certain amount of resources will need to be 
reclaimed (ala SIGTERM).
# Thus, the AM gets an opportunity to *pick* which containers it will sacrifice 
to satisfy the RM's requirements.
# Iff the AM doesn't act, the RM will go ahead and terminate some containers 
(probably the most-recently allocated ones); ala SIGKILL.

Given the above, I feel that this is a set of changes we need to be 
conservative about - particularly since the really simple pre-emption i.e. 
SIGKILL alone on RM side is trivial (from an API perspective).
{quote}

Totally agreed. The symmetry of {{ResourceRequest}} in the ask-back is 
attractive, but it's not a sufficient condition. To it, I'd add all the 
familiar attributes of using them in allocation requests (economy, 
expressiveness, versatility). While {{Resource}} covers the current impl, it 
leaves little room for related improvements, or even refinements (e.g., 
preferring resources requested by under-capacity queues, prioritizing types of 
containers, and time).

The API isn't that complex, but a strict implementation would change the RM 
more, adding risk. To mitigate that, but still encourage applications to write 
against the richer type while we get experience with it, [~curino]'s 
formulation 
[above|https://issues.apache.org/jira/browse/YARN-45?focusedCommentId=13628950page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13628950]
 seems like a decent set of semantics...

We could add a new type that encodes a subset of the {{ResourceRequest}} type. 
It lacks symmetry, but it also allows them to evolve independently.

 Scheduler feedback to AM to release containers
 --

 Key: YARN-45
 URL: https://issues.apache.org/jira/browse/YARN-45
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Chris Douglas
Assignee: Carlo Curino
 Attachments: YARN-45.patch, YARN-45.patch


 The ResourceManager strikes a balance between cluster utilization and strict 
 enforcement of resource invariants in the cluster. Individual allocations of 
 containers must be reclaimed- or reserved- to restore the global invariants 
 when cluster load shifts. In some cases, the ApplicationMaster can respond to 
 fluctuations in resource availability without losing the work already 
 completed by that task (MAPREDUCE-4584). Supplying it with this information 
 would be helpful for overall cluster utilization [1]. To this end, we want to 
 establish a protocol for the RM to ask the AM to release containers.
 [1] 

[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-25 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642163#comment-13642163
 ] 

Chris Douglas commented on YARN-45:
---

If everyone's OK with the current patch as a base, I'll commit it in the next 
couple days.

 Scheduler feedback to AM to release containers
 --

 Key: YARN-45
 URL: https://issues.apache.org/jira/browse/YARN-45
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Chris Douglas
Assignee: Carlo Curino
 Attachments: YARN-45.patch, YARN-45.patch, YARN-45.patch


 The ResourceManager strikes a balance between cluster utilization and strict 
 enforcement of resource invariants in the cluster. Individual allocations of 
 containers must be reclaimed- or reserved- to restore the global invariants 
 when cluster load shifts. In some cases, the ApplicationMaster can respond to 
 fluctuations in resource availability without losing the work already 
 completed by that task (MAPREDUCE-4584). Supplying it with this information 
 would be helpful for overall cluster utilization [1]. To this end, we want to 
 establish a protocol for the RM to ask the AM to release containers.
 [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-29 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644626#comment-13644626
 ] 

Chris Douglas commented on YARN-45:
---

I'm also a fan of {{ResourceRequest}}, but we're not really using all its 
features, yet. Similarly, {{Resource}} bakes in the fungibility of resources, 
which could be awkward as the RM accommodates richer requests (as in YARN-392).

We could use {{ResourceRequest}}- so the API is there for extensions- but only 
populate the capability as an aggregate. With the convention that \-1 
containers can mean packed as you see fit, it expresses {{Resource}} (which 
we need in practice, since the priorities for requests don't always [match the 
preemption 
order|https://issues.apache.org/jira/browse/YARN-569?focusedCommentId=13638825page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13638825]),
 which is sufficient for the current schedulers.

If we're adding the contract back with the set of containers, the 
[semantics|https://issues.apache.org/jira/browse/YARN-45?focusedCommentId=13628950page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13628950]
 we discussed earlier still seem OK.

 Scheduler feedback to AM to release containers
 --

 Key: YARN-45
 URL: https://issues.apache.org/jira/browse/YARN-45
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Chris Douglas
Assignee: Carlo Curino
 Attachments: YARN-45.patch, YARN-45.patch, YARN-45.patch, 
 YARN-45.patch, YARN-45.patch, YARN-45_summary_of_alternatives.pdf


 The ResourceManager strikes a balance between cluster utilization and strict 
 enforcement of resource invariants in the cluster. Individual allocations of 
 containers must be reclaimed- or reserved- to restore the global invariants 
 when cluster load shifts. In some cases, the ApplicationMaster can respond to 
 fluctuations in resource availability without losing the work already 
 completed by that task (MAPREDUCE-4584). Supplying it with this information 
 would be helpful for overall cluster utilization [1]. To this end, we want to 
 establish a protocol for the RM to ask the AM to release containers.
 [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-29 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644675#comment-13644675
 ] 

Chris Douglas commented on YARN-45:
---

bq. we could express the ResourceRequest as a multiple of the minimum allocation

+1 This is better

 Scheduler feedback to AM to release containers
 --

 Key: YARN-45
 URL: https://issues.apache.org/jira/browse/YARN-45
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Chris Douglas
Assignee: Carlo Curino
 Attachments: YARN-45.patch, YARN-45.patch, YARN-45.patch, 
 YARN-45.patch, YARN-45.patch, YARN-45_summary_of_alternatives.pdf


 The ResourceManager strikes a balance between cluster utilization and strict 
 enforcement of resource invariants in the cluster. Individual allocations of 
 containers must be reclaimed- or reserved- to restore the global invariants 
 when cluster load shifts. In some cases, the ApplicationMaster can respond to 
 fluctuations in resource availability without losing the work already 
 completed by that task (MAPREDUCE-4584). Supplying it with this information 
 would be helpful for overall cluster utilization [1]. To this end, we want to 
 establish a protocol for the RM to ask the AM to release containers.
 [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (YARN-45) Scheduler feedback to AM to release containers

2013-05-07 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13650527#comment-13650527
 ] 

Chris Douglas edited comment on YARN-45 at 5/7/13 6:20 AM:
---

bq. Would be great if you could add a version number to your patches.

Sorry, we weren't sure of the current convention.

{quote}
 - PreemptionMessage.strict should perhaps be named strictContract
explicitly. You did name the setters and the getters verbosely which
is good.
 - You should mark all the api getters and setters to be synchronized.
There are similar locking bugs in other existing records too but we
are tracking them elsewhere.
 - PreemptionContainer.getId() - Javadoc should refer to containers
instead of Resource?
 - PreemptionContract.getContainers() - Javadoc referring to
codeResourceManager/code may also include a @link
PreemptionContract that, if satisfied, may replace these doesn't make
sense to me.
{quote}

Fixed all of these; last one was a copy/paste of an older version of
the code. Thanks for catching these.

[~bikassaha]: we took another attempt at the javadoc, but it's
probably still not sufficient. We opened YARN-650 to track
documentation of this feature in the AM how-to, which we'll address
presently.

(thanks everyone for the great feedback!)


  was (Author: curino):
bq. Would be great if you could add a version number to your patches.

Sorry, we weren't sure of the current convention.

{quote}
 - PreemptionMessage.strict should perhaps be named strictContract
explicitly. You did name the setters and the getters verbosely which
is good.
 - You should mark all the api getters and setters to be synchronized.
There are similar locking bugs in other existing records too but we
are tracking them elsewhere.
 - PreemptionContainer.getId() - Javadoc should refer to containers
instead of Resource?
 - PreemptionContract.getContainers() - Javadoc referring to
codeResourceManager/code may also include a @link
PreemptionContract that, if satisfied, may replace these doesn't make
sense to me.
{quote}

Fixed all of these; last one was a copy/paste of an older version of
the code. Thanks for catching these.

[~bikassaha]: we took another attempt at the javadoc, but it's
probably still not sufficient. We opened YARN-XXX to track
documentation of this feature in the AM how-to, which we'll address
presently.

(thanks everyone for the great feedback!)

  
 Scheduler feedback to AM to release containers
 --

 Key: YARN-45
 URL: https://issues.apache.org/jira/browse/YARN-45
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Chris Douglas
Assignee: Carlo Curino
 Fix For: 2.0.5-beta

 Attachments: YARN-45.1.patch, YARN-45.patch, YARN-45.patch, 
 YARN-45.patch, YARN-45.patch, YARN-45.patch, YARN-45.patch, 
 YARN-45_summary_of_alternatives.pdf


 The ResourceManager strikes a balance between cluster utilization and strict 
 enforcement of resource invariants in the cluster. Individual allocations of 
 containers must be reclaimed- or reserved- to restore the global invariants 
 when cluster load shifts. In some cases, the ApplicationMaster can respond to 
 fluctuations in resource availability without losing the work already 
 completed by that task (MAPREDUCE-4584). Supplying it with this information 
 would be helpful for overall cluster utilization [1]. To this end, we want to 
 establish a protocol for the RM to ask the AM to release containers.
 [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-650) User guide for preemption

2013-05-07 Thread Chris Douglas (JIRA)
Chris Douglas created YARN-650:
--

 Summary: User guide for preemption
 Key: YARN-650
 URL: https://issues.apache.org/jira/browse/YARN-650
 Project: Hadoop YARN
  Issue Type: Task
  Components: documentation
Reporter: Chris Douglas
Priority: Minor
 Fix For: 2.0.5-beta


YARN-45 added a protocol for the RM to ask back resources. The docs on writing 
YARN applications should include a section on how to interpret this message.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-567) RM changes to support preemption for FairScheduler and CapacityScheduler

2013-05-07 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated YARN-567:
---

Attachment: (was: YARN-567-1.patch)

 RM changes to support preemption for FairScheduler and CapacityScheduler
 

 Key: YARN-567
 URL: https://issues.apache.org/jira/browse/YARN-567
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: YARN-567.patch, YARN-567.patch


 A common tradeoff in scheduling jobs is between keeping the cluster busy and 
 enforcing capacity/fairness properties. FairScheduler and CapacityScheduler 
 takes opposite stance on how to achieve this. 
 The FairScheduler, leverages task-killing to quickly reclaim resources from 
 currently running jobs and redistributing them among new jobs, thus keeping 
 the cluster busy but waste useful work. The CapacityScheduler is typically 
 tuned
 to limit the portion of the cluster used by each queue so that the likelihood 
 of violating capacity is low, thus never wasting work, but risking to keep 
 the cluster underutilized or have jobs waiting to obtain their rightful 
 capacity. 
 By introducing the notion of a work-preserving preemption we can remove this 
 tradeoff.  This requires a protocol for preemption (YARN-45), and 
 ApplicationMasters that can answer to preemption  efficiently (e.g., by 
 saving their intermediate state, this will be posted for MapReduce in a 
 separate JIRA soon), together with a scheduler that can issues preemption 
 requests (discussed in separate JIRAs YARN-568 and YARN-569).
 The changes we track with this JIRA are common to FairScheduler and 
 CapacityScheduler, and are mostly propagation of preemption decisions through 
 the ApplicationMastersService.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-567) RM changes to support preemption for FairScheduler and CapacityScheduler

2013-05-07 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated YARN-567:
---

Attachment: YARN-567-1.patch

 RM changes to support preemption for FairScheduler and CapacityScheduler
 

 Key: YARN-567
 URL: https://issues.apache.org/jira/browse/YARN-567
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: YARN-567.patch, YARN-567.patch


 A common tradeoff in scheduling jobs is between keeping the cluster busy and 
 enforcing capacity/fairness properties. FairScheduler and CapacityScheduler 
 takes opposite stance on how to achieve this. 
 The FairScheduler, leverages task-killing to quickly reclaim resources from 
 currently running jobs and redistributing them among new jobs, thus keeping 
 the cluster busy but waste useful work. The CapacityScheduler is typically 
 tuned
 to limit the portion of the cluster used by each queue so that the likelihood 
 of violating capacity is low, thus never wasting work, but risking to keep 
 the cluster underutilized or have jobs waiting to obtain their rightful 
 capacity. 
 By introducing the notion of a work-preserving preemption we can remove this 
 tradeoff.  This requires a protocol for preemption (YARN-45), and 
 ApplicationMasters that can answer to preemption  efficiently (e.g., by 
 saving their intermediate state, this will be posted for MapReduce in a 
 separate JIRA soon), together with a scheduler that can issues preemption 
 requests (discussed in separate JIRAs YARN-568 and YARN-569).
 The changes we track with this JIRA are common to FairScheduler and 
 CapacityScheduler, and are mostly propagation of preemption decisions through 
 the ApplicationMastersService.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-568) FairScheduler: support for work-preserving preemption

2013-05-07 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated YARN-568:
---

Attachment: YARN-568-1.patch

 FairScheduler: support for work-preserving preemption 
 --

 Key: YARN-568
 URL: https://issues.apache.org/jira/browse/YARN-568
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: YARN-568-1.patch, YARN-568.patch, YARN-568.patch


 In the attached patch, we modified  the FairScheduler to substitute its 
 preemption-by-killling with a work-preserving version of preemption (followed 
 by killing if the AMs do not respond quickly enough). This should allows to 
 run preemption checking more often, but kill less often (proper tuning to be 
 investigated).  Depends on YARN-567 and YARN-45, is related to YARN-569.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-650) User guide for preemption

2013-05-07 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated YARN-650:
---

Attachment: Y650-0.patch

 User guide for preemption
 -

 Key: YARN-650
 URL: https://issues.apache.org/jira/browse/YARN-650
 Project: Hadoop YARN
  Issue Type: Task
  Components: documentation
Reporter: Chris Douglas
Priority: Minor
 Fix For: 2.0.5-beta

 Attachments: Y650-0.patch


 YARN-45 added a protocol for the RM to ask back resources. The docs on 
 writing YARN applications should include a section on how to interpret this 
 message.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-568) FairScheduler: support for work-preserving preemption

2013-05-09 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13653234#comment-13653234
 ] 

Chris Douglas commented on YARN-568:


+1 I committed this. Thanks Carlo and Sandy

 FairScheduler: support for work-preserving preemption 
 --

 Key: YARN-568
 URL: https://issues.apache.org/jira/browse/YARN-568
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: YARN-568-1.patch, YARN-568-2.patch, YARN-568-2.patch, 
 YARN-568.patch, YARN-568.patch


 In the attached patch, we modified  the FairScheduler to substitute its 
 preemption-by-killling with a work-preserving version of preemption (followed 
 by killing if the AMs do not respond quickly enough). This should allows to 
 run preemption checking more often, but kill less often (proper tuning to be 
 investigated).  Depends on YARN-567 and YARN-45, is related to YARN-569.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-568) FairScheduler: support for work-preserving preemption

2013-05-19 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13661793#comment-13661793
 ] 

Chris Douglas commented on YARN-568:


bq. From the code in generatePreemptionMessage() the overlap between strict and 
fungible is not obvious. Can both be sent?

Yes. From the discussion in YARN-45, it seemed the consensus was that the RM 
may want to send a mix of both requests. Does that still make sense?

bq. Unused new member seems to have been added: recordFactory?

Sorry, an artifact of a previous version. Cleaned up in a followup commit.

 FairScheduler: support for work-preserving preemption 
 --

 Key: YARN-568
 URL: https://issues.apache.org/jira/browse/YARN-568
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Reporter: Carlo Curino
Assignee: Carlo Curino
 Fix For: 2.0.5-beta

 Attachments: YARN-568-1.patch, YARN-568-2.patch, YARN-568-2.patch, 
 YARN-568.patch, YARN-568.patch


 In the attached patch, we modified  the FairScheduler to substitute its 
 preemption-by-killling with a work-preserving version of preemption (followed 
 by killing if the AMs do not respond quickly enough). This should allows to 
 run preemption checking more often, but kill less often (proper tuning to be 
 investigated).  Depends on YARN-567 and YARN-45, is related to YARN-569.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1771) many getFileStatus calls made from node manager for localizing a public distributed cache resource

2014-03-03 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13918548#comment-13918548
 ] 

Chris Douglas commented on YARN-1771:
-

The simpler check doesn't seem to have any practical issues. Since the cache is 
keyed on Paths, the case where a user can refer to an object without access to 
it seems pretty esoteric. As long as the public cache runs with lowered 
privileges, and the check isn't necessary to verify that the public resource 
isn't private to YARN. Copying with the user's HDFS credentials avoids that, 
though that seems like a heavyweight solution if reducing getFileStatus calls 
is the only motivation.

 many getFileStatus calls made from node manager for localizing a public 
 distributed cache resource
 --

 Key: YARN-1771
 URL: https://issues.apache.org/jira/browse/YARN-1771
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Critical

 We're observing that the getFileStatus calls are putting a fair amount of 
 load on the name node as part of checking the public-ness for localizing a 
 resource that belong in the public cache.
 We see 7 getFileStatus calls made for each of these resource. We should look 
 into reducing the number of calls to the name node. One example:
 {noformat}
 2014-02-27 18:07:27,351 INFO audit: ... cmd=getfileinfo   
 src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
 2014-02-27 18:07:27,352 INFO audit: ... cmd=getfileinfo   
 src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
 2014-02-27 18:07:27,352 INFO audit: ... cmd=getfileinfo   
 src=/tmp/temp-887708724/tmp883330348 ...
 2014-02-27 18:07:27,353 INFO audit: ... cmd=getfileinfo   
 src=/tmp/temp-887708724 ...
 2014-02-27 18:07:27,353 INFO audit: ... cmd=getfileinfo   src=/tmp ...
 2014-02-27 18:07:27,354 INFO audit: ... cmd=getfileinfo   src=/...
 2014-02-27 18:07:27,354 INFO audit: ... cmd=getfileinfo   
 src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
 2014-02-27 18:07:27,355 INFO audit: ... cmd=open  
 src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1771) many getFileStatus calls made from node manager for localizing a public distributed cache resource

2014-03-03 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13918625#comment-13918625
 ] 

Chris Douglas commented on YARN-1771:
-

bq. Orthogonal to this we have been discussing adding a FileStatus[] 
getFileStatus(Path f) API that returns FileStatus for each path component of f 
in a single RPC.

Symlinks might be awkward to support, but that discussion is for a separate 
ticket. Do you have a JIRA ref?

bq. So I think we need some kind of access check, either as the requesting user 
or explicit access checks like it does today, to avoid a malicious client 
obtaining access to private files via the NM.

An HDFS nobody account?

A cache would probably be correct in almost all cases, though. Since the check 
is only performed when the resource is localized, there could be cases where 
the filesystem is never in the cached state, but those are rare (and as Sandy 
points out, already in the current design). To attack the cache, the writer 
would need to take an unprotected directory, change its permissions, then 
populate it with private data (whose attributes are guessable). Expiring after 
short internals and not populating the cache with failed localization attempts 
could help mitigate its effectiveness.

 many getFileStatus calls made from node manager for localizing a public 
 distributed cache resource
 --

 Key: YARN-1771
 URL: https://issues.apache.org/jira/browse/YARN-1771
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Critical

 We're observing that the getFileStatus calls are putting a fair amount of 
 load on the name node as part of checking the public-ness for localizing a 
 resource that belong in the public cache.
 We see 7 getFileStatus calls made for each of these resource. We should look 
 into reducing the number of calls to the name node. One example:
 {noformat}
 2014-02-27 18:07:27,351 INFO audit: ... cmd=getfileinfo   
 src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
 2014-02-27 18:07:27,352 INFO audit: ... cmd=getfileinfo   
 src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
 2014-02-27 18:07:27,352 INFO audit: ... cmd=getfileinfo   
 src=/tmp/temp-887708724/tmp883330348 ...
 2014-02-27 18:07:27,353 INFO audit: ... cmd=getfileinfo   
 src=/tmp/temp-887708724 ...
 2014-02-27 18:07:27,353 INFO audit: ... cmd=getfileinfo   src=/tmp ...
 2014-02-27 18:07:27,354 INFO audit: ... cmd=getfileinfo   src=/...
 2014-02-27 18:07:27,354 INFO audit: ... cmd=getfileinfo   
 src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
 2014-02-27 18:07:27,355 INFO audit: ... cmd=open  
 src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1771) many getFileStatus calls made from node manager for localizing a public distributed cache resource

2014-03-12 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932203#comment-13932203
 ] 

Chris Douglas commented on YARN-1771:
-

I just skimmed the patch, but it lgtm. The LoadingCache impl is very clean, and 
only caching over the course of a container localization relieves one of any 
practical responsibility to limit the cache size (that said, might as well add 
something fixed). Only minor, optional nits: If a path is invalid/inaccessible, 
it might make sense to memoize the failure, also. {{FSDownload::isPublic}} can 
be package-private (and annotated w/ {{\@VisibleForTesting}} for the unit test, 
rather than public.

 many getFileStatus calls made from node manager for localizing a public 
 distributed cache resource
 --

 Key: YARN-1771
 URL: https://issues.apache.org/jira/browse/YARN-1771
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Critical
 Attachments: yarn-1771.patch, yarn-1771.patch, yarn-1771.patch


 We're observing that the getFileStatus calls are putting a fair amount of 
 load on the name node as part of checking the public-ness for localizing a 
 resource that belong in the public cache.
 We see 7 getFileStatus calls made for each of these resource. We should look 
 into reducing the number of calls to the name node. One example:
 {noformat}
 2014-02-27 18:07:27,351 INFO audit: ... cmd=getfileinfo   
 src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
 2014-02-27 18:07:27,352 INFO audit: ... cmd=getfileinfo   
 src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
 2014-02-27 18:07:27,352 INFO audit: ... cmd=getfileinfo   
 src=/tmp/temp-887708724/tmp883330348 ...
 2014-02-27 18:07:27,353 INFO audit: ... cmd=getfileinfo   
 src=/tmp/temp-887708724 ...
 2014-02-27 18:07:27,353 INFO audit: ... cmd=getfileinfo   src=/tmp ...
 2014-02-27 18:07:27,354 INFO audit: ... cmd=getfileinfo   src=/...
 2014-02-27 18:07:27,354 INFO audit: ... cmd=getfileinfo   
 src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
 2014-02-27 18:07:27,355 INFO audit: ... cmd=open  
 src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1771) many getFileStatus calls made from node manager for localizing a public distributed cache resource

2014-03-12 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932651#comment-13932651
 ] 

Chris Douglas commented on YARN-1771:
-

bq. I'm also going to make changes to memoize failures. The only slight 
hesitation I have is normally that would be quite rare, but I think it's a good 
thing to have.

Agreed, I doubt it will have a significant impact, here. In a 
shared/longer-lived cache it might be marginally more useful, but still rare.

bq. I did think about making the stat cache longer-lived. But the complexity of 
managing its size as well as the values getting quite stale dissuaded me from 
it. Let me know if you agree...

*nod* Since the goal is to reduce stress on the NN, deferring that complexity 
until necessary is a good plan.

 many getFileStatus calls made from node manager for localizing a public 
 distributed cache resource
 --

 Key: YARN-1771
 URL: https://issues.apache.org/jira/browse/YARN-1771
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Critical
 Attachments: yarn-1771.patch, yarn-1771.patch, yarn-1771.patch


 We're observing that the getFileStatus calls are putting a fair amount of 
 load on the name node as part of checking the public-ness for localizing a 
 resource that belong in the public cache.
 We see 7 getFileStatus calls made for each of these resource. We should look 
 into reducing the number of calls to the name node. One example:
 {noformat}
 2014-02-27 18:07:27,351 INFO audit: ... cmd=getfileinfo   
 src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
 2014-02-27 18:07:27,352 INFO audit: ... cmd=getfileinfo   
 src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
 2014-02-27 18:07:27,352 INFO audit: ... cmd=getfileinfo   
 src=/tmp/temp-887708724/tmp883330348 ...
 2014-02-27 18:07:27,353 INFO audit: ... cmd=getfileinfo   
 src=/tmp/temp-887708724 ...
 2014-02-27 18:07:27,353 INFO audit: ... cmd=getfileinfo   src=/tmp ...
 2014-02-27 18:07:27,354 INFO audit: ... cmd=getfileinfo   src=/...
 2014-02-27 18:07:27,354 INFO audit: ... cmd=getfileinfo   
 src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
 2014-02-27 18:07:27,355 INFO audit: ... cmd=open  
 src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1771) many getFileStatus calls made from node manager for localizing a public distributed cache resource

2014-03-13 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated YARN-1771:


Issue Type: Improvement  (was: Bug)

 many getFileStatus calls made from node manager for localizing a public 
 distributed cache resource
 --

 Key: YARN-1771
 URL: https://issues.apache.org/jira/browse/YARN-1771
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Critical
 Attachments: yarn-1771.patch, yarn-1771.patch, yarn-1771.patch, 
 yarn-1771.patch


 We're observing that the getFileStatus calls are putting a fair amount of 
 load on the name node as part of checking the public-ness for localizing a 
 resource that belong in the public cache.
 We see 7 getFileStatus calls made for each of these resource. We should look 
 into reducing the number of calls to the name node. One example:
 {noformat}
 2014-02-27 18:07:27,351 INFO audit: ... cmd=getfileinfo   
 src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
 2014-02-27 18:07:27,352 INFO audit: ... cmd=getfileinfo   
 src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
 2014-02-27 18:07:27,352 INFO audit: ... cmd=getfileinfo   
 src=/tmp/temp-887708724/tmp883330348 ...
 2014-02-27 18:07:27,353 INFO audit: ... cmd=getfileinfo   
 src=/tmp/temp-887708724 ...
 2014-02-27 18:07:27,353 INFO audit: ... cmd=getfileinfo   src=/tmp ...
 2014-02-27 18:07:27,354 INFO audit: ... cmd=getfileinfo   src=/...
 2014-02-27 18:07:27,354 INFO audit: ... cmd=getfileinfo   
 src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
 2014-02-27 18:07:27,355 INFO audit: ... cmd=open  
 src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1771) many getFileStatus calls made from node manager for localizing a public distributed cache resource

2014-03-14 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated YARN-1771:


Fix Version/s: 2.5.0

 many getFileStatus calls made from node manager for localizing a public 
 distributed cache resource
 --

 Key: YARN-1771
 URL: https://issues.apache.org/jira/browse/YARN-1771
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Critical
 Fix For: 3.0.0, 2.4.0, 2.5.0

 Attachments: yarn-1771.patch, yarn-1771.patch, yarn-1771.patch, 
 yarn-1771.patch


 We're observing that the getFileStatus calls are putting a fair amount of 
 load on the name node as part of checking the public-ness for localizing a 
 resource that belong in the public cache.
 We see 7 getFileStatus calls made for each of these resource. We should look 
 into reducing the number of calls to the name node. One example:
 {noformat}
 2014-02-27 18:07:27,351 INFO audit: ... cmd=getfileinfo   
 src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
 2014-02-27 18:07:27,352 INFO audit: ... cmd=getfileinfo   
 src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
 2014-02-27 18:07:27,352 INFO audit: ... cmd=getfileinfo   
 src=/tmp/temp-887708724/tmp883330348 ...
 2014-02-27 18:07:27,353 INFO audit: ... cmd=getfileinfo   
 src=/tmp/temp-887708724 ...
 2014-02-27 18:07:27,353 INFO audit: ... cmd=getfileinfo   src=/tmp ...
 2014-02-27 18:07:27,354 INFO audit: ... cmd=getfileinfo   src=/...
 2014-02-27 18:07:27,354 INFO audit: ... cmd=getfileinfo   
 src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
 2014-02-27 18:07:27,355 INFO audit: ... cmd=open  
 src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1771) many getFileStatus calls made from node manager for localizing a public distributed cache resource

2014-03-14 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935310#comment-13935310
 ] 

Chris Douglas commented on YARN-1771:
-

bq. It would be great if you could commit this to branch-2.4 too...

Sure, np. Done

 many getFileStatus calls made from node manager for localizing a public 
 distributed cache resource
 --

 Key: YARN-1771
 URL: https://issues.apache.org/jira/browse/YARN-1771
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 2.3.0
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Critical
 Fix For: 3.0.0, 2.4.0, 2.5.0

 Attachments: yarn-1771.patch, yarn-1771.patch, yarn-1771.patch, 
 yarn-1771.patch


 We're observing that the getFileStatus calls are putting a fair amount of 
 load on the name node as part of checking the public-ness for localizing a 
 resource that belong in the public cache.
 We see 7 getFileStatus calls made for each of these resource. We should look 
 into reducing the number of calls to the name node. One example:
 {noformat}
 2014-02-27 18:07:27,351 INFO audit: ... cmd=getfileinfo   
 src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
 2014-02-27 18:07:27,352 INFO audit: ... cmd=getfileinfo   
 src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
 2014-02-27 18:07:27,352 INFO audit: ... cmd=getfileinfo   
 src=/tmp/temp-887708724/tmp883330348 ...
 2014-02-27 18:07:27,353 INFO audit: ... cmd=getfileinfo   
 src=/tmp/temp-887708724 ...
 2014-02-27 18:07:27,353 INFO audit: ... cmd=getfileinfo   src=/tmp ...
 2014-02-27 18:07:27,354 INFO audit: ... cmd=getfileinfo   src=/...
 2014-02-27 18:07:27,354 INFO audit: ... cmd=getfileinfo   
 src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
 2014-02-27 18:07:27,355 INFO audit: ... cmd=open  
 src=/tmp/temp-887708724/tmp883330348/foo-0.0.44.jar ...
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1927) Preemption message shouldn’t be created multiple times for same container-id in ProportionalCapacityPreemptionPolicy

2014-04-12 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967685#comment-13967685
 ] 

Chris Douglas commented on YARN-1927:
-

The decision to preempt a container may be reversed. The policy reiterates its 
request and only kills containers consistently recalled over a grace period. 
The application clears the containers requested in 
{{FiCaSchedulerApp::getAllocation}} after reporting them to the AM.

[~curino], can you confirm that this is the intent?

 Preemption message shouldn’t be created multiple times for same container-id 
 in ProportionalCapacityPreemptionPolicy
 

 Key: YARN-1927
 URL: https://issues.apache.org/jira/browse/YARN-1927
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.4.0
Reporter: Wangda Tan
Assignee: Wangda Tan
Priority: Minor
 Attachments: YARN-1927.patch


 Currently, after each editSchedule() called, preemption message will be 
 created and sent to scheduler. ProportionalCapacityPreemptionPolicy should 
 only send preemption message once for each container.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1957) ProportionalCapacitPreemptionPolicy handling of corner cases...

2014-04-28 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983978#comment-13983978
 ] 

Chris Douglas commented on YARN-1957:
-

+1

Enforcing {{maxCapacity}} in the calculation of the ideal capacity is a good 
fix, and distributing capacity over queues with zero capacity (with the config 
knob to restore the existing 0 == disabled with aggressive preemption) makes 
sense. The code appears to effect this, also. There's a slight optimization 
that can separate the zero-capacity queues during cloning, but the overhead is 
negligible.

 ProportionalCapacitPreemptionPolicy handling of corner cases...
 ---

 Key: YARN-1957
 URL: https://issues.apache.org/jira/browse/YARN-1957
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Carlo Curino
Assignee: Carlo Curino
  Labels: capacity-scheduler, preemption
 Attachments: YARN-1957.patch, YARN-1957.patch, YARN-1957_test.patch


 The current version of ProportionalCapacityPreemptionPolicy should be 
 improved to deal with the following two scenarios:
 1) when rebalancing over-capacity allocations, it potentially preempts 
 without considering the maxCapacity constraints of a queue (i.e., preempting 
 possibly more than strictly necessary)
 2) a zero capacity queue is preempted even if there is no demand (coherent 
 with old use of zero-capacity to disabled queues)
 The proposed patch fixes both issues, and introduce few new test cases.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1957) ProportionalCapacitPreemptionPolicy handling of corner cases...

2014-05-13 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated YARN-1957:


Fix Version/s: 3.0.0
   2.5.0

 ProportionalCapacitPreemptionPolicy handling of corner cases...
 ---

 Key: YARN-1957
 URL: https://issues.apache.org/jira/browse/YARN-1957
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Carlo Curino
Assignee: Carlo Curino
  Labels: capacity-scheduler, preemption
 Fix For: 3.0.0, 2.5.0, 2.4.1

 Attachments: YARN-1957.patch, YARN-1957.patch, YARN-1957_test.patch


 The current version of ProportionalCapacityPreemptionPolicy should be 
 improved to deal with the following two scenarios:
 1) when rebalancing over-capacity allocations, it potentially preempts 
 without considering the maxCapacity constraints of a queue (i.e., preempting 
 possibly more than strictly necessary)
 2) a zero capacity queue is preempted even if there is no demand (coherent 
 with old use of zero-capacity to disabled queues)
 The proposed patch fixes both issues, and introduce few new test cases.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1709) Admission Control: Reservation subsystem

2014-06-09 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025401#comment-14025401
 ] 

Chris Douglas commented on YARN-1709:
-

First pass:

{{RLE::addInterval/removeInterval}}
* This can return true when totCap == 0; when there's no work to do
* There are some redundant calls to {{isSameAsPrevious()}} and 
{{isSameAsNext()}} when adding a non-zero interval (it wouldn't be in the set 
if it were equal, so applying the same delta to each preserves this)
* Allowing a min/max allocation for the RLE would make this data structure more 
general/reusable outside this context
* These return true for all cases (the {{result}} variable is not necessary). 
This makes sense, until it adds invariants like min/max constraints.
* {{removeInterval}}: is it sufficient to compare against Resource(0, 0) 
instead of each member separately?
* {{removeInterval}}: doesn't roll back the transaction when it throws an 
exception, so it could leave an interval partially applied.

{{RLE::getCapacityAtTime/get*}}
* Many of the {{get*}} methods return mutable data, violating the locking. 
These should clone the objects before returning them.

{{RLE::toMemJSONString}}/{{RLE::toString}}
* Consider removing the spaces/newlines in the JSON representation
* Please use a {{StringBuilder}} instead of concatenation 
* Please use one of the JSON libraries on the classpath
* The {{toString()}} could be very verbose. Consider printing a summary 
(#steps, min/max, etc.) instead.

{{InMemoryPlan}}
* This should use a consistent clock time for the tick and archive, or it may 
archive ticks that have not been observed.
* The logic using {{isSuccess}} is confusing. Instead, return false/throw as 
constraints and invariants are violated, and return true when successful.
* {{ReentrantReadWriteLock}} is much slower w/ fair == true; are those 
semantics required?
* Using {{Class::isInstance}} is unconventional; using the instanceof operator 
or equals() (if it requires an exact match) is more common
* The {{*Cis}} fields and functions appear misnamed
* The {{updateCis}} function should be two functions, rather than passing 
{{addOrRemove}}
* {{updateCis}} would be easier to read if it established the invariant of the 
user in the collection, then called {{addInterval}} at the end.
* It looks like users in {{userCis}} are not GC'd
** If this is fixed, there's a potential NPE on 
{{userCis.get(reservation.getUser())}} deref
* {{getAllReservations}} should be package-private, {{@VisibleForTesting}} ; 
remove from interface
* {{headMap}} doesn't return null; these checks can be removed
* This returns mutable {{Resource}} instances from some {{get*}} methods, 
violating locking
* This creates many instances of {{Resource(0, 0)}}; can some of these be 
avoided?
* This should probably clone the {{Resource}} passed to setTotalCapacity
* Please remove the newlines in {{toString()}}
 
{{InMemoryReservationAllocation}}
* Fields can be final
* Why are some fields protected?
* remove newline from {{toString()}}
* Since this implements {{compareTo}}, it should also implement {{equals()}} 
and (particularly since it's added to collections calling it) {{hashCode()}}

General/nit
* some lines are more than 80 characters
* Javadocs contain empty lines
* Instead of two lookups on the HashSet {{containsKey()}}/{{get()}}, this can 
call {{get()}} once and check for {{null}}


 Admission Control: Reservation subsystem
 

 Key: YARN-1709
 URL: https://issues.apache.org/jira/browse/YARN-1709
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Subramaniam Krishnan
 Attachments: YARN-1709.patch


 This JIRA is about the key data structure used to track resources over time 
 to enable YARN-1051. The Reservation subsystem is conceptually a plan of 
 how the scheduler will allocate resources over-time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1374) Resource Manager fails to start due to ConcurrentModificationException

2013-10-30 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13809874#comment-13809874
 ] 

Chris Douglas commented on YARN-1374:
-

+1 lgtm

 Resource Manager fails to start due to ConcurrentModificationException
 --

 Key: YARN-1374
 URL: https://issues.apache.org/jira/browse/YARN-1374
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.3.0
Reporter: Devaraj K
Assignee: Karthik Kambatla
Priority: Blocker
 Attachments: yarn-1374-1.patch, yarn-1374-1.patch


 Resource Manager is failing to start with the below 
 ConcurrentModificationException.
 {code:xml}
 2013-10-30 20:22:42,371 INFO org.apache.hadoop.util.HostsFileReader: 
 Refreshing hosts (include/exclude) list
 2013-10-30 20:22:42,376 INFO org.apache.hadoop.service.AbstractService: 
 Service ResourceManager failed in state INITED; cause: 
 java.util.ConcurrentModificationException
 java.util.ConcurrentModificationException
   at 
 java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372)
   at java.util.AbstractList$Itr.next(AbstractList.java:343)
   at 
 java.util.Collections$UnmodifiableCollection$1.next(Collections.java:1010)
   at 
 org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:187)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:944)
 2013-10-30 20:22:42,378 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.RMHAProtocolService: 
 Transitioning to standby
 2013-10-30 20:22:42,378 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.RMHAProtocolService: 
 Transitioned to standby
 2013-10-30 20:22:42,378 FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting 
 ResourceManager
 java.util.ConcurrentModificationException
   at 
 java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372)
   at java.util.AbstractList$Itr.next(AbstractList.java:343)
   at 
 java.util.Collections$UnmodifiableCollection$1.next(Collections.java:1010)
   at 
 org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:187)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:944)
 2013-10-30 20:22:42,379 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: SHUTDOWN_MSG: 
 /
 SHUTDOWN_MSG: Shutting down ResourceManager at HOST-10-18-40-24/10.18.40.24
 /
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1324) NodeManager potentially causes unnecessary operations on all its disks

2013-10-30 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13809885#comment-13809885
 ] 

Chris Douglas commented on YARN-1324:
-

bq. When does MR use multiple disks in the same task/container? Isnt the map 
output written to a single indexed partition file?

Spills are spread across all volumes, but merged into a single file at the end.

Would randomizing the order of disks be a reasonable short-term workaround for 
(1)? Future changes could weight/elide directories based on other criteria, but 
that's a simple change. So would changing the random selection to bias its 
search order using a hash of the task id (instead of disk usage when creating 
the spill), so the ShuffleHandler could search fewer directories on average. I 
agree with Vinod, it would be hard to prevent the search altogether...

bq. Requiring apps to specify the number of disks for a container is also a 
viable solution and can be done in a back-compatible manner by changing MR to 
specify multiple disks and leaving the default to 1 for apps that dont care.

This makes sense as a hint, but some users might interpret it as a constraint 
and be confused when a NM schedules them on a node the reports fewer local dirs 
(due to failure, heterogeneous config).

 NodeManager potentially causes unnecessary operations on all its disks
 --

 Key: YARN-1324
 URL: https://issues.apache.org/jira/browse/YARN-1324
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.2.0
Reporter: Bikas Saha

 Currently, for every container, the NM creates a directory on every disk and 
 expects the container-task to choose 1 of them and load balance the use of 
 the disks across all containers. 
 1) This may have worked fine in the MR world where MR tasks would randomly 
 choose dirs but in general we cannot expect every app/task writer to 
 understand these nuances and randomly pick disks. So we could end up 
 overloading the first disk if most people decide to use the first disk.
 2) This makes a number of NM operations to scan every disk (thus randomizing 
 that disk) to locate the dir which the task has actually chosen to use for 
 its files. Makes all these operations expensive for the NM as well as 
 disruptive for users of disks that did not have the real task working dirs.
 I propose that NM should up-front decide the disk it is assigning to tasks. 
 It could choose to do so randomly or weighted-randomly by looking at space 
 and load on each disk. So it could do a better job of load balancing. Then, 
 it would associate the chosen working directory with the container context so 
 that subsequent operations on the NM can directly seek to the correct 
 location instead of having to seek on every disk.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (YARN-1471) The SLS simulator is not running the preemption policy for CapacityScheduler

2013-12-17 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated YARN-1471:


Attachment: (was: YARN-1471.patch.2)

 The SLS simulator is not running the preemption policy for CapacityScheduler
 

 Key: YARN-1471
 URL: https://issues.apache.org/jira/browse/YARN-1471
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Carlo Curino
Assignee: Carlo Curino
Priority: Minor
 Attachments: SLSCapacityScheduler.java, YARN-1471.patch


 The simulator does not run the ProportionalCapacityPreemptionPolicy monitor.  
 This is because the policy needs to interact with a CapacityScheduler, and 
 the wrapping done by the simulator breaks this. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (YARN-1471) The SLS simulator is not running the preemption policy for CapacityScheduler

2013-12-17 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated YARN-1471:


Attachment: YARN-1471.patch

 The SLS simulator is not running the preemption policy for CapacityScheduler
 

 Key: YARN-1471
 URL: https://issues.apache.org/jira/browse/YARN-1471
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Carlo Curino
Assignee: Carlo Curino
Priority: Minor
 Attachments: SLSCapacityScheduler.java, YARN-1471.patch, 
 YARN-1471.patch


 The simulator does not run the ProportionalCapacityPreemptionPolicy monitor.  
 This is because the policy needs to interact with a CapacityScheduler, and 
 the wrapping done by the simulator breaks this. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (YARN-1471) The SLS simulator is not running the preemption policy for CapacityScheduler

2013-12-18 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated YARN-1471:


Attachment: YARN-1471.2.patch

 The SLS simulator is not running the preemption policy for CapacityScheduler
 

 Key: YARN-1471
 URL: https://issues.apache.org/jira/browse/YARN-1471
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Carlo Curino
Assignee: Carlo Curino
Priority: Minor
 Attachments: SLSCapacityScheduler.java, YARN-1471.2.patch, 
 YARN-1471.patch, YARN-1471.patch


 The simulator does not run the ProportionalCapacityPreemptionPolicy monitor.  
 This is because the policy needs to interact with a CapacityScheduler, and 
 the wrapping done by the simulator breaks this. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-1471) The SLS simulator is not running the preemption policy for CapacityScheduler

2013-12-18 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852373#comment-13852373
 ] 

Chris Douglas commented on YARN-1471:
-

I committed this. Thanks Carlo

 The SLS simulator is not running the preemption policy for CapacityScheduler
 

 Key: YARN-1471
 URL: https://issues.apache.org/jira/browse/YARN-1471
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Carlo Curino
Assignee: Carlo Curino
Priority: Minor
 Fix For: 3.0.0

 Attachments: SLSCapacityScheduler.java, YARN-1471.2.patch, 
 YARN-1471.patch, YARN-1471.patch


 The simulator does not run the ProportionalCapacityPreemptionPolicy monitor.  
 This is because the policy needs to interact with a CapacityScheduler, and 
 the wrapping done by the simulator breaks this. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Created] (YARN-1518) Ensure CapacityScheduler remains compatible with SLS simulator

2013-12-18 Thread Chris Douglas (JIRA)
Chris Douglas created YARN-1518:
---

 Summary: Ensure CapacityScheduler remains compatible with SLS 
simulator
 Key: YARN-1518
 URL: https://issues.apache.org/jira/browse/YARN-1518
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Reporter: Chris Douglas
Priority: Minor


YARN-1471 added a workaround for the CapacityScheduler and monitors to work 
with the SLS simulator. This issue explores a cleaner integration, including 
tests to verify continued compatibility.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (YARN-2664) Improve RM webapp to expose info about reservations.

2014-11-10 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated YARN-2664:

Assignee: Matteo Mazzucchelli

 Improve RM webapp to expose info about reservations.
 

 Key: YARN-2664
 URL: https://issues.apache.org/jira/browse/YARN-2664
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Matteo Mazzucchelli
 Attachments: PlannerPage_screenshot.pdf, YARN-2664.1.patch, 
 YARN-2664.patch


 YARN-1051 provides a new functionality in the RM to ask for reservation on 
 resources. Exposing this through the webapp GUI is important.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (YARN-2877) Extend YARN to support distributed scheduling

2014-11-19 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated YARN-2877:

Comment: was deleted

(was: (ignore that comment, was for YARN-2875))

 Extend YARN to support distributed scheduling
 -

 Key: YARN-2877
 URL: https://issues.apache.org/jira/browse/YARN-2877
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager, resourcemanager
Reporter: Sriram Rao

 This is an umbrella JIRA that proposes to extend YARN to support distributed 
 scheduling.  Briefly, some of the motivations for distributed scheduling are 
 the following:
 1. Improve cluster utilization by opportunistically executing tasks otherwise 
 idle resources on individual machines.
 2. Reduce allocation latency.  Tasks where the scheduling time dominates 
 (i.e., task execution time is much less compared to the time required for 
 obtaining a container from the RM).
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (YARN-2877) Extend YARN to support distributed scheduling

2014-11-19 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated YARN-2877:

Comment: was deleted

(was: Linking to HADOOP-11317 to cover project-wide use.

I don't think yarn-common needs to explicitly declare a dependency on log4j, at 
least outside the test run. If you comment out that dependency —does everything 
still build?)

 Extend YARN to support distributed scheduling
 -

 Key: YARN-2877
 URL: https://issues.apache.org/jira/browse/YARN-2877
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager, resourcemanager
Reporter: Sriram Rao

 This is an umbrella JIRA that proposes to extend YARN to support distributed 
 scheduling.  Briefly, some of the motivations for distributed scheduling are 
 the following:
 1. Improve cluster utilization by opportunistically executing tasks otherwise 
 idle resources on individual machines.
 2. Reduce allocation latency.  Tasks where the scheduling time dominates 
 (i.e., task execution time is much less compared to the time required for 
 obtaining a container from the RM).
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2877) Extend YARN to support distributed scheduling

2014-12-19 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated YARN-2877:

Assignee: Konstantinos Karanasos

 Extend YARN to support distributed scheduling
 -

 Key: YARN-2877
 URL: https://issues.apache.org/jira/browse/YARN-2877
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager, resourcemanager
Reporter: Sriram Rao
Assignee: Konstantinos Karanasos

 This is an umbrella JIRA that proposes to extend YARN to support distributed 
 scheduling.  Briefly, some of the motivations for distributed scheduling are 
 the following:
 1. Improve cluster utilization by opportunistically executing tasks otherwise 
 idle resources on individual machines.
 2. Reduce allocation latency.  Tasks where the scheduling time dominates 
 (i.e., task execution time is much less compared to the time required for 
 obtaining a container from the RM).
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2718) Create a CompositeConatainerExecutor that combines DockerContainerExecutor and DefaultContainerExecutor

2015-01-26 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292258#comment-14292258
 ] 

Chris Douglas commented on YARN-2718:
-

I share Allen's skepticism. Adding this to the CLC is an invasive change. If 
the purpose is debugging, wouldn't a composite CE that does the demux be 
sufficient? Are there other use cases this supports?

 Create a CompositeConatainerExecutor that combines DockerContainerExecutor 
 and DefaultContainerExecutor
 ---

 Key: YARN-2718
 URL: https://issues.apache.org/jira/browse/YARN-2718
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Abin Shahab
 Attachments: YARN-2718.patch


 There should be a composite container that allows users to run their jobs in 
 DockerContainerExecutor, but switch to DefaultContainerExecutor for debugging 
 purposes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3100) Make YARN authorization pluggable

2015-02-04 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14306738#comment-14306738
 ] 

Chris Douglas commented on YARN-3100:
-

Motivation for the conversion from {{QueueACL}} to the nearly identical, new 
{{YarnAuthorizationProvider.AccessType}}- like the introduction of 
{{PrivilegedEntity}}- is not obvious. Are these pluggable types? Are there 
other, future entities besides queues? Should the authorizer plugin perform the 
mapping from {{QueueACL}}? Just trying to understand the design...

For the {{Default\*}} impl, partial updates for {{refreshQueues}} that become 
visible during the update and after a partial, failed update are hard to reason 
about. While it's a noop for external services, aren't these different 
semantics from the current implementation? Readers are blocked, so there are no 
locks necessary for modifications by {{setPermission}}?

 Make YARN authorization pluggable
 -

 Key: YARN-3100
 URL: https://issues.apache.org/jira/browse/YARN-3100
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-3100.1.patch, YARN-3100.2.patch


 The goal is to have YARN acl model pluggable so as to integrate other 
 authorization tool such as Apache Ranger, Sentry.
 Currently, we have 
 - admin ACL
 - queue ACL
 - application ACL
 - time line domain ACL
 - service ACL
 The proposal is to create a YarnAuthorizationProvider interface. Current 
 implementation will be the default implementation. Ranger or Sentry plug-in 
 can implement  this interface.
 Benefit:
 -  Unify the code base. With the default implementation, we can get rid of 
 each specific ACL manager such as AdminAclManager, ApplicationACLsManager, 
 QueueAclsManager etc.
 - Enable Ranger, Sentry to do authorization for YARN. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3100) Make YARN authorization pluggable

2015-02-04 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14306177#comment-14306177
 ] 

Chris Douglas commented on YARN-3100:
-

[~aw], have you read through the patch? What it implements looks like a pretty 
straightfoward application of the common ACL libraries to queues and 
applications. It just routes some of the YARN checks to a configurable 
component. Is there functionality implemented in the common libs that's not 
being used?

A few quick questions:
* What is the behavior of {{refreshQueues}}? It looks like the provider class 
remains fixed (should it throw an exception if the class in the conf doesn't 
match the singleton?), but every queue's ACLs get reset from the config. The 
refresh isn't transactional, though... if it fails partway through, the ACLs 
could be partially refreshed in the provider. Is that correct? If the provider 
is {{Configurable}}, then it also doesn't get reconfigured, as it will return 
the singleton from the first call to {{getInstance()}}
* Could we avoid pluggable implementations with a {{Default\*}} class? A 
descriptive name is easier to change and... well, descriptive.
* {{PrivilegedEntity}} is an odd class. Would it be possible to expand on its 
definition in the javadoc, and (as a public class) add annotations for its 
intended audience (HADOOP-5073)?


 Make YARN authorization pluggable
 -

 Key: YARN-3100
 URL: https://issues.apache.org/jira/browse/YARN-3100
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-3100.1.patch, YARN-3100.2.patch


 The goal is to have YARN acl model pluggable so as to integrate other 
 authorization tool such as Apache Ranger, Sentry.
 Currently, we have 
 - admin ACL
 - queue ACL
 - application ACL
 - time line domain ACL
 - service ACL
 The proposal is to create a YarnAuthorizationProvider interface. Current 
 implementation will be the default implementation. Ranger or Sentry plug-in 
 can implement  this interface.
 Benefit:
 -  Unify the code base. With the default implementation, we can get rid of 
 each specific ACL manager such as AdminAclManager, ApplicationACLsManager, 
 QueueAclsManager etc.
 - Enable Ranger, Sentry to do authorization for YARN. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3100) Make YARN authorization pluggable

2015-02-05 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308559#comment-14308559
 ] 

Chris Douglas commented on YARN-3100:
-

bq. I agree with you that if construction of Q' fails, we possibly get a mix of 
Q' and Q ACLs, which happens in the existing code.

I think the existing code doesn't have this property. ACLs 
[parsed|https://git1-us-west.apache.org/repos/asf?p=hadoop.git;a=blob;f=hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java;h=c1432101510b30cab5979223c4a52b813cfc7aee;hb=HEAD#l156]
 from the config are stored in a [member 
field|https://git1-us-west.apache.org/repos/asf?p=hadoop.git;a=blob;f=hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/AbstractCSQueue.java;h=e4c26658b0bf5301892ce7c618402ece3a6ea360;hb=HEAD#l273].
 If construction fails, those ACLs aren't installed. The patch moves 
enforcement to the authorizer:
{noformat}
   public boolean hasAccess(QueueACL acl, UserGroupInformation user) {
 synchronized (this) {
-  if (acls.get(acl).isUserAllowed(user)) {
+  if (authorizer.checkPermission(toAccessType(acl), queueEntity, user)) {
 return true;
   }
 }
{noformat}
Which is updated during construction of the replacement queue hierarchy.

 Make YARN authorization pluggable
 -

 Key: YARN-3100
 URL: https://issues.apache.org/jira/browse/YARN-3100
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-3100.1.patch, YARN-3100.2.patch


 The goal is to have YARN acl model pluggable so as to integrate other 
 authorization tool such as Apache Ranger, Sentry.
 Currently, we have 
 - admin ACL
 - queue ACL
 - application ACL
 - time line domain ACL
 - service ACL
 The proposal is to create a YarnAuthorizationProvider interface. Current 
 implementation will be the default implementation. Ranger or Sentry plug-in 
 can implement  this interface.
 Benefit:
 -  Unify the code base. With the default implementation, we can get rid of 
 each specific ACL manager such as AdminAclManager, ApplicationACLsManager, 
 QueueAclsManager etc.
 - Enable Ranger, Sentry to do authorization for YARN. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3100) Make YARN authorization pluggable

2015-02-06 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309909#comment-14309909
 ] 

Chris Douglas commented on YARN-3100:
-

Agreed; definitely a separate JIRA. As state is copied from the old queues, 
some of the methods called in {{CSQueueUtils}} throw exceptions, similar to the 
case you found in {{LeafQueue}}.

 Make YARN authorization pluggable
 -

 Key: YARN-3100
 URL: https://issues.apache.org/jira/browse/YARN-3100
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-3100.1.patch, YARN-3100.2.patch


 The goal is to have YARN acl model pluggable so as to integrate other 
 authorization tool such as Apache Ranger, Sentry.
 Currently, we have 
 - admin ACL
 - queue ACL
 - application ACL
 - time line domain ACL
 - service ACL
 The proposal is to create a YarnAuthorizationProvider interface. Current 
 implementation will be the default implementation. Ranger or Sentry plug-in 
 can implement  this interface.
 Benefit:
 -  Unify the code base. With the default implementation, we can get rid of 
 each specific ACL manager such as AdminAclManager, ApplicationACLsManager, 
 QueueAclsManager etc.
 - Enable Ranger, Sentry to do authorization for YARN. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3100) Make YARN authorization pluggable

2015-02-05 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308126#comment-14308126
 ] 

Chris Douglas commented on YARN-3100:
-

bq.  The reinitializeQueues looks to be transactional, it instantiates all new 
sub queues first and then update the root queue and child queues accordingly. 
And the checkAccess chain will compete the same scheduler lock with the 
refreshQueue.

If there's a queue with root _Q_, say we're constructing _Q'_. In the current 
patch, the {{YarnAuthorizationProvider}} singleton instance will get calls to 
{{setPermission()}} during construction of _Q'_. These (1) will be observable 
by readers of _Q_ who share the instance. I agree that if construction of _Q'_ 
fails then it won't get installed, but (2) _Q_ will run with a mix of _Q'_ and 
_Q_ ACLs because each call to {{setPermission()}} overwrites what was installed 
for _Q_.

I'm curious if (1) and (2) are an artifact of the new plugin architecture or if 
this is also happens in the existing code. Not for external implementations, 
but for the {{Default\*}} one.

bq. Alternatively, the plug-in can choose to add new acl via the setPermission 
when refreshQueue is invoked, but not to replace existing acl. Also, whether to 
add new or update or no, this is something that plug-in itself can decide or 
make it configurable by user.

Maybe I'm being dense, but I don't see how a plugin could implement those 
semantics cleanly. {{YarnAuthorizationProvider}} forces the instance to be a 
singleton, and it gets some sequence of calls to {{setPermission()}}. Since 
queues can't be deleted in the CS, I suppose it could track the sequence of 
calls that install ACLs and only publish new ACLs when it's received updates 
for everything, but that could still yield (2) if the refresh adds new queues 
before the refresh fails.

 Make YARN authorization pluggable
 -

 Key: YARN-3100
 URL: https://issues.apache.org/jira/browse/YARN-3100
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-3100.1.patch, YARN-3100.2.patch


 The goal is to have YARN acl model pluggable so as to integrate other 
 authorization tool such as Apache Ranger, Sentry.
 Currently, we have 
 - admin ACL
 - queue ACL
 - application ACL
 - time line domain ACL
 - service ACL
 The proposal is to create a YarnAuthorizationProvider interface. Current 
 implementation will be the default implementation. Ranger or Sentry plug-in 
 can implement  this interface.
 Benefit:
 -  Unify the code base. With the default implementation, we can get rid of 
 each specific ACL manager such as AdminAclManager, ApplicationACLsManager, 
 QueueAclsManager etc.
 - Enable Ranger, Sentry to do authorization for YARN. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3100) Make YARN authorization pluggable

2015-02-05 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14306937#comment-14306937
 ] 

Chris Douglas commented on YARN-3100:
-

bq. I'm not sure if I get your point, the DefaultYarnAuthorizer currently uses 
a concurrentHashMap to store the acls, setPermission is currently used on queue 
initialization. So I think lock on setPermission is not needed ?

Could the RM be in a state where the old version of ACLs are applied to one 
queue, but a new version is applied to another (a client observes the new ACLs 
while they're being installed)? I think this is true of scenarios where 
{{refreshQueues()}} fails, but I don't know if intermediate states are 
observable.

 Make YARN authorization pluggable
 -

 Key: YARN-3100
 URL: https://issues.apache.org/jira/browse/YARN-3100
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-3100.1.patch, YARN-3100.2.patch


 The goal is to have YARN acl model pluggable so as to integrate other 
 authorization tool such as Apache Ranger, Sentry.
 Currently, we have 
 - admin ACL
 - queue ACL
 - application ACL
 - time line domain ACL
 - service ACL
 The proposal is to create a YarnAuthorizationProvider interface. Current 
 implementation will be the default implementation. Ranger or Sentry plug-in 
 can implement  this interface.
 Benefit:
 -  Unify the code base. With the default implementation, we can get rid of 
 each specific ACL manager such as AdminAclManager, ApplicationACLsManager, 
 QueueAclsManager etc.
 - Enable Ranger, Sentry to do authorization for YARN. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3100) Make YARN authorization pluggable

2015-02-06 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308987#comment-14308987
 ] 

Chris Douglas commented on YARN-3100:
-

Looking through {{AbstractCSQueue}} and {{CSQueueUtils}}, it looks like there 
are many misconfigurations that leave queues in an inconsistent state...

 Make YARN authorization pluggable
 -

 Key: YARN-3100
 URL: https://issues.apache.org/jira/browse/YARN-3100
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He
 Attachments: YARN-3100.1.patch, YARN-3100.2.patch


 The goal is to have YARN acl model pluggable so as to integrate other 
 authorization tool such as Apache Ranger, Sentry.
 Currently, we have 
 - admin ACL
 - queue ACL
 - application ACL
 - time line domain ACL
 - service ACL
 The proposal is to create a YarnAuthorizationProvider interface. Current 
 implementation will be the default implementation. Ranger or Sentry plug-in 
 can implement  this interface.
 Benefit:
 -  Unify the code base. With the default implementation, we can get rid of 
 each specific ACL manager such as AdminAclManager, ApplicationACLsManager, 
 QueueAclsManager etc.
 - Enable Ranger, Sentry to do authorization for YARN. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3074) Nodemanager dies when localizer runner tries to write to a full disk

2015-01-20 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284587#comment-14284587
 ] 

Chris Douglas commented on YARN-3074:
-

bq. catch FSError since it will be a common and recoverable error in this case.

+1

 Nodemanager dies when localizer runner tries to write to a full disk
 

 Key: YARN-3074
 URL: https://issues.apache.org/jira/browse/YARN-3074
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.5.0
Reporter: Jason Lowe
Assignee: Varun Saxena

 When a LocalizerRunner tries to write to a full disk it can bring down the 
 nodemanager process.  Instead of failing the whole process we should fail 
 only the container and make a best attempt to keep going.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3177) Fix the order of the parameters in YarnConfiguration

2015-02-12 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14318874#comment-14318874
 ] 

Chris Douglas commented on YARN-3177:
-

[~brahmareddy] moving code for readability is completely reasonable.

In this particular instance, {{YarnConfiguration}} is a set of fields... 
Javadoc orders them and devs will look up the symbol directly. Those two cover 
basically all the users of the class; it's almost never read. Restructuring it 
offers a low payoff, compared to maintaining the history of when and why that 
field was added to {{YarnConfiguration}}. Of course that's still available, but 
this adds another lookup for a developer, which is more common.

 Fix the order of the parameters in YarnConfiguration
 

 Key: YARN-3177
 URL: https://issues.apache.org/jira/browse/YARN-3177
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
Priority: Minor
 Attachments: YARN-3177.patch


  *1. keep Process principal and keytab one place..( NM and RM are not placed 
 in order)* 
 {code} 
 public static final String RM_AM_MAX_ATTEMPTS =
 RM_PREFIX + am.max-attempts;
   public static final int DEFAULT_RM_AM_MAX_ATTEMPTS = 2;
   
   /** The keytab for the resource manager.*/
   public static final String RM_KEYTAB = 
 RM_PREFIX + keytab;
   /**The kerberos principal to be used for spnego filter for RM.*/
   public static final String RM_WEBAPP_SPNEGO_USER_NAME_KEY =
   RM_PREFIX + webapp.spnego-principal;
   
   /**The kerberos keytab to be used for spnego filter for RM.*/
   public static final String RM_WEBAPP_SPNEGO_KEYTAB_FILE_KEY =
   RM_PREFIX + webapp.spnego-keytab-file;
 {code}
  *2.RM  webapp adress and port are not in order* 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-3192) Empty handler for exception: java.lang.InterruptedException #WebAppProxy.java and #/ResourceManager.java

2015-02-12 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas resolved YARN-3192.
-
Resolution: Not a Problem

Calling {{System.exit(-1)}} is not an acceptable way to shut down the RM. 
Please review the surrounding code.

I'm going to close this, until we can tie a bug to this code. Graceful shutdown 
is difficult to effect, and this issue's scope is too narrow to contribute to 
it.

[~brahmareddy], many of the JIRAs you're filing appear to be detected by 
automated tools. If the interrupt handling here can cause hangs, HA bugs, 
inconsistent replies to users, etc. then please file reports on the 
consequences, citing this as the source.

 Empty handler for exception: java.lang.InterruptedException #WebAppProxy.java 
 and #/ResourceManager.java
 

 Key: YARN-3192
 URL: https://issues.apache.org/jira/browse/YARN-3192
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
 Attachments: YARN-3192.patch


 The InterruptedException is completely ignored. As a result, any events 
 causing this interrupt will be lost.
  File: org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
 {code}
try {
 event = eventQueue.take();
   } catch (InterruptedException e) {
 LOG.error(Returning, interrupted :  + e);
 return; // TODO: Kill RM.
   }
 {code}
 File: org/apache/hadoop/yarn/server/webproxy/WebAppProxy.java
 {code}
 public void join() {
 if(proxyServer != null) {
   try {
 proxyServer.join();
   } catch (InterruptedException e) {
   }
 }
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1039) Add parameter for YARN resource requests to indicate long lived

2015-01-27 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294529#comment-14294529
 ] 

Chris Douglas commented on YARN-1039:
-

Requiring accurate estimates is not realistic, but no service runs forever in 
the same container(s). If container leases can be renewed/refreshed, that's a 
manageable and realistic guarantee for the user (couldn't find a JIRA; it must 
exist). Migration, decommission, OS upgrades, and other operations-in-time on 
containers seem necessary to support long-running services, since preemption is 
comparably heavy-handed. Specifying a precise duration may be a little pedantic 
for the existing use cases, but it seems like the right abstraction.

 Add parameter for YARN resource requests to indicate long lived
 -

 Key: YARN-1039
 URL: https://issues.apache.org/jira/browse/YARN-1039
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 3.0.0, 2.1.1-beta
Reporter: Steve Loughran
Assignee: Craig Welch
 Attachments: YARN-1039.1.patch, YARN-1039.2.patch, YARN-1039.3.patch


 A container request could support a new parameter long-lived. This could be 
 used by a scheduler that would know not to host the service on a transient 
 (cloud: spot priced) node.
 Schedulers could also decide whether or not to allocate multiple long-lived 
 containers on the same node



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1983) Support heterogeneous container types at runtime on YARN

2015-02-01 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300975#comment-14300975
 ] 

Chris Douglas commented on YARN-1983:
-

As in YARN-2718: can't this be implemented as a composite CE, rather than 
changing the CLC? Managing versions of the CE, selecting a compatible CE, 
matching in the scheduler, etc. will require more than the classname to match. 
Configuring multiple CEs covers some useful cases, but if a composite CE is 
sufficient to experiment, then we can avoid a kludge in the protocol.

 Support heterogeneous container types at runtime on YARN
 

 Key: YARN-1983
 URL: https://issues.apache.org/jira/browse/YARN-1983
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Junping Du
 Attachments: YARN-1983.2.patch, YARN-1983.patch


 Different container types (default, LXC, docker, VM box, etc.) have different 
 semantics on isolation of security, namespace/env, performance, etc.
 Per discussions in YARN-1964, we have some good thoughts on supporting 
 different types of containers running on YARN and specified by application at 
 runtime which largely enhance YARN's flexibility to meet heterogenous app's 
 requirement on isolation at runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1039) Add parameter for YARN resource requests to indicate long lived

2015-01-27 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294613#comment-14294613
 ] 

Chris Douglas commented on YARN-1039:
-

[~cwelch] YARN shouldn't understand the lifecycle for a service or the 
progress/dependencies for task containers. As proposed, an AM will receive a 
lease on a container for some duration. Before the lease expires, it can 
relinquish the lease or request that it be renewed. While this adds some 
complexity in the AM implementation- it needs to track and renew its container 
leases- it's mostly library code that admits straightforward, naive 
implementations. The most obvious strawman would request all resources at the 
longest possible duration and always renew.

Mapping an enumeration expressing an AM lifecycle into a policy for requesting, 
refreshing, and managing resources is an excellent client-side abstraction. 
Even if an implementation of YARN only receives (and only issues) leases from a 
fixed set of values, the underlying abstraction can admit arbitrary durations. 
An enumeration is a good API for applications, but it's the RM framework could 
have a more fine-grained substrate.

Leases actually help services run under YARN. By way of example, refusing to 
renew a lease could signal that the node will be decommissioned, or that some 
cluster-wide invariant- like balanced utilization or fairness- is better met by 
(re)moving that container. Refusing to renew a lease- or renewing it for a 
shorter period- could signal the service to request new containers.

 Add parameter for YARN resource requests to indicate long lived
 -

 Key: YARN-1039
 URL: https://issues.apache.org/jira/browse/YARN-1039
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 3.0.0, 2.1.1-beta
Reporter: Steve Loughran
Assignee: Craig Welch
 Attachments: YARN-1039.1.patch, YARN-1039.2.patch, YARN-1039.3.patch


 A container request could support a new parameter long-lived. This could be 
 used by a scheduler that would know not to host the service on a transient 
 (cloud: spot priced) node.
 Schedulers could also decide whether or not to allocate multiple long-lived 
 containers on the same node



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3100) Make YARN authorization pluggable

2015-02-10 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313843#comment-14313843
 ] 

Chris Douglas commented on YARN-3100:
-

Sorry, I didn't get to the patch over the weekend. Thanks for addressing the 
review feedback.

Are there JIRAs following some of the types to be added to PrivilegedEntity? 
Just curious.

 Make YARN authorization pluggable
 -

 Key: YARN-3100
 URL: https://issues.apache.org/jira/browse/YARN-3100
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jian He
Assignee: Jian He
 Fix For: 2.7.0

 Attachments: YARN-3100.1.patch, YARN-3100.2.patch, YARN-3100.2.patch


 The goal is to have YARN acl model pluggable so as to integrate other 
 authorization tool such as Apache Ranger, Sentry.
 Currently, we have 
 - admin ACL
 - queue ACL
 - application ACL
 - time line domain ACL
 - service ACL
 The proposal is to create a YarnAuthorizationProvider interface. Current 
 implementation will be the default implementation. Ranger or Sentry plug-in 
 can implement  this interface.
 Benefit:
 -  Unify the code base. With the default implementation, we can get rid of 
 each specific ACL manager such as AdminAclManager, ApplicationACLsManager, 
 QueueAclsManager etc.
 - Enable Ranger, Sentry to do authorization for YARN. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1983) Support heterogeneous container types at runtime on YARN

2015-02-10 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313831#comment-14313831
 ] 

Chris Douglas commented on YARN-1983:
-

bq. We still need a way to demux the executor to support the case of YARN 
cluster with a mix of executors. That'd mean some impact on the CLC, no?

Policies that select the appropriate executor could demux on the contents of 
the CLC and not a dedicated field. A simple, static dispatch from an 
admin-configured list is a great place to start, but adding a string to the CLC 
that selects the executor class by name is difficult to evolve. Since the same 
semantics are available without changes to the platform, why bake these in?

bq. I think my current patch is intrusive indeed but more general, right?

I'm not sure I follow. How is it more general?

 Support heterogeneous container types at runtime on YARN
 

 Key: YARN-1983
 URL: https://issues.apache.org/jira/browse/YARN-1983
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Junping Du
 Attachments: YARN-1983.2.patch, YARN-1983.patch


 Different container types (default, LXC, docker, VM box, etc.) have different 
 semantics on isolation of security, namespace/env, performance, etc.
 Per discussions in YARN-1964, we have some good thoughts on supporting 
 different types of containers running on YARN and specified by application at 
 runtime which largely enhance YARN's flexibility to meet heterogenous app's 
 requirement on isolation at runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3192) Empty handler for exception: java.lang.InterruptedException #WebAppProxy.java and #/ResourceManager.java

2015-02-13 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14320645#comment-14320645
 ] 

Chris Douglas commented on YARN-3192:
-

bq. w.r.t the WebAppProxy path; we could change the join() method to simply 
pass up the exception; the sole place it is used is WebAppProxyServer.main, 
which catches all throwables and exits with a (-1)

AFAICT, there is no graceful shtudown for {{WebAppProxyServer}}; the intent is 
to exit on interrupt. This would print an error message, Error starting Proxy 
server when the proxy is shut down instead of silently exiting.

Though catching the {{InterruptedException}} in {{WebAppProxyServer}} is 
arguably more correct, so throwing out of {{WebAppProxy::join()}} could be a 
useful change if there are ever other users of {{WebAppProxy}}. That said, I'm 
still not clear what this would achieve.

 Empty handler for exception: java.lang.InterruptedException #WebAppProxy.java 
 and #/ResourceManager.java
 

 Key: YARN-3192
 URL: https://issues.apache.org/jira/browse/YARN-3192
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
 Attachments: YARN-3192.patch


 The InterruptedException is completely ignored. As a result, any events 
 causing this interrupt will be lost.
  File: org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
 {code}
try {
 event = eventQueue.take();
   } catch (InterruptedException e) {
 LOG.error(Returning, interrupted :  + e);
 return; // TODO: Kill RM.
   }
 {code}
 File: org/apache/hadoop/yarn/server/webproxy/WebAppProxy.java
 {code}
 public void join() {
 if(proxyServer != null) {
   try {
 proxyServer.join();
   } catch (InterruptedException e) {
   }
 }
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3369) Missing NullPointer check in AppSchedulingInfo causes RM to die

2015-03-18 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated YARN-3369:

Description: 
In AppSchedulingInfo.java the method checkForDeactivation() has these 2 
consecutive lines:
{code}
ResourceRequest request = getResourceRequest(priority, ResourceRequest.ANY);
if (request.getNumContainers()  0) {
{code}
the first line calls getResourceRequest and it can return null.
{code}
synchronized public ResourceRequest getResourceRequest(
Priority priority, String resourceName) {
MapString, ResourceRequest nodeRequests = requests.get(priority);
return  (nodeRequests == null) ? {color:red} null : 
nodeRequests.get(resourceName);
}
{code}
The second line dereferences the pointer directly without a check.
If the pointer is null, the RM dies. 

{quote}2015-03-17 14:14:04,757 FATAL 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
handling event type NODE_UPDATE to the scheduler
java.lang.NullPointerException
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.checkForDeactivation(AppSchedulingInfo.java:383)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.decrementOutstanding(AppSchedulingInfo.java:375)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateOffSwitch(AppSchedulingInfo.java:360)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:270)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.allocate(FiCaSchedulerApp.java:142)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainer(LeafQueue.java:1559)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignOffSwitchContainers(LeafQueue.java:1384)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainersOnNode(LeafQueue.java:1263)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:816)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:588)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:449)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1017)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1059)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:114)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:739)
at java.lang.Thread.run(Thread.java:722)
{color:red} *2015-03-17 14:14:04,758 INFO 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, 
bbye..*{color} {quote}

  was:
In AppSchedulingInfo.java the method checkForDeactivation() has these 2 
consecutive lines:
{quote} 
{color:red}  ResourceRequest request = getResourceRequest(priority, 
ResourceRequest.ANY);
  if (request.getNumContainers()  0) {
{color}
{quote}
the first line calls getResourceRequest and it can return null.
{quote}
synchronized public ResourceRequest getResourceRequest(
Priority priority, String resourceName) {
MapString, ResourceRequest nodeRequests = requests.get(priority);
{color:red} *return* {color}  (nodeRequests == null) ? {color:red} *null* 
{color} : nodeRequests.get(resourceName);
}
{quote}
The second line dereferences the pointer directly without a check.
If the pointer is null, the RM dies. 

{quote}2015-03-17 14:14:04,757 FATAL 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
handling event type NODE_UPDATE to the scheduler
java.lang.NullPointerException
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.checkForDeactivation(AppSchedulingInfo.java:383)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.decrementOutstanding(AppSchedulingInfo.java:375)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateOffSwitch(AppSchedulingInfo.java:360)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:270)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.allocate(FiCaSchedulerApp.java:142)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainer(LeafQueue.java:1559)
at 

[jira] [Commented] (YARN-3338) Exclude jline dependency from YARN

2015-03-12 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14358963#comment-14358963
 ] 

Chris Douglas commented on YARN-3338:
-

+1 lgtm

 Exclude jline dependency from YARN
 --

 Key: YARN-3338
 URL: https://issues.apache.org/jira/browse/YARN-3338
 Project: Hadoop YARN
  Issue Type: Bug
  Components: build
Reporter: Zhijie Shen
Assignee: Zhijie Shen
Priority: Blocker
 Attachments: YARN-3338.1.patch


 It was fixed in YARN-2815, but is broken again by YARN-1514.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1039) Add parameter for YARN resource requests to indicate long lived

2015-01-29 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298135#comment-14298135
 ] 

Chris Douglas commented on YARN-1039:
-

bq. That's not necessarily so, there are some cases where the type of life 
cycle for an application is important, for example, when determining whether or 
not it is open-ended (service) or a batch process which entails a notion of 
progress (session), at least for purposes of display.

That's a fair distinction. Would you agree the YARN _scheduler_ should not use 
detailed information about progress, task dependencies, or service lifecycles? 
If an AM registers with a tag that affects the attributes displayed in 
dashboards, then issues like YARN-1079 can be resolved cleanly, as you and 
Zhijie propose.

Steve has a point about mixed-mode AMs that run both long and short-lived 
containers (e.g., a long-lived service supporting a workflow composed of short 
tasks). If it's solely for display, then an enum seems adequate, but I'd like 
to better understand the use cases.

 Add parameter for YARN resource requests to indicate long lived
 -

 Key: YARN-1039
 URL: https://issues.apache.org/jira/browse/YARN-1039
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 3.0.0, 2.1.1-beta
Reporter: Steve Loughran
Assignee: Craig Welch
 Attachments: YARN-1039.1.patch, YARN-1039.2.patch, YARN-1039.3.patch


 A container request could support a new parameter long-lived. This could be 
 used by a scheduler that would know not to host the service on a transient 
 (cloud: spot priced) node.
 Schedulers could also decide whether or not to allocate multiple long-lived 
 containers on the same node



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1039) Add parameter for YARN resource requests to indicate long lived

2015-05-18 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549505#comment-14549505
 ] 

Chris Douglas commented on YARN-1039:
-

The semantics of a boolean flag are opaque. The policies enforced by different 
RM configurations (and versions) will not be- and cannot be made to be- 
consistent. Application and container priority are already encoded (or in 
progress, YARN-1963), so it's not just preemption priority or cost. Affinity 
and anti-affinity are also covered by different features. Discussion has been 
wide-ranging because it is unclear what long-lived guarantees across existing 
features (beyond removing the progress bar from the UI, which I hope we can 
stop mentioning).

An implementation that only recognizes infinite and undefined leases could be 
mapped into duration. Lease duration could also be used to communicate when 
security tokens cannot be renewed, short-lived guarantees for YARN-2877 
containers, boundaries of YARN-1051 reservations, and planned decommissioning. 
In contrast, the long-lived flag cannot be used for these cases. We could 
expose probabilistic guarantees (which are what we give in reality), but that's 
a later issue.

Considering the blockers more concretely:
bq. (a) reservations (b) white-listed requests or (c) node-label requests 
getting stuck on a node used by other services' containers that don't exit.

Aren't these handled by adding a timeout to allocations, which would also catch 
cases where this flag is _not_ set? The timeout value could be set across the 
scheduler to start, but could even be user-visible in later versions...

All said, I don't have time to work on this, agree the API can be evolved from 
the flag, and am -0 on it.

 Add parameter for YARN resource requests to indicate long lived
 -

 Key: YARN-1039
 URL: https://issues.apache.org/jira/browse/YARN-1039
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 3.0.0, 2.1.1-beta
Reporter: Steve Loughran
Assignee: Craig Welch
 Attachments: YARN-1039.1.patch, YARN-1039.2.patch, YARN-1039.3.patch


 A container request could support a new parameter long-lived. This could be 
 used by a scheduler that would know not to host the service on a transient 
 (cloud: spot priced) node.
 Schedulers could also decide whether or not to allocate multiple long-lived 
 containers on the same node



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3806) Proposal of Generic Scheduling Framework for YARN

2015-06-23 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14598163#comment-14598163
 ] 

Chris Douglas commented on YARN-3806:
-

[~wshao] Please don't delete obsoleted versions of the design doc, as it 
orphans discussion about them. Also, as you're making updates, please note the 
changes so people don't have to diff the docs.

 Proposal of Generic Scheduling Framework for YARN
 -

 Key: YARN-3806
 URL: https://issues.apache.org/jira/browse/YARN-3806
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Reporter: Wei Shao
 Attachments: ProposalOfGenericSchedulingFrameworkForYARN-V1.05.pdf, 
 ProposalOfGenericSchedulingFrameworkForYARN-V1.06.pdf


 Currently, a typical YARN cluster runs many different kinds of applications: 
 production applications, ad hoc user applications, long running services and 
 so on. Different YARN scheduling policies may be suitable for different 
 applications. For example, capacity scheduling can manage production 
 applications well since application can get guaranteed resource share, fair 
 scheduling can manage ad hoc user applications well since it can enforce 
 fairness among users. However, current YARN scheduling framework doesn’t have 
 a mechanism for multiple scheduling policies work hierarchically in one 
 cluster.
 YARN-3306 talked about many issues of today’s YARN scheduling framework, and 
 proposed a per-queue policy driven framework. In detail, it supported 
 different scheduling policies for leaf queues. However, support of different 
 scheduling policies for upper level queues is not seriously considered yet. 
 A generic scheduling framework is proposed here to address these limitations. 
 It supports different policies (fair, capacity, fifo and so on) for any queue 
 consistently. The proposal tries to solve many other issues in current YARN 
 scheduling framework as well.
 Two new proposed scheduling policies YARN-3807  YARN-3808 are based on 
 generic scheduling framework brought up in this proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3119) Memory limit check need not be enforced unless aggregate usage of all containers is near limit

2015-06-16 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14588429#comment-14588429
 ] 

Chris Douglas commented on YARN-3119:
-

Systems that embrace more forgiving resource enforcement are difficult to tune, 
particularly if those jobs run in multiple environments with different 
constraints (as is common when moving from research/test to production). If 
jobs silently and implicitly use more resources than requested, then users only 
learn that their container is under-provisioned when the cluster workload 
shifts, and their pipelines start to fail.

I agree with [~aw]'s 
[feedback|https://issues.apache.org/jira/browse/YARN-3119?focusedCommentId=14303956page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14303956].
 If this workaround is committed, this should be disabled by default and 
strongly discouraged.

 Memory limit check need not be enforced unless aggregate usage of all 
 containers is near limit
 --

 Key: YARN-3119
 URL: https://issues.apache.org/jira/browse/YARN-3119
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3119.prelim.patch


 Today we kill any container preemptively even if the total usage of 
 containers for that is well within the limit for YARN. Instead if we enforce 
 memory limit only if the total limit of all containers is close to some 
 configurable ratio of overall memory assigned to containers, we can allow for 
 flexibility in container memory usage without adverse effects. This is 
 similar in principle to how cgroups uses soft_limit_in_bytes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1983) Support heterogeneous container types at runtime on YARN

2015-06-12 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584272#comment-14584272
 ] 

Chris Douglas commented on YARN-1983:
-

(sorry for the delayed reply; missed this)

bq. I was proposing we continue the same without adding a new CLC field. Are we 
both saying the same thing then?

Yeah, I think we agree. We don't need to extend the CLC definition for this use 
case, because it's less invasive to add a composite CE that can inspect the CLC 
and demux on a set of rules.

I scanned the patch on YARN-1964, and maybe I'm being dense but I couldn't find 
the demux. It does some validation using patterns...

 Support heterogeneous container types at runtime on YARN
 

 Key: YARN-1983
 URL: https://issues.apache.org/jira/browse/YARN-1983
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Junping Du
 Attachments: YARN-1983.2.patch, YARN-1983.patch


 Different container types (default, LXC, docker, VM box, etc.) have different 
 semantics on isolation of security, namespace/env, performance, etc.
 Per discussions in YARN-1964, we have some good thoughts on supporting 
 different types of containers running on YARN and specified by application at 
 runtime which largely enhance YARN's flexibility to meet heterogenous app's 
 requirement on isolation at runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3820) Collect disks usages on the node

2015-06-29 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14605889#comment-14605889
 ] 

Chris Douglas commented on YARN-3820:
-

[~aw] Is there a corresponding part of the datanode already monitoring these 
resources? I looked, but found only the metrics. This JIRA and YARN-3819 only 
extend the monitoring. As Karthik pointed out in YARN-2745, refactoring for 
more unified resource monitoring is in YARN-3332.

On the patch: looks good, though why does the disk need a {{forcedRead}} 
parameter?

 Collect disks usages on the node
 

 Key: YARN-3820
 URL: https://issues.apache.org/jira/browse/YARN-3820
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 3.0.0
Reporter: Robert Grandl
Assignee: Robert Grandl
  Labels: yarn-common, yarn-util
 Attachments: YARN-3820-1.patch, YARN-3820-2.patch, YARN-3820-3.patch, 
 YARN-3820-4.patch


 In this JIRA we propose to collect disks usages on a node. This JIRA is part 
 of a larger effort of monitoring resource usages on the nodes. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3820) Collect disks usages on the node

2015-06-29 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14605891#comment-14605891
 ] 

Chris Douglas commented on YARN-3820:
-

Oh, didn't see YARN-3819. Will continue there.

 Collect disks usages on the node
 

 Key: YARN-3820
 URL: https://issues.apache.org/jira/browse/YARN-3820
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 3.0.0
Reporter: Robert Grandl
Assignee: Robert Grandl
  Labels: yarn-common, yarn-util
 Attachments: YARN-3820-1.patch, YARN-3820-2.patch, YARN-3820-3.patch, 
 YARN-3820-4.patch


 In this JIRA we propose to collect disks usages on a node. This JIRA is part 
 of a larger effort of monitoring resource usages on the nodes. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3819) Collect network usage on the node

2015-06-29 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606014#comment-14606014
 ] 

Chris Douglas commented on YARN-3819:
-

bq. I think that if we decide to move this to Common, we should move the whole 
ResourceCalculator; otherwise, just finish this one here. I'm willing to start 
the JIRA in Common (or reuse if anybody knows about a JIRA already pushing for 
that) to have the whole ResourceCalculator there.

+1 Let's just do this and move on.

 Collect network usage on the node
 -

 Key: YARN-3819
 URL: https://issues.apache.org/jira/browse/YARN-3819
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 3.0.0
Reporter: Robert Grandl
Assignee: Robert Grandl
  Labels: yarn-common, yarn-util
 Attachments: YARN-3819-1.patch, YARN-3819-2.patch, YARN-3819-3.patch, 
 YARN-3819-4.patch, YARN-3819-5.patch


 In this JIRA we propose to collect the network usage on a node. This JIRA is 
 part of a larger effort of monitoring resource usages on the nodes. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3820) Collect disks usages on the node

2015-06-29 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606047#comment-14606047
 ] 

Chris Douglas commented on YARN-3820:
-

I understand its function, I'm curious why it was added (CPU doesn't include 
this). Did you notice an overhead?

 Collect disks usages on the node
 

 Key: YARN-3820
 URL: https://issues.apache.org/jira/browse/YARN-3820
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 3.0.0
Reporter: Robert Grandl
Assignee: Robert Grandl
  Labels: yarn-common, yarn-util
 Attachments: YARN-3820-1.patch, YARN-3820-2.patch, YARN-3820-3.patch, 
 YARN-3820-4.patch


 In this JIRA we propose to collect disks usages on a node. This JIRA is part 
 of a larger effort of monitoring resource usages on the nodes. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3784) Indicate preemption timout along with the list of containers to AM (preemption message)

2015-06-29 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606205#comment-14606205
 ] 

Chris Douglas commented on YARN-3784:
-

Minor:
- Docs for timeout don't include units
- Many whitespace changes in {{FiCaSchedulerApp}}
- change nested if to {{}} at:
{noformat}
+if (this.preemptionTimeout != 0) {
+  if (timeout  this.preemptionTimeout) {
{noformat}
- Would it be possible to test more than the timeout reported is non-zero? If 
this used a {{Clock}} instead of calling {{System.currentTimeMillis}} directly, 
the unit test could be easier to write...

If containers are preempted for multiple causes (e.g., over-capacity, NM 
decommission), then the time to preempt could vary widely. The ProportionalCPP 
also limits the preempted capacity per round, so a global timeout will be very 
pessimistic. Would it make sense to change {{timeout}} to be {{nextkill}}? More 
general solutions would be significantly more work...

 Indicate preemption timout along with the list of containers to AM 
 (preemption message)
 ---

 Key: YARN-3784
 URL: https://issues.apache.org/jira/browse/YARN-3784
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Sunil G
Assignee: Sunil G
 Attachments: 0001-YARN-3784.patch


 Currently during preemption, AM is notified with a list of containers which 
 are marked for preemption. Introducing a timeout duration also along with 
 this container list so that AM can know how much time it will get to do a 
 graceful shutdown to its containers (assuming one of preemption policy is 
 loaded in AM).
 This will help in decommissioning NM scenarios, where NM will be 
 decommissioned after a timeout (also killing containers on it). This timeout 
 will be helpful to indicate AM that those containers can be killed by RM 
 forcefully after the timeout.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3877) YarnClientImpl.submitApplication swallows exceptions

2015-07-14 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14627026#comment-14627026
 ] 

Chris Douglas commented on YARN-3877:
-

* Not sure I understand this change:
{noformat}
+conf.setLong(YarnConfiguration.
+YARN_CLIENT_APPLICATION_CLIENT_PROTOCOL_POLL_TIMEOUT_MS, 2000);
{noformat}
It seems like it would introduce timing bugs rather than prevent them. The 
{{\@Test}} timeout should prevent the test from hanging; if the poll timeout 
fires before the interrupt is triggered, then the unit test will fail. Does 
config enforce a property that would be unverified without it?
* If necessary, then it should probably also be relative to {{pollIntervalMs}}
* This should probably be a separate test, instead of a subsection of 
{{testSubmitApplication}}

 YarnClientImpl.submitApplication swallows exceptions
 

 Key: YARN-3877
 URL: https://issues.apache.org/jira/browse/YARN-3877
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: client
Affects Versions: 2.7.2
Reporter: Steve Loughran
Assignee: Varun Saxena
Priority: Minor
 Attachments: YARN-3877.01.patch


 When {{YarnClientImpl.submitApplication}} spins waiting for the application 
 to be accepted, any interruption during its Sleep() calls are logged and 
 swallowed.
 this makes it hard to interrupt the thread during shutdown. Really it should 
 throw some form of exception and let the caller deal with it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3612) Resource calculation in child tasks is CPU-heavy

2015-07-07 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617391#comment-14617391
 ] 

Chris Douglas commented on YARN-3612:
-

bq. Moreover, I have not added the config as I do not see anyone disabling it. 
Thoughts ?

The config change could be useful for MR jobs that want to avoid the CPU 
overhead. Suggest changing calls to {{nanoTime}} from {{currentTimeMillis}} 
since it's measuring durations.

+1 overall, even without the change to {{Task}}.

 Resource calculation in child tasks is CPU-heavy
 

 Key: YARN-3612
 URL: https://issues.apache.org/jira/browse/YARN-3612
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Todd Lipcon
  Labels: BB2015-05-RFC, performance
 Attachments: MAPREDUCE-4469.patch, MAPREDUCE-4469_rev2.patch, 
 MAPREDUCE-4469_rev3.patch, MAPREDUCE-4469_rev4.patch, 
 MAPREDUCE-4469_rev5.patch, YARN-3612.01.patch, YARN-3612.02.patch


 In doing some benchmarking on a hadoop-1 derived codebase, I noticed that 
 each of the child tasks was doing a ton of syscalls. Upon stracing, I noticed 
 that it's spending a lot of time looping through all the files in /proc to 
 calculate resource usage.
 As a test, I added a flag to disable use of the ResourceCalculatorPlugin 
 within the tasks. On a CPU-bound 500G-sort workload, this improved total job 
 runtime by about 10% (map slot-seconds by 14%, reduce slot seconds by 8%)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-666) [Umbrella] Support rolling upgrades in YARN

2015-09-09 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated YARN-666:
---
Assignee: (was: Brook Zhou)

> [Umbrella] Support rolling upgrades in YARN
> ---
>
> Key: YARN-666
> URL: https://issues.apache.org/jira/browse/YARN-666
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.0.4-alpha
>Reporter: Siddharth Seth
> Fix For: 2.6.0
>
> Attachments: YARN_Rolling_Upgrades.pdf, YARN_Rolling_Upgrades_v2.pdf
>
>
> Jira to track changes required in YARN to allow rolling upgrades, including 
> documentation and possible upgrade routes. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   >