[jira] [Commented] (YARN-10633) setup yarn federation failed

2021-02-23 Thread Subramaniam Krishnan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17289299#comment-17289299
 ] 

Subramaniam Krishnan commented on YARN-10633:
-

[~hanfrank], there are multiple other configs require to enable federation as 
well. Can you follow the detailed steps in the Configuration section under: 
https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/Federation.html

> setup yarn federation failed
> 
>
> Key: YARN-10633
> URL: https://issues.apache.org/jira/browse/YARN-10633
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: federation
>Affects Versions: 3.2.2
>Reporter: yuguang
>Priority: Major
>
> Hi
> I am trying to setup yarn federation mode. But after I add  below 
> configuration in etc/hadoop/yarn-site.xml
> 
> yarn.federation.enabled
> true
> 
> then when I run yarn node -list  . Get below error . Also the historyserver 
> service can not be started either .
> I am using hadoop-3.2.2 version . 
> [root@yarna hadoop-3.2.2]# yarn node -list
> 2021-02-18 05:51:39,178 INFO service.AbstractService: Service 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl failed in state 
> STARTEDjava.lang.ArrayIndexOutOfBoundsException: Index 0 out of bounds for 
> length 0 at 
> org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider.init(ConfiguredRMFailoverProxyProvider.java:62)
>  at 
> org.apache.hadoop.yarn.client.RMProxy.createRMFailoverProxyProvider(RMProxy.java:175)
>  at org.apache.hadoop.yarn.client.RMProxy.newProxyInstance(RMProxy.java:130) 
> at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:103) at 
> org.apache.hadoop.yarn.client.ClientRMProxy.createRMProxy(ClientRMProxy.java:72)
>  at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceStart(YarnClientImpl.java:233)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) 
> at 
> org.apache.hadoop.yarn.client.cli.YarnCLI.createAndStartYarnClient(YarnCLI.java:55)
>  at org.apache.hadoop.yarn.client.cli.NodeCLI.run(NodeCLI.java:110) at 
> org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at 
> org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) at 
> org.apache.hadoop.yarn.client.cli.NodeCLI.main(NodeCLI.java:62)Exception in 
> thread "main" java.lang.ArrayIndexOutOfBoundsException: Index 0 out of bounds 
> for length 0 at 
> org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider.init(ConfiguredRMFailoverProxyProvider.java:62)
>  at 
> org.apache.hadoop.yarn.client.RMProxy.createRMFailoverProxyProvider(RMProxy.java:175)
>  at org.apache.hadoop.yarn.client.RMProxy.newProxyInstance(RMProxy.java:130) 
> at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:103) at 
> org.apache.hadoop.yarn.client.ClientRMProxy.createRMProxy(ClientRMProxy.java:72)
>  at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceStart(YarnClientImpl.java:233)
>  at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) 
> at 
> org.apache.hadoop.yarn.client.cli.YarnCLI.createAndStartYarnClient(YarnCLI.java:55)
>  at org.apache.hadoop.yarn.client.cli.NodeCLI.run(NodeCLI.java:110) at 
> org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at 
> org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) at 
> org.apache.hadoop.yarn.client.cli.NodeCLI.main(NodeCLI.java:62)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10125) In Federation, kill application from client does not kill Unmanaged AM's and containers launched by Unmanaged AM

2021-02-23 Thread Subramaniam Krishnan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17289294#comment-17289294
 ] 

Subramaniam Krishnan commented on YARN-10125:
-

Thanks [~brahmareddy] for looping me in, agree it should be handled. I recall 
testing this as well in the initial version years back as well, not sure how it 
got dropped in the interim.

Thanks [~dmmkr] for working on this!

> In Federation, kill application from client does not kill Unmanaged AM's and 
> containers launched by Unmanaged AM
> 
>
> Key: YARN-10125
> URL: https://issues.apache.org/jira/browse/YARN-10125
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client, federation, router
>Reporter: D M Murali Krishna Reddy
>Assignee: D M Murali Krishna Reddy
>Priority: Major
> Attachments: YARN-10125.001.patch
>
>
> In Federation, killing an application from client using "bin/yarn application 
> -kill ", kills the containers only of the home subcluster, 
> the Unmanaged AM and the containers launched in other subcluster are not 
> being killed causing blocking of resources.
> The containers get killed after the task gets completed and The unmanaged AM 
> gets killed after 10 minutes of killing the application, killing any 
> remaining running containers in that subcluster.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1187) Add discrete event-based simulation to yarn scheduler simulator

2021-01-11 Thread Subramaniam Krishnan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-1187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17262891#comment-17262891
 ] 

Subramaniam Krishnan commented on YARN-1187:


Thanks [~ywskycn] , I have assigned it to [~afchung90] .

> Add discrete event-based simulation to yarn scheduler simulator
> ---
>
> Key: YARN-1187
> URL: https://issues.apache.org/jira/browse/YARN-1187
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Wei Yan
>Assignee: Andrew Chung
>Priority: Major
>
> Follow the discussion in YARN-1021.
> Discrete event simulation decouples the running from any real-world clock. 
> This allows users to step through the execution, set debug points, and 
> definitely get a deterministic rexec. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-1187) Add discrete event-based simulation to yarn scheduler simulator

2021-01-11 Thread Subramaniam Krishnan (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-1187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan reassigned YARN-1187:
--

Assignee: Andrew Chung  (was: Wei Yan)

> Add discrete event-based simulation to yarn scheduler simulator
> ---
>
> Key: YARN-1187
> URL: https://issues.apache.org/jira/browse/YARN-1187
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Wei Yan
>Assignee: Andrew Chung
>Priority: Major
>
> Follow the discussion in YARN-1021.
> Discrete event simulation decouples the running from any real-world clock. 
> This allows users to step through the execution, set debug points, and 
> definitely get a deterministic rexec. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2475) ReservationSystem: replan upon capacity reduction

2014-09-12 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-2475:
---
Attachment: YARN-2475.patch

Thanks [~chris.douglas] for your diligent review. I am attaching a patch that 
has the minor tweaks you suggested.

 ReservationSystem: replan upon capacity reduction
 -

 Key: YARN-2475
 URL: https://issues.apache.org/jira/browse/YARN-2475
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: YARN-2475.patch, YARN-2475.patch, YARN-2475.patch


 In the context of YARN-1051, if capacity of the cluster drops significantly 
 upon machine failures we need to trigger a reorganization of the planned 
 reservations. As reservations are absolute it is possible that they will 
 not all fit, and some need to be rejected a-posteriori.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1709) Admission Control: Reservation subsystem

2014-09-12 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-1709:
---
Attachment: YARN-1709.patch

Thanks [~chris.douglas] for your diligent review. I am attaching a patch that 
has the minor tweaks you suggested.

 Admission Control: Reservation subsystem
 

 Key: YARN-1709
 URL: https://issues.apache.org/jira/browse/YARN-1709
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Subramaniam Krishnan
 Attachments: YARN-1709.patch, YARN-1709.patch, YARN-1709.patch, 
 YARN-1709.patch, YARN-1709.patch, YARN-1709.patch, YARN-1709.patch


 This JIRA is about the key data structure used to track resources over time 
 to enable YARN-1051. The Reservation subsystem is conceptually a plan of 
 how the scheduler will allocate resources over-time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2475) ReservationSystem: replan upon capacity reduction

2014-09-12 Thread Subramaniam Krishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14132158#comment-14132158
 ] 

Subramaniam Krishnan commented on YARN-2475:


bq. Just a minor clarification: as this iterates over each instant of the plan, 
are others allowed to modify it?

I prefer current approach to global locking of plan as users can submit 
requests for new (or modify existing reservations) for future. Additional 
requests within the replanning window will be rejected at the validation stage 
itself even before they reach the plan because execution of replanner implies 
there is no spare capacity. Makes sense?

 ReservationSystem: replan upon capacity reduction
 -

 Key: YARN-2475
 URL: https://issues.apache.org/jira/browse/YARN-2475
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: YARN-2475.patch, YARN-2475.patch, YARN-2475.patch


 In the context of YARN-1051, if capacity of the cluster drops significantly 
 upon machine failures we need to trigger a reorganization of the planned 
 reservations. As reservations are absolute it is possible that they will 
 not all fit, and some need to be rejected a-posteriori.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1708) Add a public API to reserve resources (part of YARN-1051)

2014-09-11 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-1708:
---
Attachment: YARN-1708.patch

Rebased patch after sync-ing branch yarn-1051 with trunk

 Add a public API to reserve resources (part of YARN-1051)
 -

 Key: YARN-1708
 URL: https://issues.apache.org/jira/browse/YARN-1708
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Subramaniam Krishnan
 Attachments: YARN-1708.patch, YARN-1708.patch, YARN-1708.patch, 
 YARN-1708.patch


 This JIRA tracks the definition of a new public API for YARN, which allows 
 users to reserve resources (think of time-bounded queues). This is part of 
 the admission control enhancement proposed in YARN-1051.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1707) Making the CapacityScheduler more dynamic

2014-09-11 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-1707:
---
Attachment: YARN-1707.10.patch

Rebased patch after sync-ing branch yarn-1051 with trunk

 Making the CapacityScheduler more dynamic
 -

 Key: YARN-1707
 URL: https://issues.apache.org/jira/browse/YARN-1707
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Carlo Curino
Assignee: Carlo Curino
  Labels: capacity-scheduler
 Attachments: YARN-1707.10.patch, YARN-1707.2.patch, 
 YARN-1707.3.patch, YARN-1707.4.patch, YARN-1707.5.patch, YARN-1707.6.patch, 
 YARN-1707.7.patch, YARN-1707.8.patch, YARN-1707.9.patch, YARN-1707.patch


 The CapacityScheduler is a rather static at the moment, and refreshqueue 
 provides a rather heavy-handed way to reconfigure it. Moving towards 
 long-running services (tracked in YARN-896) and to enable more advanced 
 admission control and resource parcelling we need to make the 
 CapacityScheduler more dynamic. This is instrumental to the umbrella jira 
 YARN-1051.
 Concretely this require the following changes:
 * create queues dynamically
 * destroy queues dynamically
 * dynamically change queue parameters (e.g., capacity) 
 * modify refreshqueue validation to enforce sum(child.getCapacity())= 100% 
 instead of ==100%
 We limit this to LeafQueues. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1709) Admission Control: Reservation subsystem

2014-09-11 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-1709:
---
Attachment: YARN-1709.patch

Rebased patch after sync-ing branch yarn-1051 with trunk

 Admission Control: Reservation subsystem
 

 Key: YARN-1709
 URL: https://issues.apache.org/jira/browse/YARN-1709
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Subramaniam Krishnan
 Attachments: YARN-1709.patch, YARN-1709.patch, YARN-1709.patch, 
 YARN-1709.patch, YARN-1709.patch


 This JIRA is about the key data structure used to track resources over time 
 to enable YARN-1051. The Reservation subsystem is conceptually a plan of 
 how the scheduler will allocate resources over-time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1711) CapacityOverTimePolicy: a policy to enforce quotas over time for YARN-1709

2014-09-11 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-1711:
---
Attachment: YARN-1711.3.patch

Rebased patch after sync-ing branch yarn-1051 with trunk

 CapacityOverTimePolicy: a policy to enforce quotas over time for YARN-1709
 --

 Key: YARN-1711
 URL: https://issues.apache.org/jira/browse/YARN-1711
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Carlo Curino
Assignee: Carlo Curino
  Labels: reservations
 Attachments: YARN-1711.1.patch, YARN-1711.2.patch, YARN-1711.3.patch, 
 YARN-1711.patch


 This JIRA tracks the development of a policy that enforces user quotas (a 
 time-extension of the notion of capacity) in the inventory subsystem 
 discussed in YARN-1709.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2080) Admission Control: Integrate Reservation subsystem with ResourceManager

2014-09-11 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-2080:
---
Attachment: YARN-2080.patch

Rebased patch after sync-ing branch yarn-1051 with trunk

 Admission Control: Integrate Reservation subsystem with ResourceManager
 ---

 Key: YARN-2080
 URL: https://issues.apache.org/jira/browse/YARN-2080
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Subramaniam Krishnan
Assignee: Subramaniam Krishnan
 Attachments: YARN-2080.patch, YARN-2080.patch, YARN-2080.patch, 
 YARN-2080.patch, YARN-2080.patch


 This JIRA tracks the integration of Reservation subsystem data structures 
 introduced in YARN-1709 with the YARN RM. This is essentially end2end wiring 
 of YARN-1051.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1712) Admission Control: plan follower

2014-09-11 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-1712:
---
Attachment: YARN-1712.4.patch

Thanks [~leftnoteasy] for reviewing the patch. I have wrapped the debug logs 
with _isDebugEnabled()_. This patch is also rebased post sync-ing of branch 
yarn-1051 with trunk

 Admission Control: plan follower
 

 Key: YARN-1712
 URL: https://issues.apache.org/jira/browse/YARN-1712
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler, resourcemanager
Reporter: Carlo Curino
Assignee: Carlo Curino
  Labels: reservations, scheduler
 Attachments: YARN-1712.1.patch, YARN-1712.2.patch, YARN-1712.3.patch, 
 YARN-1712.4.patch, YARN-1712.patch


 This JIRA tracks a thread that continuously propagates the current state of 
 an inventory subsystem to the scheduler. As the inventory subsystem store the 
 plan of how the resources should be subdivided, the work we propose in this 
 JIRA realizes such plan by dynamically instructing the CapacityScheduler to 
 add/remove/resize queues to follow the plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2475) ReservationSystem: replan upon capacity reduction

2014-09-11 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-2475:
---
Attachment: YARN-2475.patch

Thanks [~chris.douglas] for reviewing the patch. I am uploading a patch that 
addresses all your comments (skipping relisting them).

bq. Why is the enforcement window tied to CapacitySchedulerConfiguration?

The replanner can be configured per plan which in turn translates to a leaf 
queue in capacity scheduler configuration. Consequently the enforcement window 
is configured for the replanner via the capacity scheduler leaf queue 
configuration.

 ReservationSystem: replan upon capacity reduction
 -

 Key: YARN-2475
 URL: https://issues.apache.org/jira/browse/YARN-2475
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: YARN-2475.patch, YARN-2475.patch


 In the context of YARN-1051, if capacity of the cluster drops significantly 
 upon machine failures we need to trigger a reorganization of the planned 
 reservations. As reservations are absolute it is possible that they will 
 not all fit, and some need to be rejected a-posteriori.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1709) Admission Control: Reservation subsystem

2014-09-11 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-1709:
---
Attachment: YARN-1709.patch

The patch has only modification - an additional constructor in InMemoryPlan 
which takes in the clock that can be used to pass in mock clocks in test cases 
as [suggested | 
https://issues.apache.org/jira/browse/YARN-2475?focusedCommentId=14129041] by 
[~chris.douglas]

 Admission Control: Reservation subsystem
 

 Key: YARN-1709
 URL: https://issues.apache.org/jira/browse/YARN-1709
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Subramaniam Krishnan
 Attachments: YARN-1709.patch, YARN-1709.patch, YARN-1709.patch, 
 YARN-1709.patch, YARN-1709.patch, YARN-1709.patch


 This JIRA is about the key data structure used to track resources over time 
 to enable YARN-1051. The Reservation subsystem is conceptually a plan of 
 how the scheduler will allocate resources over-time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2080) Admission Control: Integrate Reservation subsystem with ResourceManager

2014-09-09 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-2080:
---
Attachment: YARN-2080.patch

Thanks [~vinodkv] for reviewing the patch. I am uploading a new patch that has 
includes your feedback:
  * Renamed all Yarn config variables as you suggested. I prefer using the 
standalone configs as it gives us more flexibility.
  * Removed duplicate logging in _ClientRMService_  
_ReservationInputValidator_. Consistenly uses RMAuditLogger throughout.
  * Fixes in AbstractReservationSystem as you suggested.
  * Updated stale references to queues in Javadocs of 
_YarnClient.submitReservation()_
  * _TestYarnClient_  _TestClientRMService_ use newInstance instead of PBImpls
  * Renamed _ReservationRequest.setLeaseDuration()_ was renamed to be simply 
_setDuration()_
  * Moved _CapacitySchedulerConfiguration_ to YARN-1711

bq. ReservationInputValidator: Deleting a request shouldn't need 
validateReservationUpdateRequest-validateReservationDefinition. We only need 
the ID validation

That's exactly what's being done. ReservationDefinitions are validated only for 
submission/update.

bq. checkReservationACLs: Today anyone who can submit applications can also 
submit reservations. We may want to separate them, if you agree, I'll file a 
ticket for future separation of these ACLs.

I agree. I have a set of follow up enhancement JIRAs to YARN-1051 in mind one 
of which was exactly to consider separation of ACLs as you pointed out.

 Admission Control: Integrate Reservation subsystem with ResourceManager
 ---

 Key: YARN-2080
 URL: https://issues.apache.org/jira/browse/YARN-2080
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Subramaniam Krishnan
Assignee: Subramaniam Krishnan
 Attachments: YARN-2080.patch, YARN-2080.patch, YARN-2080.patch, 
 YARN-2080.patch


 This JIRA tracks the integration of Reservation subsystem data structures 
 introduced in YARN-1709 with the YARN RM. This is essentially end2end wiring 
 of YARN-1051.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1712) Admission Control: plan follower

2014-09-09 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-1712:
---
Attachment: YARN-1712.3.patch

Thanks [~jianhe] for your detailed feedback. I am attaching a patch with the 
following updates:
  * Made move apps logic synchronous and move is to defReservationQueue 
(renamed)
  * Removed the synchronized on scheduler as individual calls are already 
synchronized
  * Fixed comment formatting and variable names
  * Created a common method to calculate lhsRes and rhsRes
  * Optimized the loop as suggested

Some clarifications:
  * Exceptions are suppressed deliberately as PlanFollower is a background 
timer thread and we don't want it to exit
  * _plan.getReservationsAtTime(now)_ is used by others like Replanners. We 
need the reservations and not just the names even in PlanFollower so leaving it 
as is
 * Tried moving the default queue creating to when PlanQueue is initialized in 
CapacityScheduler but it was getting overly complex mainly due to the relaxed 
constraint of child capacities =100% for PlanQueues. This is just an 
additional hashmap lookup with the code being much cleaner so not moving it for 
now. If it is still a concern, I can add a flag to Plan and check that instead 
of CapacityScheduler#getQueue

 Admission Control: plan follower
 

 Key: YARN-1712
 URL: https://issues.apache.org/jira/browse/YARN-1712
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler, resourcemanager
Reporter: Carlo Curino
Assignee: Carlo Curino
  Labels: reservations, scheduler
 Attachments: YARN-1712.1.patch, YARN-1712.2.patch, YARN-1712.3.patch, 
 YARN-1712.patch


 This JIRA tracks a thread that continuously propagates the current state of 
 an inventory subsystem to the scheduler. As the inventory subsystem store the 
 plan of how the resources should be subdivided, the work we propose in this 
 JIRA realizes such plan by dynamically instructing the CapacityScheduler to 
 add/remove/resize queues to follow the plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1711) CapacityOverTimePolicy: a policy to enforce quotas over time for YARN-1709

2014-09-09 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-1711:
---
Attachment: YARN-1712.3.patch

Updated patch to include *CapacitySchedulerConfiguration* based on by 
[~vinodkv]'s [suggestion | 
https://issues.apache.org/jira/browse/YARN-2080?focusedCommentId=14125994] as 
the _majority_ of the configurations or for enforcement policies

 CapacityOverTimePolicy: a policy to enforce quotas over time for YARN-1709
 --

 Key: YARN-1711
 URL: https://issues.apache.org/jira/browse/YARN-1711
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Carlo Curino
Assignee: Carlo Curino
  Labels: reservations
 Attachments: YARN-1711.1.patch, YARN-1711.patch, YARN-1712.3.patch


 This JIRA tracks the development of a policy that enforces user quotas (a 
 time-extension of the notion of capacity) in the inventory subsystem 
 discussed in YARN-1709.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1711) CapacityOverTimePolicy: a policy to enforce quotas over time for YARN-1709

2014-09-09 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-1711:
---
Attachment: (was: YARN-1712.3.patch)

 CapacityOverTimePolicy: a policy to enforce quotas over time for YARN-1709
 --

 Key: YARN-1711
 URL: https://issues.apache.org/jira/browse/YARN-1711
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Carlo Curino
Assignee: Carlo Curino
  Labels: reservations
 Attachments: YARN-1711.1.patch, YARN-1711.2.patch, YARN-1711.patch


 This JIRA tracks the development of a policy that enforces user quotas (a 
 time-extension of the notion of capacity) in the inventory subsystem 
 discussed in YARN-1709.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1711) CapacityOverTimePolicy: a policy to enforce quotas over time for YARN-1709

2014-09-09 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-1711:
---
Attachment: YARN-1711.2.patch

 CapacityOverTimePolicy: a policy to enforce quotas over time for YARN-1709
 --

 Key: YARN-1711
 URL: https://issues.apache.org/jira/browse/YARN-1711
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Carlo Curino
Assignee: Carlo Curino
  Labels: reservations
 Attachments: YARN-1711.1.patch, YARN-1711.2.patch, YARN-1711.patch


 This JIRA tracks the development of a policy that enforces user quotas (a 
 time-extension of the notion of capacity) in the inventory subsystem 
 discussed in YARN-1709.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1709) Admission Control: Reservation subsystem

2014-09-08 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-1709:
---
Attachment: YARN-1709.patch

Thanks [~chris.douglas] for your exhaustive review. I am uploading a patch that 
has the following fixes:
  * Cloned _ZERO_RESOURCE_, _minimumAllocation_ and _maximumAllocation_ to 
prevent leaking of mutable data
   * Removed MessageFormat. Have to concat strings in few cases where they are 
both logged and included as part of exception message
  * Fixed the code readability and lock scope in _addReservation()_
  * Added assertions for _isWriteLockedByCurrentThread()_ in private methods 
that assume locks
  * Removed redundant _this_ in get methods
  * toString uses StringBuilder instead of StringBuffer now
  * Fixed Javadoc - content (_getEarliestStartTime()_) and whitespaces
  * Made _ReservationInterval_ immutable, good catch

The ReservationSystem uses UTCClock (added as part of YARN-1708) to enforce UTC 
times.  

 Admission Control: Reservation subsystem
 

 Key: YARN-1709
 URL: https://issues.apache.org/jira/browse/YARN-1709
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Subramaniam Krishnan
 Attachments: YARN-1709.patch, YARN-1709.patch, YARN-1709.patch, 
 YARN-1709.patch


 This JIRA is about the key data structure used to track resources over time 
 to enable YARN-1051. The Reservation subsystem is conceptually a plan of 
 how the scheduler will allocate resources over-time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1712) Admission Control: plan follower

2014-09-04 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-1712:
---
Attachment: YARN-1712.2.patch

[~leftnoteasy] , good to hear that you got the full context. Thanks for 
reviewing the patch. I am uploading a new patch that has the following changes:
   * Fix the Log message.
   * Replace stale references to sessions with reservations, good catch.

The currentReservations might have new reservations which just start now  so 
were not active before. These will not yet have corresponding reservation 
queues in CapacityScheduler as we create them after sorting. This is done to 
ensure the what you highlighted earlier - we never exceed total capacity.

 Admission Control: plan follower
 

 Key: YARN-1712
 URL: https://issues.apache.org/jira/browse/YARN-1712
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler, resourcemanager
Reporter: Carlo Curino
Assignee: Carlo Curino
  Labels: reservations, scheduler
 Attachments: YARN-1712.1.patch, YARN-1712.2.patch, YARN-1712.patch


 This JIRA tracks a thread that continuously propagates the current state of 
 an inventory subsystem to the scheduler. As the inventory subsystem store the 
 plan of how the resources should be subdivided, the work we propose in this 
 JIRA realizes such plan by dynamically instructing the CapacityScheduler to 
 add/remove/resize queues to follow the plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1707) Making the CapacityScheduler more dynamic

2014-09-04 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-1707:
---
Attachment: YARN-1707.9.patch

Uploading a new patch with a minor change. Renamed 
ReservationQueue#changeCapacity to ReservationQueue#setEntitlement for 
consistency.

 Making the CapacityScheduler more dynamic
 -

 Key: YARN-1707
 URL: https://issues.apache.org/jira/browse/YARN-1707
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Carlo Curino
Assignee: Carlo Curino
  Labels: capacity-scheduler
 Attachments: YARN-1707.2.patch, YARN-1707.3.patch, YARN-1707.4.patch, 
 YARN-1707.5.patch, YARN-1707.6.patch, YARN-1707.7.patch, YARN-1707.8.patch, 
 YARN-1707.9.patch, YARN-1707.patch


 The CapacityScheduler is a rather static at the moment, and refreshqueue 
 provides a rather heavy-handed way to reconfigure it. Moving towards 
 long-running services (tracked in YARN-896) and to enable more advanced 
 admission control and resource parcelling we need to make the 
 CapacityScheduler more dynamic. This is instrumental to the umbrella jira 
 YARN-1051.
 Concretely this require the following changes:
 * create queues dynamically
 * destroy queues dynamically
 * dynamically change queue parameters (e.g., capacity) 
 * modify refreshqueue validation to enforce sum(child.getCapacity())= 100% 
 instead of ==100%
 We limit this to LeafQueues. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1708) Add a public API to reserve resources (part of YARN-1051)

2014-09-04 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-1708:
---
Attachment: YARN-1708.patch

Thanks [~vinodkv] for reviewing the patch. I am uploading a new patch that has 
the following fixes based on your comments:
  * All the newInstance methods and setters in the Reservation*Response objects 
should be marked as private.
  * Replaced hashCode with IDE generated one in ReservationId
  * Renamed ReservationRequests.{set|get}Type - {set|get}Interpretor, also in 
ReservationRequestsProto.type.
   * Renamed ReservationRequest.leaseDuration to be simply duration to make it 
consistent with ReservationRequestProto.duration

 Add a public API to reserve resources (part of YARN-1051)
 -

 Key: YARN-1708
 URL: https://issues.apache.org/jira/browse/YARN-1708
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Subramaniam Krishnan
 Attachments: YARN-1708.patch, YARN-1708.patch, YARN-1708.patch


 This JIRA tracks the definition of a new public API for YARN, which allows 
 users to reserve resources (think of time-bounded queues). This is part of 
 the admission control enhancement proposed in YARN-1051.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1707) Making the CapacityScheduler more dynamic

2014-09-04 Thread Subramaniam Krishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14122234#comment-14122234
 ] 

Subramaniam Krishnan commented on YARN-1707:


Thanks [~jianhe] and [~leftnoteasy] for taking the time to do a thorough 
review. I am proxying for [~curino] also as he did most of the work for the 
patch. As discussed we will commit this to YARN-1051 branch once we have +1s 
for few other sub-JIRAs.

 Making the CapacityScheduler more dynamic
 -

 Key: YARN-1707
 URL: https://issues.apache.org/jira/browse/YARN-1707
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Carlo Curino
Assignee: Carlo Curino
  Labels: capacity-scheduler
 Attachments: YARN-1707.2.patch, YARN-1707.3.patch, YARN-1707.4.patch, 
 YARN-1707.5.patch, YARN-1707.6.patch, YARN-1707.7.patch, YARN-1707.8.patch, 
 YARN-1707.9.patch, YARN-1707.patch


 The CapacityScheduler is a rather static at the moment, and refreshqueue 
 provides a rather heavy-handed way to reconfigure it. Moving towards 
 long-running services (tracked in YARN-896) and to enable more advanced 
 admission control and resource parcelling we need to make the 
 CapacityScheduler more dynamic. This is instrumental to the umbrella jira 
 YARN-1051.
 Concretely this require the following changes:
 * create queues dynamically
 * destroy queues dynamically
 * dynamically change queue parameters (e.g., capacity) 
 * modify refreshqueue validation to enforce sum(child.getCapacity())= 100% 
 instead of ==100%
 We limit this to LeafQueues. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1707) Making the CapacityScheduler more dynamic

2014-09-03 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-1707:
---
Attachment: YARN-1707.7.patch

Thanks [~jianhe] for your comments. I am updating a patch that has the 
following fixes:
   * renamed dyQConf/sesConf to entitlement 
   * userLimit be reinitialized in ReservationQueue/PlanQueue
   * Indendation fixed
   * Renamed SchedulerConfigEditException to SchedulerDynamicEditException
   * Consistently used showReservationsAsQueues for both method as well as the 
flag

The newly parsed queues will have the maxApps* as 
CapacityScheduler#reinitialize() invokes parseQueues() which is where they are 
updated.

 Making the CapacityScheduler more dynamic
 -

 Key: YARN-1707
 URL: https://issues.apache.org/jira/browse/YARN-1707
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Carlo Curino
Assignee: Carlo Curino
  Labels: capacity-scheduler
 Attachments: YARN-1707.2.patch, YARN-1707.3.patch, YARN-1707.4.patch, 
 YARN-1707.5.patch, YARN-1707.6.patch, YARN-1707.7.patch, YARN-1707.patch


 The CapacityScheduler is a rather static at the moment, and refreshqueue 
 provides a rather heavy-handed way to reconfigure it. Moving towards 
 long-running services (tracked in YARN-896) and to enable more advanced 
 admission control and resource parcelling we need to make the 
 CapacityScheduler more dynamic. This is instrumental to the umbrella jira 
 YARN-1051.
 Concretely this require the following changes:
 * create queues dynamically
 * destroy queues dynamically
 * dynamically change queue parameters (e.g., capacity) 
 * modify refreshqueue validation to enforce sum(child.getCapacity())= 100% 
 instead of ==100%
 We limit this to LeafQueues. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1712) Admission Control: plan follower

2014-09-03 Thread Subramaniam Krishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14120729#comment-14120729
 ] 

Subramaniam Krishnan commented on YARN-1712:


Thanks [~leftnoteasy] for taking a look at the patch. Since [~curino] is 
traveling, I'll try to answer your questions. Your understanding is very close 
as your steps of 1-7 are correct. There is slight context which is missing that 
I will try to explain.

bq. Question: Why not do 2) after 4)? Is it better to do shrink after excluded 
expired reservations?

Shrinking might be required as reservations are absolute while queues express 
relative (% of cluster) capacity. We need to shrink first as shrinking might 
result in additional expired reservations.  The expired reservations are 
determined as those reservations that exist in the scheduler but are not 
currently active in the Plan (post shrinking if required). I should add that 
shrinking is a rare exception case when we loose large chuks of cluster 
capacity.

bq. 6) Sort all reservations, from less to more satisfied, and set their new 
entitlement.
bq. Question: Is it possible totalAssignedCapacity  1? Could you please 
explain how to avoid it happen?

We sort all reservations based on what was promised at this moment of time. 
That can vary because we support skylines for reservations, i.e. varied 
resource requirements over time. This is required to handle DAGs as in the case 
of Tez, Oozie, Hive or Pig queries as the nodes of the DAG will have different 
resource needs. This is explained in detail in the tech report we uploaded as 
part of YARN-1051. 
The totalAssignedCapacity will never exceed 1 because:
  1) We always release all excess capacity before starting to allocate fresh 
capacity.
  2) The reservations themselves are validated before being added to the Plan 
to ensure that they never exceed (YARN-1709  YARN-1711) the total capacity of 
the Plan. Like mentioned above, shrinking will handle large transient cluster 
failures. 

{quote}
One comment is,
Current compare and sort reservation is comparing (allocatedResource - 
guaranteedResource), one feeling at top of my mind is, this may make larger 
queue can get resource easier than small queue. Is it possible an app can get 
more resource than other by lying to RM that it needs more resource when fierce 
competition on resource?
{quote}

To prevent exactly we do our allocations starting from smallest to largest 
reservation queue. We enforce sharing policies (YARN-1711) to prevent a single 
user/app to reserve the entire cluster resources or cause starvation by 
hoarding resources.

Hope this clarifies the logic. Feel free to revert if you have any further 
questions.

 Admission Control: plan follower
 

 Key: YARN-1712
 URL: https://issues.apache.org/jira/browse/YARN-1712
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler, resourcemanager
Reporter: Carlo Curino
Assignee: Carlo Curino
  Labels: reservations, scheduler
 Attachments: YARN-1712.1.patch, YARN-1712.patch


 This JIRA tracks a thread that continuously propagates the current state of 
 an inventory subsystem to the scheduler. As the inventory subsystem store the 
 plan of how the resources should be subdivided, the work we propose in this 
 JIRA realizes such plan by dynamically instructing the CapacityScheduler to 
 add/remove/resize queues to follow the plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1707) Making the CapacityScheduler more dynamic

2014-09-03 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-1707:
---
Attachment: YARN-1707.8.patch

Thanks [~jianhe] for your insights. I am uploading a new patch that has the 
following fixes:
   * Removed display name
   * Reverted unnecessary visibility change  null check
   * Pass QueueEnititlement to changeCapacity()
   * Handling move to Plan Queue, including unit test case

I have not removed YarnException from setEntitlement as it is thrown 
getAndCheckLeafQueue()

 Making the CapacityScheduler more dynamic
 -

 Key: YARN-1707
 URL: https://issues.apache.org/jira/browse/YARN-1707
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Carlo Curino
Assignee: Carlo Curino
  Labels: capacity-scheduler
 Attachments: YARN-1707.2.patch, YARN-1707.3.patch, YARN-1707.4.patch, 
 YARN-1707.5.patch, YARN-1707.6.patch, YARN-1707.7.patch, YARN-1707.8.patch, 
 YARN-1707.patch


 The CapacityScheduler is a rather static at the moment, and refreshqueue 
 provides a rather heavy-handed way to reconfigure it. Moving towards 
 long-running services (tracked in YARN-896) and to enable more advanced 
 admission control and resource parcelling we need to make the 
 CapacityScheduler more dynamic. This is instrumental to the umbrella jira 
 YARN-1051.
 Concretely this require the following changes:
 * create queues dynamically
 * destroy queues dynamically
 * dynamically change queue parameters (e.g., capacity) 
 * modify refreshqueue validation to enforce sum(child.getCapacity())= 100% 
 instead of ==100%
 We limit this to LeafQueues. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1707) Making the CapacityScheduler more dynamic

2014-09-02 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-1707:
---
Attachment: YARN-1707.6.patch

Thanks [~jianhe] for your feedback. I am uploading a new patch that addresses 
your comments.

 Making the CapacityScheduler more dynamic
 -

 Key: YARN-1707
 URL: https://issues.apache.org/jira/browse/YARN-1707
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Reporter: Carlo Curino
Assignee: Carlo Curino
  Labels: capacity-scheduler
 Attachments: YARN-1707.2.patch, YARN-1707.3.patch, YARN-1707.4.patch, 
 YARN-1707.5.patch, YARN-1707.6.patch, YARN-1707.patch


 The CapacityScheduler is a rather static at the moment, and refreshqueue 
 provides a rather heavy-handed way to reconfigure it. Moving towards 
 long-running services (tracked in YARN-896) and to enable more advanced 
 admission control and resource parcelling we need to make the 
 CapacityScheduler more dynamic. This is instrumental to the umbrella jira 
 YARN-1051.
 Concretely this require the following changes:
 * create queues dynamically
 * destroy queues dynamically
 * dynamically change queue parameters (e.g., capacity) 
 * modify refreshqueue validation to enforce sum(child.getCapacity())= 100% 
 instead of ==100%
 We limit this to LeafQueues. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1709) Admission Control: Reservation subsystem

2014-08-29 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-1709:
---

Attachment: YARN-1709.patch

Updating the patch as result of API changes based on [~vinodkv] [feedback 
|https://issues.apache.org/jira/browse/YARN-1708?focusedCommentId=14112669] on 
YARN-1708.

 Admission Control: Reservation subsystem
 

 Key: YARN-1709
 URL: https://issues.apache.org/jira/browse/YARN-1709
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Subramaniam Krishnan
 Attachments: YARN-1709.patch, YARN-1709.patch, YARN-1709.patch


 This JIRA is about the key data structure used to track resources over time 
 to enable YARN-1051. The Reservation subsystem is conceptually a plan of 
 how the scheduler will allocate resources over-time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-2080) Admission Control: Integrate Reservation subsystem with ResourceManager

2014-08-29 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-2080:
---

Attachment: YARN-2080.patch

Uploading a new patch that adds a scheduler agnostic AbstractReservationSystem 
which is extended by the CapacityReservationSystem scheduler configuration as 
suggested by [~kasha]. CapacityReservationSystem essentially just loads configs 
from capacity scheduler xml. Attempted to converge this with Fair Scheduler as 
part of YARN-2386 but figured that it was not feasible.

It has also minor changes as a result of API changes based on [~vinodkv]  
[feedback | 
https://issues.apache.org/jira/browse/YARN-1708?focusedCommentId=14112669] on 
YARN-1708.

 Admission Control: Integrate Reservation subsystem with ResourceManager
 ---

 Key: YARN-2080
 URL: https://issues.apache.org/jira/browse/YARN-2080
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Subramaniam Krishnan
Assignee: Subramaniam Krishnan
 Attachments: YARN-2080.patch, YARN-2080.patch, YARN-2080.patch


 This JIRA tracks the integration of Reservation subsystem data structures 
 introduced in YARN-1709 with the YARN RM. This is essentially end2end wiring 
 of YARN-1051.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2080) Admission Control: Integrate Reservation subsystem with ResourceManager

2014-08-29 Thread Subramaniam Krishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14115860#comment-14115860
 ] 

Subramaniam Krishnan commented on YARN-2080:


Typo in previous comment. Read it as:

Uploading a new patch that adds a scheduler agnostic AbstractReservationSystem 
which is extended by the CapacityReservationSystem for capacity scheduler as 
suggested by [~kasha]. CapacityReservationSystem essentially just loads configs 
from capacity scheduler xml. Attempted to converge this with Fair Scheduler as 
part of YARN-2386 but figured that it was not feasible.

It has also minor changes as a result of API changes based on [~vinodkv] 
[feedback | 
https://issues.apache.org/jira/browse/YARN-1708?focusedCommentId=14112669] on 
YARN-1708.


 Admission Control: Integrate Reservation subsystem with ResourceManager
 ---

 Key: YARN-2080
 URL: https://issues.apache.org/jira/browse/YARN-2080
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Subramaniam Krishnan
Assignee: Subramaniam Krishnan
 Attachments: YARN-2080.patch, YARN-2080.patch, YARN-2080.patch


 This JIRA tracks the integration of Reservation subsystem data structures 
 introduced in YARN-1709 with the YARN RM. This is essentially end2end wiring 
 of YARN-1051.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2385) Consider splitting getAppsinQueue to getRunningAppsInQueue + getPendingAppsInQueue

2014-08-29 Thread Subramaniam Krishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14115863#comment-14115863
 ] 

Subramaniam Krishnan commented on YARN-2385:


Thanks [~sunilg] for verifying. I am fine either ways, i.e. if you want to take 
up the splitting now or later as currently we have ensured that the behavior of 
CS  FS are consistent for _getAppsInQueue_. [~leftnoteasy],  [~zjshen] what do 
you guys feel?

 Consider splitting getAppsinQueue to getRunningAppsInQueue + 
 getPendingAppsInQueue
 --

 Key: YARN-2385
 URL: https://issues.apache.org/jira/browse/YARN-2385
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler, fairscheduler
Reporter: Subramaniam Krishnan
  Labels: abstractyarnscheduler

 Currently getAppsinQueue returns both pending  running apps. The purpose of 
 the JIRA is to explore splitting it to getRunningAppsInQueue + 
 getPendingAppsInQueue that will provide more flexibility to callers



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1708) Add a public API to reserve resources (part of YARN-1051)

2014-08-27 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-1708:
---

Attachment: YARN-1708.patch

Thanks [~vinodkv] for taking the time to review and for the follow up 
discussions. I am uploading a new API patch based on the consensus we reached.

The summary of the changes are:

 - Make all proto fields as optional, with default and add server side code to 
check for required fields.
 - Rename ReservationCreateRequestProto - ReservationSubmissionRequestProto
 - Rename ReservationDescriptionProto - ReservationRequestsProto
 - Rename ReservationResourceRequestProto - ReservationRequestsProto
 - Added a new ReservationRequestProto which will be specifically to specify 
resources to reserve instead of reusing ResourceRequestProto as currently 
reservations does not use locality constraints. In future we see convergence of 
both.
 - Rename ReservationDescriptionInterpreterProto - 
ReservationRequestInterpreterProto. Added examples for each reservation type in 
Javadoc.
 - ReservationHandle is not needed. 
 - Add ReservationIdProto: ClusterTimeStamp + long id : Similar to appIDs
 - Rename ReservationCreateResponseProto - ReservationSubmissionResponseProto
 - ReservationUpdateRequestProto: No need to pass queue-name
  -- Instead should specify ReservationID and the effect will be to replace the 
existing reservation with new one.
 - ReservationUpdateResponseProto: Can just be empty
 - Add a ReservationDeleteRequestProto with ReservationId which will be deleted
 - ReservationDeleteResponseProto: again can just be empty

 Add a public API to reserve resources (part of YARN-1051)
 -

 Key: YARN-1708
 URL: https://issues.apache.org/jira/browse/YARN-1708
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Subramaniam Krishnan
 Attachments: YARN-1708.patch, YARN-1708.patch


 This JIRA tracks the definition of a new public API for YARN, which allows 
 users to reserve resources (think of time-bounded queues). This is part of 
 the admission control enhancement proposed in YARN-1051.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (YARN-1051) YARN Admission Control/Planner: enhancing the resource allocation model with time.

2014-08-27 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan reassigned YARN-1051:
--

Assignee: Subramaniam Krishnan  (was: Carlo Curino)

 YARN Admission Control/Planner: enhancing the resource allocation model with 
 time.
 --

 Key: YARN-1051
 URL: https://issues.apache.org/jira/browse/YARN-1051
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler, resourcemanager, scheduler
Reporter: Carlo Curino
Assignee: Subramaniam Krishnan
 Attachments: YARN-1051-design.pdf, curino_MSR-TR-2013-108.pdf, 
 techreport.pdf


 In this umbrella JIRA we propose to extend the YARN RM to handle time 
 explicitly, allowing users to reserve capacity over time. This is an 
 important step towards SLAs, long-running services, workflows, and helps for 
 gang scheduling.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2385) Consider splitting getAppsinQueue to getRunningAppsInQueue + getPendingAppsInQueue

2014-08-26 Thread Subramaniam Krishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111737#comment-14111737
 ] 

Subramaniam Krishnan commented on YARN-2385:


[~sunilg], the behavior of *getAppsInQueue* should be same for both CS  FS 
unless I am missing something. As part of YARN-2378, I added pending apps also 
to CS#getAppsInQueue.

 Consider splitting getAppsinQueue to getRunningAppsInQueue + 
 getPendingAppsInQueue
 --

 Key: YARN-2385
 URL: https://issues.apache.org/jira/browse/YARN-2385
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler, fairscheduler
Reporter: Subramaniam Krishnan
  Labels: abstractyarnscheduler

 Currently getAppsinQueue returns both pending  running apps. The purpose of 
 the JIRA is to explore splitting it to getRunningAppsInQueue + 
 getPendingAppsInQueue that will provide more flexibility to callers



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-2385) Consider splitting getAppsinQueue to getRunningAppsInQueue + getPendingAppsInQueue

2014-08-21 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-2385:
---

Summary: Consider splitting getAppsinQueue to getRunningAppsInQueue + 
getPendingAppsInQueue  (was: Adding support for listing all applications in a 
queue)

 Consider splitting getAppsinQueue to getRunningAppsInQueue + 
 getPendingAppsInQueue
 --

 Key: YARN-2385
 URL: https://issues.apache.org/jira/browse/YARN-2385
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler, fairscheduler
Reporter: Subramaniam Krishnan
  Labels: abstractyarnscheduler

 This JIRA proposes adding a method in AbstractYarnScheduler to get all the 
 pending/active applications. Fair scheduler already supports moving a single 
 application from one queue to another. Support for the same is being added to 
 Capacity Scheduler as part of YARN-2378 and YARN-2248. So with the addition 
 of this method, we can transparently add support for moving all applications 
 from source queue to target queue and draining a queue, i.e. killing all 
 applications in a queue as proposed by YARN-2389



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-2385) Consider splitting getAppsinQueue to getRunningAppsInQueue + getPendingAppsInQueue

2014-08-21 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-2385:
---

Description: Currently getAppsinQueue returns both pending  running apps. 
The purpose of the JIRA is to explore splitting it to getRunningAppsInQueue + 
getPendingAppsInQueue that will provide more flexibility to callers  (was: This 
JIRA proposes adding a method in AbstractYarnScheduler to get all the 
pending/active applications. Fair scheduler already supports moving a single 
application from one queue to another. Support for the same is being added to 
Capacity Scheduler as part of YARN-2378 and YARN-2248. So with the addition of 
this method, we can transparently add support for moving all applications from 
source queue to target queue and draining a queue, i.e. killing all 
applications in a queue as proposed by YARN-2389)

 Consider splitting getAppsinQueue to getRunningAppsInQueue + 
 getPendingAppsInQueue
 --

 Key: YARN-2385
 URL: https://issues.apache.org/jira/browse/YARN-2385
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler, fairscheduler
Reporter: Subramaniam Krishnan
  Labels: abstractyarnscheduler

 Currently getAppsinQueue returns both pending  running apps. The purpose of 
 the JIRA is to explore splitting it to getRunningAppsInQueue + 
 getPendingAppsInQueue that will provide more flexibility to callers



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2385) Adding support for listing all applications in a queue

2014-08-18 Thread Subramaniam Krishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101034#comment-14101034
 ] 

Subramaniam Krishnan commented on YARN-2385:


[~sunilg], [~leftnoteasy], [~zjshen]

I suggest we either open a new JIRA to discuss splitting of getAppsinQueue to 
getRunningAppsInQueue + getPendingAppsInQueue or update the current JIRA to 
reflect the discussion? 

 Adding support for listing all applications in a queue
 --

 Key: YARN-2385
 URL: https://issues.apache.org/jira/browse/YARN-2385
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler, fairscheduler
Reporter: Subramaniam Krishnan
Assignee: Karthik Kambatla
  Labels: abstractyarnscheduler

 This JIRA proposes adding a method in AbstractYarnScheduler to get all the 
 pending/active applications. Fair scheduler already supports moving a single 
 application from one queue to another. Support for the same is being added to 
 Capacity Scheduler as part of YARN-2378 and YARN-2248. So with the addition 
 of this method, we can transparently add support for moving all applications 
 from source queue to target queue and draining a queue, i.e. killing all 
 applications in a queue as proposed by YARN-2389



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-2385) Adding support for listing all applications in a queue

2014-08-18 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-2385:
---

Assignee: (was: Karthik Kambatla)

 Adding support for listing all applications in a queue
 --

 Key: YARN-2385
 URL: https://issues.apache.org/jira/browse/YARN-2385
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler, fairscheduler
Reporter: Subramaniam Krishnan
  Labels: abstractyarnscheduler

 This JIRA proposes adding a method in AbstractYarnScheduler to get all the 
 pending/active applications. Fair scheduler already supports moving a single 
 application from one queue to another. Support for the same is being added to 
 Capacity Scheduler as part of YARN-2378 and YARN-2248. So with the addition 
 of this method, we can transparently add support for moving all applications 
 from source queue to target queue and draining a queue, i.e. killing all 
 applications in a queue as proposed by YARN-2389



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (YARN-2386) Refactor common scheduler configurations into a base ResourceSchedulerConfig class

2014-08-18 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan resolved YARN-2386.


Resolution: Invalid

Took a look into both the scheduler configs and unfortunately the 
configurations are so disparate that there isn't much common to refactor out.

 Refactor common scheduler configurations into a base ResourceSchedulerConfig 
 class
 --

 Key: YARN-2386
 URL: https://issues.apache.org/jira/browse/YARN-2386
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Subramaniam Krishnan
Assignee: Subramaniam Krishnan

 As discussed with [~leftnoteasy], [~jianhe] and [~kasha], this JIRA proposes 
 refactoring common configuration from Capacity  Fair scheduler to a common 
 base class to avoid duplicating configs. Currently Capacity  Fair scheduler 
 directly extend configuration and adding a common base Resource scheduler 
 config class would also align with the Resource Scheduler hierarchy and 
 enable other systems like reservation system (YARN-2080) to be scheduler 
 implementation agnostic.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1051) YARN Admission Control/Planner: enhancing the resource allocation model with time.

2014-05-31 Thread Subramaniam Krishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014565#comment-14014565
 ] 

Subramaniam Krishnan commented on YARN-1051:


We have posted patches for YARN-1709 and YARN-2080, looking for feedback.

 YARN Admission Control/Planner: enhancing the resource allocation model with 
 time.
 --

 Key: YARN-1051
 URL: https://issues.apache.org/jira/browse/YARN-1051
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler, resourcemanager, scheduler
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: YARN-1051-design.pdf, curino_MSR-TR-2013-108.pdf, 
 techreport.pdf


 In this umbrella JIRA we propose to extend the YARN RM to handle time 
 explicitly, allowing users to reserve capacity over time. This is an 
 important step towards SLAs, long-running services, workflows, and helps for 
 gang scheduling.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-2080) Admission Control: Integrate Reservation subsystem with ResourceManager

2014-05-30 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-2080:
---

Attachment: YARN-2080.patch

Attaching a patch file that wires the reservation APIs into existing YARN APIs.

It introduces a new component *ReservationSystem* that essentially manages all 
the _Plans_ (#YARN-1709) configured in the ResourceSchedulers. The 
ReservationSystem is bootstrapped by ResourceManager if it is enabled in 
configuration.

The ClientRMService has implementation of the reservation APIs which are 
additionally exposed via the YarnClient.


 Admission Control: Integrate Reservation subsystem with ResourceManager
 ---

 Key: YARN-2080
 URL: https://issues.apache.org/jira/browse/YARN-2080
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Subramaniam Krishnan
Assignee: Subramaniam Krishnan
 Attachments: YARN-2080.patch


 This JIRA tracks the integration of Reservation subsystem data structures 
 introduced in YARN-1709 with the YARN RM. This is essentially end2end wiring 
 of YARN-1051.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1709) Admission Control: Reservation subsystem

2014-05-29 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-1709:
---

Attachment: YARN-1709.patch

The attached patch contains in-memory data structures to track reservations 
over time:

 * _Plan_ : It is the central data structure of a reservation system that 
maintains the agenda for the cluster i.e. how client reservations that have 
been accepted so far will be honoured.

 * _ReservationAllocation_ : It represents a concrete instance of resources 
allocated over time to satisfy a single client reservation request.

 * _RLESparseResourceAllocation_ It is a run length encoded sparse data 
structure that maintains cumulative resource allocations over time.

 Admission Control: Reservation subsystem
 

 Key: YARN-1709
 URL: https://issues.apache.org/jira/browse/YARN-1709
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Subramaniam Krishnan
 Attachments: YARN-1709.patch


 This JIRA is about the key data structure used to track resources over time 
 to enable YARN-1051. The Reservation subsystem is conceptually a plan of 
 how the scheduler will allocate resources over-time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1709) Admission Control: Reservation subsystem

2014-05-20 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-1709:
---

Description: This JIRA is about the key data structure used to track 
resources over time to enable YARN-1051. The Reservation subsystem is 
conceptually a plan of how the scheduler will allocate resources over-time.  
(was: This JIRA is about the key data structure used to track resources over 
time to enable YARN-1051. The inventory subsystem is conceptually a plan of 
how the capacity scheduler will be configured over-time.)
Summary: Admission Control: Reservation subsystem  (was: Admission 
Control: inventory subsystem)

 Admission Control: Reservation subsystem
 

 Key: YARN-1709
 URL: https://issues.apache.org/jira/browse/YARN-1709
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Subramaniam Krishnan

 This JIRA is about the key data structure used to track resources over time 
 to enable YARN-1051. The Reservation subsystem is conceptually a plan of 
 how the scheduler will allocate resources over-time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (YARN-2080) Admission Control: Integrate Reservation subsystem with ResourceManager

2014-05-20 Thread Subramaniam Krishnan (JIRA)
Subramaniam Krishnan created YARN-2080:
--

 Summary: Admission Control: Integrate Reservation subsystem with 
ResourceManager
 Key: YARN-2080
 URL: https://issues.apache.org/jira/browse/YARN-2080
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Subramaniam Krishnan
Assignee: Subramaniam Krishnan


This JIRA is about the key data structure used to track resources over time to 
enable YARN-1051. The Reservation subsystem is conceptually a plan of how the 
scheduler will allocate resources over-time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-2080) Admission Control: Integrate Reservation subsystem with ResourceManager

2014-05-20 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-2080:
---

Description: This JIRA tracks the integration of Reservation subsystem data 
structures introduced in YARN-1709 with the YARN RM. This is essentially 
end2end wiring of YARN-1051.  (was: This JIRA is about the key data structure 
used to track resources over time to enable YARN-1051. The Reservation 
subsystem is conceptually a plan of how the scheduler will allocate resources 
over-time.)

 Admission Control: Integrate Reservation subsystem with ResourceManager
 ---

 Key: YARN-2080
 URL: https://issues.apache.org/jira/browse/YARN-2080
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Subramaniam Krishnan
Assignee: Subramaniam Krishnan

 This JIRA tracks the integration of Reservation subsystem data structures 
 introduced in YARN-1709 with the YARN RM. This is essentially end2end wiring 
 of YARN-1051.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1708) Add a public API to reserve resources (part of YARN-1051)

2014-05-05 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-1708:
---

Attachment: YARN-1708.patch

 Add a public API to reserve resources (part of YARN-1051)
 -

 Key: YARN-1708
 URL: https://issues.apache.org/jira/browse/YARN-1708
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Subramaniam Krishnan
 Attachments: YARN-1708.patch


 This JIRA tracks the definition of a new public API for YARN, which allows 
 users to reserve resources (think of time-bounded queues). This is part of 
 the admission control enhancement proposed in YARN-1051.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1708) Add a public API to reserve resources (part of YARN-1051)

2014-05-05 Thread Subramaniam Krishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13990239#comment-13990239
 ] 

Subramaniam Krishnan commented on YARN-1708:


Attaching the patch

 Add a public API to reserve resources (part of YARN-1051)
 -

 Key: YARN-1708
 URL: https://issues.apache.org/jira/browse/YARN-1708
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Subramaniam Krishnan
 Attachments: YARN-1708.patch


 This JIRA tracks the definition of a new public API for YARN, which allows 
 users to reserve resources (think of time-bounded queues). This is part of 
 the admission control enhancement proposed in YARN-1051.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1051) YARN Admission Control/Planner: enhancing the resource allocation model with time.

2014-03-19 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-1051:
---

Attachment: techreport.pdf

Attaching an updated Tech Report which enunciates more clearly what we intend 
to achieve, results from our P-o-C and also aligns with the design doc on how 
we propose to implement the same in YARN.

 YARN Admission Control/Planner: enhancing the resource allocation model with 
 time.
 --

 Key: YARN-1051
 URL: https://issues.apache.org/jira/browse/YARN-1051
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler, resourcemanager, scheduler
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: YARN-1051-design.pdf, curino_MSR-TR-2013-108.pdf, 
 techreport.pdf


 In this umbrella JIRA we propose to extend the YARN RM to handle time 
 explicitly, allowing users to reserve capacity over time. This is an 
 important step towards SLAs, long-running services, workflows, and helps for 
 gang scheduling.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (YARN-1709) Admission Control: inventory subsystem

2014-02-11 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan reassigned YARN-1709:
--

Assignee: Subramaniam Krishnan

 Admission Control: inventory subsystem
 --

 Key: YARN-1709
 URL: https://issues.apache.org/jira/browse/YARN-1709
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Subramaniam Krishnan

 This JIRA is about the key data structure used to track resources over time 
 to enable YARN-1051. The inventory subsystem is conceptually a plan of how 
 the capacity scheduler will be configured over-time.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Assigned] (YARN-1710) Admission Control: agents to allocate reservation

2014-02-11 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan reassigned YARN-1710:
--

Assignee: Subramaniam Krishnan

 Admission Control: agents to allocate reservation
 -

 Key: YARN-1710
 URL: https://issues.apache.org/jira/browse/YARN-1710
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Carlo Curino
Assignee: Subramaniam Krishnan

 This JIRA tracks the algorithms used to allocate a user ReservationRequest 
 coming in from the new reservation API (YARN-1708), in the inventory 
 subsystem (YARN-1709) maintaining the current plan for the cluster. The focus 
 of this agents is to quickly find a solution for the set of contraints 
 provided by the user, and the physical constraints of the plan.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1051) YARN Admission Control/Planner: enhancing the resource allocation model with time.

2014-02-11 Thread Subramaniam Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subramaniam Krishnan updated YARN-1051:
---

Attachment: YARN-1051-design.pdf

Attaching the approach doc that describes the overall intent for interested 
readers. The doc also lists the breakdown into incremental sub-tasks.Any 
suggestions/thoughts are welcome, we will incorporate feedback as it comes in.  

 YARN Admission Control/Planner: enhancing the resource allocation model with 
 time.
 --

 Key: YARN-1051
 URL: https://issues.apache.org/jira/browse/YARN-1051
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler, resourcemanager, scheduler
Reporter: Carlo Curino
Assignee: Carlo Curino
 Attachments: YARN-1051-design.pdf, curino_MSR-TR-2013-108.pdf


 In this umbrella JIRA we propose to extend the YARN RM to handle time 
 explicitly, allowing users to reserve capacity over time. This is an 
 important step towards SLAs, long-running services, workflows, and helps for 
 gang scheduling.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)