Re: Launching tasks with reserved resources

2015-08-17 Thread Gidon Gershinsky
Sounds good, will do.



Regards, 
Gidon







From:   Alex Rukletsov 
To: user@mesos.apache.org
Date:   17/08/2015 05:30 PM
Subject:Re: Launching tasks with reserved resources



> if there were an api for splitting a resource object
I think it's a good idea, "resource math" is something that each framework 
re-implements. We were discussing the idea of providing a "framework kit", 
but AFAIK there has been no work done in this direction yet. Mind filing a 
JIRA ticket?

> sending the reserved and unreserved resources in two separate offers 
indeed helps here
I would say this one also deserves a ticket. I may not see some use cases 
where this is undesirable, but will be happy to see the discussion around 
that documented in the ticket. Even if the ticket will end up in "won't 
fix", the discussion and reasoning can be helpful for posterity.

On Mon, Aug 17, 2015 at 3:46 PM, Gidon Gershinsky  
wrote:
Hi Alex, 

Yep, this setup is using static reservations in agents. 

I haven't tried running a big task with two or more resources (reserved 
and unreserved), but guess it is quite intuitive for a developer - a 
framework is offered two resource objects, and launches a task specifying 
these objects, no need to dive too deep into resource roles etc. If a 
framework hoards resources, it can "sum up" the offered objects, which 
again looks reasonable. 
The problem I had is at the opposite end - when a framework needs to split 
the offered resources and run many smaller tasks. Eventually, I was able 
to bypass it, by micro-managing the role assignment to each task 
resources; cumbersome, but works. So its more of a usage issue - if there 
were an api for splitting a resource object (opposite to the "+" api for 
summing/hoarding), the things would be more intuitive. 
Btw, sending the reserved and unreserved resources in two separate offers 
indeed helps here, since each offer comes with a single role. 
In any case, I agree it makes sense for a developer to be aware of the 
reservation policies. 



Regards, 
Gidon







From:Alex Rukletsov  
To:user@mesos.apache.org 
Date:    17/08/2015 01:02 PM 
Subject:Re: Launching tasks with reserved resources 




Hi Gidon, 

just to make sure, you mean static reservations on mesos agents (via 
--resources flag) and not dynamic reservations, right? 

Let me first try to explain, why you get the TASK_ERROR message. The 
built-in allocator merges '*' and reserved resources, hinting master to 
create a single offer. However, as you mentioned before, validation fails, 
if you try to mix resources with different role, because the function 
responsible for validation checks whether task resources are "contained" 
in offered resources, which obviously includes role equality check. Here 
are some source code snippets: 
https://github.com/apache/mesos/blob/master/src/master/validation.cpp#L449 

https://github.com/apache/mesos/blob/master/src/common/resources.cpp#L598 
https://github.com/apache/mesos/blob/master/src/common/resources.cpp#L244 
https://github.com/apache/mesos/blob/master/src/common/resources.cpp#L197 

Maybe we should split reserved and unreserved resources into two offers?  

Now, to your second concern about whether we should disallow tasks using 
both '*' and 'role' resources. I see your point: if a framework is 
entitled to use reserved and unreserved resources, why not hoard them and 
launch a bigger task? I think it's fine, and you should be actually able 
to do it by explicitly specifying two different resource objects in the 
task launch message, one for '*" resources and one for your role. Why 
cannot you just use your framework's role for both? Different roles may 
have different guarantees (quota, MESOS-1791), and while reserved 
resources may still be available for your framework, '*" may become 
unavailable for you (in future Mesos releases or with custom allocators) 
leading to the whole task termination. By requiring two different objects 
in the task launch message we motivate the framework ? i.e. framework 
writer ? to be aware of different policies that may be attached to 
different roles. Does it make sense? 

?Alex 

On Thu, Aug 13, 2015 at 2:23 PM, Gidon Gershinsky  
wrote: 
I have a simple setup where a framework runs with a role, and some 
resources are reserved in cluster for that role. 
The resource offers arrive at the framework as a list of two resource 
sets: one general (cpus(*)), etc)  and one specific for the role 
(cpus("role1"), etc). 

So far so good. If two tasks are launched, each with one of the two 
resources, things work. 

But problems start when I need to launch multiple smaller tasks (with a 
total resource consumption equal to the offered). I run this by creating 
resource objects, and attaching them to task

Re: Launching tasks with reserved resources

2015-08-17 Thread Alex Rukletsov
> if there were an api for splitting a resource object
I think it's a good idea, "resource math" is something that each framework
re-implements. We were discussing the idea of providing a "framework kit",
but AFAIK there has been no work done in this direction yet. Mind filing a
JIRA ticket?

> sending the reserved and unreserved resources in two separate offers
indeed helps here
I would say this one also deserves a ticket. I may not see some use cases
where this is undesirable, but will be happy to see the discussion around
that documented in the ticket. Even if the ticket will end up in "won't
fix", the discussion and reasoning can be helpful for posterity.

On Mon, Aug 17, 2015 at 3:46 PM, Gidon Gershinsky  wrote:

> Hi Alex,
>
> Yep, this setup is using static reservations in agents.
>
> I haven't tried running a big task with two or more resources (reserved
> and unreserved), but guess it is quite intuitive for a developer - a
> framework is offered two resource objects, and launches a task specifying
> these objects, no need to dive too deep into resource roles etc. If a
> framework hoards resources, it can "sum up" the offered objects, which
> again looks reasonable.
> The problem I had is at the opposite end - when a framework needs to split
> the offered resources and run many smaller tasks. Eventually, I was able to
> bypass it, by micro-managing the role assignment to each task resources;
> cumbersome, but works. So its more of a usage issue - if there were an api
> for splitting a resource object (opposite to the "+" api for
> summing/hoarding), the things would be more intuitive.
> Btw, sending the reserved and unreserved resources in two separate offers
> indeed helps here, since each offer comes with a single role.
> In any case, I agree it makes sense for a developer to be aware of the
> reservation policies.
>
>
>
> Regards,
> Gidon
>
>
>
>
>
>
>
> From:Alex Rukletsov 
> To:user@mesos.apache.org
> Date:17/08/2015 01:02 PM
> Subject:Re: Launching tasks with reserved resources
> --
>
>
>
> Hi Gidon,
>
> just to make sure, you mean static reservations on mesos agents (via
> --resources flag) and not dynamic reservations, right?
>
> Let me first try to explain, why you get the TASK_ERROR message. The
> built-in allocator merges '*' and reserved resources, hinting master to
> create a single offer. However, as you mentioned before, validation fails,
> if you try to mix resources with different role, because the function
> responsible for validation checks whether task resources are "contained" in
> offered resources, which obviously includes role equality check. Here are
> some source code snippets:
>
> *https://github.com/apache/mesos/blob/master/src/master/validation.cpp#L449*
> <https://github.com/apache/mesos/blob/master/src/master/validation.cpp#L449>
> *https://github.com/apache/mesos/blob/master/src/common/resources.cpp#L598*
> <https://github.com/apache/mesos/blob/master/src/common/resources.cpp#L598>
> *https://github.com/apache/mesos/blob/master/src/common/resources.cpp#L244*
> <https://github.com/apache/mesos/blob/master/src/common/resources.cpp#L244>
> *https://github.com/apache/mesos/blob/master/src/common/resources.cpp#L197*
> <https://github.com/apache/mesos/blob/master/src/common/resources.cpp#L197>
>
> Maybe we should split reserved and unreserved resources into two offers?
>
> Now, to your second concern about whether we should disallow tasks using
> both '*' and 'role' resources. I see your point: if a framework is entitled
> to use reserved and unreserved resources, why not hoard them and launch a
> bigger task? I think it's fine, and you should be actually able to do it by
> explicitly specifying two different resource objects in the task launch
> message, one for '*" resources and one for your role. Why cannot you just
> use your framework's role for both? Different roles may have different
> guarantees (quota, MESOS-1791), and while reserved resources may still be
> available for your framework, '*" may become unavailable for you (in future
> Mesos releases or with custom allocators) leading to the whole task
> termination. By requiring two different objects in the task launch message
> we motivate the framework — i.e. framework writer — to be aware of
> different policies that may be attached to different roles. Does it make
> sense?
>
> —Alex
>
> On Thu, Aug 13, 2015 at 2:23 PM, Gidon Gershinsky <*gi...@il.ibm.com*
> > wrote:
> I have a simple setup where a framework runs with a role, and some

Re: Launching tasks with reserved resources

2015-08-17 Thread Gidon Gershinsky
Hi Alex,

Yep, this setup is using static reservations in agents.

I haven't tried running a big task with two or more resources (reserved 
and unreserved), but guess it is quite intuitive for a developer - a 
framework is offered two resource objects, and launches a task specifying 
these objects, no need to dive too deep into resource roles etc. If a 
framework hoards resources, it can "sum up" the offered objects, which 
again looks reasonable.
The problem I had is at the opposite end - when a framework needs to split 
the offered resources and run many smaller tasks. Eventually, I was able 
to bypass it, by micro-managing the role assignment to each task 
resources; cumbersome, but works. So its more of a usage issue - if there 
were an api for splitting a resource object (opposite to the "+" api for 
summing/hoarding), the things would be more intuitive.
Btw, sending the reserved and unreserved resources in two separate offers 
indeed helps here, since each offer comes with a single role.
In any case, I agree it makes sense for a developer to be aware of the 
reservation policies.



Regards, 
Gidon







From:   Alex Rukletsov 
To: user@mesos.apache.org
Date:   17/08/2015 01:02 PM
Subject:    Re: Launching tasks with reserved resources



Hi Gidon,

just to make sure, you mean static reservations on mesos agents (via 
--resources flag) and not dynamic reservations, right?

Let me first try to explain, why you get the TASK_ERROR message. The 
built-in allocator merges '*' and reserved resources, hinting master to 
create a single offer. However, as you mentioned before, validation fails, 
if you try to mix resources with different role, because the function 
responsible for validation checks whether task resources are "contained" 
in offered resources, which obviously includes role equality check. Here 
are some source code snippets:
https://github.com/apache/mesos/blob/master/src/master/validation.cpp#L449
https://github.com/apache/mesos/blob/master/src/common/resources.cpp#L598
https://github.com/apache/mesos/blob/master/src/common/resources.cpp#L244
https://github.com/apache/mesos/blob/master/src/common/resources.cpp#L197

Maybe we should split reserved and unreserved resources into two offers? 

Now, to your second concern about whether we should disallow tasks using 
both '*' and 'role' resources. I see your point: if a framework is 
entitled to use reserved and unreserved resources, why not hoard them and 
launch a bigger task? I think it's fine, and you should be actually able 
to do it by explicitly specifying two different resource objects in the 
task launch message, one for '*" resources and one for your role. Why 
cannot you just use your framework's role for both? Different roles may 
have different guarantees (quota, MESOS-1791), and while reserved 
resources may still be available for your framework, '*" may become 
unavailable for you (in future Mesos releases or with custom allocators) 
leading to the whole task termination. By requiring two different objects 
in the task launch message we motivate the framework ? i.e. framework 
writer ? to be aware of different policies that may be attached to 
different roles. Does it make sense?

?Alex

On Thu, Aug 13, 2015 at 2:23 PM, Gidon Gershinsky  
wrote:
I have a simple setup where a framework runs with a role, and some 
resources are reserved in cluster for that role. 
The resource offers arrive at the framework as a list of two resource 
sets: one general (cpus(*)), etc)  and one specific for the role 
(cpus("role1"), etc). 

So far so good. If two tasks are launched, each with one of the two 
resources, things work. 

But problems start when I need to launch multiple smaller tasks (with a 
total resource consumption equal to the offered). I run this by creating 
resource objects, and attaching them to tasks, using calls from the 
standard Mesos samples (python): 
task = mesos_pb2.TaskInfo()
   cpus = task.resources.add() 
cpus.name = "cpus" 
cpus.scalar.value = TASK_CPUS 

checking that total doesnt surpass the offered resources. This starts 
fine, but soon I get TASK_ERROR messages, due to Master validator finding 
that more resources are requested by tasks than available in the offer. 
This obviously happens because all tasks resources, as defined above, come 
with (*) role, while the offer resources are split between "*" and "role1" 
! Ok, then I assign a role to task resources, by adding 
   cpus.role = "role1" 

But this fails again, and for the same reason.. 

Shouldn't this work differently? When a resource offer is received 
 framework with a "role1", why should it care which part is 'unreserved' 
and which part is reserved to "role1"? When a task

Re: Launching tasks with reserved resources

2015-08-17 Thread Alex Rukletsov
Hi Gidon,

just to make sure, you mean static reservations on mesos agents (via
--resources flag) and not dynamic reservations, right?

Let me first try to explain, why you get the TASK_ERROR message. The
built-in allocator merges '*' and reserved resources, hinting master to
create a single offer. However, as you mentioned before, validation fails,
if you try to mix resources with different role, because the function
responsible for validation checks whether task resources are "contained" in
offered resources, which obviously includes role equality check. Here are
some source code snippets:
https://github.com/apache/mesos/blob/master/src/master/validation.cpp#L449
https://github.com/apache/mesos/blob/master/src/common/resources.cpp#L598
https://github.com/apache/mesos/blob/master/src/common/resources.cpp#L244
https://github.com/apache/mesos/blob/master/src/common/resources.cpp#L197

Maybe we should split reserved and unreserved resources into two offers?

Now, to your second concern about whether we should disallow tasks using
both '*' and 'role' resources. I see your point: if a framework is entitled
to use reserved and unreserved resources, why not hoard them and launch a
bigger task? I think it's fine, and you should be actually able to do it by
explicitly specifying two different resource objects in the task launch
message, one for '*" resources and one for your role. Why cannot you just
use your framework's role for both? Different roles may have different
guarantees (quota, MESOS-1791), and while reserved resources may still be
available for your framework, '*" may become unavailable for you (in future
Mesos releases or with custom allocators) leading to the whole task
termination. By requiring two different objects in the task launch message
we motivate the framework — i.e. framework writer — to be aware of
different policies that may be attached to different roles. Does it make
sense?

—Alex

On Thu, Aug 13, 2015 at 2:23 PM, Gidon Gershinsky  wrote:

> I have a simple setup where a framework runs with a role, and some
> resources are reserved in cluster for that role.
> The resource offers arrive at the framework as a list of two resource
> sets: one general (cpus(*)), etc)  and one specific for the role
> (cpus("role1"), etc).
>
> So far so good. If two tasks are launched, each with one of the two
> resources, things work.
>
> But problems start when I need to launch multiple smaller tasks (with a
> total resource consumption equal to the offered). I run this by creating
> resource objects, and attaching them to tasks, using calls from the
> standard Mesos samples (python):
> task = mesos_pb2.TaskInfo()
>cpus = task.resources.add()
> cpus.name = "cpus"
> cpus.scalar.value = TASK_CPUS
>
> checking that total doesnt surpass the offered resources. This starts
> fine, but soon I get TASK_ERROR messages, due to Master validator finding
> that more resources are requested by tasks than available in the offer.
> This obviously happens because all tasks resources, as defined above, come
> with (*) role, while the offer resources are split between "*" and "role1"
> ! Ok, then I assign a role to task resources, by adding
>cpus.role = "role1"
>
> But this fails again, and for the same reason..
>
> Shouldn't this work differently? When a resource offer is received
>  framework with a "role1", why should it care which part is 'unreserved'
> and which part is reserved to "role1"? When a task launch request is
> received by the master, from a framework with a role, why can't it check
> only the total resource amount, instead of treating unreserved and reserved
> resources separately? They are reserved for this role anyway.. Or I'm
> missing something?
>
>
> Regards,
> Gidon
>
>
>
>


Launching tasks with reserved resources

2015-08-13 Thread Gidon Gershinsky
I have a simple setup where a framework runs with a role, and some 
resources are reserved in cluster for that role.
The resource offers arrive at the framework as a list of two resource 
sets: one general (cpus(*)), etc)  and one specific for the role 
(cpus("role1"), etc).

So far so good. If two tasks are launched, each with one of the two 
resources, things work.

But problems start when I need to launch multiple smaller tasks (with a 
total resource consumption equal to the offered). I run this by creating 
resource objects, and attaching them to tasks, using calls from the 
standard Mesos samples (python):
task = mesos_pb2.TaskInfo()
cpus = task.resources.add()
cpus.name = "cpus"
cpus.scalar.value = TASK_CPUS

checking that total doesnt surpass the offered resources. This starts 
fine, but soon I get TASK_ERROR messages, due to Master validator finding 
that more resources are requested by tasks than available in the offer. 
This obviously happens because all tasks resources, as defined above, come 
with (*) role, while the offer resources are split between "*" and "role1" 
! Ok, then I assign a role to task resources, by adding
   cpus.role = "role1"

But this fails again, and for the same reason.. 

Shouldn't this work differently? When a resource offer is received 
framework with a "role1", why should it care which part is 'unreserved' 
and which part is reserved to "role1"? When a task launch request is 
received by the master, from a framework with a role, why can't it check 
only the total resource amount, instead of treating unreserved and 
reserved resources separately? They are reserved for this role anyway.. Or 
I'm missing something?


Regards, 
Gidon