Re: Support deadline for tasks

2018-03-26 Thread David Morrison
Hi, Benjamin,

Usually for us if tasks run longer than a certain period of time it means
that something has gone wrong and we should just abort/try again.

David (also at Yelp)

On Fri, Mar 23, 2018 at 7:14 PM, Benjamin Mahler  wrote:

> Ah, I was more curious about why they need to be killed after a timeout.
> E.g. After a particular deadline the work is useless (in Zhitao's case).
>
> On Fri, Mar 23, 2018 at 6:22 PM Sagar Sadashiv Patwardhan 
> wrote:
>
>> Hi Benjamin,
>> We have a few tasks that should be killed after
>> some timeout. We currently have some logic in our scheduler to kill these
>> tasks. Would be nice to delegate this to the executor.
>>
>> - Sagar
>>
>> On Fri, Mar 23, 2018 at 3:29 PM, Benjamin Mahler 
>> wrote:
>>
>> > Sagar, could you share your use case? Or is it exactly the same as
>> > Zhitao's?
>> >
>> > On Fri, Mar 23, 2018 at 3:15 PM, Sagar Sadashiv Patwardhan <
>> > sag...@yelp.com>
>> > wrote:
>> >
>> > > +1
>> > >
>> > > This will be useful for us(Yelp) as well.
>> > >
>> > > On Fri, Mar 23, 2018 at 1:31 PM, Benjamin Mahler 
>> > > wrote:
>> > >
>> > > > Also, it's advantageous for mesos to be aware of a hard deadline
>> when
>> > it
>> > > > comes to resource allocation. We know that some resources will free
>> up
>> > > and
>> > > > can make better decisions when it comes to pre-emption, for example.
>> > > > Currently, mesos doesn't know if a task will run forever or will
>> run to
>> > > > completion.
>> > > >
>> > > > On Fri, Mar 23, 2018 at 10:07 AM, James Peach 
>> > wrote:
>> > > >
>> > > > >
>> > > > >
>> > > > > > On Mar 23, 2018, at 9:57 AM, Renan DelValle <
>> > > renanidelva...@gmail.com>
>> > > > > wrote:
>> > > > > >
>> > > > > > Hi Zhitao,
>> > > > > >
>> > > > > > Since this is something that could potentially be handled by the
>> > > > > executor and/or framework, I was wondering if you could speak to
>> the
>> > > > > advantages of making this a TaskInfo primitive vs having the
>> executor
>> > > (or
>> > > > > even the framework) handle it.
>> > > > >
>> > > > > There's some discussion around this on https://issues.apache.org/
>> > > > > jira/browse/MESOS-8725.
>> > > > >
>> > > > > My take is that delegating too much to the scheduler makes
>> schedulers
>> > > > > harder to write and exacerbates the complexity of the system. If 4
>> > > > > different schedulers implement this feature, operators are likely
>> to
>> > > need
>> > > > > to understand 4 different ways of doing the same thing, which
>> would
>> > be
>> > > > > unfortunate.
>> > > > >
>> > > > > J
>> > > >
>> > >
>> >
>>
>


Re: Questions about Pods and the Mesos Containerizer

2018-01-29 Thread David Morrison
On Thu, Jan 25, 2018 at 5:49 PM, Gilbert Song  wrote:

>
>>-
>>
>>Is it possible to allocate a separate IP address per container in a
>>pod?
>>
>> Right now nested containers share the network from their parent container
> (pod). Do we have a specific use case that we need containers inside of a
> taskgroup have different IP addresses?
>

For our use case, we need to be able to launch a relatively large number of
containers inside a taskgroup that all listen on the same port (and the
port is not easily-changeable).  So we need to be able to assign different
IPs to the containers so they don't conflict.

Cheers,
David


Questions about Pods and the Mesos Containerizer

2018-01-24 Thread David Morrison
Hi Mesos community!

We’re in the process of designing a Mesos framework to launch multiple
containers together on the same host and are considering a couple of
approaches. The first is to use pods (with the TASK_GROUP primitive), and
the second is write a custom executor that launches nested containers and
use CNI to handle networking.

With that in mind, we had the following questions:

Questions about pods/task_groups:

   -

   Is it possible to do healthchecks per task in a pod?
   -

   Is it possible to allocate a separate IP address per container in a pod?
   -

   Is there any plan to support the Docker containeriser with pods?


Questions about UCR/Mesos containerizer:

   -

   Timeframe for debugging tools (equivalent of docker exec, etc)?
   -

   Is there any performance data about using the Mesos containeriser with
   container images versus using the Docker containeriser?
   -

  how does the Mesos containerizer handle extremely large images?
  -

  how does the Mesos containerizer handle dozens/hundreds of concurrent
  pulls?


If anyone has had any experience using the UCR and/or pods with the sort of
workflow we’re considering, your input would be highly useful!

Cheers,

David Morrison

Software Engineer @ Yelp