Re: [Design Doc] Improve Storage Support in Mesos using Resources Provider

2017-03-23 Thread daemeon reiydelle
Mesos at one point had a serious constraint that did not include network
utilization as an ask. That proved to be a major issue in utilizing the
frames (physical machines) effectively. If this has been resolved, I
appologize for missing this amazingly important fix. Otherwise tweaking
disk when network overload is a recurring risk seems interesting.


*...*



*Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872*

On Thu, Mar 23, 2017 at 5:51 PM, Jie Yu  wrote:

> Yes, the idea is to make this general in the future. In fact, the whole
> resource provider design keeps that in mind.
>
> We could add a general "CONVERT" operation in the future with a free
> formed key value pairs (as well as the source resources) as the parameters.
> And it's up to the corresponding resource provider to interpret that.
>
> - Jie
>
> On Thu, Mar 23, 2017 at 3:50 AM, Sargun Dhillon  wrote:
>
>> Is the intent to make this generic beyond disks? I can see the
>> concepts apply beyond volumes, and blocks. Perhaps a generic
>> Create{generation} -- where larger generation numbers descend from
>> smaller ones?
>>
>> I can also see this valuable in networking. My use case is ENIs in
>> AWS. I would like to have a ResourceProvider that can manipulate ENIs
>> based on the invocation of the scheduler. Instead of "CREATE_BLOCK"
>> it'd be CREATE_INTERFACE, with some given options about the ENI,
>> giving us a raw interface. Subsequently, we would want to do a
>> CREATE_IPVLAN, as a subinterface that we can assign an actual IP to.
>> The IPVLAN interface is a descendant of the raw interface, just as
>> volumes are descendants of block devices.
>>
>>
>> On Sun, Mar 12, 2017 at 6:47 PM, Jie Yu  wrote:
>> > Hi,
>> >
>> > Currently, Mesos supports both local persistent volumes as well as
>> external
>> > persistent volumes. However, both of them are not ideal.
>> >
>> > Local persistent volumes do not support offering physical or logical
>> block
>> > devices directly. Also, frameworks do not have choices to select
>> > filesystems for their local persistent volumes. There are also some
>> > usability problem with the local persistent volumes. Mesos does support
>> > multiple local disks. However, it’s a big burden for operators to
>> configure
>> > each agent properly to be able to leverage this feature.
>> >
>> > External persistent volumes support in Mesos currently bypasses the
>> > resource management part. In other words, using an external persistent
>> > volume does not go through the usual offer cycle. Mesos doesn’t track
>> > resources associated with the external volumes. This makes quota
>> control,
>> > reservation, fair sharing almost impossible to implement. Also, the
>> current
>> > interface Mesos uses to interact with volume providers is the Docker
>> Volume
>> > Driver interface (DVDI), which is very specific to operations on a
>> > particular agent.
>> >
>> > The main problem I see currently is that we don’t have a coherent story
>> for
>> > storage. Yes, we have some primitives in Mesos that can support some
>> > stateful services, but this is far from ideal. Some of them are just the
>> > stop gap solution (e.g., the external volume support). This design
>> tries to
>> > tell a coherent story for supporting storage in Mesos.
>> >
>> > https://docs.google.com/document/d/125YWqg_5BB5OY9a6M7LZcby5
>> RSqBwo2PZzpVLuxYXh4/edit?usp=sharing
>> >
>> > Please feel free to reply this thread or comment on the doc if you have
>> any
>> > comments or suggestions! Thanks!
>> >
>> > - Jie
>>
>
>


Re: Detect when mesos agent needs work directory cleanup

2017-03-23 Thread Benjamin Mahler
I would recommend avoiding a manual clean up of the work directory, since
it's not guaranteed that this approach will remain correct as things
evolve. To have the agent perform the cleanup using its own logic, you can
run:

mesos-agent --recover=cleanup --work_dir= --master=

Also, there is already reboot handling in place to ensure the agent doesn't
bother recovering if the node has rebooted since it last ran, although I
believe this may be in a future release. +yan, neil

On Thu, Mar 23, 2017 at 1:11 AM, Michele Bertasi <
michele.bert...@brightcomputing.com> wrote:

> Thank you. I will do that then.
>
> Mike
>
> On Wed, Mar 22, 2017 at 6:57 PM, Tomek Janiszewski 
> wrote:
>
>> 1. Cleanup in required when agent configuration is not compatible with
>> previous version. This mean task runtime change. This occurs when:
>> resources, isolators, attributes or containerizers change.
>> 2. IMO it's OK. When you reboot node all its tasks are gone so you won't
>> lose anything. Node will be registered as a new node.
>>
>> śr., 22.03.2017, 15:55 użytkownik Michele Bertasi <
>> michele.bert...@brightcomputing.com> napisał:
>>
>>> Hi everybody,
>>>
>>> I'm having troubles with the cleanup of the mesos agent work directory.
>>> When we change some configuration parameters to the agent (e.g. the
>>> --resource flag), the agent refuses to start and asks the removal of
>>> */var/lib/mesos-agent/meta/slaves/latest* in the logs.
>>>
>>> Since we are automating these changes, we would like to know when to
>>> perform this cleanup programmatically.
>>>
>>> Right now we are doing that *after every single change*, but that's not
>>> ideal, since every time mesos master assigns a new ID to the slave. It
>>> seems unnecessary to me.
>>>
>>> My questions are:
>>>
>>>- is there a way to know when this cleanup is needed (without
>>>looking at the logs)?
>>>- do we have problems if we clean up the work directory every time
>>>the node boots (so the configuration might or might not have been changed
>>>at all)?
>>>
>>>
>>> Thanks, kind regards,
>>>
>>> Mike
>>>
>>
>


Re: [Design Doc] Improve Storage Support in Mesos using Resources Provider

2017-03-23 Thread Jie Yu
Yes, the idea is to make this general in the future. In fact, the whole
resource provider design keeps that in mind.

We could add a general "CONVERT" operation in the future with a free formed
key value pairs (as well as the source resources) as the parameters. And
it's up to the corresponding resource provider to interpret that.

- Jie

On Thu, Mar 23, 2017 at 3:50 AM, Sargun Dhillon  wrote:

> Is the intent to make this generic beyond disks? I can see the
> concepts apply beyond volumes, and blocks. Perhaps a generic
> Create{generation} -- where larger generation numbers descend from
> smaller ones?
>
> I can also see this valuable in networking. My use case is ENIs in
> AWS. I would like to have a ResourceProvider that can manipulate ENIs
> based on the invocation of the scheduler. Instead of "CREATE_BLOCK"
> it'd be CREATE_INTERFACE, with some given options about the ENI,
> giving us a raw interface. Subsequently, we would want to do a
> CREATE_IPVLAN, as a subinterface that we can assign an actual IP to.
> The IPVLAN interface is a descendant of the raw interface, just as
> volumes are descendants of block devices.
>
>
> On Sun, Mar 12, 2017 at 6:47 PM, Jie Yu  wrote:
> > Hi,
> >
> > Currently, Mesos supports both local persistent volumes as well as
> external
> > persistent volumes. However, both of them are not ideal.
> >
> > Local persistent volumes do not support offering physical or logical
> block
> > devices directly. Also, frameworks do not have choices to select
> > filesystems for their local persistent volumes. There are also some
> > usability problem with the local persistent volumes. Mesos does support
> > multiple local disks. However, it’s a big burden for operators to
> configure
> > each agent properly to be able to leverage this feature.
> >
> > External persistent volumes support in Mesos currently bypasses the
> > resource management part. In other words, using an external persistent
> > volume does not go through the usual offer cycle. Mesos doesn’t track
> > resources associated with the external volumes. This makes quota control,
> > reservation, fair sharing almost impossible to implement. Also, the
> current
> > interface Mesos uses to interact with volume providers is the Docker
> Volume
> > Driver interface (DVDI), which is very specific to operations on a
> > particular agent.
> >
> > The main problem I see currently is that we don’t have a coherent story
> for
> > storage. Yes, we have some primitives in Mesos that can support some
> > stateful services, but this is far from ideal. Some of them are just the
> > stop gap solution (e.g., the external volume support). This design tries
> to
> > tell a coherent story for supporting storage in Mesos.
> >
> > https://docs.google.com/document/d/125YWqg_
> 5BB5OY9a6M7LZcby5RSqBwo2PZzpVLuxYXh4/edit?usp=sharing
> >
> > Please feel free to reply this thread or comment on the doc if you have
> any
> > comments or suggestions! Thanks!
> >
> > - Jie
>


Re: Detect when mesos agent needs work directory cleanup

2017-03-23 Thread Michele Bertasi
Thank you. I will do that then.

Mike

On Wed, Mar 22, 2017 at 6:57 PM, Tomek Janiszewski 
wrote:

> 1. Cleanup in required when agent configuration is not compatible with
> previous version. This mean task runtime change. This occurs when:
> resources, isolators, attributes or containerizers change.
> 2. IMO it's OK. When you reboot node all its tasks are gone so you won't
> lose anything. Node will be registered as a new node.
>
> śr., 22.03.2017, 15:55 użytkownik Michele Bertasi  brightcomputing.com> napisał:
>
>> Hi everybody,
>>
>> I'm having troubles with the cleanup of the mesos agent work directory.
>> When we change some configuration parameters to the agent (e.g. the
>> --resource flag), the agent refuses to start and asks the removal of
>> */var/lib/mesos-agent/meta/slaves/latest* in the logs.
>>
>> Since we are automating these changes, we would like to know when to
>> perform this cleanup programmatically.
>>
>> Right now we are doing that *after every single change*, but that's not
>> ideal, since every time mesos master assigns a new ID to the slave. It
>> seems unnecessary to me.
>>
>> My questions are:
>>
>>- is there a way to know when this cleanup is needed (without looking
>>at the logs)?
>>- do we have problems if we clean up the work directory every time
>>the node boots (so the configuration might or might not have been changed
>>at all)?
>>
>>
>> Thanks, kind regards,
>>
>> Mike
>>
>


Re: [Design Doc] Improve Storage Support in Mesos using Resources Provider

2017-03-23 Thread Sargun Dhillon
Is the intent to make this generic beyond disks? I can see the
concepts apply beyond volumes, and blocks. Perhaps a generic
Create{generation} -- where larger generation numbers descend from
smaller ones?

I can also see this valuable in networking. My use case is ENIs in
AWS. I would like to have a ResourceProvider that can manipulate ENIs
based on the invocation of the scheduler. Instead of "CREATE_BLOCK"
it'd be CREATE_INTERFACE, with some given options about the ENI,
giving us a raw interface. Subsequently, we would want to do a
CREATE_IPVLAN, as a subinterface that we can assign an actual IP to.
The IPVLAN interface is a descendant of the raw interface, just as
volumes are descendants of block devices.


On Sun, Mar 12, 2017 at 6:47 PM, Jie Yu  wrote:
> Hi,
>
> Currently, Mesos supports both local persistent volumes as well as external
> persistent volumes. However, both of them are not ideal.
>
> Local persistent volumes do not support offering physical or logical block
> devices directly. Also, frameworks do not have choices to select
> filesystems for their local persistent volumes. There are also some
> usability problem with the local persistent volumes. Mesos does support
> multiple local disks. However, it’s a big burden for operators to configure
> each agent properly to be able to leverage this feature.
>
> External persistent volumes support in Mesos currently bypasses the
> resource management part. In other words, using an external persistent
> volume does not go through the usual offer cycle. Mesos doesn’t track
> resources associated with the external volumes. This makes quota control,
> reservation, fair sharing almost impossible to implement. Also, the current
> interface Mesos uses to interact with volume providers is the Docker Volume
> Driver interface (DVDI), which is very specific to operations on a
> particular agent.
>
> The main problem I see currently is that we don’t have a coherent story for
> storage. Yes, we have some primitives in Mesos that can support some
> stateful services, but this is far from ideal. Some of them are just the
> stop gap solution (e.g., the external volume support). This design tries to
> tell a coherent story for supporting storage in Mesos.
>
> https://docs.google.com/document/d/125YWqg_5BB5OY9a6M7LZcby5RSqBwo2PZzpVLuxYXh4/edit?usp=sharing
>
> Please feel free to reply this thread or comment on the doc if you have any
> comments or suggestions! Thanks!
>
> - Jie