Re: On Mesos versioning and deprecation policy

2016-10-29 Thread haosdent
+1 For the sum up. Now it is clear for me.

On Sat, Oct 29, 2016 at 6:45 AM, Vinod Kone  wrote:

> We had an extended discussion around this in the last community sync.
> Thanks for those who participated!
>
> To sum up the discussion:
>
> --> As mesos devs, we should strive to not make incompatible changes in
> APIs, flags, environment variables.
>
> --> In the rare case where an incompatible change is preferred (e.g., code
> complexity), we should give a clear 6 months heads up the users that a
> breaking change is going to take place.
>
> --> Breaking changes do not necessitate a major version bump. This is
> because we want to allow live upgrades between major versions (e.g., 1.10
> to 2.0).
>
> --> Compatibility guarantees do not apply to experimental features (incl.
> APIs).
>
> --> We need to have clear documentation about procedure that devs could
> follow when deprecating/removing stable features and adding experimental
> features.
>
> --> We need to improve upgrades.md to make it easy for operators to know
> what features are deprecated/removed between versions X and Y.
>
> --> We should decouple internal protos used by Mesos from the unversioned
> protos used by driver based frameworks.
>
> I will spend some time in the next few weeks to create/update the
> documentation reflecting these points.
>
> Anything else I missed?
>
> Thanks,
>
> On Sat, Oct 15, 2016 at 11:47 AM, haosdent  wrote:
>
> > Thanks @yan's great inputs! I couldn't agree more almost of them.
> >
> > > Also the API is not just what the machine reads but all the
> documentation
> > associated with it, right? It depends on what the documentation says;
> what
> > the user _should_ expect.
> >
> > I think different users may have different expectations. And the guy who
> > developed the APIs may have different understand from some users as well.
> > Our documentations should cover most of cases.
> >
> > But in case that we didn't or forgot to write it explicitly in the
> > document, should we give up to update the API? Just like user Alice said
> > this is a BUG while user Bob said this is a feature. I think we still
> need
> > to raise it case by case to ensure most users are not affected by the
> > breaking API changes.
> >
> > On Sat, Oct 15, 2016 at 6:55 AM, Vinod Kone 
> wrote:
> >
> > > We will chat about this in the upcoming community sync (thursday 3 PM).
> > > So, please make sure to attend if you are interested.
> > >
> > > On Fri, Oct 14, 2016 at 3:44 PM, Yan Xu  wrote:
> > >
> > >>
> > >> On Fri, Oct 14, 2016 at 3:37 PM, Yan Xu  wrote:
> > >>
> > >>> Thanks Alex for starting this!
> > >>>
> > >>> In addition to comments below, I think it'll be helpful to keep the
> > >>> existing versioning doc concise and user-friendly while having a
> > dedicated
> > >>> doc for the "implementation details" where precise requirements and
> > >>> procedures go. Maybe some duplication/cross-referencing is needed but
> > Mesos
> > >>> developers will find the latter much more helpful while the
> > users/framework
> > >>> developer will find the former easy to read.
> > >>>
> > >>> e.g., a similar split:
> > >>> https://github.com/kubernetes/kubernetes/blob/master/docs/api.md
> > >>> https://github.com/kubernetes/kubernetes/blob/master/docs/de
> > >>> vel/api_changes.md (which has a lot of details on how the kubernetes
> > >>> community is thinking about similar issues, which we can learn from)
> > >>>
> > >>> Jiang Yan Xu 
> > >>>
> > >>> On Wed, Oct 12, 2016 at 9:34 AM, Alex Rukletsov  >
> > >>> wrote:
> > >>>
> >  Folks,
> > 
> >  There have been a bunch of online [1, 2] and offline discussions
> about
> >  our
> >  deprecation and versioning policy. I found that people—including
> >  myself—read the versioning doc [3] differently; moreover some
> aspects
> >  are
> >  not captured there. I would like to start a discussion around this
> >  topic by
> >  sharing my confusions and suggestions. This will hopefully help us
> > stay
> >  on
> >  the same page and have similar expectations. The second goal is to
> >  eliminate ambiguities from the versioning doc (thanks Vinod for
> >  volunteering to update it).
> > 
> > >>>
> > >>> +1 Let me know if there are things I can help with.
> > >>>
> > >>>
> > 
> >  1. API vs. semantic changes.
> >  Current versioning guide treat features (e.g. flags, metrics,
> > endpoints)
> >  and API differently: incompatible changes for the former are allowed
> >  after
> >  6 month deprecation cycle, while for the latter they require
> bumping a
> >  major version. I suggest we consolidate these policies.
> > 
> > >>>
> > >>> I feel that the distinction is not API vs. semantic changes,
> Backwards
> > >>> compatible API guarantee should imply backwards compatible semantics
> > (of
> > >>> the API).

Re: On Mesos versioning and deprecation policy

2016-10-28 Thread Vinod Kone
We had an extended discussion around this in the last community sync.
Thanks for those who participated!

To sum up the discussion:

--> As mesos devs, we should strive to not make incompatible changes in
APIs, flags, environment variables.

--> In the rare case where an incompatible change is preferred (e.g., code
complexity), we should give a clear 6 months heads up the users that a
breaking change is going to take place.

--> Breaking changes do not necessitate a major version bump. This is
because we want to allow live upgrades between major versions (e.g., 1.10
to 2.0).

--> Compatibility guarantees do not apply to experimental features (incl.
APIs).

--> We need to have clear documentation about procedure that devs could
follow when deprecating/removing stable features and adding experimental
features.

--> We need to improve upgrades.md to make it easy for operators to know
what features are deprecated/removed between versions X and Y.

--> We should decouple internal protos used by Mesos from the unversioned
protos used by driver based frameworks.

I will spend some time in the next few weeks to create/update the
documentation reflecting these points.

Anything else I missed?

Thanks,

On Sat, Oct 15, 2016 at 11:47 AM, haosdent  wrote:

> Thanks @yan's great inputs! I couldn't agree more almost of them.
>
> > Also the API is not just what the machine reads but all the documentation
> associated with it, right? It depends on what the documentation says; what
> the user _should_ expect.
>
> I think different users may have different expectations. And the guy who
> developed the APIs may have different understand from some users as well.
> Our documentations should cover most of cases.
>
> But in case that we didn't or forgot to write it explicitly in the
> document, should we give up to update the API? Just like user Alice said
> this is a BUG while user Bob said this is a feature. I think we still need
> to raise it case by case to ensure most users are not affected by the
> breaking API changes.
>
> On Sat, Oct 15, 2016 at 6:55 AM, Vinod Kone  wrote:
>
> > We will chat about this in the upcoming community sync (thursday 3 PM).
> > So, please make sure to attend if you are interested.
> >
> > On Fri, Oct 14, 2016 at 3:44 PM, Yan Xu  wrote:
> >
> >>
> >> On Fri, Oct 14, 2016 at 3:37 PM, Yan Xu  wrote:
> >>
> >>> Thanks Alex for starting this!
> >>>
> >>> In addition to comments below, I think it'll be helpful to keep the
> >>> existing versioning doc concise and user-friendly while having a
> dedicated
> >>> doc for the "implementation details" where precise requirements and
> >>> procedures go. Maybe some duplication/cross-referencing is needed but
> Mesos
> >>> developers will find the latter much more helpful while the
> users/framework
> >>> developer will find the former easy to read.
> >>>
> >>> e.g., a similar split:
> >>> https://github.com/kubernetes/kubernetes/blob/master/docs/api.md
> >>> https://github.com/kubernetes/kubernetes/blob/master/docs/de
> >>> vel/api_changes.md (which has a lot of details on how the kubernetes
> >>> community is thinking about similar issues, which we can learn from)
> >>>
> >>> Jiang Yan Xu 
> >>>
> >>> On Wed, Oct 12, 2016 at 9:34 AM, Alex Rukletsov 
> >>> wrote:
> >>>
>  Folks,
> 
>  There have been a bunch of online [1, 2] and offline discussions about
>  our
>  deprecation and versioning policy. I found that people—including
>  myself—read the versioning doc [3] differently; moreover some aspects
>  are
>  not captured there. I would like to start a discussion around this
>  topic by
>  sharing my confusions and suggestions. This will hopefully help us
> stay
>  on
>  the same page and have similar expectations. The second goal is to
>  eliminate ambiguities from the versioning doc (thanks Vinod for
>  volunteering to update it).
> 
> >>>
> >>> +1 Let me know if there are things I can help with.
> >>>
> >>>
> 
>  1. API vs. semantic changes.
>  Current versioning guide treat features (e.g. flags, metrics,
> endpoints)
>  and API differently: incompatible changes for the former are allowed
>  after
>  6 month deprecation cycle, while for the latter they require bumping a
>  major version. I suggest we consolidate these policies.
> 
> >>>
> >>> I feel that the distinction is not API vs. semantic changes, Backwards
> >>> compatible API guarantee should imply backwards compatible semantics
> (of
> >>> the API).
> >>> i.e., if a change in API doesn't cause the message to be dropped to the
> >>> floor but leads to behavior change that causes problems in the system,
> it
> >>> still breaks compatibility.
> >>>
> >>> IMO the distinction is more between:
> >>> - Compatibility between components that are impossible/very unpleasant
> >>> to upgrade in lockstep - high priority 

Re: On Mesos versioning and deprecation policy

2016-10-15 Thread haosdent
Thanks @yan's great inputs! I couldn't agree more almost of them.

> Also the API is not just what the machine reads but all the documentation
associated with it, right? It depends on what the documentation says; what
the user _should_ expect.

I think different users may have different expectations. And the guy who
developed the APIs may have different understand from some users as well.
Our documentations should cover most of cases.

But in case that we didn't or forgot to write it explicitly in the
document, should we give up to update the API? Just like user Alice said
this is a BUG while user Bob said this is a feature. I think we still need
to raise it case by case to ensure most users are not affected by the
breaking API changes.

On Sat, Oct 15, 2016 at 6:55 AM, Vinod Kone  wrote:

> We will chat about this in the upcoming community sync (thursday 3 PM).
> So, please make sure to attend if you are interested.
>
> On Fri, Oct 14, 2016 at 3:44 PM, Yan Xu  wrote:
>
>>
>> On Fri, Oct 14, 2016 at 3:37 PM, Yan Xu  wrote:
>>
>>> Thanks Alex for starting this!
>>>
>>> In addition to comments below, I think it'll be helpful to keep the
>>> existing versioning doc concise and user-friendly while having a dedicated
>>> doc for the "implementation details" where precise requirements and
>>> procedures go. Maybe some duplication/cross-referencing is needed but Mesos
>>> developers will find the latter much more helpful while the users/framework
>>> developer will find the former easy to read.
>>>
>>> e.g., a similar split:
>>> https://github.com/kubernetes/kubernetes/blob/master/docs/api.md
>>> https://github.com/kubernetes/kubernetes/blob/master/docs/de
>>> vel/api_changes.md (which has a lot of details on how the kubernetes
>>> community is thinking about similar issues, which we can learn from)
>>>
>>> Jiang Yan Xu 
>>>
>>> On Wed, Oct 12, 2016 at 9:34 AM, Alex Rukletsov 
>>> wrote:
>>>
 Folks,

 There have been a bunch of online [1, 2] and offline discussions about
 our
 deprecation and versioning policy. I found that people—including
 myself—read the versioning doc [3] differently; moreover some aspects
 are
 not captured there. I would like to start a discussion around this
 topic by
 sharing my confusions and suggestions. This will hopefully help us stay
 on
 the same page and have similar expectations. The second goal is to
 eliminate ambiguities from the versioning doc (thanks Vinod for
 volunteering to update it).

>>>
>>> +1 Let me know if there are things I can help with.
>>>
>>>

 1. API vs. semantic changes.
 Current versioning guide treat features (e.g. flags, metrics, endpoints)
 and API differently: incompatible changes for the former are allowed
 after
 6 month deprecation cycle, while for the latter they require bumping a
 major version. I suggest we consolidate these policies.

>>>
>>> I feel that the distinction is not API vs. semantic changes, Backwards
>>> compatible API guarantee should imply backwards compatible semantics (of
>>> the API).
>>> i.e., if a change in API doesn't cause the message to be dropped to the
>>> floor but leads to behavior change that causes problems in the system, it
>>> still breaks compatibility.
>>>
>>> IMO the distinction is more between:
>>> - Compatibility between components that are impossible/very unpleasant
>>> to upgrade in lockstep - high priority for compatibility guarantee.
>>> - Compatibility between components that are generally bundled (modules)
>>> or things that usually aren't built into automated tooling (e.g., the
>>> /state endpoint) - more relaxed for now but we should explicitly exclude
>>> them from the guarantee.
>>>
>>>

 We should also define and clearly explain what changes require bumping
 the
 major version. I have no strong opinion here and would love to hear what
 people think. The original motivation for maintaining backwards
 compatibility is to make sure vN schedulers can correctly work with vN
 API
 without being updated. But what about semantic changes that do not touch
 the API? For example, what if we decide to send less task health
 updates to
 schedulers based on some health policy? It influences the flow of task
 status updates, should such change be considered compatible? Taking it
 to
 an extreme, we may not even be able to fix some bugs because someone may
 already rely on this behaviour!

>>>
>>> API changes should warrant a major version bump. Also the API is not
>>> just what the machine reads but all the documentation associated with it,
>>> right? It depends on what the documentation says; what the user _should_
>>> expect.
>>>
>>> That said, I feel that these things are hard to be talked about in the
>>> abstract. Even with a guideline, we still need to make case-by-case
>>> 

Re: On Mesos versioning and deprecation policy

2016-10-14 Thread Vinod Kone
We will chat about this in the upcoming community sync (thursday 3 PM). So,
please make sure to attend if you are interested.

On Fri, Oct 14, 2016 at 3:44 PM, Yan Xu  wrote:

>
> On Fri, Oct 14, 2016 at 3:37 PM, Yan Xu  wrote:
>
>> Thanks Alex for starting this!
>>
>> In addition to comments below, I think it'll be helpful to keep the
>> existing versioning doc concise and user-friendly while having a dedicated
>> doc for the "implementation details" where precise requirements and
>> procedures go. Maybe some duplication/cross-referencing is needed but Mesos
>> developers will find the latter much more helpful while the users/framework
>> developer will find the former easy to read.
>>
>> e.g., a similar split:
>> https://github.com/kubernetes/kubernetes/blob/master/docs/api.md
>> https://github.com/kubernetes/kubernetes/blob/master/docs/de
>> vel/api_changes.md (which has a lot of details on how the kubernetes
>> community is thinking about similar issues, which we can learn from)
>>
>> Jiang Yan Xu 
>>
>> On Wed, Oct 12, 2016 at 9:34 AM, Alex Rukletsov 
>> wrote:
>>
>>> Folks,
>>>
>>> There have been a bunch of online [1, 2] and offline discussions about
>>> our
>>> deprecation and versioning policy. I found that people—including
>>> myself—read the versioning doc [3] differently; moreover some aspects are
>>> not captured there. I would like to start a discussion around this topic
>>> by
>>> sharing my confusions and suggestions. This will hopefully help us stay
>>> on
>>> the same page and have similar expectations. The second goal is to
>>> eliminate ambiguities from the versioning doc (thanks Vinod for
>>> volunteering to update it).
>>>
>>
>> +1 Let me know if there are things I can help with.
>>
>>
>>>
>>> 1. API vs. semantic changes.
>>> Current versioning guide treat features (e.g. flags, metrics, endpoints)
>>> and API differently: incompatible changes for the former are allowed
>>> after
>>> 6 month deprecation cycle, while for the latter they require bumping a
>>> major version. I suggest we consolidate these policies.
>>>
>>
>> I feel that the distinction is not API vs. semantic changes, Backwards
>> compatible API guarantee should imply backwards compatible semantics (of
>> the API).
>> i.e., if a change in API doesn't cause the message to be dropped to the
>> floor but leads to behavior change that causes problems in the system, it
>> still breaks compatibility.
>>
>> IMO the distinction is more between:
>> - Compatibility between components that are impossible/very unpleasant to
>> upgrade in lockstep - high priority for compatibility guarantee.
>> - Compatibility between components that are generally bundled (modules)
>> or things that usually aren't built into automated tooling (e.g., the
>> /state endpoint) - more relaxed for now but we should explicitly exclude
>> them from the guarantee.
>>
>>
>>>
>>> We should also define and clearly explain what changes require bumping
>>> the
>>> major version. I have no strong opinion here and would love to hear what
>>> people think. The original motivation for maintaining backwards
>>> compatibility is to make sure vN schedulers can correctly work with vN
>>> API
>>> without being updated. But what about semantic changes that do not touch
>>> the API? For example, what if we decide to send less task health updates
>>> to
>>> schedulers based on some health policy? It influences the flow of task
>>> status updates, should such change be considered compatible? Taking it to
>>> an extreme, we may not even be able to fix some bugs because someone may
>>> already rely on this behaviour!
>>>
>>
>> API changes should warrant a major version bump. Also the API is not just
>> what the machine reads but all the documentation associated with it, right?
>> It depends on what the documentation says; what the user _should_ expect.
>>
>> That said, I feel that these things are hard to be talked about in the
>> abstract. Even with a guideline, we still need to make case-by-case
>> decisions. (e.g., has the documentation precisely defined this precise
>> behavior? If not, is it reasonable for the users to expect some behavior
>> because it's common sense? How bad is it if some behavior just changes a
>> tiny bit?) Therefore we need to make sure the process for API changes are
>> more rigorously defined.
>>
>> Whether something is a bug depends on whether the API does what it says
>> it'll do. The line may sometimes be blurry but in general I don't feel it's
>> a problem. If someone is relying on the behavior that is a bug, we should
>> still help them fix it but the bug shouldn't count as "our guarantee".
>>
>>
>>>
>>> Another tightly related thing we should explicitly call out is
>>> upgradability and rollback capabilities inside a major release.
>>> Committing
>>> to this may significantly limit what we can change within a major
>>> release;
>>> on the other side it will give users more time and a 

Re: On Mesos versioning and deprecation policy

2016-10-14 Thread Yan Xu
On Fri, Oct 14, 2016 at 3:37 PM, Yan Xu  wrote:

> Thanks Alex for starting this!
>
> In addition to comments below, I think it'll be helpful to keep the
> existing versioning doc concise and user-friendly while having a dedicated
> doc for the "implementation details" where precise requirements and
> procedures go. Maybe some duplication/cross-referencing is needed but Mesos
> developers will find the latter much more helpful while the users/framework
> developer will find the former easy to read.
>
> e.g., a similar split:
> https://github.com/kubernetes/kubernetes/blob/master/docs/api.md
> https://github.com/kubernetes/kubernetes/blob/master/docs/de
> vel/api_changes.md (which has a lot of details on how the kubernetes
> community is thinking about similar issues, which we can learn from)
>
> Jiang Yan Xu 
>
> On Wed, Oct 12, 2016 at 9:34 AM, Alex Rukletsov 
> wrote:
>
>> Folks,
>>
>> There have been a bunch of online [1, 2] and offline discussions about our
>> deprecation and versioning policy. I found that people—including
>> myself—read the versioning doc [3] differently; moreover some aspects are
>> not captured there. I would like to start a discussion around this topic
>> by
>> sharing my confusions and suggestions. This will hopefully help us stay on
>> the same page and have similar expectations. The second goal is to
>> eliminate ambiguities from the versioning doc (thanks Vinod for
>> volunteering to update it).
>>
>
> +1 Let me know if there are things I can help with.
>
>
>>
>> 1. API vs. semantic changes.
>> Current versioning guide treat features (e.g. flags, metrics, endpoints)
>> and API differently: incompatible changes for the former are allowed after
>> 6 month deprecation cycle, while for the latter they require bumping a
>> major version. I suggest we consolidate these policies.
>>
>
> I feel that the distinction is not API vs. semantic changes, Backwards
> compatible API guarantee should imply backwards compatible semantics (of
> the API).
> i.e., if a change in API doesn't cause the message to be dropped to the
> floor but leads to behavior change that causes problems in the system, it
> still breaks compatibility.
>
> IMO the distinction is more between:
> - Compatibility between components that are impossible/very unpleasant to
> upgrade in lockstep - high priority for compatibility guarantee.
> - Compatibility between components that are generally bundled (modules) or
> things that usually aren't built into automated tooling (e.g., the /state
> endpoint) - more relaxed for now but we should explicitly exclude them from
> the guarantee.
>
>
>>
>> We should also define and clearly explain what changes require bumping the
>> major version. I have no strong opinion here and would love to hear what
>> people think. The original motivation for maintaining backwards
>> compatibility is to make sure vN schedulers can correctly work with vN API
>> without being updated. But what about semantic changes that do not touch
>> the API? For example, what if we decide to send less task health updates
>> to
>> schedulers based on some health policy? It influences the flow of task
>> status updates, should such change be considered compatible? Taking it to
>> an extreme, we may not even be able to fix some bugs because someone may
>> already rely on this behaviour!
>>
>
> API changes should warrant a major version bump. Also the API is not just
> what the machine reads but all the documentation associated with it, right?
> It depends on what the documentation says; what the user _should_ expect.
>
> That said, I feel that these things are hard to be talked about in the
> abstract. Even with a guideline, we still need to make case-by-case
> decisions. (e.g., has the documentation precisely defined this precise
> behavior? If not, is it reasonable for the users to expect some behavior
> because it's common sense? How bad is it if some behavior just changes a
> tiny bit?) Therefore we need to make sure the process for API changes are
> more rigorously defined.
>
> Whether something is a bug depends on whether the API does what it says
> it'll do. The line may sometimes be blurry but in general I don't feel it's
> a problem. If someone is relying on the behavior that is a bug, we should
> still help them fix it but the bug shouldn't count as "our guarantee".
>
>
>>
>> Another tightly related thing we should explicitly call out is
>> upgradability and rollback capabilities inside a major release. Committing
>> to this may significantly limit what we can change within a major release;
>> on the other side it will give users more time and a better experience
>> about using and maintaining Mesos clusters.
>>
>
> According to the versioning doc upgradability depends on whether you
> depend on deprecated/removed features.
>
> That paragraph should be explained more precisely:
> - "deprecated" means your system won't break but warnings are shown (Maybe
> we should 

On Mesos versioning and deprecation policy

2016-10-12 Thread Alex Rukletsov
Folks,

There have been a bunch of online [1, 2] and offline discussions about our
deprecation and versioning policy. I found that people—including
myself—read the versioning doc [3] differently; moreover some aspects are
not captured there. I would like to start a discussion around this topic by
sharing my confusions and suggestions. This will hopefully help us stay on
the same page and have similar expectations. The second goal is to
eliminate ambiguities from the versioning doc (thanks Vinod for
volunteering to update it).

1. API vs. semantic changes.
Current versioning guide treat features (e.g. flags, metrics, endpoints)
and API differently: incompatible changes for the former are allowed after
6 month deprecation cycle, while for the latter they require bumping a
major version. I suggest we consolidate these policies.

We should also define and clearly explain what changes require bumping the
major version. I have no strong opinion here and would love to hear what
people think. The original motivation for maintaining backwards
compatibility is to make sure vN schedulers can correctly work with vN API
without being updated. But what about semantic changes that do not touch
the API? For example, what if we decide to send less task health updates to
schedulers based on some health policy? It influences the flow of task
status updates, should such change be considered compatible? Taking it to
an extreme, we may not even be able to fix some bugs because someone may
already rely on this behaviour!

Another tightly related thing we should explicitly call out is
upgradability and rollback capabilities inside a major release. Committing
to this may significantly limit what we can change within a major release;
on the other side it will give users more time and a better experience
about using and maintaining Mesos clusters.

2. Versioned vs. unversioned protobufs.
Currently we have v1 and unnamed protobufs, which simultaneously mean v0,
v2, and internal. I am sometimes confused about what is the right way to
update or introduce a field or message there, do people feel the same? How
about splitting the unnamed version into explicit v0, v2, and internal?

Food for thought. It would be great if we can only maintain "diffs" to the
internal protobufs in the code, instead of duplicating them altogether.

3. API and feature labelling.
I suggest to introduce explicit labels for API and features, to ensure
users have the right assumptions about the their lifetime while engineers
have the ability to change a wip feature in an non-compatible way. I
propose the following:
API: stable, non-stable, pure (not used by Mesos components)
Feature: experimental, normal.

Looking forward to your thoughts and suggestions.
AlexR

[1] https://www.mail-archive.com/user@mesos.apache.org/msg08025.html
[2] https://www.mail-archive.com/dev@mesos.apache.org/msg36621.html
[3]
https://github.com/apache/mesos/blob/b2beef37f6f85a8c75e968136caa7a1f292ba20e/docs/versioning.md