On Fri, Oct 14, 2016 at 3:37 PM, Yan Xu <xuj...@apple.com> wrote:

> Thanks Alex for starting this!
>
> In addition to comments below, I think it'll be helpful to keep the
> existing versioning doc concise and user-friendly while having a dedicated
> doc for the "implementation details" where precise requirements and
> procedures go. Maybe some duplication/cross-referencing is needed but Mesos
> developers will find the latter much more helpful while the users/framework
> developer will find the former easy to read.
>
> e.g., a similar split:
> https://github.com/kubernetes/kubernetes/blob/master/docs/api.md
> https://github.com/kubernetes/kubernetes/blob/master/docs/de
> vel/api_changes.md (which has a lot of details on how the kubernetes
> community is thinking about similar issues, which we can learn from)
>
> Jiang Yan Xu 
>
> On Wed, Oct 12, 2016 at 9:34 AM, Alex Rukletsov <a...@mesosphere.com>
> wrote:
>
>> Folks,
>>
>> There have been a bunch of online [1, 2] and offline discussions about our
>> deprecation and versioning policy. I found that people—including
>> myself—read the versioning doc [3] differently; moreover some aspects are
>> not captured there. I would like to start a discussion around this topic
>> by
>> sharing my confusions and suggestions. This will hopefully help us stay on
>> the same page and have similar expectations. The second goal is to
>> eliminate ambiguities from the versioning doc (thanks Vinod for
>> volunteering to update it).
>>
>
> +1 Let me know if there are things I can help with.
>
>
>>
>> 1. API vs. semantic changes.
>> Current versioning guide treat features (e.g. flags, metrics, endpoints)
>> and API differently: incompatible changes for the former are allowed after
>> 6 month deprecation cycle, while for the latter they require bumping a
>> major version. I suggest we consolidate these policies.
>>
>
> I feel that the distinction is not API vs. semantic changes, Backwards
> compatible API guarantee should imply backwards compatible semantics (of
> the API).
> i.e., if a change in API doesn't cause the message to be dropped to the
> floor but leads to behavior change that causes problems in the system, it
> still breaks compatibility.
>
> IMO the distinction is more between:
> - Compatibility between components that are impossible/very unpleasant to
> upgrade in lockstep - high priority for compatibility guarantee.
> - Compatibility between components that are generally bundled (modules) or
> things that usually aren't built into automated tooling (e.g., the /state
> endpoint) - more relaxed for now but we should explicitly exclude them from
> the guarantee.
>
>
>>
>> We should also define and clearly explain what changes require bumping the
>> major version. I have no strong opinion here and would love to hear what
>> people think. The original motivation for maintaining backwards
>> compatibility is to make sure vN schedulers can correctly work with vN API
>> without being updated. But what about semantic changes that do not touch
>> the API? For example, what if we decide to send less task health updates
>> to
>> schedulers based on some health policy? It influences the flow of task
>> status updates, should such change be considered compatible? Taking it to
>> an extreme, we may not even be able to fix some bugs because someone may
>> already rely on this behaviour!
>>
>
> API changes should warrant a major version bump. Also the API is not just
> what the machine reads but all the documentation associated with it, right?
> It depends on what the documentation says; what the user _should_ expect.
>
> That said, I feel that these things are hard to be talked about in the
> abstract. Even with a guideline, we still need to make case-by-case
> decisions. (e.g., has the documentation precisely defined this precise
> behavior? If not, is it reasonable for the users to expect some behavior
> because it's common sense? How bad is it if some behavior just changes a
> tiny bit?) Therefore we need to make sure the process for API changes are
> more rigorously defined.
>
> Whether something is a bug depends on whether the API does what it says
> it'll do. The line may sometimes be blurry but in general I don't feel it's
> a problem. If someone is relying on the behavior that is a bug, we should
> still help them fix it but the bug shouldn't count as "our guarantee".
>
>
>>
>> Another tightly related thing we should explicitly call out is
>> upgradability and rollback capabilities inside a major release. Committing
>> to this may significantly limit what we can change within a major release;
>> on the other side it will give users more time and a better experience
>> about using and maintaining Mesos clusters.
>>
>
> According to the versioning doc upgradability depends on whether you
> depend on deprecated/removed features.
>
> That paragraph should be explained more precisely:
> - "deprecated" means your system won't break but warnings are shown (Maybe
> we should use some standard deprecation warning keywords so the operator
> can monitor the log for such warnings!
> - "removed": means it may break.
>
> If you deprecate a flag/env that interface with operator tooling in the
> next minor release, the operator basically has 6 months from the next minor
> release to change the her tooling. I feel this is pretty acceptable.
> If you deprecate a flag/env variable that interface with the framework
> (executor) in the next minor release, I feel it may not be enough and it
> probably warrants a major version bump. So perhaps the API shouldn't be
> just the protos.
>
>
>> 2. Versioned vs. unversioned protobufs.
>> Currently we have v1 and unnamed protobufs, which simultaneously mean v0,
>> v2, and internal. I am sometimes confused about what is the right way to
>> update or introduce a field or message there, do people feel the same? How
>> about splitting the unnamed version into explicit v0, v2, and internal?
>>
>
> As haosdent mentioned, we have captured this in MESOS-6268. The benefit is
> clear but I guess the people will be more motivated when we find some v2
> feature can't be made compatible with the v0 API. (Anand's point
> in MESOS-6016). On the other hand, if we cut v0 API access before that
> happens (is v0 API obsolete and should be removed 6 months after 1.0?) then
> we don't need to worry about v0 and can use unversioned protos as
> "internal"?
>
>
>> Food for thought. It would be great if we can only maintain "diffs" to the
>> internal protobufs in the code, instead of duplicating them altogether.
>>
>> 3. API and feature labelling.
>> I suggest to introduce explicit labels for API and features, to ensure
>> users have the right assumptions about the their lifetime while engineers
>> have the ability to change a wip feature in an non-compatible way. I
>> propose the following:
>> API: stable, non-stable, pure (not used by Mesos components)
>> Feature: experimental, normal.
>>
>
>  +1 on formalizing the terminologies.
>
> Historically the distinction is not clear for the following:
>
> 1. The API has no compatibility guarantee at all.
> 2. The feature provided by this API is experimental
>

To add to this point: because 2) logically doesn't apply to the "pure (not
used by Mesos components)" fields in the API, it could be more confusing
and thus require more precise definition.


>
> IMO It's OK that we say that we don't distinguish the two (the API has no
> compatibility guarantee until the feature is fully released) but we have to
> make it clear.
> If we don't make such distinction, ALL API additions should be marked as
> unstable first and be changed stable later (as a formal process).
>
>
>>
>> Looking forward to your thoughts and suggestions.
>> AlexR
>>
>> [1] https://www.mail-archive.com/user@mesos.apache.org/msg08025.html
>> [2] https://www.mail-archive.com/dev@mesos.apache.org/msg36621.html
>> [3]
>> https://github.com/apache/mesos/blob/b2beef37f6f85a8c75e9681
>> 36caa7a1f292ba20e/docs/versioning.md
>>
>
>

Reply via email to