There have been a bunch of online [1, 2] and offline discussions about our
deprecation and versioning policy. I found that people—including
myself—read the versioning doc  differently; moreover some aspects are
not captured there. I would like to start a discussion around this topic by
sharing my confusions and suggestions. This will hopefully help us stay on
the same page and have similar expectations. The second goal is to
eliminate ambiguities from the versioning doc (thanks Vinod for
volunteering to update it).
1. API vs. semantic changes.
Current versioning guide treat features (e.g. flags, metrics, endpoints)
and API differently: incompatible changes for the former are allowed after
6 month deprecation cycle, while for the latter they require bumping a
major version. I suggest we consolidate these policies.
We should also define and clearly explain what changes require bumping the
major version. I have no strong opinion here and would love to hear what
people think. The original motivation for maintaining backwards
compatibility is to make sure vN schedulers can correctly work with vN API
without being updated. But what about semantic changes that do not touch
the API? For example, what if we decide to send less task health updates to
schedulers based on some health policy? It influences the flow of task
status updates, should such change be considered compatible? Taking it to
an extreme, we may not even be able to fix some bugs because someone may
already rely on this behaviour!
Another tightly related thing we should explicitly call out is
upgradability and rollback capabilities inside a major release. Committing
to this may significantly limit what we can change within a major release;
on the other side it will give users more time and a better experience
about using and maintaining Mesos clusters.
2. Versioned vs. unversioned protobufs.
Currently we have v1 and unnamed protobufs, which simultaneously mean v0,
v2, and internal. I am sometimes confused about what is the right way to
update or introduce a field or message there, do people feel the same? How
about splitting the unnamed version into explicit v0, v2, and internal?
Food for thought. It would be great if we can only maintain "diffs" to the
internal protobufs in the code, instead of duplicating them altogether.
3. API and feature labelling.
I suggest to introduce explicit labels for API and features, to ensure
users have the right assumptions about the their lifetime while engineers
have the ability to change a wip feature in an non-compatible way. I
propose the following:
API: stable, non-stable, pure (not used by Mesos components)
Feature: experimental, normal.
Looking forward to your thoughts and suggestions.