On Mesos versioning and deprecation policy
Folks, There have been a bunch of online [1, 2] and offline discussions about our deprecation and versioning policy. I found that people—including myself—read the versioning doc [3] differently; moreover some aspects are not captured there. I would like to start a discussion around this topic by sharing my confusions and suggestions. This will hopefully help us stay on the same page and have similar expectations. The second goal is to eliminate ambiguities from the versioning doc (thanks Vinod for volunteering to update it). 1. API vs. semantic changes. Current versioning guide treat features (e.g. flags, metrics, endpoints) and API differently: incompatible changes for the former are allowed after 6 month deprecation cycle, while for the latter they require bumping a major version. I suggest we consolidate these policies. We should also define and clearly explain what changes require bumping the major version. I have no strong opinion here and would love to hear what people think. The original motivation for maintaining backwards compatibility is to make sure vN schedulers can correctly work with vN API without being updated. But what about semantic changes that do not touch the API? For example, what if we decide to send less task health updates to schedulers based on some health policy? It influences the flow of task status updates, should such change be considered compatible? Taking it to an extreme, we may not even be able to fix some bugs because someone may already rely on this behaviour! Another tightly related thing we should explicitly call out is upgradability and rollback capabilities inside a major release. Committing to this may significantly limit what we can change within a major release; on the other side it will give users more time and a better experience about using and maintaining Mesos clusters. 2. Versioned vs. unversioned protobufs. Currently we have v1 and unnamed protobufs, which simultaneously mean v0, v2, and internal. I am sometimes confused about what is the right way to update or introduce a field or message there, do people feel the same? How about splitting the unnamed version into explicit v0, v2, and internal? Food for thought. It would be great if we can only maintain "diffs" to the internal protobufs in the code, instead of duplicating them altogether. 3. API and feature labelling. I suggest to introduce explicit labels for API and features, to ensure users have the right assumptions about the their lifetime while engineers have the ability to change a wip feature in an non-compatible way. I propose the following: API: stable, non-stable, pure (not used by Mesos components) Feature: experimental, normal. Looking forward to your thoughts and suggestions. AlexR [1] https://www.mail-archive.com/user@mesos.apache.org/msg08025.html [2] https://www.mail-archive.com/dev@mesos.apache.org/msg36621.html [3] https://github.com/apache/mesos/blob/b2beef37f6f85a8c75e968136caa7a1f292ba20e/docs/versioning.md
Re: How to shutdown mesos-agent gracefully?
I'd like to notify framework to kill its tasks and then terminate the mesos-agent. To the Maintenance feature, can not remember whether the slave info will be clearup if that slave will not re-register back. On Wed, Oct 12, 2016 at 10:13 PM Alex Rukletsovwrote: > To make sure: you are aware of SIGUSR1? > > On Tue, Oct 11, 2016 at 5:37 PM, tommy xiao wrote: > > > Hi Ma, > > > > could you please input more background, why Maintenance feature is not > > best option for your request? > > > > 2016-10-11 14:47 GMT+08:00 haosdent : > > > > > gracefully means not affect running tasks? > > > > > > On Tue, Oct 11, 2016 at 2:36 PM, Klaus Ma > > wrote: > > > > > >> It seems there's not a way to shutdown mesos-agent gracefully. > > >> Maintenance feature expect the agents re-register back in the future. > > >> > > >> Thanks > > >> Klaus > > >> -- > > >> > > >> Regards, > > >> > > >> Da (Klaus), Ma (马达), PMP® | Software Architect > > >> IBM Platform Development & Support, STG, IBM GCG > > >> +86-10-8245 4084 <+86%2010%208245%204084> | mad...@cn.ibm.com | > http://k82.me > > >> > > > > > > > > > > > > -- > > > Best Regards, > > > Haosdent Huang > > > > > > > > > > > -- > > Deshi Xiao > > Twitter: xds2000 > > E-mail: xiaods(AT)gmail.com > > > -- Regards, Da (Klaus), Ma (马达), PMP® | Software Architect IBM Platform Development & Support, STG, IBM GCG +86-10-8245 4084 | mad...@cn.ibm.com | http://k82.me
Re: How to shutdown mesos-agent gracefully?
To make sure: you are aware of SIGUSR1? On Tue, Oct 11, 2016 at 5:37 PM, tommy xiaowrote: > Hi Ma, > > could you please input more background, why Maintenance feature is not > best option for your request? > > 2016-10-11 14:47 GMT+08:00 haosdent : > > > gracefully means not affect running tasks? > > > > On Tue, Oct 11, 2016 at 2:36 PM, Klaus Ma > wrote: > > > >> It seems there's not a way to shutdown mesos-agent gracefully. > >> Maintenance feature expect the agents re-register back in the future. > >> > >> Thanks > >> Klaus > >> -- > >> > >> Regards, > >> > >> Da (Klaus), Ma (马达), PMP® | Software Architect > >> IBM Platform Development & Support, STG, IBM GCG > >> +86-10-8245 4084 | mad...@cn.ibm.com | http://k82.me > >> > > > > > > > > -- > > Best Regards, > > Haosdent Huang > > > > > > -- > Deshi Xiao > Twitter: xds2000 > E-mail: xiaods(AT)gmail.com >
Re: 1.1.0 release
Folks, we have 23 unresolved tickets targeted for Mesos 1.1.0 release, including 7 blockers and 3 epics (MESOS-5344, MESOS-3421, MESOS-2449), which turns 23 into 55. Obviously, we can’t make a cut today. Shepherds please either commit your blockers by Thu EOD PST or declare them as non-blockers. For unfinished epics, please transition all unresolved tickets to a new epic (see previous email) or retarget the epic. Make sure CHANGELOG is in good shape. We strive to cut the release on Fri Oct 14 around 13:00 CEST. At that time we will bulk-transit all unresolved tickets to 1.2. Rigorously, Alex & Till On Tue, Oct 11, 2016 at 5:30 PM, Alex Rukletsovwrote: > Folks, > > in preparation for Mesos 1.1.0 release we would like to ask people who > have worked on features in 1.1.0 to either: > * update the CHANGELOG and declare the feature implemented or > experimental, make sure documentation is updated as well; > * postpone to 1.2 and update the related epic; > * promote an experimental feature to stable if necessary. > > If you think you need to land something in 1.1.0, please mark the > respective JIRA as a blocker and set the target version to 1.1.0. Bear in > mind the release cut will be cut *tomorrow*, Oct 12 2016. > > For experimental features, consider creating a separate epic and moving > all unresolved tickets there, while marking the original epic as resolved > for 1.1.0. For example, see MESOS-2449 (pods) and MESOS-6355 > (pods-improvements). > > Below is the list of candidates for the CHAGELOG update with their > respective owners: > MESOS-6014 CNI port-mapping Avinash, Jie > MESOS-2449 Pods, subtopics: nested containers, nested isolators, default > executor Vinod > MESOS-5676 New Mesos CLI Kevin > MESOS-4697 Unified Cgroups isolator Haosdent, Jie > MESOS-6007 v1 API Anand, Vinod > MESOS-3302 - // - > MESOS-4855 - // - > MESOS-4791 - // - > MESOS-4766 Allocator performance BenM > MESOS-4936 Container security Jie > MESOS-4936 Capabilities and container security Benjamin Bannier, Jie > MESOS-3421 Shared resources Yan Xu > MESOS-5344 Partition awareness Neil > > Below is the list of features marked as experimental in 1.0. Are they > ready to be promoted and called out in the CHANGELOG? > MESOS-4312 Power PC Vinod > MESOS-4828 XFS disk isolator Yan Xu > MESOS-4641 Network CNI isolator Qian, Jie > MESOS-3094 Mesos tasks on Windows Joseph > MESOS-4355 Docker volume isolator Guangya, Qian, Jie > > This one has never been even called experimental. Joseph, is it time to do > so? > MESOS-898 CMake (never declared even experimental) Joseph > > Thanks in advance for cooperation, > Till and AlexR > > On Fri, Oct 7, 2016 at 7:47 PM, Vinod Kone wrote: > >> I think you need to clean up the JIRA a bit. >> >> 1) Make sure unresolved tickets do not have fix version (1.1.0) set. >> 2) Move "Fix version 1.1.0" to "Target version 1.1.0". >> >> 2) might obviate the need for 1). >> >> >> >> On Fri, Oct 7, 2016 at 7:24 AM, Till Toenshoff wrote: >> >>> Hi everyone! >>> >>> its us who will be the Release Managers for 1.1.0 - Alex and Till! >>> >>> We are planning to cut the next release (1.1.0) within three workdays - >>> that would be Wednesday next week. So, if you have any patches that need to >>> get into 1.1.0 make sure that either is already in the master branch or the >>> corresponding ticket has a target version set to 1.1.0. >>> >>> The release dashboard: >>> https://issues.apache.org/jira/secure/Dashboard.jspa?selectP >>> ageId=12329720 >>> >>> Alex & Till >>> >> >> >