If every other release is a bug fix release, would the versioning go: 3.1.0 <-- feature release 3.1.1 <-- bug fix release
Eventually it seems like it might be possible to be able to push out a bug fix release more frequently than once a month? On Wed, Mar 18, 2015 at 7:59 AM Josh McKenzie <josh.mcken...@datastax.com> wrote: > +1 > > On Wed, Mar 18, 2015 at 7:54 AM, Jake Luciani <jak...@gmail.com> wrote: > > > +1 > > > > On Tue, Mar 17, 2015 at 5:06 PM, Jonathan Ellis <jbel...@gmail.com> > wrote: > > > Cassandra 2.1 was released in September, which means that if we were on > > > track with our stated goal of six month releases, 3.0 would be done > about > > > now. Instead, we haven't even delivered a beta. The immediate cause > > this > > > time is blocking for 8099 > > > <https://issues.apache.org/jira/browse/CASSANDRA-8099>, but the > reality > > is > > > that nobody should really be surprised. Something always comes up -- > > we've > > > averaged about nine months since 1.0, with 2.1 taking an entire year. > > > > > > We could make theory align with reality by acknowledging, "if nine > months > > > is our 'natural' release schedule, then so be it." But I think we can > do > > > better. > > > > > > Broadly speaking, we have two constituencies with Cassandra releases: > > > > > > First, we have the users who are building or porting an application on > > > Cassandra. These users want the newest features to make their job > > easier. > > > If 2.1.0 has a few bugs, it's not the end of the world. They have time > > to > > > wait for 2.1.x to stabilize while they write their code. They would > like > > > to see us deliver on our six month schedule or even faster. > > > > > > Second, we have the users who have an application in production. These > > > users, or their bosses, want Cassandra to be as stable as possible. > > > Assuming they deploy on a stable release like 2.0.12, they don't want > to > > > touch it. They would like to see us release *less* often. (Because > that > > > means they have to do less upgrades while remaining in our backwards > > > compatibility window.) > > > > > > With our current "big release every X months" model, these users' needs > > are > > > in tension. > > > > > > We discussed this six months ago, and ended up with this: > > > > > > What if we tried a [four month] release cycle, BUT we would guarantee > > that > > >> you could do a rolling upgrade until we bump the supermajor version? > So > > 2.0 > > >> could upgrade to 3.0 without having to go through 2.1. (But to go to > > 3.1 > > >> or 4.0 you would have to go through 3.0.) > > >> > > > > > > Crucially, I added > > > > > > Whether this is reasonable depends on how fast we can stabilize > releases. > > >> 2.1.0 will be a good test of this. > > >> > > > > > > Unfortunately, even after DataStax hired half a dozen full-time test > > > engineers, 2.1.0 continued the proud tradition of being unready for > > > production use, with "wait for .5 before upgrading" once again looking > > like > > > a good guideline. > > > > > > I’m starting to think that the entire model of “write a bunch of new > > > features all at once and then try to stabilize it for release” is > broken. > > > We’ve been trying that for years and empirically speaking the evidence > is > > > that it just doesn’t work, either from a stability standpoint or even > > just > > > shipping on time. > > > > > > A big reason that it takes us so long to stabilize new releases now is > > > that, because our major release cycle is so long, it’s super tempting > to > > > slip in “just one” new feature into bugfix releases, and I’m as guilty > of > > > that as anyone. > > > > > > For similar reasons, it’s difficult to do a meaningful freeze with big > > > feature releases. A look at 3.0 shows why: we have 8099 coming, but we > > > also have significant work done (but not finished) on 6230, 7970, 6696, > > and > > > 6477, all of which are meaningful improvements that address > demonstrated > > > user pain. So if we keep doing what we’ve been doing, our choices are > to > > > either delay 3.0 further while we finish and stabilize these, or we > wait > > > nine months to a year for the next release. Either way, one of our > > > constituencies gets disappointed. > > > > > > So, I’d like to try something different. I think we were on the right > > > track with shorter releases with more compatibility. But I’d like to > > throw > > > in a twist. Intel cuts down on risk with a “tick-tock” schedule for > new > > > architectures and process shrinks instead of trying to do both at once. > > We > > > can do something similar here: > > > > > > One month releases. Period. If it’s not done, it can wait. > > > *Every other release only accepts bug fixes.* > > > > > > By itself, one-month releases are going to dramatically reduce the > > > complexity of testing and debugging new releases -- and bugs that do > slip > > > past us will only affect a smaller percentage of users, avoiding the > “big > > > release has a bunch of bugs no one has seen before and pretty much > > everyone > > > is hit by something” scenario. But by adding in the second rule, I > think > > > we have a real chance to make a quantum leap here: stable, > > production-ready > > > releases every two months. > > > > > > So here is my proposal for 3.0: > > > > > > We’re just about ready to start serious review of 8099. When that’s > > done, > > > we branch 3.0 and cut a beta and then release candidates. Whatever > isn’t > > > done by then, has to wait; unlike prior betas, we will only accept bug > > > fixes into 3.0 after branching. > > > > > > One month after 3.0, we will ship 3.1 (with new features). At the same > > > time, we will branch 3.2. New features in trunk will go into 3.3. The > > 3.2 > > > branch will only get bug fixes. We will maintain backwards > compatibility > > > for all of 3.x; eventually (no less than a year) we will pick a release > > to > > > be 4.0, and drop deprecated features and old backwards compatibilities. > > > Otherwise there will be nothing special about the 4.0 designation. > (Note > > > that with an “odd releases have new features, even releases only have > bug > > > fixes” policy, 4.0 will actually be *more* stable than 3.11.) > > > > > > Larger features can continue to be developed in separate branches, the > > way > > > 8099 is being worked on today, and committed to trunk when ready. So > > this > > > is not saying that we are limited only to features we can build in a > > single > > > month. > > > > > > Some things will have to change with our dev process, for the better. > In > > > particular, with one month to commit new features, we don’t have room > for > > > committing sloppy work and stabilizing it later. Trunk has to be > stable > > at > > > all times. I asked Ariel Weisberg to put together his thoughts > > separately > > > on what worked for his team at VoltDB, and how we can apply that to > > > Cassandra -- see his email from Friday <http://bit.ly/1MHaOKX>. > (TLDR: > > > Redefine “done” to include automated tests. Infrastructure to run > tests > > > against github branches before merging to trunk. A new test harness > for > > > long-running regression tests.) > > > > > > I’m optimistic that as we improve our process this way, our even > releases > > > will become increasingly stable. If so, we can skip sub-minor releases > > > (3.2.x) entirely, and focus on keeping the release train moving. In > the > > > meantime, we will continue delivering 2.1.x stability releases. > > > > > > This won’t be an entirely smooth transition. In particular, you will > > have > > > noticed that 3.1 will get more than a month’s worth of new features > while > > > we stabilize 3.0 as the last of the old way of doing things, so some > > > patience is in order as we try this out. By 3.4 and 3.6 later this > year > > we > > > should have a good idea if this is working, and we can make adjustments > > as > > > warranted. > > > > > > -- > > > Jonathan Ellis > > > Project Chair, Apache Cassandra > > > co-founder, http://www.datastax.com > > > @spyced > > > > > > > > -- > > http://twitter.com/tjake > > > > > > -- > Joshua McKenzie > DataStax -- The Apache Cassandra Company >