I like the idea but I agree that every month is a bit aggressive. I have no say but:
I would say 4 releases a year instead of 12. with 2 months of new features and 1 month of bug squashing per a release. With the 4th quarter just bugs. I would also proposed 2 year LTS releases for the releases after the 4th quarter. So everyone could get a new feature release every quarter and the stability of super major versions for 2 years. On Wed, Mar 18, 2015 at 2:34 PM, Dave Brosius <dbros...@mebigfatguy.com> wrote: > It would seem the practical implications of this is that there would be > significantly more development on branches, with potentially more > significant delays on merging these branches. This would imply to me that > more Jenkins servers would need to be set up to handle auto-testing of more > branches, as if feature work spends more time on external branches, it is > then likely to be be less tested (even if by accident) as less developers > would be working on that branch. Only when a feature was blessed to make it > to the release-tracked branch, would it become exposed to the majority of > developers/testers, etc doing normal running/playing/testing. > > This isn't to knock the idea in anyway, just wanted to mention what i > think the outcome would be. > > dave > > > > > >>> > On Tue, Mar 17, 2015 at 5:06 PM, Jonathan Ellis <jbel...@gmail.com> >>> wrote: >>> > > Cassandra 2.1 was released in September, which means that if we were >>> on >>> > > track with our stated goal of six month releases, 3.0 would be done >>> about >>> > > now. Instead, we haven't even delivered a beta. The immediate cause >>> > this >>> > > time is blocking for 8099 >>> > > <https://issues.apache.org/jira/browse/CASSANDRA-8099>, but the >>> reality >>> > is >>> > > that nobody should really be surprised. Something always comes up -- >>> > we've >>> > > averaged about nine months since 1.0, with 2.1 taking an entire year. >>> > > >>> > > We could make theory align with reality by acknowledging, "if nine >>> months >>> > > is our 'natural' release schedule, then so be it." But I think we >>> can >>> do >>> > > better. >>> > > >>> > > Broadly speaking, we have two constituencies with Cassandra releases: >>> > > >>> > > First, we have the users who are building or porting an application >>> on >>> > > Cassandra. These users want the newest features to make their job >>> > easier. >>> > > If 2.1.0 has a few bugs, it's not the end of the world. They have >>> time >>> > to >>> > > wait for 2.1.x to stabilize while they write their code. They would >>> like >>> > > to see us deliver on our six month schedule or even faster. >>> > > >>> > > Second, we have the users who have an application in production. >>> These >>> > > users, or their bosses, want Cassandra to be as stable as possible. >>> > > Assuming they deploy on a stable release like 2.0.12, they don't want >>> to >>> > > touch it. They would like to see us release *less* often. (Because >>> that >>> > > means they have to do less upgrades while remaining in our backwards >>> > > compatibility window.) >>> > > >>> > > With our current "big release every X months" model, these users' >>> needs >>> > are >>> > > in tension. >>> > > >>> > > We discussed this six months ago, and ended up with this: >>> > > >>> > > What if we tried a [four month] release cycle, BUT we would guarantee >>> > that >>> > >> you could do a rolling upgrade until we bump the supermajor version? >>> So >>> > 2.0 >>> > >> could upgrade to 3.0 without having to go through 2.1. (But to go >>> to >>> > 3.1 >>> > >> or 4.0 you would have to go through 3.0.) >>> > >> >>> > > >>> > > Crucially, I added >>> > > >>> > > Whether this is reasonable depends on how fast we can stabilize >>> releases. >>> > >> 2.1.0 will be a good test of this. >>> > >> >>> > > >>> > > Unfortunately, even after DataStax hired half a dozen full-time test >>> > > engineers, 2.1.0 continued the proud tradition of being unready for >>> > > production use, with "wait for .5 before upgrading" once again >>> looking >>> > like >>> > > a good guideline. >>> > > >>> > > I’m starting to think that the entire model of “write a bunch of new >>> > > features all at once and then try to stabilize it for release” is >>> broken. >>> > > We’ve been trying that for years and empirically speaking the >>> evidence >>> is >>> > > that it just doesn’t work, either from a stability standpoint or even >>> > just >>> > > shipping on time. >>> > > >>> > > A big reason that it takes us so long to stabilize new releases now >>> is >>> > > that, because our major release cycle is so long, it’s super tempting >>> to >>> > > slip in “just one” new feature into bugfix releases, and I’m as >>> guilty >>> of >>> > > that as anyone. >>> > > >>> > > For similar reasons, it’s difficult to do a meaningful freeze with >>> big >>> > > feature releases. A look at 3.0 shows why: we have 8099 coming, but >>> we >>> > > also have significant work done (but not finished) on 6230, 7970, >>> 6696, >>> > and >>> > > 6477, all of which are meaningful improvements that address >>> demonstrated >>> > > user pain. So if we keep doing what we’ve been doing, our choices >>> are >>> to >>> > > either delay 3.0 further while we finish and stabilize these, or we >>> wait >>> > > nine months to a year for the next release. Either way, one of our >>> > > constituencies gets disappointed. >>> > > >>> > > So, I’d like to try something different. I think we were on the >>> right >>> > > track with shorter releases with more compatibility. But I’d like to >>> > throw >>> > > in a twist. Intel cuts down on risk with a “tick-tock” schedule for >>> new >>> > > architectures and process shrinks instead of trying to do both at >>> once. >>> > We >>> > > can do something similar here: >>> > > >>> > > One month releases. Period. If it’s not done, it can wait. >>> > > *Every other release only accepts bug fixes.* >>> > > >>> > > By itself, one-month releases are going to dramatically reduce the >>> > > complexity of testing and debugging new releases -- and bugs that do >>> slip >>> > > past us will only affect a smaller percentage of users, avoiding the >>> “big >>> > > release has a bunch of bugs no one has seen before and pretty much >>> > everyone >>> > > is hit by something” scenario. But by adding in the second rule, I >>> think >>> > > we have a real chance to make a quantum leap here: stable, >>> > production-ready >>> > > releases every two months. >>> > > >>> > > So here is my proposal for 3.0: >>> > > >>> > > We’re just about ready to start serious review of 8099. When that’s >>> > done, >>> > > we branch 3.0 and cut a beta and then release candidates. Whatever >>> isn’t >>> > > done by then, has to wait; unlike prior betas, we will only accept >>> bug >>> > > fixes into 3.0 after branching. >>> > > >>> > > One month after 3.0, we will ship 3.1 (with new features). At the >>> same >>> > > time, we will branch 3.2. New features in trunk will go into 3.3. >>> The >>> > 3.2 >>> > > branch will only get bug fixes. We will maintain backwards >>> compatibility >>> > > for all of 3.x; eventually (no less than a year) we will pick a >>> release >>> > to >>> > > be 4.0, and drop deprecated features and old backwards >>> compatibilities. >>> > > Otherwise there will be nothing special about the 4.0 designation. >>> (Note >>> > > that with an “odd releases have new features, even releases only have >>> bug >>> > > fixes” policy, 4.0 will actually be *more* stable than 3.11.) >>> > > >>> > > Larger features can continue to be developed in separate branches, >>> the >>> > way >>> > > 8099 is being worked on today, and committed to trunk when ready. So >>> > this >>> > > is not saying that we are limited only to features we can build in a >>> > single >>> > > month. >>> > > >>> > > Some things will have to change with our dev process, for the better. >>> In >>> > > particular, with one month to commit new features, we don’t have room >>> for >>> > > committing sloppy work and stabilizing it later. Trunk has to be >>> stable >>> > at >>> > > all times. I asked Ariel Weisberg to put together his thoughts >>> > separately >>> > > on what worked for his team at VoltDB, and how we can apply that to >>> > > Cassandra -- see his email from Friday <http://bit.ly/1MHaOKX>. >>> (TLDR: >>> > > Redefine “done” to include automated tests. Infrastructure to run >>> tests >>> > > against github branches before merging to trunk. A new test harness >>> for >>> > > long-running regression tests.) >>> > > >>> > > I’m optimistic that as we improve our process this way, our even >>> releases >>> > > will become increasingly stable. If so, we can skip sub-minor >>> releases >>> > > (3.2.x) entirely, and focus on keeping the release train moving. In >>> the >>> > > meantime, we will continue delivering 2.1.x stability releases. >>> > > >>> > > This won’t be an entirely smooth transition. In particular, you will >>> > have >>> > > noticed that 3.1 will get more than a month’s worth of new features >>> while >>> > > we stabilize 3.0 as the last of the old way of doing things, so some >>> > > patience is in order as we try this out. By 3.4 and 3.6 later this >>> year >>> > we >>> > > should have a good idea if this is working, and we can make >>> adjustments >>> > as >>> > > warranted. >>> > > >>> > > -- >>> > > Jonathan Ellis >>> > > Project Chair, Apache Cassandra >>> > > co-founder, http://www.datastax.com >>> > > @spyced >>> >> >