RE: 3.0 and the Cassandra release process

2015-04-15 Thread Chuck Allen -X (charlall - RANDSTAD NORTH AMERICA LP at Cisco)
O yea, and BGL4 is now green without any impending risks. 

Additionally, the other yellow projects LWR05  MTV05 are on a path that will 
lead to green in coming weeks.

Thats All Folks

-Original Message-
From: Jonathan Ellis [mailto:jbel...@gmail.com] 
Sent: Wednesday, April 15, 2015 3:40 AM
To: dev
Subject: Re: 3.0 and the Cassandra release process

Short answer: yes.

Longer answer, pasted from my reply to Jon Haddad elsewhere in the thread:

We are moving away from designating major releases like 3.0 as special,
other than as a marker of compatibility.  In fact we are moving away from major 
releases entirely, with each release being a much smaller, digestible unit of 
change, and the ultimate goal of every even release being production-quality.

This means that bugs won't pile up and compound each other.  And bugs that do 
slip through will affect less users.  As 3.x stabilizes, more people will try 
out the releases, yielding better quality, yielding even more people trying 
them out in a virtuous cycle.

This won't just happen by wishing for it.  I am very serious about investing 
the energy we would have spent on backporting fixes to a stable
branch, into improving our QA process and test coverage.  After a very short 
list of in-progress features that may not make the 3.0 cutoff (#6477,
#6696 come to mind) I'm willing to virtually pause new feature development 
entirely to make this happen.


On Tue, Apr 14, 2015 at 11:53 PM, Phil Yang ud1...@gmail.com wrote:

 Hi Jonathan,

 How long will tick-tock releases will be maintained? Do users have to 
 upgrade to a new even release with new features to fix the bugs in an 
 older even release?

 2015-04-14 6:28 GMT+08:00 Jonathan Ellis jbel...@gmail.com:

  On Tue, Mar 17, 2015 at 4:06 PM, Jonathan Ellis jbel...@gmail.com
 wrote:
 
  
   I’m optimistic that as we improve our process this way, our even
 releases
   will become increasingly stable.  If so, we can skip sub-minor 
   releases
   (3.2.x) entirely, and focus on keeping the release train moving.  
   In
 the
   meantime, we will continue delivering 2.1.x stability releases.
  
 
  The weak point of this plan is the transition from the big release
  development methodology culminating in 3.0, to the monthly tick-tock 
  releases.  Since 3.0 needs to go through a beta/release candidate 
  phase, during which we're going to be serious about not adding new 
  features,
 that
  means that 3.1 will come with multiple months worth of features, so 
  right off the bat we're starting from a disadvantage from a 
  stability
 standpoint.
 
  Recognizing that it will take several months for the tick-tock 
  releases
 to
  stabilize, I would like to ship 3.0.x stability releases 
  concurrently
 with
  3.y tick-tock releases.  This should stabilize 3.0.x faster than
 tick-tock,
  while at the same time hedging our bets such that if we assess 
  tick-tock
 in
  six months and decide it's not delivering on its goals, we're not 
  six months behind in having a usable set of features that we shipped in 3.0.
 
  So, to summarize:
 
  - New features will *only* go into tick-tock releases.
  - Bug fixes will go into tick-tock releases and a 3.0.x branch, 
  which
 will
  be maintained for at least a year
 
  --
  Jonathan Ellis
  Project Chair, Apache Cassandra
  co-founder, http://www.datastax.com
  @spyced
 



 --
 Thanks,
 Phil Yang




--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder, http://www.datastax.com
@spyced


Re: 3.0 and the Cassandra release process

2015-04-15 Thread Jonathan Ellis
Short answer: yes.

Longer answer, pasted from my reply to Jon Haddad elsewhere in the thread:

We are moving away from designating major releases like 3.0 as special,
other than as a marker of compatibility.  In fact we are moving away from
major releases entirely, with each release being a much smaller, digestible
unit of change, and the ultimate goal of every even release being
production-quality.

This means that bugs won't pile up and compound each other.  And bugs that
do slip through will affect less users.  As 3.x stabilizes, more people
will try out the releases, yielding better quality, yielding even more
people trying them out in a virtuous cycle.

This won't just happen by wishing for it.  I am very serious about
investing the energy we would have spent on backporting fixes to a stable
branch, into improving our QA process and test coverage.  After a very
short list of in-progress features that may not make the 3.0 cutoff (#6477,
#6696 come to mind) I'm willing to virtually pause new feature development
entirely to make this happen.


On Tue, Apr 14, 2015 at 11:53 PM, Phil Yang ud1...@gmail.com wrote:

 Hi Jonathan,

 How long will tick-tock releases will be maintained? Do users have to
 upgrade to a new even release with new features to fix the bugs in an older
 even release?

 2015-04-14 6:28 GMT+08:00 Jonathan Ellis jbel...@gmail.com:

  On Tue, Mar 17, 2015 at 4:06 PM, Jonathan Ellis jbel...@gmail.com
 wrote:
 
  
   I’m optimistic that as we improve our process this way, our even
 releases
   will become increasingly stable.  If so, we can skip sub-minor releases
   (3.2.x) entirely, and focus on keeping the release train moving.  In
 the
   meantime, we will continue delivering 2.1.x stability releases.
  
 
  The weak point of this plan is the transition from the big release
  development methodology culminating in 3.0, to the monthly tick-tock
  releases.  Since 3.0 needs to go through a beta/release candidate phase,
  during which we're going to be serious about not adding new features,
 that
  means that 3.1 will come with multiple months worth of features, so right
  off the bat we're starting from a disadvantage from a stability
 standpoint.
 
  Recognizing that it will take several months for the tick-tock releases
 to
  stabilize, I would like to ship 3.0.x stability releases concurrently
 with
  3.y tick-tock releases.  This should stabilize 3.0.x faster than
 tick-tock,
  while at the same time hedging our bets such that if we assess tick-tock
 in
  six months and decide it's not delivering on its goals, we're not six
  months behind in having a usable set of features that we shipped in 3.0.
 
  So, to summarize:
 
  - New features will *only* go into tick-tock releases.
  - Bug fixes will go into tick-tock releases and a 3.0.x branch, which
 will
  be maintained for at least a year
 
  --
  Jonathan Ellis
  Project Chair, Apache Cassandra
  co-founder, http://www.datastax.com
  @spyced
 



 --
 Thanks,
 Phil Yang




-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder, http://www.datastax.com
@spyced


Re: 3.0 and the Cassandra release process

2015-04-14 Thread Phil Yang
Hi Jonathan,

How long will tick-tock releases will be maintained? Do users have to
upgrade to a new even release with new features to fix the bugs in an older
even release?

2015-04-14 6:28 GMT+08:00 Jonathan Ellis jbel...@gmail.com:

 On Tue, Mar 17, 2015 at 4:06 PM, Jonathan Ellis jbel...@gmail.com wrote:

 
  I’m optimistic that as we improve our process this way, our even releases
  will become increasingly stable.  If so, we can skip sub-minor releases
  (3.2.x) entirely, and focus on keeping the release train moving.  In the
  meantime, we will continue delivering 2.1.x stability releases.
 

 The weak point of this plan is the transition from the big release
 development methodology culminating in 3.0, to the monthly tick-tock
 releases.  Since 3.0 needs to go through a beta/release candidate phase,
 during which we're going to be serious about not adding new features, that
 means that 3.1 will come with multiple months worth of features, so right
 off the bat we're starting from a disadvantage from a stability standpoint.

 Recognizing that it will take several months for the tick-tock releases to
 stabilize, I would like to ship 3.0.x stability releases concurrently with
 3.y tick-tock releases.  This should stabilize 3.0.x faster than tick-tock,
 while at the same time hedging our bets such that if we assess tick-tock in
 six months and decide it's not delivering on its goals, we're not six
 months behind in having a usable set of features that we shipped in 3.0.

 So, to summarize:

 - New features will *only* go into tick-tock releases.
 - Bug fixes will go into tick-tock releases and a 3.0.x branch, which will
 be maintained for at least a year

 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder, http://www.datastax.com
 @spyced




-- 
Thanks,
Phil Yang


Re: 3.0 and the Cassandra release process

2015-04-13 Thread Jonathan Ellis
On Tue, Mar 17, 2015 at 4:06 PM, Jonathan Ellis jbel...@gmail.com wrote:


 I’m optimistic that as we improve our process this way, our even releases
 will become increasingly stable.  If so, we can skip sub-minor releases
 (3.2.x) entirely, and focus on keeping the release train moving.  In the
 meantime, we will continue delivering 2.1.x stability releases.


The weak point of this plan is the transition from the big release
development methodology culminating in 3.0, to the monthly tick-tock
releases.  Since 3.0 needs to go through a beta/release candidate phase,
during which we're going to be serious about not adding new features, that
means that 3.1 will come with multiple months worth of features, so right
off the bat we're starting from a disadvantage from a stability standpoint.

Recognizing that it will take several months for the tick-tock releases to
stabilize, I would like to ship 3.0.x stability releases concurrently with
3.y tick-tock releases.  This should stabilize 3.0.x faster than tick-tock,
while at the same time hedging our bets such that if we assess tick-tock in
six months and decide it's not delivering on its goals, we're not six
months behind in having a usable set of features that we shipped in 3.0.

So, to summarize:

- New features will *only* go into tick-tock releases.
- Bug fixes will go into tick-tock releases and a 3.0.x branch, which will
be maintained for at least a year

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder, http://www.datastax.com
@spyced


Re: 3.0 and the Cassandra release process

2015-04-02 Thread Jonathan Haddad
In this tick tock cycle, is there still a long term release that's
maintained, meant for production?  Will bug fixes be back ported to 3.0
(stable) with new stuff going forward to 3.x?

On Thu, Mar 26, 2015 at 6:50 AM Aleksey Yeschenko alek...@apache.org
wrote:

 Hey Jason. I think pretty much everybody is on board with:

 1) A monthly release cycle
 2) Keeping trunk releasable all the times

 And that’s what my personal +1 was for.

 The tick-tock mechanism details and bug fix policy for the maintained
 stable lines should be fleshed out before we proceed. I believe that once
 they are explained better, the concerns will mostly, or entirely, go away.

 --
 AY

 On Mon, Mar 23, 2015 at 11:15 PM, Jason Brown jasedbr...@gmail.com
 wrote:

  Hey all,
 
  I had a hallway conversation with some folks here last week, and they
  expressed some concerns with this proposal. I will not attempt to
 summarize
  their arguments as I don't believe I could do them ample justice, but I
  strongly encouraged those individuals to speak up and be heard on this
  thread (I know they are watching!).
 
  Thanks,
 
  -Jason
 
  On Mon, Mar 23, 2015 at 6:32 AM, 曹志富 cao.zh...@gmail.com wrote:
 
   +1
  
   --
   Ranger Tsao
  
   2015-03-20 22:57 GMT+08:00 Ryan McGuire r...@datastax.com:
  
I'm taking notes from the infrastructure doc and wrote down some
 action
items for my team:
   
https://gist.github.com/EnigmaCurry/d53eccb55f5d0986c976
   
   
--
   
[image: datastax_logo.png] http://www.datastax.com/
   
Ryan McGuire
   
Software Engineering Manager in Test | r...@datastax.com
   
[image: linkedin.png] https://www.linkedin.com/in/enigmacurry
  [image:
twitter.png] http://twitter.com/enigmacurry
http://github.com/enigmacurry
   
   
On Thu, Mar 19, 2015 at 1:08 PM, Ariel Weisberg 
ariel.weisb...@datastax.com
 wrote:
   
 Hi,

 I realized one of the documents we didn't send out was the
   infrastructure
 side changes I am looking for. This one is maybe a little rougher
 as
  it
was
 the first one I wrote on the subject.



   
  
  https://docs.google.com/document/d/1Seku0vPwChbnH3uYYxon0UO-
 b6LDtSqluZiH--sWWi0/edit?usp=sharing

 The goal is to have infrastructure that gives developers as close
 to
 immediate feedback as possible on their code before they merge.
   Feedback
 that is delayed to after merging to trunk should come in a day or
 two
   and
 there is a product owner (Michael Shuler) responsible for making
 sure
that
 issues are addressed quickly.

 QA is going to help by providing developers with a better tools for
writing
 higher level functional tests that explore all of the functions
   together
 along with the configuration space without developers having to do
  any
work
 other then plugging in functionality to exercise and then validate
 something specific. This kind of harness is hard to get right and
  make
 reliable and expressive so they have their work cut out for them.

 It's going to be an iterative process where the tests improve as
 new
   work
 introduces missing coverage and as bugs/regressions drive the
introduction
 of new tests. The monthly retrospective (planning on doing that
 first
   of
 the month) is also going to help us refine the testing and
  development
 process.

 Ariel

 On Thu, Mar 19, 2015 at 7:23 AM, Jason Brown jasedbr...@gmail.com
 
wrote:

  +1 to this general proposal. I think the time has finally come
 for
  us
to
  try something new, and this sounds legit. Thanks!
 
  On Thu, Mar 19, 2015 at 12:49 AM, Phil Yang ud1...@gmail.com
   wrote:
 
   Can I regard the odd version as the development preview and
 the
even
   version as the production ready?
  
   IMO, as a database infrastructure project, stable is more
   important
  than
   other kinds of projects. LTS is a good idea, but if we don't
   support
   non-LTS releases for enough time to fix their bugs, users on
   non-LTS
   release may have to upgrade a new major release to fix the bugs
  and
may
   have to handle some new bugs by the new features. I'm afraid
 that
   eventually people would only think about the LTS one.
  
  
   2015-03-19 8:48 GMT+08:00 Pavel Yaskevich pove...@gmail.com:
  
+1
   
On Wed, Mar 18, 2015 at 3:50 PM, Michael Kjellman 
mkjell...@internalcircle.com wrote:
   
 For most of my life I’ve lived on the software bleeding
 edge
   both
 personally and professionally. Maybe it’s a personal
  weakness,
but
 I
guess
 I get a thrill out of the problem solving aspect?

 Recently I came to a bit of an epiphany — the closer I keep
  to
the
   daily
 build — generally the happier 

Re: 3.0 and the Cassandra release process

2015-04-02 Thread Jonathan Ellis
We are moving away from designating major releases like 3.0 as special,
other than as a marker of compatibility.  In fact we are moving away from
major releases entirely, with each release being a much smaller, digestible
unit of change, and the ultimate goal of every even release being
production-quality.

This means that bugs won't pile up and compound each other.  And bugs that
do slip through will affect less users.  As 3.x stabilizes, more people
will try out the releases, yielding better quality, yielding even more
people trying them out in a virtuous cycle.

This won't just happen by wishing for it.  I am very serious about
investing the energy we would have spent on backporting fixes to a stable
branch, into improving our QA process and test coverage.  After a very
short list of in-progress features that may not make the 3.0 cutoff (#6477,
#6696 come to mind) I'm willing to virtually pause new feature development
entirely to make this happen.

Some patience will be necessary with the first few releases.  But at this
point, people are used to about six months of waiting for a new major to
stabilize.  So, let's give this a try until 3.6.  If that still hasn't
materially stabilized, then we need to go back to the drawing board.  But
I'm optimistic that it will.

On Thu, Apr 2, 2015 at 5:04 PM, Jonathan Haddad j...@jonhaddad.com wrote:

 In this tick tock cycle, is there still a long term release that's
 maintained, meant for production?  Will bug fixes be back ported to 3.0
 (stable) with new stuff going forward to 3.x?


-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder, http://www.datastax.com
@spyced


Re: 3.0 and the Cassandra release process

2015-04-02 Thread Colin
Hey Jonathan,

I have been hoping for this approach for years now-one of the reasons I left 
Datastax was due to my feeling that quality was always on the backburner and 
never really taken seriously vs marketing driven releases.

I sincerely hope this approach reverses that perceived trend.
--
Colin 
+1 612 859 6129
Skype colin.p.clark

 On Apr 2, 2015, at 5:54 PM, Jonathan Ellis jbel...@gmail.com wrote:
 
 We are moving away from designating major releases like 3.0 as special,
 other than as a marker of compatibility.  In fact we are moving away from
 major releases entirely, with each release being a much smaller, digestible
 unit of change, and the ultimate goal of every even release being
 production-quality.
 
 This means that bugs won't pile up and compound each other.  And bugs that
 do slip through will affect less users.  As 3.x stabilizes, more people
 will try out the releases, yielding better quality, yielding even more
 people trying them out in a virtuous cycle.
 
 This won't just happen by wishing for it.  I am very serious about
 investing the energy we would have spent on backporting fixes to a stable
 branch, into improving our QA process and test coverage.  After a very
 short list of in-progress features that may not make the 3.0 cutoff (#6477,
 #6696 come to mind) I'm willing to virtually pause new feature development
 entirely to make this happen.
 
 Some patience will be necessary with the first few releases.  But at this
 point, people are used to about six months of waiting for a new major to
 stabilize.  So, let's give this a try until 3.6.  If that still hasn't
 materially stabilized, then we need to go back to the drawing board.  But
 I'm optimistic that it will.
 
 On Thu, Apr 2, 2015 at 5:04 PM, Jonathan Haddad j...@jonhaddad.com wrote:
 
 In this tick tock cycle, is there still a long term release that's
 maintained, meant for production?  Will bug fixes be back ported to 3.0
 (stable) with new stuff going forward to 3.x?
 
 -- 
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder, http://www.datastax.com
 @spyced


Re: 3.0 and the Cassandra release process

2015-03-26 Thread Aleksey Yeschenko
Hey Jason. I think pretty much everybody is on board with:

1) A monthly release cycle
2) Keeping trunk releasable all the times

And that’s what my personal +1 was for.

The tick-tock mechanism details and bug fix policy for the maintained
stable lines should be fleshed out before we proceed. I believe that once
they are explained better, the concerns will mostly, or entirely, go away.

--
AY

On Mon, Mar 23, 2015 at 11:15 PM, Jason Brown jasedbr...@gmail.com wrote:

 Hey all,

 I had a hallway conversation with some folks here last week, and they
 expressed some concerns with this proposal. I will not attempt to summarize
 their arguments as I don't believe I could do them ample justice, but I
 strongly encouraged those individuals to speak up and be heard on this
 thread (I know they are watching!).

 Thanks,

 -Jason

 On Mon, Mar 23, 2015 at 6:32 AM, 曹志富 cao.zh...@gmail.com wrote:

  +1
 
  --
  Ranger Tsao
 
  2015-03-20 22:57 GMT+08:00 Ryan McGuire r...@datastax.com:
 
   I'm taking notes from the infrastructure doc and wrote down some action
   items for my team:
  
   https://gist.github.com/EnigmaCurry/d53eccb55f5d0986c976
  
  
   --
  
   [image: datastax_logo.png] http://www.datastax.com/
  
   Ryan McGuire
  
   Software Engineering Manager in Test | r...@datastax.com
  
   [image: linkedin.png] https://www.linkedin.com/in/enigmacurry
 [image:
   twitter.png] http://twitter.com/enigmacurry
   http://github.com/enigmacurry
  
  
   On Thu, Mar 19, 2015 at 1:08 PM, Ariel Weisberg 
   ariel.weisb...@datastax.com
wrote:
  
Hi,
   
I realized one of the documents we didn't send out was the
  infrastructure
side changes I am looking for. This one is maybe a little rougher as
 it
   was
the first one I wrote on the subject.
   
   
   
  
 
 https://docs.google.com/document/d/1Seku0vPwChbnH3uYYxon0UO-b6LDtSqluZiH--sWWi0/edit?usp=sharing
   
The goal is to have infrastructure that gives developers as close to
immediate feedback as possible on their code before they merge.
  Feedback
that is delayed to after merging to trunk should come in a day or two
  and
there is a product owner (Michael Shuler) responsible for making sure
   that
issues are addressed quickly.
   
QA is going to help by providing developers with a better tools for
   writing
higher level functional tests that explore all of the functions
  together
along with the configuration space without developers having to do
 any
   work
other then plugging in functionality to exercise and then validate
something specific. This kind of harness is hard to get right and
 make
reliable and expressive so they have their work cut out for them.
   
It's going to be an iterative process where the tests improve as new
  work
introduces missing coverage and as bugs/regressions drive the
   introduction
of new tests. The monthly retrospective (planning on doing that first
  of
the month) is also going to help us refine the testing and
 development
process.
   
Ariel
   
On Thu, Mar 19, 2015 at 7:23 AM, Jason Brown jasedbr...@gmail.com
   wrote:
   
 +1 to this general proposal. I think the time has finally come for
 us
   to
 try something new, and this sounds legit. Thanks!

 On Thu, Mar 19, 2015 at 12:49 AM, Phil Yang ud1...@gmail.com
  wrote:

  Can I regard the odd version as the development preview and the
   even
  version as the production ready?
 
  IMO, as a database infrastructure project, stable is more
  important
 than
  other kinds of projects. LTS is a good idea, but if we don't
  support
  non-LTS releases for enough time to fix their bugs, users on
  non-LTS
  release may have to upgrade a new major release to fix the bugs
 and
   may
  have to handle some new bugs by the new features. I'm afraid that
  eventually people would only think about the LTS one.
 
 
  2015-03-19 8:48 GMT+08:00 Pavel Yaskevich pove...@gmail.com:
 
   +1
  
   On Wed, Mar 18, 2015 at 3:50 PM, Michael Kjellman 
   mkjell...@internalcircle.com wrote:
  
For most of my life I’ve lived on the software bleeding edge
  both
personally and professionally. Maybe it’s a personal
 weakness,
   but
I
   guess
I get a thrill out of the problem solving aspect?
   
Recently I came to a bit of an epiphany — the closer I keep
 to
   the
  daily
build — generally the happier I am on a daily basis. Bugs
  happen,
but
  for
the most part (aside from show stopper bugs), pain points for
myself
  in a
given daily build can generally can be debugged to 1 or
 maybe 2
root
causes, fixed in ~24 hours, and then life is better the next
  day
 again.
   In
comparison, the old waterfall model generally means taking an
  “official”
release at 

Re: 3.0 and the Cassandra release process

2015-03-24 Thread Chris Burroughs
Broadly as a contributor and operator I like the idea of more frequent 
releases off of an always stable master.  First customer ship quality 
all the time [1]!


I'm a little concerned that the specific tick-tock proposal could 
devolve into a 'devodd' style where the 'feature release' becomes a 
thing no one wants to run in production.  However, if master is always 
stable it doesn't really matter when releases are cut and if master is 
*not* stable that is a larger problem then the details of the release 
cadence.  I say give it a shot.



[1] http://wiki.illumos.org/display/illumos/On+the+Quality+Death+Spiral


Re: 3.0 and the Cassandra release process

2015-03-23 Thread 曹志富
+1

--
Ranger Tsao

2015-03-20 22:57 GMT+08:00 Ryan McGuire r...@datastax.com:

 I'm taking notes from the infrastructure doc and wrote down some action
 items for my team:

 https://gist.github.com/EnigmaCurry/d53eccb55f5d0986c976


 --

 [image: datastax_logo.png] http://www.datastax.com/

 Ryan McGuire

 Software Engineering Manager in Test | r...@datastax.com

 [image: linkedin.png] https://www.linkedin.com/in/enigmacurry [image:
 twitter.png] http://twitter.com/enigmacurry
 http://github.com/enigmacurry


 On Thu, Mar 19, 2015 at 1:08 PM, Ariel Weisberg 
 ariel.weisb...@datastax.com
  wrote:

  Hi,
 
  I realized one of the documents we didn't send out was the infrastructure
  side changes I am looking for. This one is maybe a little rougher as it
 was
  the first one I wrote on the subject.
 
 
 
 https://docs.google.com/document/d/1Seku0vPwChbnH3uYYxon0UO-b6LDtSqluZiH--sWWi0/edit?usp=sharing
 
  The goal is to have infrastructure that gives developers as close to
  immediate feedback as possible on their code before they merge. Feedback
  that is delayed to after merging to trunk should come in a day or two and
  there is a product owner (Michael Shuler) responsible for making sure
 that
  issues are addressed quickly.
 
  QA is going to help by providing developers with a better tools for
 writing
  higher level functional tests that explore all of the functions together
  along with the configuration space without developers having to do any
 work
  other then plugging in functionality to exercise and then validate
  something specific. This kind of harness is hard to get right and make
  reliable and expressive so they have their work cut out for them.
 
  It's going to be an iterative process where the tests improve as new work
  introduces missing coverage and as bugs/regressions drive the
 introduction
  of new tests. The monthly retrospective (planning on doing that first of
  the month) is also going to help us refine the testing and development
  process.
 
  Ariel
 
  On Thu, Mar 19, 2015 at 7:23 AM, Jason Brown jasedbr...@gmail.com
 wrote:
 
   +1 to this general proposal. I think the time has finally come for us
 to
   try something new, and this sounds legit. Thanks!
  
   On Thu, Mar 19, 2015 at 12:49 AM, Phil Yang ud1...@gmail.com wrote:
  
Can I regard the odd version as the development preview and the
 even
version as the production ready?
   
IMO, as a database infrastructure project, stable is more important
   than
other kinds of projects. LTS is a good idea, but if we don't support
non-LTS releases for enough time to fix their bugs, users on non-LTS
release may have to upgrade a new major release to fix the bugs and
 may
have to handle some new bugs by the new features. I'm afraid that
eventually people would only think about the LTS one.
   
   
2015-03-19 8:48 GMT+08:00 Pavel Yaskevich pove...@gmail.com:
   
 +1

 On Wed, Mar 18, 2015 at 3:50 PM, Michael Kjellman 
 mkjell...@internalcircle.com wrote:

  For most of my life I’ve lived on the software bleeding edge both
  personally and professionally. Maybe it’s a personal weakness,
 but
  I
 guess
  I get a thrill out of the problem solving aspect?
 
  Recently I came to a bit of an epiphany — the closer I keep to
 the
daily
  build — generally the happier I am on a daily basis. Bugs happen,
  but
for
  the most part (aside from show stopper bugs), pain points for
  myself
in a
  given daily build can generally can be debugged to 1 or maybe 2
  root
  causes, fixed in ~24 hours, and then life is better the next day
   again.
 In
  comparison, the old waterfall model generally means taking an
“official”
  release at some point and waiting for some poor soul (or
 developer)
   to
  actually run the thing. No matter how good the QA team is, until
  it’s
  actually used in the real world, most bugs aren’t found.
 
  If you and your organization can wait 24 hours * number of bugs
 discovered
  after people actually started using the thing, you end up with a
“usable
  build” around the holy-grail minor X.X.5 release of Cassandra.
 
  I love the idea of the LTS model Jonathan describes because it
  means
more
  code can get real testing and “bake” for longer instead of
 sitting
 largely
  unused on some git repository in a datacenter far far away. A lot
  of
code
  has changed between 2.0 and trunk today. The code has diverged to
  the
 point
  that if you write something for 2.0 (as the most stable major
  branch
  currently available), merging it forward to 3.0 or after
 generally
means
  rewriting it. If the only thing that comes out of this is a
 smaller
delta
  of LOC between the deployable version/branch and what we can
  develop
  against and what QA is focused on I think 

Re: 3.0 and the Cassandra release process

2015-03-23 Thread Jason Brown
Hey all,

I had a hallway conversation with some folks here last week, and they
expressed some concerns with this proposal. I will not attempt to summarize
their arguments as I don't believe I could do them ample justice, but I
strongly encouraged those individuals to speak up and be heard on this
thread (I know they are watching!).

Thanks,

-Jason

On Mon, Mar 23, 2015 at 6:32 AM, 曹志富 cao.zh...@gmail.com wrote:

 +1

 --
 Ranger Tsao

 2015-03-20 22:57 GMT+08:00 Ryan McGuire r...@datastax.com:

  I'm taking notes from the infrastructure doc and wrote down some action
  items for my team:
 
  https://gist.github.com/EnigmaCurry/d53eccb55f5d0986c976
 
 
  --
 
  [image: datastax_logo.png] http://www.datastax.com/
 
  Ryan McGuire
 
  Software Engineering Manager in Test | r...@datastax.com
 
  [image: linkedin.png] https://www.linkedin.com/in/enigmacurry [image:
  twitter.png] http://twitter.com/enigmacurry
  http://github.com/enigmacurry
 
 
  On Thu, Mar 19, 2015 at 1:08 PM, Ariel Weisberg 
  ariel.weisb...@datastax.com
   wrote:
 
   Hi,
  
   I realized one of the documents we didn't send out was the
 infrastructure
   side changes I am looking for. This one is maybe a little rougher as it
  was
   the first one I wrote on the subject.
  
  
  
 
 https://docs.google.com/document/d/1Seku0vPwChbnH3uYYxon0UO-b6LDtSqluZiH--sWWi0/edit?usp=sharing
  
   The goal is to have infrastructure that gives developers as close to
   immediate feedback as possible on their code before they merge.
 Feedback
   that is delayed to after merging to trunk should come in a day or two
 and
   there is a product owner (Michael Shuler) responsible for making sure
  that
   issues are addressed quickly.
  
   QA is going to help by providing developers with a better tools for
  writing
   higher level functional tests that explore all of the functions
 together
   along with the configuration space without developers having to do any
  work
   other then plugging in functionality to exercise and then validate
   something specific. This kind of harness is hard to get right and make
   reliable and expressive so they have their work cut out for them.
  
   It's going to be an iterative process where the tests improve as new
 work
   introduces missing coverage and as bugs/regressions drive the
  introduction
   of new tests. The monthly retrospective (planning on doing that first
 of
   the month) is also going to help us refine the testing and development
   process.
  
   Ariel
  
   On Thu, Mar 19, 2015 at 7:23 AM, Jason Brown jasedbr...@gmail.com
  wrote:
  
+1 to this general proposal. I think the time has finally come for us
  to
try something new, and this sounds legit. Thanks!
   
On Thu, Mar 19, 2015 at 12:49 AM, Phil Yang ud1...@gmail.com
 wrote:
   
 Can I regard the odd version as the development preview and the
  even
 version as the production ready?

 IMO, as a database infrastructure project, stable is more
 important
than
 other kinds of projects. LTS is a good idea, but if we don't
 support
 non-LTS releases for enough time to fix their bugs, users on
 non-LTS
 release may have to upgrade a new major release to fix the bugs and
  may
 have to handle some new bugs by the new features. I'm afraid that
 eventually people would only think about the LTS one.


 2015-03-19 8:48 GMT+08:00 Pavel Yaskevich pove...@gmail.com:

  +1
 
  On Wed, Mar 18, 2015 at 3:50 PM, Michael Kjellman 
  mkjell...@internalcircle.com wrote:
 
   For most of my life I’ve lived on the software bleeding edge
 both
   personally and professionally. Maybe it’s a personal weakness,
  but
   I
  guess
   I get a thrill out of the problem solving aspect?
  
   Recently I came to a bit of an epiphany — the closer I keep to
  the
 daily
   build — generally the happier I am on a daily basis. Bugs
 happen,
   but
 for
   the most part (aside from show stopper bugs), pain points for
   myself
 in a
   given daily build can generally can be debugged to 1 or maybe 2
   root
   causes, fixed in ~24 hours, and then life is better the next
 day
again.
  In
   comparison, the old waterfall model generally means taking an
 “official”
   release at some point and waiting for some poor soul (or
  developer)
to
   actually run the thing. No matter how good the QA team is,
 until
   it’s
   actually used in the real world, most bugs aren’t found.
  
   If you and your organization can wait 24 hours * number of bugs
  discovered
   after people actually started using the thing, you end up with
 a
 “usable
   build” around the holy-grail minor X.X.5 release of Cassandra.
  
   I love the idea of the LTS model Jonathan describes because it
   means
 more
   code can get real testing and “bake” for longer instead of
  

Re: 3.0 and the Cassandra release process

2015-03-20 Thread Ryan McGuire
I'm taking notes from the infrastructure doc and wrote down some action
items for my team:

https://gist.github.com/EnigmaCurry/d53eccb55f5d0986c976


--

[image: datastax_logo.png] http://www.datastax.com/

Ryan McGuire

Software Engineering Manager in Test | r...@datastax.com

[image: linkedin.png] https://www.linkedin.com/in/enigmacurry [image:
twitter.png] http://twitter.com/enigmacurry
http://github.com/enigmacurry


On Thu, Mar 19, 2015 at 1:08 PM, Ariel Weisberg ariel.weisb...@datastax.com
 wrote:

 Hi,

 I realized one of the documents we didn't send out was the infrastructure
 side changes I am looking for. This one is maybe a little rougher as it was
 the first one I wrote on the subject.


 https://docs.google.com/document/d/1Seku0vPwChbnH3uYYxon0UO-b6LDtSqluZiH--sWWi0/edit?usp=sharing

 The goal is to have infrastructure that gives developers as close to
 immediate feedback as possible on their code before they merge. Feedback
 that is delayed to after merging to trunk should come in a day or two and
 there is a product owner (Michael Shuler) responsible for making sure that
 issues are addressed quickly.

 QA is going to help by providing developers with a better tools for writing
 higher level functional tests that explore all of the functions together
 along with the configuration space without developers having to do any work
 other then plugging in functionality to exercise and then validate
 something specific. This kind of harness is hard to get right and make
 reliable and expressive so they have their work cut out for them.

 It's going to be an iterative process where the tests improve as new work
 introduces missing coverage and as bugs/regressions drive the introduction
 of new tests. The monthly retrospective (planning on doing that first of
 the month) is also going to help us refine the testing and development
 process.

 Ariel

 On Thu, Mar 19, 2015 at 7:23 AM, Jason Brown jasedbr...@gmail.com wrote:

  +1 to this general proposal. I think the time has finally come for us to
  try something new, and this sounds legit. Thanks!
 
  On Thu, Mar 19, 2015 at 12:49 AM, Phil Yang ud1...@gmail.com wrote:
 
   Can I regard the odd version as the development preview and the even
   version as the production ready?
  
   IMO, as a database infrastructure project, stable is more important
  than
   other kinds of projects. LTS is a good idea, but if we don't support
   non-LTS releases for enough time to fix their bugs, users on non-LTS
   release may have to upgrade a new major release to fix the bugs and may
   have to handle some new bugs by the new features. I'm afraid that
   eventually people would only think about the LTS one.
  
  
   2015-03-19 8:48 GMT+08:00 Pavel Yaskevich pove...@gmail.com:
  
+1
   
On Wed, Mar 18, 2015 at 3:50 PM, Michael Kjellman 
mkjell...@internalcircle.com wrote:
   
 For most of my life I’ve lived on the software bleeding edge both
 personally and professionally. Maybe it’s a personal weakness, but
 I
guess
 I get a thrill out of the problem solving aspect?

 Recently I came to a bit of an epiphany — the closer I keep to the
   daily
 build — generally the happier I am on a daily basis. Bugs happen,
 but
   for
 the most part (aside from show stopper bugs), pain points for
 myself
   in a
 given daily build can generally can be debugged to 1 or maybe 2
 root
 causes, fixed in ~24 hours, and then life is better the next day
  again.
In
 comparison, the old waterfall model generally means taking an
   “official”
 release at some point and waiting for some poor soul (or developer)
  to
 actually run the thing. No matter how good the QA team is, until
 it’s
 actually used in the real world, most bugs aren’t found.

 If you and your organization can wait 24 hours * number of bugs
discovered
 after people actually started using the thing, you end up with a
   “usable
 build” around the holy-grail minor X.X.5 release of Cassandra.

 I love the idea of the LTS model Jonathan describes because it
 means
   more
 code can get real testing and “bake” for longer instead of sitting
largely
 unused on some git repository in a datacenter far far away. A lot
 of
   code
 has changed between 2.0 and trunk today. The code has diverged to
 the
point
 that if you write something for 2.0 (as the most stable major
 branch
 currently available), merging it forward to 3.0 or after generally
   means
 rewriting it. If the only thing that comes out of this is a smaller
   delta
 of LOC between the deployable version/branch and what we can
 develop
 against and what QA is focused on I think that’s a massive win.

 Something like CASSANDRA-8099 will need 2x the baking time of even
  many
of
 the more risky changes the project has made. While I wouldn’t want
 to
run a
 build with CASSANDRA-8099 in it anytime soon, there 

Re: 3.0 and the Cassandra release process

2015-03-19 Thread Phil Yang
Can I regard the odd version as the development preview and the even
version as the production ready?

IMO, as a database infrastructure project, stable is more important than
other kinds of projects. LTS is a good idea, but if we don't support
non-LTS releases for enough time to fix their bugs, users on non-LTS
release may have to upgrade a new major release to fix the bugs and may
have to handle some new bugs by the new features. I'm afraid that
eventually people would only think about the LTS one.


2015-03-19 8:48 GMT+08:00 Pavel Yaskevich pove...@gmail.com:

 +1

 On Wed, Mar 18, 2015 at 3:50 PM, Michael Kjellman 
 mkjell...@internalcircle.com wrote:

  For most of my life I’ve lived on the software bleeding edge both
  personally and professionally. Maybe it’s a personal weakness, but I
 guess
  I get a thrill out of the problem solving aspect?
 
  Recently I came to a bit of an epiphany — the closer I keep to the daily
  build — generally the happier I am on a daily basis. Bugs happen, but for
  the most part (aside from show stopper bugs), pain points for myself in a
  given daily build can generally can be debugged to 1 or maybe 2 root
  causes, fixed in ~24 hours, and then life is better the next day again.
 In
  comparison, the old waterfall model generally means taking an “official”
  release at some point and waiting for some poor soul (or developer) to
  actually run the thing. No matter how good the QA team is, until it’s
  actually used in the real world, most bugs aren’t found.
 
  If you and your organization can wait 24 hours * number of bugs
 discovered
  after people actually started using the thing, you end up with a “usable
  build” around the holy-grail minor X.X.5 release of Cassandra.
 
  I love the idea of the LTS model Jonathan describes because it means more
  code can get real testing and “bake” for longer instead of sitting
 largely
  unused on some git repository in a datacenter far far away. A lot of code
  has changed between 2.0 and trunk today. The code has diverged to the
 point
  that if you write something for 2.0 (as the most stable major branch
  currently available), merging it forward to 3.0 or after generally means
  rewriting it. If the only thing that comes out of this is a smaller delta
  of LOC between the deployable version/branch and what we can develop
  against and what QA is focused on I think that’s a massive win.
 
  Something like CASSANDRA-8099 will need 2x the baking time of even many
 of
  the more risky changes the project has made. While I wouldn’t want to
 run a
  build with CASSANDRA-8099 in it anytime soon, there are now hundreds of
  other changes blocked, most likely many containing new bugs of their own,
  but have no exposure at all to even the most involved C* developers.
 
  I really think this will be a huge win for the project and I’m super
  thankful for Sylvian, Ariel, Jonathan, Aleksey, and Jake for guiding this
  change to a much more sustainable release model for the entire community.
 
  best,
  kjellman
 
 
   On Mar 18, 2015, at 3:02 PM, Ariel Weisberg 
 ariel.weisb...@datastax.com
  wrote:
  
   Hi,
  
   Keep in mind it is a bug fix release every month and a feature release
  every two months.
  
   For development that is really a two month cycle with all bug fixes
  being backported one release. As a developer if you want to get something
  in a release you have two months and you should be sizing pieces of large
  tasks so they ship at least every two months.
  
   Ariel
   On Mar 18, 2015, at 5:58 PM, Terrance Shepherd tscana...@gmail.com
  wrote:
  
   I like the idea but I agree that every month is a bit aggressive. I
  have no
   say but:
  
   I would say 4 releases a year instead of 12. with 2 months of new
  features
   and 1 month of bug squashing per a release. With the 4th quarter just
  bugs.
  
   I would also proposed 2 year LTS releases for the releases after the
 4th
   quarter. So everyone could get a new feature release every quarter and
  the
   stability of super major versions for 2 years.
  
   On Wed, Mar 18, 2015 at 2:34 PM, Dave Brosius 
 dbros...@mebigfatguy.com
  
   wrote:
  
   It would seem the practical implications of this is that there would
 be
   significantly more development on branches, with potentially more
   significant delays on merging these branches. This would imply to me
  that
   more Jenkins servers would need to be set up to handle auto-testing
 of
  more
   branches, as if feature work spends more time on external branches,
 it
  is
   then likely to be be less tested (even if by accident) as less
  developers
   would be working on that branch. Only when a feature was blessed to
  make it
   to the release-tracked branch, would it become exposed to the
 majority
  of
   developers/testers, etc doing normal running/playing/testing.
  
   This isn't to knock the idea in anyway, just wanted to mention what i
   think the outcome would be.
  
   dave
  
  
  
  
   On Tue, Mar 

Re: 3.0 and the Cassandra release process

2015-03-19 Thread Jason Brown
+1 to this general proposal. I think the time has finally come for us to
try something new, and this sounds legit. Thanks!

On Thu, Mar 19, 2015 at 12:49 AM, Phil Yang ud1...@gmail.com wrote:

 Can I regard the odd version as the development preview and the even
 version as the production ready?

 IMO, as a database infrastructure project, stable is more important than
 other kinds of projects. LTS is a good idea, but if we don't support
 non-LTS releases for enough time to fix their bugs, users on non-LTS
 release may have to upgrade a new major release to fix the bugs and may
 have to handle some new bugs by the new features. I'm afraid that
 eventually people would only think about the LTS one.


 2015-03-19 8:48 GMT+08:00 Pavel Yaskevich pove...@gmail.com:

  +1
 
  On Wed, Mar 18, 2015 at 3:50 PM, Michael Kjellman 
  mkjell...@internalcircle.com wrote:
 
   For most of my life I’ve lived on the software bleeding edge both
   personally and professionally. Maybe it’s a personal weakness, but I
  guess
   I get a thrill out of the problem solving aspect?
  
   Recently I came to a bit of an epiphany — the closer I keep to the
 daily
   build — generally the happier I am on a daily basis. Bugs happen, but
 for
   the most part (aside from show stopper bugs), pain points for myself
 in a
   given daily build can generally can be debugged to 1 or maybe 2 root
   causes, fixed in ~24 hours, and then life is better the next day again.
  In
   comparison, the old waterfall model generally means taking an
 “official”
   release at some point and waiting for some poor soul (or developer) to
   actually run the thing. No matter how good the QA team is, until it’s
   actually used in the real world, most bugs aren’t found.
  
   If you and your organization can wait 24 hours * number of bugs
  discovered
   after people actually started using the thing, you end up with a
 “usable
   build” around the holy-grail minor X.X.5 release of Cassandra.
  
   I love the idea of the LTS model Jonathan describes because it means
 more
   code can get real testing and “bake” for longer instead of sitting
  largely
   unused on some git repository in a datacenter far far away. A lot of
 code
   has changed between 2.0 and trunk today. The code has diverged to the
  point
   that if you write something for 2.0 (as the most stable major branch
   currently available), merging it forward to 3.0 or after generally
 means
   rewriting it. If the only thing that comes out of this is a smaller
 delta
   of LOC between the deployable version/branch and what we can develop
   against and what QA is focused on I think that’s a massive win.
  
   Something like CASSANDRA-8099 will need 2x the baking time of even many
  of
   the more risky changes the project has made. While I wouldn’t want to
  run a
   build with CASSANDRA-8099 in it anytime soon, there are now hundreds of
   other changes blocked, most likely many containing new bugs of their
 own,
   but have no exposure at all to even the most involved C* developers.
  
   I really think this will be a huge win for the project and I’m super
   thankful for Sylvian, Ariel, Jonathan, Aleksey, and Jake for guiding
 this
   change to a much more sustainable release model for the entire
 community.
  
   best,
   kjellman
  
  
On Mar 18, 2015, at 3:02 PM, Ariel Weisberg 
  ariel.weisb...@datastax.com
   wrote:
   
Hi,
   
Keep in mind it is a bug fix release every month and a feature
 release
   every two months.
   
For development that is really a two month cycle with all bug fixes
   being backported one release. As a developer if you want to get
 something
   in a release you have two months and you should be sizing pieces of
 large
   tasks so they ship at least every two months.
   
Ariel
On Mar 18, 2015, at 5:58 PM, Terrance Shepherd tscana...@gmail.com
 
   wrote:
   
I like the idea but I agree that every month is a bit aggressive. I
   have no
say but:
   
I would say 4 releases a year instead of 12. with 2 months of new
   features
and 1 month of bug squashing per a release. With the 4th quarter
 just
   bugs.
   
I would also proposed 2 year LTS releases for the releases after the
  4th
quarter. So everyone could get a new feature release every quarter
 and
   the
stability of super major versions for 2 years.
   
On Wed, Mar 18, 2015 at 2:34 PM, Dave Brosius 
  dbros...@mebigfatguy.com
   
wrote:
   
It would seem the practical implications of this is that there
 would
  be
significantly more development on branches, with potentially more
significant delays on merging these branches. This would imply to
 me
   that
more Jenkins servers would need to be set up to handle auto-testing
  of
   more
branches, as if feature work spends more time on external branches,
  it
   is
then likely to be be less tested (even if by accident) as less
   developers
would be working on that branch. Only 

Re: 3.0 and the Cassandra release process

2015-03-19 Thread Ariel Weisberg
Hi,

I realized one of the documents we didn't send out was the infrastructure
side changes I am looking for. This one is maybe a little rougher as it was
the first one I wrote on the subject.

https://docs.google.com/document/d/1Seku0vPwChbnH3uYYxon0UO-b6LDtSqluZiH--sWWi0/edit?usp=sharing

The goal is to have infrastructure that gives developers as close to
immediate feedback as possible on their code before they merge. Feedback
that is delayed to after merging to trunk should come in a day or two and
there is a product owner (Michael Shuler) responsible for making sure that
issues are addressed quickly.

QA is going to help by providing developers with a better tools for writing
higher level functional tests that explore all of the functions together
along with the configuration space without developers having to do any work
other then plugging in functionality to exercise and then validate
something specific. This kind of harness is hard to get right and make
reliable and expressive so they have their work cut out for them.

It's going to be an iterative process where the tests improve as new work
introduces missing coverage and as bugs/regressions drive the introduction
of new tests. The monthly retrospective (planning on doing that first of
the month) is also going to help us refine the testing and development
process.

Ariel

On Thu, Mar 19, 2015 at 7:23 AM, Jason Brown jasedbr...@gmail.com wrote:

 +1 to this general proposal. I think the time has finally come for us to
 try something new, and this sounds legit. Thanks!

 On Thu, Mar 19, 2015 at 12:49 AM, Phil Yang ud1...@gmail.com wrote:

  Can I regard the odd version as the development preview and the even
  version as the production ready?
 
  IMO, as a database infrastructure project, stable is more important
 than
  other kinds of projects. LTS is a good idea, but if we don't support
  non-LTS releases for enough time to fix their bugs, users on non-LTS
  release may have to upgrade a new major release to fix the bugs and may
  have to handle some new bugs by the new features. I'm afraid that
  eventually people would only think about the LTS one.
 
 
  2015-03-19 8:48 GMT+08:00 Pavel Yaskevich pove...@gmail.com:
 
   +1
  
   On Wed, Mar 18, 2015 at 3:50 PM, Michael Kjellman 
   mkjell...@internalcircle.com wrote:
  
For most of my life I’ve lived on the software bleeding edge both
personally and professionally. Maybe it’s a personal weakness, but I
   guess
I get a thrill out of the problem solving aspect?
   
Recently I came to a bit of an epiphany — the closer I keep to the
  daily
build — generally the happier I am on a daily basis. Bugs happen, but
  for
the most part (aside from show stopper bugs), pain points for myself
  in a
given daily build can generally can be debugged to 1 or maybe 2 root
causes, fixed in ~24 hours, and then life is better the next day
 again.
   In
comparison, the old waterfall model generally means taking an
  “official”
release at some point and waiting for some poor soul (or developer)
 to
actually run the thing. No matter how good the QA team is, until it’s
actually used in the real world, most bugs aren’t found.
   
If you and your organization can wait 24 hours * number of bugs
   discovered
after people actually started using the thing, you end up with a
  “usable
build” around the holy-grail minor X.X.5 release of Cassandra.
   
I love the idea of the LTS model Jonathan describes because it means
  more
code can get real testing and “bake” for longer instead of sitting
   largely
unused on some git repository in a datacenter far far away. A lot of
  code
has changed between 2.0 and trunk today. The code has diverged to the
   point
that if you write something for 2.0 (as the most stable major branch
currently available), merging it forward to 3.0 or after generally
  means
rewriting it. If the only thing that comes out of this is a smaller
  delta
of LOC between the deployable version/branch and what we can develop
against and what QA is focused on I think that’s a massive win.
   
Something like CASSANDRA-8099 will need 2x the baking time of even
 many
   of
the more risky changes the project has made. While I wouldn’t want to
   run a
build with CASSANDRA-8099 in it anytime soon, there are now hundreds
 of
other changes blocked, most likely many containing new bugs of their
  own,
but have no exposure at all to even the most involved C* developers.
   
I really think this will be a huge win for the project and I’m super
thankful for Sylvian, Ariel, Jonathan, Aleksey, and Jake for guiding
  this
change to a much more sustainable release model for the entire
  community.
   
best,
kjellman
   
   
 On Mar 18, 2015, at 3:02 PM, Ariel Weisberg 
   ariel.weisb...@datastax.com
wrote:

 Hi,

 Keep in mind it is a bug fix release every month and a 

Re: 3.0 and the Cassandra release process

2015-03-18 Thread Sylvain Lebresne
+1

On Tue, Mar 17, 2015 at 10:06 PM, Jonathan Ellis jbel...@gmail.com wrote:

 Cassandra 2.1 was released in September, which means that if we were on
 track with our stated goal of six month releases, 3.0 would be done about
 now.  Instead, we haven't even delivered a beta.  The immediate cause this
 time is blocking for 8099
 https://issues.apache.org/jira/browse/CASSANDRA-8099, but the reality is
 that nobody should really be surprised.  Something always comes up -- we've
 averaged about nine months since 1.0, with 2.1 taking an entire year.

 We could make theory align with reality by acknowledging, if nine months
 is our 'natural' release schedule, then so be it.  But I think we can do
 better.

 Broadly speaking, we have two constituencies with Cassandra releases:

 First, we have the users who are building or porting an application on
 Cassandra.  These users want the newest features to make their job easier.
 If 2.1.0 has a few bugs, it's not the end of the world.  They have time to
 wait for 2.1.x to stabilize while they write their code.  They would like
 to see us deliver on our six month schedule or even faster.

 Second, we have the users who have an application in production.  These
 users, or their bosses, want Cassandra to be as stable as possible.
 Assuming they deploy on a stable release like 2.0.12, they don't want to
 touch it.  They would like to see us release *less* often.  (Because that
 means they have to do less upgrades while remaining in our backwards
 compatibility window.)

 With our current big release every X months model, these users' needs are
 in tension.

 We discussed this six months ago, and ended up with this:

 What if we tried a [four month] release cycle, BUT we would guarantee that
  you could do a rolling upgrade until we bump the supermajor version? So
 2.0
  could upgrade to 3.0 without having to go through 2.1.  (But to go to 3.1
  or 4.0 you would have to go through 3.0.)
 

 Crucially, I added

 Whether this is reasonable depends on how fast we can stabilize releases.
  2.1.0 will be a good test of this.
 

 Unfortunately, even after DataStax hired half a dozen full-time test
 engineers, 2.1.0 continued the proud tradition of being unready for
 production use, with wait for .5 before upgrading once again looking like
 a good guideline.

 I’m starting to think that the entire model of “write a bunch of new
 features all at once and then try to stabilize it for release” is broken.
 We’ve been trying that for years and empirically speaking the evidence is
 that it just doesn’t work, either from a stability standpoint or even just
 shipping on time.

 A big reason that it takes us so long to stabilize new releases now is
 that, because our major release cycle is so long, it’s super tempting to
 slip in “just one” new feature into bugfix releases, and I’m as guilty of
 that as anyone.

 For similar reasons, it’s difficult to do a meaningful freeze with big
 feature releases.  A look at 3.0 shows why: we have 8099 coming, but we
 also have significant work done (but not finished) on 6230, 7970, 6696, and
 6477, all of which are meaningful improvements that address demonstrated
 user pain.  So if we keep doing what we’ve been doing, our choices are to
 either delay 3.0 further while we finish and stabilize these, or we wait
 nine months to a year for the next release.  Either way, one of our
 constituencies gets disappointed.

 So, I’d like to try something different.  I think we were on the right
 track with shorter releases with more compatibility.  But I’d like to throw
 in a twist.  Intel cuts down on risk with a “tick-tock” schedule for new
 architectures and process shrinks instead of trying to do both at once.  We
 can do something similar here:

 One month releases.  Period.  If it’s not done, it can wait.
 *Every other release only accepts bug fixes.*

 By itself, one-month releases are going to dramatically reduce the
 complexity of testing and debugging new releases -- and bugs that do slip
 past us will only affect a smaller percentage of users, avoiding the “big
 release has a bunch of bugs no one has seen before and pretty much everyone
 is hit by something” scenario.  But by adding in the second rule, I think
 we have a real chance to make a quantum leap here: stable, production-ready
 releases every two months.

 So here is my proposal for 3.0:

 We’re just about ready to start serious review of 8099.  When that’s done,
 we branch 3.0 and cut a beta and then release candidates.  Whatever isn’t
 done by then, has to wait; unlike prior betas, we will only accept bug
 fixes into 3.0 after branching.

 One month after 3.0, we will ship 3.1 (with new features).  At the same
 time, we will branch 3.2.  New features in trunk will go into 3.3.  The 3.2
 branch will only get bug fixes.  We will maintain backwards compatibility
 for all of 3.x; eventually (no less than a year) we will pick a release to
 be 4.0, and drop deprecated features and old backwards 

Re: 3.0 and the Cassandra release process

2015-03-18 Thread Robert Stupp
+1

I also appreciate Ariel’s effort. The improved CI integration is great - being 
able to run a huge amount of tests on different platforms against one's 
development branch is a huge improvement.


 Am 17.03.2015 um 22:06 schrieb Jonathan Ellis jbel...@gmail.com:
 
 Cassandra 2.1 was released in September, which means that if we were on
 track with our stated goal of six month releases, 3.0 would be done about
 now.  Instead, we haven't even delivered a beta.  The immediate cause this
 time is blocking for 8099
 https://issues.apache.org/jira/browse/CASSANDRA-8099, but the reality is
 that nobody should really be surprised.  Something always comes up -- we've
 averaged about nine months since 1.0, with 2.1 taking an entire year.
 
 We could make theory align with reality by acknowledging, if nine months
 is our 'natural' release schedule, then so be it.  But I think we can do
 better.
 
 Broadly speaking, we have two constituencies with Cassandra releases:
 
 First, we have the users who are building or porting an application on
 Cassandra.  These users want the newest features to make their job easier.
 If 2.1.0 has a few bugs, it's not the end of the world.  They have time to
 wait for 2.1.x to stabilize while they write their code.  They would like
 to see us deliver on our six month schedule or even faster.
 
 Second, we have the users who have an application in production.  These
 users, or their bosses, want Cassandra to be as stable as possible.
 Assuming they deploy on a stable release like 2.0.12, they don't want to
 touch it.  They would like to see us release *less* often.  (Because that
 means they have to do less upgrades while remaining in our backwards
 compatibility window.)
 
 With our current big release every X months model, these users' needs are
 in tension.
 
 We discussed this six months ago, and ended up with this:
 
 What if we tried a [four month] release cycle, BUT we would guarantee that
 you could do a rolling upgrade until we bump the supermajor version? So 2.0
 could upgrade to 3.0 without having to go through 2.1.  (But to go to 3.1
 or 4.0 you would have to go through 3.0.)
 
 
 Crucially, I added
 
 Whether this is reasonable depends on how fast we can stabilize releases.
 2.1.0 will be a good test of this.
 
 
 Unfortunately, even after DataStax hired half a dozen full-time test
 engineers, 2.1.0 continued the proud tradition of being unready for
 production use, with wait for .5 before upgrading once again looking like
 a good guideline.
 
 I’m starting to think that the entire model of “write a bunch of new
 features all at once and then try to stabilize it for release” is broken.
 We’ve been trying that for years and empirically speaking the evidence is
 that it just doesn’t work, either from a stability standpoint or even just
 shipping on time.
 
 A big reason that it takes us so long to stabilize new releases now is
 that, because our major release cycle is so long, it’s super tempting to
 slip in “just one” new feature into bugfix releases, and I’m as guilty of
 that as anyone.
 
 For similar reasons, it’s difficult to do a meaningful freeze with big
 feature releases.  A look at 3.0 shows why: we have 8099 coming, but we
 also have significant work done (but not finished) on 6230, 7970, 6696, and
 6477, all of which are meaningful improvements that address demonstrated
 user pain.  So if we keep doing what we’ve been doing, our choices are to
 either delay 3.0 further while we finish and stabilize these, or we wait
 nine months to a year for the next release.  Either way, one of our
 constituencies gets disappointed.
 
 So, I’d like to try something different.  I think we were on the right
 track with shorter releases with more compatibility.  But I’d like to throw
 in a twist.  Intel cuts down on risk with a “tick-tock” schedule for new
 architectures and process shrinks instead of trying to do both at once.  We
 can do something similar here:
 
 One month releases.  Period.  If it’s not done, it can wait.
 *Every other release only accepts bug fixes.*
 
 By itself, one-month releases are going to dramatically reduce the
 complexity of testing and debugging new releases -- and bugs that do slip
 past us will only affect a smaller percentage of users, avoiding the “big
 release has a bunch of bugs no one has seen before and pretty much everyone
 is hit by something” scenario.  But by adding in the second rule, I think
 we have a real chance to make a quantum leap here: stable, production-ready
 releases every two months.
 
 So here is my proposal for 3.0:
 
 We’re just about ready to start serious review of 8099.  When that’s done,
 we branch 3.0 and cut a beta and then release candidates.  Whatever isn’t
 done by then, has to wait; unlike prior betas, we will only accept bug
 fixes into 3.0 after branching.
 
 One month after 3.0, we will ship 3.1 (with new features).  At the same
 time, we will branch 3.2.  New features in trunk will go into 3.3.  The 3.2
 

Re: 3.0 and the Cassandra release process

2015-03-18 Thread Gary Dusbabek
+1. This sounds like a step in a better direction.

Gary.

On Tue, Mar 17, 2015 at 4:06 PM, Jonathan Ellis jbel...@gmail.com wrote:

 Cassandra 2.1 was released in September, which means that if we were on
 track with our stated goal of six month releases, 3.0 would be done about
 now.  Instead, we haven't even delivered a beta.  The immediate cause this
 time is blocking for 8099
 https://issues.apache.org/jira/browse/CASSANDRA-8099, but the reality is
 that nobody should really be surprised.  Something always comes up -- we've
 averaged about nine months since 1.0, with 2.1 taking an entire year.

 We could make theory align with reality by acknowledging, if nine months
 is our 'natural' release schedule, then so be it.  But I think we can do
 better.

 Broadly speaking, we have two constituencies with Cassandra releases:

 First, we have the users who are building or porting an application on
 Cassandra.  These users want the newest features to make their job easier.
 If 2.1.0 has a few bugs, it's not the end of the world.  They have time to
 wait for 2.1.x to stabilize while they write their code.  They would like
 to see us deliver on our six month schedule or even faster.

 Second, we have the users who have an application in production.  These
 users, or their bosses, want Cassandra to be as stable as possible.
 Assuming they deploy on a stable release like 2.0.12, they don't want to
 touch it.  They would like to see us release *less* often.  (Because that
 means they have to do less upgrades while remaining in our backwards
 compatibility window.)

 With our current big release every X months model, these users' needs are
 in tension.

 We discussed this six months ago, and ended up with this:

 What if we tried a [four month] release cycle, BUT we would guarantee that
  you could do a rolling upgrade until we bump the supermajor version? So
 2.0
  could upgrade to 3.0 without having to go through 2.1.  (But to go to 3.1
  or 4.0 you would have to go through 3.0.)
 

 Crucially, I added

 Whether this is reasonable depends on how fast we can stabilize releases.
  2.1.0 will be a good test of this.
 

 Unfortunately, even after DataStax hired half a dozen full-time test
 engineers, 2.1.0 continued the proud tradition of being unready for
 production use, with wait for .5 before upgrading once again looking like
 a good guideline.

 I'm starting to think that the entire model of write a bunch of new
 features all at once and then try to stabilize it for release is broken.
 We've been trying that for years and empirically speaking the evidence is
 that it just doesn't work, either from a stability standpoint or even just
 shipping on time.

 A big reason that it takes us so long to stabilize new releases now is
 that, because our major release cycle is so long, it's super tempting to
 slip in just one new feature into bugfix releases, and I'm as guilty of
 that as anyone.

 For similar reasons, it's difficult to do a meaningful freeze with big
 feature releases.  A look at 3.0 shows why: we have 8099 coming, but we
 also have significant work done (but not finished) on 6230, 7970, 6696, and
 6477, all of which are meaningful improvements that address demonstrated
 user pain.  So if we keep doing what we've been doing, our choices are to
 either delay 3.0 further while we finish and stabilize these, or we wait
 nine months to a year for the next release.  Either way, one of our
 constituencies gets disappointed.

 So, I'd like to try something different.  I think we were on the right
 track with shorter releases with more compatibility.  But I'd like to throw
 in a twist.  Intel cuts down on risk with a tick-tock schedule for new
 architectures and process shrinks instead of trying to do both at once.  We
 can do something similar here:

 One month releases.  Period.  If it's not done, it can wait.
 *Every other release only accepts bug fixes.*

 By itself, one-month releases are going to dramatically reduce the
 complexity of testing and debugging new releases -- and bugs that do slip
 past us will only affect a smaller percentage of users, avoiding the big
 release has a bunch of bugs no one has seen before and pretty much everyone
 is hit by something scenario.  But by adding in the second rule, I think
 we have a real chance to make a quantum leap here: stable, production-ready
 releases every two months.

 So here is my proposal for 3.0:

 We're just about ready to start serious review of 8099.  When that's done,
 we branch 3.0 and cut a beta and then release candidates.  Whatever isn't
 done by then, has to wait; unlike prior betas, we will only accept bug
 fixes into 3.0 after branching.

 One month after 3.0, we will ship 3.1 (with new features).  At the same
 time, we will branch 3.2.  New features in trunk will go into 3.3.  The 3.2
 branch will only get bug fixes.  We will maintain backwards compatibility
 for all of 3.x; eventually (no less than a year) we will pick a release to
 be 4.0, 

Re: 3.0 and the Cassandra release process

2015-03-18 Thread Jake Luciani
+1

On Tue, Mar 17, 2015 at 5:06 PM, Jonathan Ellis jbel...@gmail.com wrote:
 Cassandra 2.1 was released in September, which means that if we were on
 track with our stated goal of six month releases, 3.0 would be done about
 now.  Instead, we haven't even delivered a beta.  The immediate cause this
 time is blocking for 8099
 https://issues.apache.org/jira/browse/CASSANDRA-8099, but the reality is
 that nobody should really be surprised.  Something always comes up -- we've
 averaged about nine months since 1.0, with 2.1 taking an entire year.

 We could make theory align with reality by acknowledging, if nine months
 is our 'natural' release schedule, then so be it.  But I think we can do
 better.

 Broadly speaking, we have two constituencies with Cassandra releases:

 First, we have the users who are building or porting an application on
 Cassandra.  These users want the newest features to make their job easier.
 If 2.1.0 has a few bugs, it's not the end of the world.  They have time to
 wait for 2.1.x to stabilize while they write their code.  They would like
 to see us deliver on our six month schedule or even faster.

 Second, we have the users who have an application in production.  These
 users, or their bosses, want Cassandra to be as stable as possible.
 Assuming they deploy on a stable release like 2.0.12, they don't want to
 touch it.  They would like to see us release *less* often.  (Because that
 means they have to do less upgrades while remaining in our backwards
 compatibility window.)

 With our current big release every X months model, these users' needs are
 in tension.

 We discussed this six months ago, and ended up with this:

 What if we tried a [four month] release cycle, BUT we would guarantee that
 you could do a rolling upgrade until we bump the supermajor version? So 2.0
 could upgrade to 3.0 without having to go through 2.1.  (But to go to 3.1
 or 4.0 you would have to go through 3.0.)


 Crucially, I added

 Whether this is reasonable depends on how fast we can stabilize releases.
 2.1.0 will be a good test of this.


 Unfortunately, even after DataStax hired half a dozen full-time test
 engineers, 2.1.0 continued the proud tradition of being unready for
 production use, with wait for .5 before upgrading once again looking like
 a good guideline.

 I’m starting to think that the entire model of “write a bunch of new
 features all at once and then try to stabilize it for release” is broken.
 We’ve been trying that for years and empirically speaking the evidence is
 that it just doesn’t work, either from a stability standpoint or even just
 shipping on time.

 A big reason that it takes us so long to stabilize new releases now is
 that, because our major release cycle is so long, it’s super tempting to
 slip in “just one” new feature into bugfix releases, and I’m as guilty of
 that as anyone.

 For similar reasons, it’s difficult to do a meaningful freeze with big
 feature releases.  A look at 3.0 shows why: we have 8099 coming, but we
 also have significant work done (but not finished) on 6230, 7970, 6696, and
 6477, all of which are meaningful improvements that address demonstrated
 user pain.  So if we keep doing what we’ve been doing, our choices are to
 either delay 3.0 further while we finish and stabilize these, or we wait
 nine months to a year for the next release.  Either way, one of our
 constituencies gets disappointed.

 So, I’d like to try something different.  I think we were on the right
 track with shorter releases with more compatibility.  But I’d like to throw
 in a twist.  Intel cuts down on risk with a “tick-tock” schedule for new
 architectures and process shrinks instead of trying to do both at once.  We
 can do something similar here:

 One month releases.  Period.  If it’s not done, it can wait.
 *Every other release only accepts bug fixes.*

 By itself, one-month releases are going to dramatically reduce the
 complexity of testing and debugging new releases -- and bugs that do slip
 past us will only affect a smaller percentage of users, avoiding the “big
 release has a bunch of bugs no one has seen before and pretty much everyone
 is hit by something” scenario.  But by adding in the second rule, I think
 we have a real chance to make a quantum leap here: stable, production-ready
 releases every two months.

 So here is my proposal for 3.0:

 We’re just about ready to start serious review of 8099.  When that’s done,
 we branch 3.0 and cut a beta and then release candidates.  Whatever isn’t
 done by then, has to wait; unlike prior betas, we will only accept bug
 fixes into 3.0 after branching.

 One month after 3.0, we will ship 3.1 (with new features).  At the same
 time, we will branch 3.2.  New features in trunk will go into 3.3.  The 3.2
 branch will only get bug fixes.  We will maintain backwards compatibility
 for all of 3.x; eventually (no less than a year) we will pick a release to
 be 4.0, and drop deprecated features and old backwards 

Re: 3.0 and the Cassandra release process

2015-03-18 Thread Aleksey Yeschenko
+1

-- 
AY

On March 17, 2015 at 14:07:03, Jonathan Ellis (jbel...@gmail.com) wrote:

Cassandra 2.1 was released in September, which means that if we were on  
track with our stated goal of six month releases, 3.0 would be done about  
now. Instead, we haven't even delivered a beta. The immediate cause this  
time is blocking for 8099  
https://issues.apache.org/jira/browse/CASSANDRA-8099, but the reality is  
that nobody should really be surprised. Something always comes up -- we've  
averaged about nine months since 1.0, with 2.1 taking an entire year.  

We could make theory align with reality by acknowledging, if nine months  
is our 'natural' release schedule, then so be it. But I think we can do  
better.  

Broadly speaking, we have two constituencies with Cassandra releases:  

First, we have the users who are building or porting an application on  
Cassandra. These users want the newest features to make their job easier.  
If 2.1.0 has a few bugs, it's not the end of the world. They have time to  
wait for 2.1.x to stabilize while they write their code. They would like  
to see us deliver on our six month schedule or even faster.  

Second, we have the users who have an application in production. These  
users, or their bosses, want Cassandra to be as stable as possible.  
Assuming they deploy on a stable release like 2.0.12, they don't want to  
touch it. They would like to see us release *less* often. (Because that  
means they have to do less upgrades while remaining in our backwards  
compatibility window.)  

With our current big release every X months model, these users' needs are  
in tension.  

We discussed this six months ago, and ended up with this:  

What if we tried a [four month] release cycle, BUT we would guarantee that  
 you could do a rolling upgrade until we bump the supermajor version? So 2.0  
 could upgrade to 3.0 without having to go through 2.1. (But to go to 3.1  
 or 4.0 you would have to go through 3.0.)  
  

Crucially, I added  

Whether this is reasonable depends on how fast we can stabilize releases.  
 2.1.0 will be a good test of this.  
  

Unfortunately, even after DataStax hired half a dozen full-time test  
engineers, 2.1.0 continued the proud tradition of being unready for  
production use, with wait for .5 before upgrading once again looking like  
a good guideline.  

I’m starting to think that the entire model of “write a bunch of new  
features all at once and then try to stabilize it for release” is broken.  
We’ve been trying that for years and empirically speaking the evidence is  
that it just doesn’t work, either from a stability standpoint or even just  
shipping on time.  

A big reason that it takes us so long to stabilize new releases now is  
that, because our major release cycle is so long, it’s super tempting to  
slip in “just one” new feature into bugfix releases, and I’m as guilty of  
that as anyone.  

For similar reasons, it’s difficult to do a meaningful freeze with big  
feature releases. A look at 3.0 shows why: we have 8099 coming, but we  
also have significant work done (but not finished) on 6230, 7970, 6696, and  
6477, all of which are meaningful improvements that address demonstrated  
user pain. So if we keep doing what we’ve been doing, our choices are to  
either delay 3.0 further while we finish and stabilize these, or we wait  
nine months to a year for the next release. Either way, one of our  
constituencies gets disappointed.  

So, I’d like to try something different. I think we were on the right  
track with shorter releases with more compatibility. But I’d like to throw  
in a twist. Intel cuts down on risk with a “tick-tock” schedule for new  
architectures and process shrinks instead of trying to do both at once. We  
can do something similar here:  

One month releases. Period. If it’s not done, it can wait.  
*Every other release only accepts bug fixes.*  

By itself, one-month releases are going to dramatically reduce the  
complexity of testing and debugging new releases -- and bugs that do slip  
past us will only affect a smaller percentage of users, avoiding the “big  
release has a bunch of bugs no one has seen before and pretty much everyone  
is hit by something” scenario. But by adding in the second rule, I think  
we have a real chance to make a quantum leap here: stable, production-ready  
releases every two months.  

So here is my proposal for 3.0:  

We’re just about ready to start serious review of 8099. When that’s done,  
we branch 3.0 and cut a beta and then release candidates. Whatever isn’t  
done by then, has to wait; unlike prior betas, we will only accept bug  
fixes into 3.0 after branching.  

One month after 3.0, we will ship 3.1 (with new features). At the same  
time, we will branch 3.2. New features in trunk will go into 3.3. The 3.2  
branch will only get bug fixes. We will maintain backwards compatibility  
for all of 3.x; eventually (no less than a year) we will pick a release to  

Re: 3.0 and the Cassandra release process

2015-03-18 Thread Josh McKenzie
+1

On Wed, Mar 18, 2015 at 7:54 AM, Jake Luciani jak...@gmail.com wrote:

 +1

 On Tue, Mar 17, 2015 at 5:06 PM, Jonathan Ellis jbel...@gmail.com wrote:
  Cassandra 2.1 was released in September, which means that if we were on
  track with our stated goal of six month releases, 3.0 would be done about
  now.  Instead, we haven't even delivered a beta.  The immediate cause
 this
  time is blocking for 8099
  https://issues.apache.org/jira/browse/CASSANDRA-8099, but the reality
 is
  that nobody should really be surprised.  Something always comes up --
 we've
  averaged about nine months since 1.0, with 2.1 taking an entire year.
 
  We could make theory align with reality by acknowledging, if nine months
  is our 'natural' release schedule, then so be it.  But I think we can do
  better.
 
  Broadly speaking, we have two constituencies with Cassandra releases:
 
  First, we have the users who are building or porting an application on
  Cassandra.  These users want the newest features to make their job
 easier.
  If 2.1.0 has a few bugs, it's not the end of the world.  They have time
 to
  wait for 2.1.x to stabilize while they write their code.  They would like
  to see us deliver on our six month schedule or even faster.
 
  Second, we have the users who have an application in production.  These
  users, or their bosses, want Cassandra to be as stable as possible.
  Assuming they deploy on a stable release like 2.0.12, they don't want to
  touch it.  They would like to see us release *less* often.  (Because that
  means they have to do less upgrades while remaining in our backwards
  compatibility window.)
 
  With our current big release every X months model, these users' needs
 are
  in tension.
 
  We discussed this six months ago, and ended up with this:
 
  What if we tried a [four month] release cycle, BUT we would guarantee
 that
  you could do a rolling upgrade until we bump the supermajor version? So
 2.0
  could upgrade to 3.0 without having to go through 2.1.  (But to go to
 3.1
  or 4.0 you would have to go through 3.0.)
 
 
  Crucially, I added
 
  Whether this is reasonable depends on how fast we can stabilize releases.
  2.1.0 will be a good test of this.
 
 
  Unfortunately, even after DataStax hired half a dozen full-time test
  engineers, 2.1.0 continued the proud tradition of being unready for
  production use, with wait for .5 before upgrading once again looking
 like
  a good guideline.
 
  I’m starting to think that the entire model of “write a bunch of new
  features all at once and then try to stabilize it for release” is broken.
  We’ve been trying that for years and empirically speaking the evidence is
  that it just doesn’t work, either from a stability standpoint or even
 just
  shipping on time.
 
  A big reason that it takes us so long to stabilize new releases now is
  that, because our major release cycle is so long, it’s super tempting to
  slip in “just one” new feature into bugfix releases, and I’m as guilty of
  that as anyone.
 
  For similar reasons, it’s difficult to do a meaningful freeze with big
  feature releases.  A look at 3.0 shows why: we have 8099 coming, but we
  also have significant work done (but not finished) on 6230, 7970, 6696,
 and
  6477, all of which are meaningful improvements that address demonstrated
  user pain.  So if we keep doing what we’ve been doing, our choices are to
  either delay 3.0 further while we finish and stabilize these, or we wait
  nine months to a year for the next release.  Either way, one of our
  constituencies gets disappointed.
 
  So, I’d like to try something different.  I think we were on the right
  track with shorter releases with more compatibility.  But I’d like to
 throw
  in a twist.  Intel cuts down on risk with a “tick-tock” schedule for new
  architectures and process shrinks instead of trying to do both at once.
 We
  can do something similar here:
 
  One month releases.  Period.  If it’s not done, it can wait.
  *Every other release only accepts bug fixes.*
 
  By itself, one-month releases are going to dramatically reduce the
  complexity of testing and debugging new releases -- and bugs that do slip
  past us will only affect a smaller percentage of users, avoiding the “big
  release has a bunch of bugs no one has seen before and pretty much
 everyone
  is hit by something” scenario.  But by adding in the second rule, I think
  we have a real chance to make a quantum leap here: stable,
 production-ready
  releases every two months.
 
  So here is my proposal for 3.0:
 
  We’re just about ready to start serious review of 8099.  When that’s
 done,
  we branch 3.0 and cut a beta and then release candidates.  Whatever isn’t
  done by then, has to wait; unlike prior betas, we will only accept bug
  fixes into 3.0 after branching.
 
  One month after 3.0, we will ship 3.1 (with new features).  At the same
  time, we will branch 3.2.  New features in trunk will go into 3.3.  The
 3.2
  branch will only get bug 

Re: 3.0 and the Cassandra release process

2015-03-18 Thread Michael Kjellman
For most of my life I’ve lived on the software bleeding edge both personally 
and professionally. Maybe it’s a personal weakness, but I guess I get a thrill 
out of the problem solving aspect?

Recently I came to a bit of an epiphany — the closer I keep to the daily build 
— generally the happier I am on a daily basis. Bugs happen, but for the most 
part (aside from show stopper bugs), pain points for myself in a given daily 
build can generally can be debugged to 1 or maybe 2 root causes, fixed in ~24 
hours, and then life is better the next day again. In comparison, the old 
waterfall model generally means taking an “official” release at some point and 
waiting for some poor soul (or developer) to actually run the thing. No matter 
how good the QA team is, until it’s actually used in the real world, most bugs 
aren’t found.

If you and your organization can wait 24 hours * number of bugs discovered 
after people actually started using the thing, you end up with a “usable build” 
around the holy-grail minor X.X.5 release of Cassandra.

I love the idea of the LTS model Jonathan describes because it means more code 
can get real testing and “bake” for longer instead of sitting largely unused on 
some git repository in a datacenter far far away. A lot of code has changed 
between 2.0 and trunk today. The code has diverged to the point that if you 
write something for 2.0 (as the most stable major branch currently available), 
merging it forward to 3.0 or after generally means rewriting it. If the only 
thing that comes out of this is a smaller delta of LOC between the deployable 
version/branch and what we can develop against and what QA is focused on I 
think that’s a massive win.

Something like CASSANDRA-8099 will need 2x the baking time of even many of the 
more risky changes the project has made. While I wouldn’t want to run a build 
with CASSANDRA-8099 in it anytime soon, there are now hundreds of other changes 
blocked, most likely many containing new bugs of their own, but have no 
exposure at all to even the most involved C* developers.

I really think this will be a huge win for the project and I’m super thankful 
for Sylvian, Ariel, Jonathan, Aleksey, and Jake for guiding this change to a 
much more sustainable release model for the entire community.

best,
kjellman

 
 On Mar 18, 2015, at 3:02 PM, Ariel Weisberg ariel.weisb...@datastax.com 
 wrote:
 
 Hi,
 
 Keep in mind it is a bug fix release every month and a feature release every 
 two months.
 
 For development that is really a two month cycle with all bug fixes being 
 backported one release. As a developer if you want to get something in a 
 release you have two months and you should be sizing pieces of large tasks so 
 they ship at least every two months.
 
 Ariel
 On Mar 18, 2015, at 5:58 PM, Terrance Shepherd tscana...@gmail.com wrote:
 
 I like the idea but I agree that every month is a bit aggressive. I have no
 say but:
 
 I would say 4 releases a year instead of 12. with 2 months of new features
 and 1 month of bug squashing per a release. With the 4th quarter just bugs.
 
 I would also proposed 2 year LTS releases for the releases after the 4th
 quarter. So everyone could get a new feature release every quarter and the
 stability of super major versions for 2 years.
 
 On Wed, Mar 18, 2015 at 2:34 PM, Dave Brosius dbros...@mebigfatguy.com
 wrote:
 
 It would seem the practical implications of this is that there would be
 significantly more development on branches, with potentially more
 significant delays on merging these branches. This would imply to me that
 more Jenkins servers would need to be set up to handle auto-testing of more
 branches, as if feature work spends more time on external branches, it is
 then likely to be be less tested (even if by accident) as less developers
 would be working on that branch. Only when a feature was blessed to make it
 to the release-tracked branch, would it become exposed to the majority of
 developers/testers, etc doing normal running/playing/testing.
 
 This isn't to knock the idea in anyway, just wanted to mention what i
 think the outcome would be.
 
 dave
 
 
 
 
 On Tue, Mar 17, 2015 at 5:06 PM, Jonathan Ellis jbel...@gmail.com
 wrote:
 Cassandra 2.1 was released in September, which means that if we were
 on
 track with our stated goal of six month releases, 3.0 would be done
 about
 now.  Instead, we haven't even delivered a beta.  The immediate cause
 this
 time is blocking for 8099
 https://issues.apache.org/jira/browse/CASSANDRA-8099, but the
 reality
 is
 that nobody should really be surprised.  Something always comes up --
 we've
 averaged about nine months since 1.0, with 2.1 taking an entire year.
 
 We could make theory align with reality by acknowledging, if nine
 months
 is our 'natural' release schedule, then so be it.  But I think we
 can
 do
 better.
 
 Broadly speaking, we have two constituencies with Cassandra releases:
 
 First, we have the users who are building or 

Re: 3.0 and the Cassandra release process

2015-03-18 Thread Pavel Yaskevich
+1

On Wed, Mar 18, 2015 at 3:50 PM, Michael Kjellman 
mkjell...@internalcircle.com wrote:

 For most of my life I’ve lived on the software bleeding edge both
 personally and professionally. Maybe it’s a personal weakness, but I guess
 I get a thrill out of the problem solving aspect?

 Recently I came to a bit of an epiphany — the closer I keep to the daily
 build — generally the happier I am on a daily basis. Bugs happen, but for
 the most part (aside from show stopper bugs), pain points for myself in a
 given daily build can generally can be debugged to 1 or maybe 2 root
 causes, fixed in ~24 hours, and then life is better the next day again. In
 comparison, the old waterfall model generally means taking an “official”
 release at some point and waiting for some poor soul (or developer) to
 actually run the thing. No matter how good the QA team is, until it’s
 actually used in the real world, most bugs aren’t found.

 If you and your organization can wait 24 hours * number of bugs discovered
 after people actually started using the thing, you end up with a “usable
 build” around the holy-grail minor X.X.5 release of Cassandra.

 I love the idea of the LTS model Jonathan describes because it means more
 code can get real testing and “bake” for longer instead of sitting largely
 unused on some git repository in a datacenter far far away. A lot of code
 has changed between 2.0 and trunk today. The code has diverged to the point
 that if you write something for 2.0 (as the most stable major branch
 currently available), merging it forward to 3.0 or after generally means
 rewriting it. If the only thing that comes out of this is a smaller delta
 of LOC between the deployable version/branch and what we can develop
 against and what QA is focused on I think that’s a massive win.

 Something like CASSANDRA-8099 will need 2x the baking time of even many of
 the more risky changes the project has made. While I wouldn’t want to run a
 build with CASSANDRA-8099 in it anytime soon, there are now hundreds of
 other changes blocked, most likely many containing new bugs of their own,
 but have no exposure at all to even the most involved C* developers.

 I really think this will be a huge win for the project and I’m super
 thankful for Sylvian, Ariel, Jonathan, Aleksey, and Jake for guiding this
 change to a much more sustainable release model for the entire community.

 best,
 kjellman


  On Mar 18, 2015, at 3:02 PM, Ariel Weisberg ariel.weisb...@datastax.com
 wrote:
 
  Hi,
 
  Keep in mind it is a bug fix release every month and a feature release
 every two months.
 
  For development that is really a two month cycle with all bug fixes
 being backported one release. As a developer if you want to get something
 in a release you have two months and you should be sizing pieces of large
 tasks so they ship at least every two months.
 
  Ariel
  On Mar 18, 2015, at 5:58 PM, Terrance Shepherd tscana...@gmail.com
 wrote:
 
  I like the idea but I agree that every month is a bit aggressive. I
 have no
  say but:
 
  I would say 4 releases a year instead of 12. with 2 months of new
 features
  and 1 month of bug squashing per a release. With the 4th quarter just
 bugs.
 
  I would also proposed 2 year LTS releases for the releases after the 4th
  quarter. So everyone could get a new feature release every quarter and
 the
  stability of super major versions for 2 years.
 
  On Wed, Mar 18, 2015 at 2:34 PM, Dave Brosius dbros...@mebigfatguy.com
 
  wrote:
 
  It would seem the practical implications of this is that there would be
  significantly more development on branches, with potentially more
  significant delays on merging these branches. This would imply to me
 that
  more Jenkins servers would need to be set up to handle auto-testing of
 more
  branches, as if feature work spends more time on external branches, it
 is
  then likely to be be less tested (even if by accident) as less
 developers
  would be working on that branch. Only when a feature was blessed to
 make it
  to the release-tracked branch, would it become exposed to the majority
 of
  developers/testers, etc doing normal running/playing/testing.
 
  This isn't to knock the idea in anyway, just wanted to mention what i
  think the outcome would be.
 
  dave
 
 
 
 
  On Tue, Mar 17, 2015 at 5:06 PM, Jonathan Ellis jbel...@gmail.com
  wrote:
  Cassandra 2.1 was released in September, which means that if we
 were
  on
  track with our stated goal of six month releases, 3.0 would be done
  about
  now.  Instead, we haven't even delivered a beta.  The immediate
 cause
  this
  time is blocking for 8099
  https://issues.apache.org/jira/browse/CASSANDRA-8099, but the
  reality
  is
  that nobody should really be surprised.  Something always comes up
 --
  we've
  averaged about nine months since 1.0, with 2.1 taking an entire
 year.
 
  We could make theory align with reality by acknowledging, if nine
  months
  is our 'natural' release schedule, then so be it.  

Re: 3.0 and the Cassandra release process

2015-03-18 Thread Jonathan Haddad
If every other release is a bug fix release, would the versioning go:

3.1.0 -- feature release
3.1.1 -- bug fix release

Eventually it seems like it might be possible to be able to push out a bug
fix release more frequently than once a month?

On Wed, Mar 18, 2015 at 7:59 AM Josh McKenzie josh.mcken...@datastax.com
wrote:

 +1

 On Wed, Mar 18, 2015 at 7:54 AM, Jake Luciani jak...@gmail.com wrote:

  +1
 
  On Tue, Mar 17, 2015 at 5:06 PM, Jonathan Ellis jbel...@gmail.com
 wrote:
   Cassandra 2.1 was released in September, which means that if we were on
   track with our stated goal of six month releases, 3.0 would be done
 about
   now.  Instead, we haven't even delivered a beta.  The immediate cause
  this
   time is blocking for 8099
   https://issues.apache.org/jira/browse/CASSANDRA-8099, but the
 reality
  is
   that nobody should really be surprised.  Something always comes up --
  we've
   averaged about nine months since 1.0, with 2.1 taking an entire year.
  
   We could make theory align with reality by acknowledging, if nine
 months
   is our 'natural' release schedule, then so be it.  But I think we can
 do
   better.
  
   Broadly speaking, we have two constituencies with Cassandra releases:
  
   First, we have the users who are building or porting an application on
   Cassandra.  These users want the newest features to make their job
  easier.
   If 2.1.0 has a few bugs, it's not the end of the world.  They have time
  to
   wait for 2.1.x to stabilize while they write their code.  They would
 like
   to see us deliver on our six month schedule or even faster.
  
   Second, we have the users who have an application in production.  These
   users, or their bosses, want Cassandra to be as stable as possible.
   Assuming they deploy on a stable release like 2.0.12, they don't want
 to
   touch it.  They would like to see us release *less* often.  (Because
 that
   means they have to do less upgrades while remaining in our backwards
   compatibility window.)
  
   With our current big release every X months model, these users' needs
  are
   in tension.
  
   We discussed this six months ago, and ended up with this:
  
   What if we tried a [four month] release cycle, BUT we would guarantee
  that
   you could do a rolling upgrade until we bump the supermajor version?
 So
  2.0
   could upgrade to 3.0 without having to go through 2.1.  (But to go to
  3.1
   or 4.0 you would have to go through 3.0.)
  
  
   Crucially, I added
  
   Whether this is reasonable depends on how fast we can stabilize
 releases.
   2.1.0 will be a good test of this.
  
  
   Unfortunately, even after DataStax hired half a dozen full-time test
   engineers, 2.1.0 continued the proud tradition of being unready for
   production use, with wait for .5 before upgrading once again looking
  like
   a good guideline.
  
   I’m starting to think that the entire model of “write a bunch of new
   features all at once and then try to stabilize it for release” is
 broken.
   We’ve been trying that for years and empirically speaking the evidence
 is
   that it just doesn’t work, either from a stability standpoint or even
  just
   shipping on time.
  
   A big reason that it takes us so long to stabilize new releases now is
   that, because our major release cycle is so long, it’s super tempting
 to
   slip in “just one” new feature into bugfix releases, and I’m as guilty
 of
   that as anyone.
  
   For similar reasons, it’s difficult to do a meaningful freeze with big
   feature releases.  A look at 3.0 shows why: we have 8099 coming, but we
   also have significant work done (but not finished) on 6230, 7970, 6696,
  and
   6477, all of which are meaningful improvements that address
 demonstrated
   user pain.  So if we keep doing what we’ve been doing, our choices are
 to
   either delay 3.0 further while we finish and stabilize these, or we
 wait
   nine months to a year for the next release.  Either way, one of our
   constituencies gets disappointed.
  
   So, I’d like to try something different.  I think we were on the right
   track with shorter releases with more compatibility.  But I’d like to
  throw
   in a twist.  Intel cuts down on risk with a “tick-tock” schedule for
 new
   architectures and process shrinks instead of trying to do both at once.
  We
   can do something similar here:
  
   One month releases.  Period.  If it’s not done, it can wait.
   *Every other release only accepts bug fixes.*
  
   By itself, one-month releases are going to dramatically reduce the
   complexity of testing and debugging new releases -- and bugs that do
 slip
   past us will only affect a smaller percentage of users, avoiding the
 “big
   release has a bunch of bugs no one has seen before and pretty much
  everyone
   is hit by something” scenario.  But by adding in the second rule, I
 think
   we have a real chance to make a quantum leap here: stable,
  production-ready
   releases every two months.
  
   So here is my 

Re: 3.0 and the Cassandra release process

2015-03-18 Thread Terrance Shepherd
I like the idea but I agree that every month is a bit aggressive. I have no
say but:

I would say 4 releases a year instead of 12. with 2 months of new features
and 1 month of bug squashing per a release. With the 4th quarter just bugs.

I would also proposed 2 year LTS releases for the releases after the 4th
quarter. So everyone could get a new feature release every quarter and the
stability of super major versions for 2 years.

On Wed, Mar 18, 2015 at 2:34 PM, Dave Brosius dbros...@mebigfatguy.com
wrote:

 It would seem the practical implications of this is that there would be
 significantly more development on branches, with potentially more
 significant delays on merging these branches. This would imply to me that
 more Jenkins servers would need to be set up to handle auto-testing of more
 branches, as if feature work spends more time on external branches, it is
 then likely to be be less tested (even if by accident) as less developers
 would be working on that branch. Only when a feature was blessed to make it
 to the release-tracked branch, would it become exposed to the majority of
 developers/testers, etc doing normal running/playing/testing.

 This isn't to knock the idea in anyway, just wanted to mention what i
 think the outcome would be.

 dave



  
  On Tue, Mar 17, 2015 at 5:06 PM, Jonathan Ellis jbel...@gmail.com
 wrote:
   Cassandra 2.1 was released in September, which means that if we were
 on
   track with our stated goal of six month releases, 3.0 would be done
 about
   now.  Instead, we haven't even delivered a beta.  The immediate cause
  this
   time is blocking for 8099
   https://issues.apache.org/jira/browse/CASSANDRA-8099, but the
 reality
  is
   that nobody should really be surprised.  Something always comes up --
  we've
   averaged about nine months since 1.0, with 2.1 taking an entire year.
  
   We could make theory align with reality by acknowledging, if nine
 months
   is our 'natural' release schedule, then so be it.  But I think we
 can
 do
   better.
  
   Broadly speaking, we have two constituencies with Cassandra releases:
  
   First, we have the users who are building or porting an application
 on
   Cassandra.  These users want the newest features to make their job
  easier.
   If 2.1.0 has a few bugs, it's not the end of the world.  They have
 time
  to
   wait for 2.1.x to stabilize while they write their code.  They would
 like
   to see us deliver on our six month schedule or even faster.
  
   Second, we have the users who have an application in production.
 These
   users, or their bosses, want Cassandra to be as stable as possible.
   Assuming they deploy on a stable release like 2.0.12, they don't want
 to
   touch it.  They would like to see us release *less* often.  (Because
 that
   means they have to do less upgrades while remaining in our backwards
   compatibility window.)
  
   With our current big release every X months model, these users'
 needs
  are
   in tension.
  
   We discussed this six months ago, and ended up with this:
  
   What if we tried a [four month] release cycle, BUT we would guarantee
  that
   you could do a rolling upgrade until we bump the supermajor version?
 So
  2.0
   could upgrade to 3.0 without having to go through 2.1.  (But to go
 to
  3.1
   or 4.0 you would have to go through 3.0.)
  
  
   Crucially, I added
  
   Whether this is reasonable depends on how fast we can stabilize
 releases.
   2.1.0 will be a good test of this.
  
  
   Unfortunately, even after DataStax hired half a dozen full-time test
   engineers, 2.1.0 continued the proud tradition of being unready for
   production use, with wait for .5 before upgrading once again
 looking
  like
   a good guideline.
  
   I’m starting to think that the entire model of “write a bunch of new
   features all at once and then try to stabilize it for release” is
 broken.
   We’ve been trying that for years and empirically speaking the
 evidence
 is
   that it just doesn’t work, either from a stability standpoint or even
  just
   shipping on time.
  
   A big reason that it takes us so long to stabilize new releases now
 is
   that, because our major release cycle is so long, it’s super tempting
 to
   slip in “just one” new feature into bugfix releases, and I’m as
 guilty
 of
   that as anyone.
  
   For similar reasons, it’s difficult to do a meaningful freeze with
 big
   feature releases.  A look at 3.0 shows why: we have 8099 coming, but
 we
   also have significant work done (but not finished) on 6230, 7970,
 6696,
  and
   6477, all of which are meaningful improvements that address
 demonstrated
   user pain.  So if we keep doing what we’ve been doing, our choices
 are
 to
   either delay 3.0 further while we finish and stabilize these, or we
 wait
   nine months to a year for the next release.  Either way, one of our
   constituencies gets disappointed.
  
   So, I’d like to try something different.  I think we were on the
 right
   track with shorter 

Re: 3.0 and the Cassandra release process

2015-03-18 Thread Ariel Weisberg
Hi,

Long lived feature branches are already a thing and orthogonal IMO to release 
frequency. The goal is that developers will implement larger features as 
smaller tested components that have already shipped. Some times this means 
working in a less destructive fashion so you can always ship a working 
implementation of everything (which is a mixed bag).

Developers should be able to put their work on trunk faster because they will 
know before the merge what the impact of their changes will be. That is why we 
are emphasizing have Jenkin’s run on all commits (trunk and branch). We want 
the testing that is performed on branches to be as close to the testing 
performed on trunk. Once something is merged to trunk we want it to be about as 
tested as it is going to get within a day or two.

Part of releasing more frequently is getting away from relying on 
developers/testers running things and moving towards automated testing that 
exercises the database the same way users do with the same expectations of 
correctness. We also have to address the process issues that are causing the 
tests we have to demonstrate that trunk is not releasable on a regular basis.

Ariel

 On Mar 18, 2015, at 5:34 PM, Dave Brosius dbros...@mebigfatguy.com wrote:
 
 It would seem the practical implications of this is that there would be 
 significantly more development on branches, with potentially more significant 
 delays on merging these branches. This would imply to me that more Jenkins 
 servers would need to be set up to handle auto-testing of more branches, as 
 if feature work spends more time on external branches, it is then likely to 
 be be less tested (even if by accident) as less developers would be working 
 on that branch. Only when a feature was blessed to make it to the 
 release-tracked branch, would it become exposed to the majority of 
 developers/testers, etc doing normal running/playing/testing.
 
 This isn't to knock the idea in anyway, just wanted to mention what i think 
 the outcome would be.
 
 dave
 
 
 
  On Tue, Mar 17, 2015 at 5:06 PM, Jonathan Ellis jbel...@gmail.com
 wrote:
   Cassandra 2.1 was released in September, which means that if we were on
   track with our stated goal of six month releases, 3.0 would be done
 about
   now.  Instead, we haven't even delivered a beta.  The immediate cause
  this
   time is blocking for 8099
   https://issues.apache.org/jira/browse/CASSANDRA-8099, but the
 reality
  is
   that nobody should really be surprised.  Something always comes up --
  we've
   averaged about nine months since 1.0, with 2.1 taking an entire year.
  
   We could make theory align with reality by acknowledging, if nine
 months
   is our 'natural' release schedule, then so be it.  But I think we can
 do
   better.
  
   Broadly speaking, we have two constituencies with Cassandra releases:
  
   First, we have the users who are building or porting an application on
   Cassandra.  These users want the newest features to make their job
  easier.
   If 2.1.0 has a few bugs, it's not the end of the world.  They have time
  to
   wait for 2.1.x to stabilize while they write their code.  They would
 like
   to see us deliver on our six month schedule or even faster.
  
   Second, we have the users who have an application in production.  These
   users, or their bosses, want Cassandra to be as stable as possible.
   Assuming they deploy on a stable release like 2.0.12, they don't want
 to
   touch it.  They would like to see us release *less* often.  (Because
 that
   means they have to do less upgrades while remaining in our backwards
   compatibility window.)
  
   With our current big release every X months model, these users' needs
  are
   in tension.
  
   We discussed this six months ago, and ended up with this:
  
   What if we tried a [four month] release cycle, BUT we would guarantee
  that
   you could do a rolling upgrade until we bump the supermajor version?
 So
  2.0
   could upgrade to 3.0 without having to go through 2.1.  (But to go to
  3.1
   or 4.0 you would have to go through 3.0.)
  
  
   Crucially, I added
  
   Whether this is reasonable depends on how fast we can stabilize
 releases.
   2.1.0 will be a good test of this.
  
  
   Unfortunately, even after DataStax hired half a dozen full-time test
   engineers, 2.1.0 continued the proud tradition of being unready for
   production use, with wait for .5 before upgrading once again looking
  like
   a good guideline.
  
   I’m starting to think that the entire model of “write a bunch of new
   features all at once and then try to stabilize it for release” is
 broken.
   We’ve been trying that for years and empirically speaking the evidence
 is
   that it just doesn’t work, either from a stability standpoint or even
  just
   shipping on time.
  
   A big reason that it takes us so long to stabilize new releases now is
   that, because our major release cycle is so long, it’s super tempting
 to
   slip in “just one” new 

Re: 3.0 and the Cassandra release process

2015-03-18 Thread Ariel Weisberg
Hi,

Keep in mind it is a bug fix release every month and a feature release every 
two months.

For development that is really a two month cycle with all bug fixes being 
backported one release. As a developer if you want to get something in a 
release you have two months and you should be sizing pieces of large tasks so 
they ship at least every two months.

Ariel
 On Mar 18, 2015, at 5:58 PM, Terrance Shepherd tscana...@gmail.com wrote:
 
 I like the idea but I agree that every month is a bit aggressive. I have no
 say but:
 
 I would say 4 releases a year instead of 12. with 2 months of new features
 and 1 month of bug squashing per a release. With the 4th quarter just bugs.
 
 I would also proposed 2 year LTS releases for the releases after the 4th
 quarter. So everyone could get a new feature release every quarter and the
 stability of super major versions for 2 years.
 
 On Wed, Mar 18, 2015 at 2:34 PM, Dave Brosius dbros...@mebigfatguy.com
 wrote:
 
 It would seem the practical implications of this is that there would be
 significantly more development on branches, with potentially more
 significant delays on merging these branches. This would imply to me that
 more Jenkins servers would need to be set up to handle auto-testing of more
 branches, as if feature work spends more time on external branches, it is
 then likely to be be less tested (even if by accident) as less developers
 would be working on that branch. Only when a feature was blessed to make it
 to the release-tracked branch, would it become exposed to the majority of
 developers/testers, etc doing normal running/playing/testing.
 
 This isn't to knock the idea in anyway, just wanted to mention what i
 think the outcome would be.
 
 dave
 
 
 
 
 On Tue, Mar 17, 2015 at 5:06 PM, Jonathan Ellis jbel...@gmail.com
 wrote:
 Cassandra 2.1 was released in September, which means that if we were
 on
 track with our stated goal of six month releases, 3.0 would be done
 about
 now.  Instead, we haven't even delivered a beta.  The immediate cause
 this
 time is blocking for 8099
 https://issues.apache.org/jira/browse/CASSANDRA-8099, but the
 reality
 is
 that nobody should really be surprised.  Something always comes up --
 we've
 averaged about nine months since 1.0, with 2.1 taking an entire year.
 
 We could make theory align with reality by acknowledging, if nine
 months
 is our 'natural' release schedule, then so be it.  But I think we
 can
 do
 better.
 
 Broadly speaking, we have two constituencies with Cassandra releases:
 
 First, we have the users who are building or porting an application
 on
 Cassandra.  These users want the newest features to make their job
 easier.
 If 2.1.0 has a few bugs, it's not the end of the world.  They have
 time
 to
 wait for 2.1.x to stabilize while they write their code.  They would
 like
 to see us deliver on our six month schedule or even faster.
 
 Second, we have the users who have an application in production.
 These
 users, or their bosses, want Cassandra to be as stable as possible.
 Assuming they deploy on a stable release like 2.0.12, they don't want
 to
 touch it.  They would like to see us release *less* often.  (Because
 that
 means they have to do less upgrades while remaining in our backwards
 compatibility window.)
 
 With our current big release every X months model, these users'
 needs
 are
 in tension.
 
 We discussed this six months ago, and ended up with this:
 
 What if we tried a [four month] release cycle, BUT we would guarantee
 that
 you could do a rolling upgrade until we bump the supermajor version?
 So
 2.0
 could upgrade to 3.0 without having to go through 2.1.  (But to go
 to
 3.1
 or 4.0 you would have to go through 3.0.)
 
 
 Crucially, I added
 
 Whether this is reasonable depends on how fast we can stabilize
 releases.
 2.1.0 will be a good test of this.
 
 
 Unfortunately, even after DataStax hired half a dozen full-time test
 engineers, 2.1.0 continued the proud tradition of being unready for
 production use, with wait for .5 before upgrading once again
 looking
 like
 a good guideline.
 
 I’m starting to think that the entire model of “write a bunch of new
 features all at once and then try to stabilize it for release” is
 broken.
 We’ve been trying that for years and empirically speaking the
 evidence
 is
 that it just doesn’t work, either from a stability standpoint or even
 just
 shipping on time.
 
 A big reason that it takes us so long to stabilize new releases now
 is
 that, because our major release cycle is so long, it’s super tempting
 to
 slip in “just one” new feature into bugfix releases, and I’m as
 guilty
 of
 that as anyone.
 
 For similar reasons, it’s difficult to do a meaningful freeze with
 big
 feature releases.  A look at 3.0 shows why: we have 8099 coming, but
 we
 also have significant work done (but not finished) on 6230, 7970,
 6696,
 and
 6477, all of which are meaningful improvements that address
 demonstrated
 user pain.  So if we keep 

Re: 3.0 and the Cassandra release process

2015-03-18 Thread Dave Brosius
It would seem the practical implications of this is that there would be 
significantly more development on branches, with potentially more 
significant delays on merging these branches. This would imply to me 
that more Jenkins servers would need to be set up to handle auto-testing 
of more branches, as if feature work spends more time on external 
branches, it is then likely to be be less tested (even if by accident) 
as less developers would be working on that branch. Only when a feature 
was blessed to make it to the release-tracked branch, would it become 
exposed to the majority of developers/testers, etc doing normal 
running/playing/testing.


This isn't to knock the idea in anyway, just wanted to mention what i 
think the outcome would be.


dave




 On Tue, Mar 17, 2015 at 5:06 PM, Jonathan Ellis jbel...@gmail.com
wrote:
  Cassandra 2.1 was released in September, which means that if we were on
  track with our stated goal of six month releases, 3.0 would be done
about
  now.  Instead, we haven't even delivered a beta.  The immediate cause
 this
  time is blocking for 8099
  https://issues.apache.org/jira/browse/CASSANDRA-8099, but the
reality
 is
  that nobody should really be surprised.  Something always comes up --
 we've
  averaged about nine months since 1.0, with 2.1 taking an entire year.
 
  We could make theory align with reality by acknowledging, if nine
months
  is our 'natural' release schedule, then so be it.  But I think we can
do
  better.
 
  Broadly speaking, we have two constituencies with Cassandra releases:
 
  First, we have the users who are building or porting an application on
  Cassandra.  These users want the newest features to make their job
 easier.
  If 2.1.0 has a few bugs, it's not the end of the world.  They have time
 to
  wait for 2.1.x to stabilize while they write their code.  They would
like
  to see us deliver on our six month schedule or even faster.
 
  Second, we have the users who have an application in production.  These
  users, or their bosses, want Cassandra to be as stable as possible.
  Assuming they deploy on a stable release like 2.0.12, they don't want
to
  touch it.  They would like to see us release *less* often.  (Because
that
  means they have to do less upgrades while remaining in our backwards
  compatibility window.)
 
  With our current big release every X months model, these users' needs
 are
  in tension.
 
  We discussed this six months ago, and ended up with this:
 
  What if we tried a [four month] release cycle, BUT we would guarantee
 that
  you could do a rolling upgrade until we bump the supermajor version?
So
 2.0
  could upgrade to 3.0 without having to go through 2.1.  (But to go to
 3.1
  or 4.0 you would have to go through 3.0.)
 
 
  Crucially, I added
 
  Whether this is reasonable depends on how fast we can stabilize
releases.
  2.1.0 will be a good test of this.
 
 
  Unfortunately, even after DataStax hired half a dozen full-time test
  engineers, 2.1.0 continued the proud tradition of being unready for
  production use, with wait for .5 before upgrading once again looking
 like
  a good guideline.
 
  I’m starting to think that the entire model of “write a bunch of new
  features all at once and then try to stabilize it for release” is
broken.
  We’ve been trying that for years and empirically speaking the evidence
is
  that it just doesn’t work, either from a stability standpoint or even
 just
  shipping on time.
 
  A big reason that it takes us so long to stabilize new releases now is
  that, because our major release cycle is so long, it’s super tempting
to
  slip in “just one” new feature into bugfix releases, and I’m as guilty
of
  that as anyone.
 
  For similar reasons, it’s difficult to do a meaningful freeze with big
  feature releases.  A look at 3.0 shows why: we have 8099 coming, but we
  also have significant work done (but not finished) on 6230, 7970, 6696,
 and
  6477, all of which are meaningful improvements that address
demonstrated
  user pain.  So if we keep doing what we’ve been doing, our choices are
to
  either delay 3.0 further while we finish and stabilize these, or we
wait
  nine months to a year for the next release.  Either way, one of our
  constituencies gets disappointed.
 
  So, I’d like to try something different.  I think we were on the right
  track with shorter releases with more compatibility.  But I’d like to
 throw
  in a twist.  Intel cuts down on risk with a “tick-tock” schedule for
new
  architectures and process shrinks instead of trying to do both at once.
 We
  can do something similar here:
 
  One month releases.  Period.  If it’s not done, it can wait.
  *Every other release only accepts bug fixes.*
 
  By itself, one-month releases are going to dramatically reduce the
  complexity of testing and debugging new releases -- and bugs that do
slip
  past us will only affect a smaller percentage of users, avoiding the
“big
  release has a bunch of bugs no one has seen before 

3.0 and the Cassandra release process

2015-03-17 Thread Jonathan Ellis
Cassandra 2.1 was released in September, which means that if we were on
track with our stated goal of six month releases, 3.0 would be done about
now.  Instead, we haven't even delivered a beta.  The immediate cause this
time is blocking for 8099
https://issues.apache.org/jira/browse/CASSANDRA-8099, but the reality is
that nobody should really be surprised.  Something always comes up -- we've
averaged about nine months since 1.0, with 2.1 taking an entire year.

We could make theory align with reality by acknowledging, if nine months
is our 'natural' release schedule, then so be it.  But I think we can do
better.

Broadly speaking, we have two constituencies with Cassandra releases:

First, we have the users who are building or porting an application on
Cassandra.  These users want the newest features to make their job easier.
If 2.1.0 has a few bugs, it's not the end of the world.  They have time to
wait for 2.1.x to stabilize while they write their code.  They would like
to see us deliver on our six month schedule or even faster.

Second, we have the users who have an application in production.  These
users, or their bosses, want Cassandra to be as stable as possible.
Assuming they deploy on a stable release like 2.0.12, they don't want to
touch it.  They would like to see us release *less* often.  (Because that
means they have to do less upgrades while remaining in our backwards
compatibility window.)

With our current big release every X months model, these users' needs are
in tension.

We discussed this six months ago, and ended up with this:

What if we tried a [four month] release cycle, BUT we would guarantee that
 you could do a rolling upgrade until we bump the supermajor version? So 2.0
 could upgrade to 3.0 without having to go through 2.1.  (But to go to 3.1
 or 4.0 you would have to go through 3.0.)


Crucially, I added

Whether this is reasonable depends on how fast we can stabilize releases.
 2.1.0 will be a good test of this.


Unfortunately, even after DataStax hired half a dozen full-time test
engineers, 2.1.0 continued the proud tradition of being unready for
production use, with wait for .5 before upgrading once again looking like
a good guideline.

I’m starting to think that the entire model of “write a bunch of new
features all at once and then try to stabilize it for release” is broken.
We’ve been trying that for years and empirically speaking the evidence is
that it just doesn’t work, either from a stability standpoint or even just
shipping on time.

A big reason that it takes us so long to stabilize new releases now is
that, because our major release cycle is so long, it’s super tempting to
slip in “just one” new feature into bugfix releases, and I’m as guilty of
that as anyone.

For similar reasons, it’s difficult to do a meaningful freeze with big
feature releases.  A look at 3.0 shows why: we have 8099 coming, but we
also have significant work done (but not finished) on 6230, 7970, 6696, and
6477, all of which are meaningful improvements that address demonstrated
user pain.  So if we keep doing what we’ve been doing, our choices are to
either delay 3.0 further while we finish and stabilize these, or we wait
nine months to a year for the next release.  Either way, one of our
constituencies gets disappointed.

So, I’d like to try something different.  I think we were on the right
track with shorter releases with more compatibility.  But I’d like to throw
in a twist.  Intel cuts down on risk with a “tick-tock” schedule for new
architectures and process shrinks instead of trying to do both at once.  We
can do something similar here:

One month releases.  Period.  If it’s not done, it can wait.
*Every other release only accepts bug fixes.*

By itself, one-month releases are going to dramatically reduce the
complexity of testing and debugging new releases -- and bugs that do slip
past us will only affect a smaller percentage of users, avoiding the “big
release has a bunch of bugs no one has seen before and pretty much everyone
is hit by something” scenario.  But by adding in the second rule, I think
we have a real chance to make a quantum leap here: stable, production-ready
releases every two months.

So here is my proposal for 3.0:

We’re just about ready to start serious review of 8099.  When that’s done,
we branch 3.0 and cut a beta and then release candidates.  Whatever isn’t
done by then, has to wait; unlike prior betas, we will only accept bug
fixes into 3.0 after branching.

One month after 3.0, we will ship 3.1 (with new features).  At the same
time, we will branch 3.2.  New features in trunk will go into 3.3.  The 3.2
branch will only get bug fixes.  We will maintain backwards compatibility
for all of 3.x; eventually (no less than a year) we will pick a release to
be 4.0, and drop deprecated features and old backwards compatibilities.
Otherwise there will be nothing special about the 4.0 designation.  (Note
that with an “odd releases have new features, even releases only 

Re: 3.0 and the Cassandra release process

2015-03-17 Thread Jacob Rhoden
Thanks for everyone's hard work and perseverance, Cassandra to is truly 
amazing. It really does make redundancy so much easier making my life far less 
stressful (: it surely is this awesomeness that creates the demand for features 
in the first place. So this is a great problem to have.

Certainly having a product where the user base continually encourages people 
not to use the current major version is a situation that could be improved.

Doing something to attempt to improve the current process is better than (for 
example) doing nothing. Modelling a process based on another companies proven 
strategy seems better than making it up as you go. 

I suggest anyone who would minus one this should also need to include an 
alternate proposal to change the status quo.

Thanks,
Jacob



__
Sent from iPhone

 On 18 Mar 2015, at 8:06 am, Jonathan Ellis jbel...@gmail.com wrote:
 
 Cassandra 2.1 was released in September, which means that if we were on
 track with our stated goal of six month releases, 3.0 would be done about
 now.  Instead, we haven't even delivered a beta.  The immediate cause this
 time is blocking for 8099
 https://issues.apache.org/jira/browse/CASSANDRA-8099, but the reality is
 that nobody should really be surprised.  Something always comes up -- we've
 averaged about nine months since 1.0, with 2.1 taking an entire year.
 
 We could make theory align with reality by acknowledging, if nine months
 is our 'natural' release schedule, then so be it.  But I think we can do
 better.
 
 Broadly speaking, we have two constituencies with Cassandra releases:
 
 First, we have the users who are building or porting an application on
 Cassandra.  These users want the newest features to make their job easier.
 If 2.1.0 has a few bugs, it's not the end of the world.  They have time to
 wait for 2.1.x to stabilize while they write their code.  They would like
 to see us deliver on our six month schedule or even faster.
 
 Second, we have the users who have an application in production.  These
 users, or their bosses, want Cassandra to be as stable as possible.
 Assuming they deploy on a stable release like 2.0.12, they don't want to
 touch it.  They would like to see us release *less* often.  (Because that
 means they have to do less upgrades while remaining in our backwards
 compatibility window.)
 
 With our current big release every X months model, these users' needs are
 in tension.
 
 We discussed this six months ago, and ended up with this:
 
 What if we tried a [four month] release cycle, BUT we would guarantee that
 you could do a rolling upgrade until we bump the supermajor version? So 2.0
 could upgrade to 3.0 without having to go through 2.1.  (But to go to 3.1
 or 4.0 you would have to go through 3.0.)
 
 Crucially, I added
 
 Whether this is reasonable depends on how fast we can stabilize releases.
 2.1.0 will be a good test of this.
 
 Unfortunately, even after DataStax hired half a dozen full-time test
 engineers, 2.1.0 continued the proud tradition of being unready for
 production use, with wait for .5 before upgrading once again looking like
 a good guideline.
 
 I’m starting to think that the entire model of “write a bunch of new
 features all at once and then try to stabilize it for release” is broken.
 We’ve been trying that for years and empirically speaking the evidence is
 that it just doesn’t work, either from a stability standpoint or even just
 shipping on time.
 
 A big reason that it takes us so long to stabilize new releases now is
 that, because our major release cycle is so long, it’s super tempting to
 slip in “just one” new feature into bugfix releases, and I’m as guilty of
 that as anyone.
 
 For similar reasons, it’s difficult to do a meaningful freeze with big
 feature releases.  A look at 3.0 shows why: we have 8099 coming, but we
 also have significant work done (but not finished) on 6230, 7970, 6696, and
 6477, all of which are meaningful improvements that address demonstrated
 user pain.  So if we keep doing what we’ve been doing, our choices are to
 either delay 3.0 further while we finish and stabilize these, or we wait
 nine months to a year for the next release.  Either way, one of our
 constituencies gets disappointed.
 
 So, I’d like to try something different.  I think we were on the right
 track with shorter releases with more compatibility.  But I’d like to throw
 in a twist.  Intel cuts down on risk with a “tick-tock” schedule for new
 architectures and process shrinks instead of trying to do both at once.  We
 can do something similar here:
 
 One month releases.  Period.  If it’s not done, it can wait.
 *Every other release only accepts bug fixes.*
 
 By itself, one-month releases are going to dramatically reduce the
 complexity of testing and debugging new releases -- and bugs that do slip
 past us will only affect a smaller percentage of users, avoiding the “big
 release has a bunch of bugs no one has seen before and pretty 

Re: 3.0 and the Cassandra release process

2015-03-17 Thread Michael Kjellman
❤️ it. +1

-kjellman

 On Mar 17, 2015, at 2:06 PM, Jonathan Ellis jbel...@gmail.com wrote:
 
 Cassandra 2.1 was released in September, which means that if we were on
 track with our stated goal of six month releases, 3.0 would be done about
 now.  Instead, we haven't even delivered a beta.  The immediate cause this
 time is blocking for 8099
 https://issues.apache.org/jira/browse/CASSANDRA-8099, but the reality is
 that nobody should really be surprised.  Something always comes up -- we've
 averaged about nine months since 1.0, with 2.1 taking an entire year.
 
 We could make theory align with reality by acknowledging, if nine months
 is our 'natural' release schedule, then so be it.  But I think we can do
 better.
 
 Broadly speaking, we have two constituencies with Cassandra releases:
 
 First, we have the users who are building or porting an application on
 Cassandra.  These users want the newest features to make their job easier.
 If 2.1.0 has a few bugs, it's not the end of the world.  They have time to
 wait for 2.1.x to stabilize while they write their code.  They would like
 to see us deliver on our six month schedule or even faster.
 
 Second, we have the users who have an application in production.  These
 users, or their bosses, want Cassandra to be as stable as possible.
 Assuming they deploy on a stable release like 2.0.12, they don't want to
 touch it.  They would like to see us release *less* often.  (Because that
 means they have to do less upgrades while remaining in our backwards
 compatibility window.)
 
 With our current big release every X months model, these users' needs are
 in tension.
 
 We discussed this six months ago, and ended up with this:
 
 What if we tried a [four month] release cycle, BUT we would guarantee that
 you could do a rolling upgrade until we bump the supermajor version? So 2.0
 could upgrade to 3.0 without having to go through 2.1.  (But to go to 3.1
 or 4.0 you would have to go through 3.0.)
 
 
 Crucially, I added
 
 Whether this is reasonable depends on how fast we can stabilize releases.
 2.1.0 will be a good test of this.
 
 
 Unfortunately, even after DataStax hired half a dozen full-time test
 engineers, 2.1.0 continued the proud tradition of being unready for
 production use, with wait for .5 before upgrading once again looking like
 a good guideline.
 
 I’m starting to think that the entire model of “write a bunch of new
 features all at once and then try to stabilize it for release” is broken.
 We’ve been trying that for years and empirically speaking the evidence is
 that it just doesn’t work, either from a stability standpoint or even just
 shipping on time.
 
 A big reason that it takes us so long to stabilize new releases now is
 that, because our major release cycle is so long, it’s super tempting to
 slip in “just one” new feature into bugfix releases, and I’m as guilty of
 that as anyone.
 
 For similar reasons, it’s difficult to do a meaningful freeze with big
 feature releases.  A look at 3.0 shows why: we have 8099 coming, but we
 also have significant work done (but not finished) on 6230, 7970, 6696, and
 6477, all of which are meaningful improvements that address demonstrated
 user pain.  So if we keep doing what we’ve been doing, our choices are to
 either delay 3.0 further while we finish and stabilize these, or we wait
 nine months to a year for the next release.  Either way, one of our
 constituencies gets disappointed.
 
 So, I’d like to try something different.  I think we were on the right
 track with shorter releases with more compatibility.  But I’d like to throw
 in a twist.  Intel cuts down on risk with a “tick-tock” schedule for new
 architectures and process shrinks instead of trying to do both at once.  We
 can do something similar here:
 
 One month releases.  Period.  If it’s not done, it can wait.
 *Every other release only accepts bug fixes.*
 
 By itself, one-month releases are going to dramatically reduce the
 complexity of testing and debugging new releases -- and bugs that do slip
 past us will only affect a smaller percentage of users, avoiding the “big
 release has a bunch of bugs no one has seen before and pretty much everyone
 is hit by something” scenario.  But by adding in the second rule, I think
 we have a real chance to make a quantum leap here: stable, production-ready
 releases every two months.
 
 So here is my proposal for 3.0:
 
 We’re just about ready to start serious review of 8099.  When that’s done,
 we branch 3.0 and cut a beta and then release candidates.  Whatever isn’t
 done by then, has to wait; unlike prior betas, we will only accept bug
 fixes into 3.0 after branching.
 
 One month after 3.0, we will ship 3.1 (with new features).  At the same
 time, we will branch 3.2.  New features in trunk will go into 3.3.  The 3.2
 branch will only get bug fixes.  We will maintain backwards compatibility
 for all of 3.x; eventually (no less than a year) we will pick a release to
 be 4.0, and drop