Re: Wrapping up tick-tock

2017-01-11 Thread Stefan Podkowinski
I honestly don't understand the release cadence discussion. The 3.x branch
is far from production ready. Is this really the time to plan the next
major feature releases on top of it, instead of focusing to stabilize 3.x
first? Who knows how long that would take, even if everyone would
exclusively work on bug fixing (which I think should happen).

On Tue, Jan 10, 2017 at 4:29 PM, Jonathan Haddad  wrote:

> I don't see why it has to be one extreme (yearly) or another (monthly).
> When you had originally proposed Tick Tock, you wrote:
>
> "The primary goal is to improve release quality.  Our current major “dot
> zero” releases require another five or six months to make them stable
> enough for production.  This is directly related to how we pile features in
> for 9 to 12 months and release all at once.  The interactions between the
> new features are complex and not always obvious.  2.1 was no exception,
> despite DataStax hiring a full tme test engineering team specifically for
> Apache Cassandra."
>
> I agreed with you at the time that the yearly cycle was too long to be
> adding features before cutting a release, and still do now.  Instead of
> elastic banding all the way back to a process which wasn't working before,
> why not try somewhere in the middle?  A release every 6 months (with
> monthly bug fixes for a year) gives:
>
> 1. long enough time to stabilize (1 year vs 1 month)
> 2. not so long things sit around untested forever
> 3. only 2 releases (current and previous) to do bug fix support at any
> given time.
>
> Jon
>
> On Tue, Jan 10, 2017 at 6:56 AM Jonathan Ellis  wrote:
>
> > Hi all,
> >
> > We’ve had a few threads now about the successes and failures of the
> > tick-tock release process and what to do to replace it, but they all died
> > out without reaching a robust consensus.
> >
> > In those threads we saw several reasonable options proposed, but from my
> > perspective they all operated in a kind of theoretical fantasy land of
> > testing and development resources.  In particular, it takes around a
> > person-week of effort to verify that a release is ready.  That is, going
> > through all the test suites, inspecting and re-running failing tests to
> see
> > if there is a product problem or a flaky test.
> >
> > (I agree that in a perfect world this wouldn’t be necessary because your
> > test ci is always green, but see my previous framing of the perfect world
> > as a fantasy land.  It’s also worth noting that this is a common problem
> > for large OSS projects, not necessarily something to beat ourselves up
> > over, but in any case, that's our reality right now.)
> >
> > I submit that any process that assumes a monthly release cadence is not
> > realistic from a resourcing standpoint for this validation.  Notably, we
> > have struggled to marshal this for 3.10 for two months now.
> >
> > Therefore, I suggest first that we collectively roll up our sleeves to
> vet
> > 3.10 as the last tick-tock release.  Stick a fork in it, it’s done.  No
> > more tick-tock.
> >
> > I further suggest that in place of tick tock we go back to our old model
> of
> > yearly-ish releases with as-needed bug fix releases on stable branches,
> > probably bi-monthly.  This amortizes the release validation problem over
> a
> > longer development period.  And of course we remain free to ramp back up
> to
> > the more rapid cadence envisioned by the other proposals if we increase
> our
> > pool of QA effort or we are able to eliminate flakey tests to the point
> > that a long validation process becomes unnecessary.
> >
> > (While a longer dev period could mean a correspondingly more painful test
> > validation process at the end, my experience is that most of the
> validation
> > cost is “fixed” in the form of flaky tests and thus does not increase
> > proportionally to development time.)
> >
> > Thoughts?
> >
> > --
> > Jonathan Ellis
> > co-founder, http://www.datastax.com
> > @spyced
> >
>


Re: Per blockng release on dtest

2017-01-11 Thread Benjamin Lerer
Regarding CASSANDRA-12620, it has been committed in the 3.0 branch at
c612cd8d7dbd24888c216ad53f974686b88dd601 and merged into 3.11. As, if I am
not mistaken, 3.11 should become the new 3.10 release, I do not think that
there is a problem.

Did I miss something Ariel?

On Tue, Jan 10, 2017 at 6:45 PM, Jeff Jirsa  wrote:

> +1
>
>
> On Tue, Jan 10, 2017 at 9:23 AM, Aleksey Yeschenko 
> wrote:
>
> > That’s a good point.
> >
> > So 3.11 after 3.10, then move on to 3.11.x further bug fix releases?
> >
> > +1 to that.
> >
> > --
> > AY
> >
> > On 10 January 2017 at 17:22:09, Michael Shuler (mich...@pbandjelly.org)
> > wrote:
> >
> > I had the same thought. 3.10 is the tick, so a 3.11 bugfix tock follows
> > the intended final fix release for closing out tick-tock. Throwing a
> > 3.10.1 out there would add more user confusion and would be the exact
> > same contents as a 3.11 release versioned package set anyway.
> >
> > --
> > Michael
> >
> > On 01/10/2017 11:18 AM, Josh McKenzie wrote:
> > > | If someone tries to upgrade 3.10 to whatever 4.0 ends up being I
> > > think they will hit the wrong answer bug. So I would advocate for
> > > having the fix brought
> > > into 3.10, but it was broken in 3.9 as well.
> > >
> > > Seems like we'd just release that as 3.10.1 (instead of 3.11) and just
> > > tell people "you can upgrade to 4.0 w/latest version of 3.10". This
> > > does violate the "even releases features, odd releases bugfix", so
> > > maybe a 3.11 as final 3.X line would help keep that consistent?
> > >
> > > I'd rather not open the can of worms of back-porting this to 3.9 as
> > > well to hold to our claim of "any 3.X can go to 4.0".
> > >
> > > On Tue, Jan 10, 2017 at 12:13 PM, Ariel Weisberg 
> > wrote:
> > >> Hi,
> > >>
> > >>
> > >>
> > >> The upgrade tests are tricky because they upgrade from an existing
> > >> release to a current release. The bug is in 3.9 and won't be fixed
> until
> > >> 3.11 because the test checks out and builds 3.9 right now. 3.10
> doesn't
> > >> include the commit that fixes the issue so it will fail after 3.10 is
> > >> released and the test is updated to check out 3.10.
> > >>
> > >>
> > >> We claim to support upgrade from any 3.x version to 4.0. If someone
> > >> tries to upgrade 3.10 to whatever 4.0 ends up being I think they will
> > >> hit the wrong answer bug. So I would advocate for having the fix
> brought
> > >> into 3.10, but it was broken in 3.9 as well.
> > >>
> > >>
> > >> Some of the tests fail because trunk complains of unreadable stables
> and
> > >> I suspect that isn't a bug it's just something that is no longer
> > >> supported due to thrift removal, but I haven't fixed those yet. Those
> > >> are probably issues with trunk or the tests.
> > >>
> > >>
> > >> Others fail for reasons I haven't triaged yet. I'm struggling with my
> > >> own issues getting the tests to run locally.
> > >>
> > >>
> > >> Ariel
> > >>
> > >>
> > >>
> > >> On Tue, Jan 10, 2017, at 11:49 AM, Nate McCall wrote:
> > >>
> > 
> > >>
> >  I concede it would be fine to do it gradually. Once the pace of
> >  issues
> >  introduced by new development is beaten by the pace at which
> >  they are
> >  addressed I think things will go well.
> > >>
> > >>>
> > >>
> > >>> So from Michael's JIRA query:
> > >>
> > >>> https://issues.apache.org/jira/browse/CASSANDRA-12617?
> > jql=project%20%3D%20CASSANDRA%20AND%20fixVersion%20%3D%203.
> > 10%20AND%20resolution%20%3D%20Unresolved
> > >>>
> > >>
> > >>> Are we good for 3.10 after we get those cleaned up?
> > >>
> > >>>
> > >>
> > >>> Ariel, you made reference to:
> > >>
> > >>> https://github.com/apache/cassandra/commit/
> > c612cd8d7dbd24888c216ad53f974686b88dd601
> > >>>
> > >>
> > >>> Do we need to re-open an issue to have this applied to 3.10 and add
> it
> > >>> to the above list?
> > >>
> > >>>
> > >>
> > 
> > >>
> >  On Tue, Jan 10, 2017, at 11:17 AM, Josh McKenzie wrote:
> > >>
> > >
> > >>
> > > Sankalp's proposal of us progressively tightening up our standards
> > > allows
> > > us to get code out the door and regain some lost momentum on
> > > the 3.10
> > > release failures and blocking, and gives us time as a community to
> > > adjust
> > > our behavior without the burden of an ever-later slipped release
> > > hanging
> > > over our heads. There's plenty of bugfixes in the 3.X line; the
> > > more time
> > > people can have to kick the tires on that code, the more things
> > > we can
> > > find
> > >>
> > > and the better future releases will be.
> > >>
> > >>>
> > >>
> > >>>
> > >>
> > >>> +1 On gradually moving to this. Dropping releases with huge change
> > >>
> > >>> lists has never gone well for us in the past.
> > >>
> > >>
> >
> >
>