Feature releases don't have to be on the same cadence as bug fixes. They're
naturally different beasts.

Why not stick with monthly feature releases, but mark every third (or
sixth) as a supported release that gets quarterly updates for 2-3 quarters?


On Thursday, 15 September 2016, Tyler Hobbs <ty...@datastax.com> wrote:

> I agree that regular (monthly) releases, and smaller, more frequent feature
> releases are the best part of tick/tock.  The downside of tick/tock, as
> mentioned above, is that there isn't enough time for user feedback and
> testing to catch new bugs before the next feature release.
>
> I would personally like to see a hybrid.  The proposal that Jon mentions of
> doing a new feature release every three months plus 6 months of bugfixes
> for any release seems like like a good balance to me.
>
> On Thu, Sep 15, 2016 at 1:59 PM, Jonathan Haddad <j...@jonhaddad.com
> <javascript:;>> wrote:
>
> > I don't think it's binary - we don't have to do year long insanity or
> > bleeding edge crazyness.
> >
> > How about a release every 3 months, with each release accepting 6 months
> of
> > patches?  (oldstable & newstable)  Also provide nightly builds & stick to
> > the idea of stable trunk.
> >
> > The issue is the number of bug fixes a given release gets.  1 bug fix
> > release for a new feature is just terrible.  The community as a whole
> > despises this system and is lowering confidence in the project.
> >
> > Jon
> >
> >
> > On Thu, Sep 15, 2016 at 11:48 AM Jake Luciani <jak...@gmail.com
> <javascript:;>> wrote:
> >
> > > I'm pretty sure everyone will agree Tick-Tock didn't go well and needs
> to
> > > change.
> > >
> > > The problem for me is going back to the old way doesn't sound great.
> > There
> > > are parts of tick-tock I really like,
> > > for example, the cadence and limited scope per release.
> > >
> > > I know at the summit there were a lot of ideas thrown around I can
> > > regurgitate but perhaps people
> > > who have been thinking about this would like to chime in and present
> > ideas?
> > >
> > > -Jake
> > >
> > > On Thu, Sep 15, 2016 at 2:28 PM, Benedict Elliott Smith <
> > > bened...@apache.org <javascript:;>
> > > > wrote:
> > >
> > > > I agree tick-tock is a failure.  But for two reasons IMO:
> > > >
> > > > 1) Ultimately, the users are the real testers and it takes a while
> for
> > a
> > > > release to percolate into the wild for feedback.  The reality is
> that a
> > > > release doesn't have its tires properly kicked for at least three
> > months
> > > > after it's cut.  So if we are to have any tocks, they should be
> > > completely
> > > > unwed from the ticks, and should probably happen on a ~3M cadence to
> > keep
> > > > the labour down but the utility up (and there should probably still
> be
> > > more
> > > > than one tock per tick)
> > > >
> > > > 2) Those promised resources to improved process never happened.  We
> > > haven't
> > > > even reached parity with the 2.1 release until very recently, i.e. no
> > > > failing u/dtests.
> > > >
> > > >
> > > > On 15 September 2016 at 19:08, Jeff Jirsa <
> jeff.ji...@crowdstrike.com <javascript:;>>
> > > > wrote:
> > > >
> > > > > I know we’ve got a lot of folks following the dev list without a
> lot
> > of
> > > > > background, so let’s make sure we get some context here so everyone
> > can
> > > > be
> > > > > on the same page.
> > > > >
> > > > > Going to preface this wall of text by saying I’m +1 on a 3.5.1 (and
> > > > 3.3.1,
> > > > > etc) if it’s done AFTER 3.9 (I think we need to get 3.9 out first
> > > before
> > > > > the RE manpower is spent on backporting fixes, even critical fixes,
> > > > because
> > > > > 3.9 has multiple critical fixes for people running 3.7).
> > > > >
> > > > > Now some background:
> > > > >
> > > > > For many years, Cassandra used to have a dev process that kept 3
> > active
> > > > > branches - “bleeding edge”, a “stable”, and an “old stable” branch,
> > > where
> > > > > developers would be committing ALL new contributions to the
> bleeding
> > > > edge,
> > > > > non-api-breaking changes to stable, and bugfixes only to old
> stable.
> > > > While
> > > > > the api changed and major features were added, that bleeding edge
> > would
> > > > > just be ‘trunk’, and it’d get cut into a major version when it was
> > > ready
> > > > to
> > > > > ship. We saw that with 2.2 / 2.1 / 2.0 (and before that, 2.1 / 2.0
> /
> > > 1.2,
> > > > > and before that 2.0 / 1.2 / 1.1 ). When that bleeding edge got
> > released
> > > > as
> > > > > a major x.y.0, the third, oldest, most stable branch went EOL, and
> > new
> > > > > features would go into trunk for the next major version.
> > > > >
> > > > > There were two big negatives observed with this:
> > > > >
> > > > > The first big negative is that if multiple major new features were
> in
> > > > > flight, releases were prone to delay. Nobody wants to break an API
> > on a
> > > > > x.y.1 release, and nobody wants to add a new feature to a x.y.2
> > > release,
> > > > so
> > > > > the project would delay the x.y releases if major features were
> > close,
> > > > and
> > > > > then there’d be pressure to slip them in before they were fully
> > tested,
> > > > or
> > > > > cut features to avoid delaying the release. This pressure was
> > observed
> > > to
> > > > > be bad for the project – it forced technical compromises.
> > > > >
> > > > > The second downside that was observed was that nobody would try to
> > run
> > > > the
> > > > > new versions when they launched, because they were buggy because
> they
> > > > were
> > > > > filled with new features. 2.2, for example, introduced RBAC,
> > commitlog
> > > > > compression, and user defined functions – major features that
> needed
> > to
> > > > be
> > > > > tested. Unfortunately, because there were few real-world testers,
> > there
> > > > > were still major bugs being found for months – the first
> > > production-ready
> > > > > version of 2.2 is probably in the 2.2.5 or 2.2.6 range.
> > > > >
> > > > > For version 3, we moved to an alternate release, modeled on Intel’s
> > > > > tick/tock https://en.wikipedia.org/wiki/Tick-Tock_model
> > > > >
> > > > > The intention was to allow new features into 3.even releases (3.0,
> > 3.2,
> > > > > 3.4, 3.6, and so on), with bugfixes in 3.odd releases (3.1, … ).
> The
> > > hope
> > > > > was to allow more frequent releases to address the first big
> negative
> > > > > (flood of new features that blocked releases), while also helping
> to
> > > > > address the second – with fewer major features in a release, they
> > > better
> > > > > get more/better test coverage.
> > > > >
> > > > > In the tick/tock model, anyone running 3.odd (like 3.5) should be
> > > looking
> > > > > for bugfixes in 3.7. It’s certainly true that 3.5 is horribly
> broken
> > > (as
> > > > is
> > > > > 3.3, and 3.4, etc), but with this release model, the bugfix SHOULD
> BE
> > > in
> > > > > 3.7. As I mentioned previously, we have precedent for backporting
> > > > critical
> > > > > fixes, but we don’t have a well defined bar (that I see) for what’s
> > > > > critical enough for a backport.
> > > > >
> > > > > Jon is noting (and what many of us who run Cassandra in production
> > have
> > > > > really known for a very long time) is that nobody wants to run
> > 3.newest
> > > > > (even or odd), because 3.newest is likely broken (because it’s a
> > > complex
> > > > > distributed database, and testing is hard, and it takes time and
> > > complex
> > > > > workloads to find bugs). In the tick/tock model, because new
> features
> > > > went
> > > > > into 3.6, there are new features that may not be adequately
> > > > > tested/validated in 3.7 a user of 3.5 doesn’t want, and isn’t
> willing
> > > to
> > > > > accept the risk.
> > > > >
> > > > > The bottom line here is that tick/tock is probably a well
> intentioned
> > > but
> > > > > failed attempt to bring stability to Cassandra’s releases. The
> > problems
> > > > > tick/tock was meant to solve are real problems, but tick/tock
> doesn’t
> > > > seem
> > > > > to be addressing them – new features invalidate old testing, which
> > > makes
> > > > it
> > > > > difficult/impossible for real users to sit on the 3.odd versions.
> > > > >
> > > > > We’re due for cutting 3.9 and 3.0.9, and we have limited RE
> manpower
> > to
> > > > > get those out. Only after those are out would I be +1 on a 3.5.1,
> and
> > > > then
> > > > > only because if I were running 3.5, and I hit this bug, I wouldn’t
> > want
> > > > to
> > > > > spend the ~$100k it would cost my organization to validate 3.7
> prior
> > to
> > > > > upgrading, and I don’t think it’s reasonable to ask users to
> > recompile
> > > a
> > > > > release for a ~10 line fix for a very nasty bug.
> > > > >
> > > > > I’m also very strongly recommend we (committers/PMC) reconsider
> > > tick/tock
> > > > > for 4.x releases, because this is exactly the type of problem that
> > will
> > > > > continue to happen as we move forward. I suggest that we either
> need
> > to
> > > > go
> > > > > back to the old model and do a better job of dealing with feature
> > creep
> > > > and
> > > > > testing, or we need to better define what gets backported, because
> > the
> > > > > community needs a stable version to run, and running latest odd
> > release
> > > > of
> > > > > tick/tock isn’t it.
> > > > >
> > > > > - Jeff
> > > > >
> > > > >
> > > > > On 9/15/16, 10:31 AM, "dave_les...@apple.com <javascript:;> on
> behalf of Dave
> > > Lester" <
> > > > > dave_les...@apple.com <javascript:;>> wrote:
> > > > >
> > > > > >How would cutting a 3.5.1 release possibly confuse users of the
> > > > software?
> > > > > It would be easy to document the change and to send release notes.
> > > > > >
> > > > > >Given the bug’s critical nature and that it's a minor fix, I’m +1
> > > > > (non-binding) to a new release.
> > > > > >
> > > > > >Dave
> > > > > >
> > > > > >> On Sep 15, 2016, at 7:18 AM, Jeremiah D Jordan <
> > https://urldefense.
> > > > >
> > > proofpoint.com/v2/url?u=http-3A__jeremiah.jordan-40gmail.
> com&d=DQIFaQ&c=
> > > > > 08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M&r=
> > > > > yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow&m=
> > > > > srNzKwrs8hKPoJMZ4Ao18CYaMYKnbWaCHou6ui5tqdM&s=iM_
> > > > > LKKIhaiC0w6uz3lhK1lob4gJbKhLPqGNfPPLye6w&e= > wrote:
> > > > > >>
> > > > > >> I’m with Jeff on this, 3.7 (bug fixes on 3.6) has already been
> > > > released
> > > > > with the fix.  Since the fix applies cleanly anyone is free to put
> it
> > > on
> > > > > top of 3.5 on their own if they like, but I see no reason to put
> out
> > a
> > > > > 3.5.1 right now and confuse people further.
> > > > > >>
> > > > > >> -Jeremiah
> > > > > >>
> > > > > >>
> > > > > >>> On Sep 15, 2016, at 9:07 AM, Jonathan Haddad <
> j...@jonhaddad.com <javascript:;>>
> > > > > wrote:
> > > > > >>>
> > > > > >>> As I follow up, I suppose I'm only advocating for a fix to the
> > odd
> > > > > >>> releases.  Sadly, Tick Tock versioning is misleading.
> > > > > >>>
> > > > > >>> If tick tock were to continue (and I'm very much against how it
> > > > > currently
> > > > > >>> works) the whole even-features odd-fixes thing needs to stop
> > ASAP,
> > > > all
> > > > > it
> > > > > >>> does it confuse people.
> > > > > >>>
> > > > > >>> The follow up to 3.4 (3.5) should have been 3.4.1, following
> > > semver,
> > > > so
> > > > > >>> people know it's bug fixes only to 3.4.
> > > > > >>>
> > > > > >>> Jon
> > > > > >>>
> > > > > >>> On Wed, Sep 14, 2016 at 10:37 PM Jonathan Haddad <
> > > j...@jonhaddad.com <javascript:;>>
> > > > > wrote:
> > > > > >>>
> > > > > >>>> In this particular case, I'd say adding a bug fix release for
> > > every
> > > > > >>>> version that's affected would be the right thing.  The issue
> is
> > so
> > > > > easily
> > > > > >>>> reproducible and will likely result in massive data loss for
> > > anyone
> > > > > on 3.X
> > > > > >>>> WHERE X < 6 and uses the "date" type.
> > > > > >>>>
> > > > > >>>> This is how easy it is to reproduce:
> > > > > >>>>
> > > > > >>>> 1. Start Cassandra 3.5
> > > > > >>>> 2. create KEYSPACE test WITH replication = {'class':
> > > > 'SimpleStrategy',
> > > > > >>>> 'replication_factor': 1};
> > > > > >>>> 3. use test;
> > > > > >>>> 4. create table fail (id int primary key, d date);
> > > > > >>>> 5. delete d from fail where id = 1;
> > > > > >>>> 6. Stop Cassandra
> > > > > >>>> 7. Start Cassandra
> > > > > >>>>
> > > > > >>>> You will get this, and startup will fail:
> > > > > >>>>
> > > > > >>>> ERROR 05:32:09 Exiting due to error while processing commit
> log
> > > > during
> > > > > >>>> initialization.
> > > > > >>>> org.apache.cassandra.db.commitlog.CommitLogReplayer$
> > > > > CommitLogReplayException:
> > > > > >>>> Unexpected error deserializing mutation; saved to
> > > > > >>>> /var/folders/0l/g2p6cnyd5kx_1wkl83nd3y4r0000gn/T/
> > > > > mutation6313332720566971713dat.
> > > > > >>>> This may be caused by replaying a mutation against a table
> with
> > > the
> > > > > same
> > > > > >>>> name but incompatible schema.  Exception follows:
> > > > > >>>> org.apache.cassandra.serializers.MarshalException: Expected 4
> > byte
> > > > > long for
> > > > > >>>> date (0)
> > > > > >>>>
> > > > > >>>> I mean.. come on.  It's an easy fix.  It cleanly merges
> against
> > > 3.5
> > > > > (and
> > > > > >>>> probably the other releases) and requires very little
> investment
> > > > from
> > > > > >>>> anyone.
> > > > > >>>>
> > > > > >>>>
> > > > > >>>> On Wed, Sep 14, 2016 at 9:40 PM Jeff Jirsa <
> > > > > jeff.ji...@crowdstrike.com <javascript:;>>
> > > > > >>>> wrote:
> > > > > >>>>
> > > > > >>>>> We did 3.1.1 and 3.2.1, so there’s SOME precedent for
> emergency
> > > > > fixes,
> > > > > >>>>> but we certainly didn’t/won’t go back and cut new releases
> from
> > > > every
> > > > > >>>>> branch for every critical bug in future releases, so I think
> we
> > > > need
> > > > > to
> > > > > >>>>> draw the line somewhere. If it’s fixed in 3.7 and 3.0.x (x >=
> > 6),
> > > > it
> > > > > seems
> > > > > >>>>> like you’ve got options (either stay on the tick and go up to
> > > 3.7,
> > > > > or bail
> > > > > >>>>> down to 3.0.x)
> > > > > >>>>>
> > > > > >>>>> Perhaps, though, this highlights the fact that tick/tock may
> > not
> > > be
> > > > > the
> > > > > >>>>> best option long term. We’ve tried it for a year, perhaps we
> > > should
> > > > > instead
> > > > > >>>>> discuss whether or not it should continue, or if there’s
> > another
> > > > > process
> > > > > >>>>> that gives us a better way to get useful patches into
> versions
> > > > > people are
> > > > > >>>>> willing to run in production.
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>>> On 9/14/16, 8:55 PM, "Jonathan Haddad" <j...@jonhaddad.com
> <javascript:;>>
> > > wrote:
> > > > > >>>>>
> > > > > >>>>>> Common sense is what prevents someone from upgrading to yet
> > > > another
> > > > > >>>>>> completely unknown version with new features which have
> > probably
> > > > > broken
> > > > > >>>>>> even more stuff that nobody is aware of.  The folks I'm
> > helping
> > > > > right
> > > > > >>>>>> deployed 3.5 when they got started because
> > > > > >>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__
> > > > > cassandra.apache.org&d=DQIBaQ&c=08AGY6txKsvMOP6lYkHQpPMRA1U6kq
> > > > > hAwGa8-0QCg3M&r=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow&m=
> > > > > MZ9nLcNNhQZkuXyH0NBbP1kSEE2M-SYgyVqZ88IJcXY&s=pLP3udocOcAG6k_
> > > > > sAb9p8tcAhtOhpFm6JB7owGhPQEs&e=
> > > > > >>>>> suggests
> > > > > >>>>>> it's acceptable for production.  It turns out using 4 of the
> > > built
> > > > > in
> > > > > >>>>>> datatypes of the database result in the server being unable
> to
> > > > > restart
> > > > > >>>>>> without clearing out the commit logs and running a repair.
> > That
> > > > > screams
> > > > > >>>>>> critical to me.  You shouldn't even be able to install 3.5
> > > without
> > > > > the
> > > > > >>>>>> patch I've supplied - that bug is a ticking time bomb for
> > anyone
> > > > > that
> > > > > >>>>>> installs it.
> > > > > >>>>>>
> > > > > >>>>>> On Wed, Sep 14, 2016 at 8:12 PM Michael Shuler <
> > > > > mich...@pbandjelly.org <javascript:;>>
> > > > > >>>>>> wrote:
> > > > > >>>>>>
> > > > > >>>>>>> What's preventing the use of the 3.6 or 3.7 releases where
> > this
> > > > > bug is
> > > > > >>>>>>> already fixed? This is also fixed in the 3.0.6/7/8
> releases.
> > > > > >>>>>>>
> > > > > >>>>>>> Michael
> > > > > >>>>>>>
> > > > > >>>>>>> On 09/14/2016 08:30 PM, Jonathan Haddad wrote:
> > > > > >>>>>>>> Unfortunately CASSANDRA-11618 was fixed in 3.6 but was not
> > > back
> > > > > >>>>> ported to
> > > > > >>>>>>>> 3.5 as well, and it makes Cassandra effectively unusable
> if
> > > > > someone
> > > > > >>>>> is
> > > > > >>>>>>>> using any of the 4 types affected in any of their schema.
> > > > > >>>>>>>>
> > > > > >>>>>>>> I have cherry picked & merged the patch back to here and
> > will
> > > > put
> > > > > it
> > > > > >>>>> in a
> > > > > >>>>>>>> JIRA as well tonight, I just wanted to get the ball
> rolling
> > > asap
> > > > > on
> > > > > >>>>> this.
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>
> > > > > >>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.
> > > > > com_rustyrazorblade_cassandra_tree_fix-5Fcommitlog-
> > > > 5Fexception&d=DQIBaQ&c=
> > > > > 08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M&r=
> > > > > yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow&m=
> > > > > MZ9nLcNNhQZkuXyH0NBbP1kSEE2M-SYgyVqZ88IJcXY&s=ktY5tkT-
> > > > > nO1jtyc0EicbgZHXJYl03DvzuxqzyyOgzII&e=
> > > > > >>>>>>>>
> > > > > >>>>>>>> Jon
> > > > > >>>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>
> > > > > >>>>
> > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > http://twitter.com/tjake
> > >
> >
>
>
>
> --
> Tyler Hobbs
> DataStax <http://datastax.com/>
>

Reply via email to