Re: Proposal - 3.5.1

2016-10-20 Thread Jonathan Haddad
>From a community perspective a supported release every 6 months would be
much more attractive than yearly, as having to wait ~9-10 months for
something like SASI is kind of frustrating.

Monthly dev releases is awesome.

On Thu, Oct 20, 2016 at 3:18 PM Nate McCall  wrote:

> > I’m not sure it makes sense to have separate features/stability releases
> in that world. 4.0.x will be stable, every 4.x will be a dev release on the
> road to 5.0.
> >
> +1. Much easier to understand and it's 'backwards compatible' (sort
> of) with wherever we leave 3.x.
>
> Still keeping 4.x on monthly cadence should keep the small, incremental
> focus.
>


Re: Proposal - 3.5.1

2016-10-20 Thread Jonathan Haddad
And +1 to ditching the "tick tock" alternating release thing nobody
understands it anyway.

On Thu, Oct 20, 2016 at 3:38 PM Jonathan Haddad  wrote:

> From a community perspective a supported release every 6 months would be
> much more attractive than yearly, as having to wait ~9-10 months for
> something like SASI is kind of frustrating.
>
> Monthly dev releases is awesome.
>
> On Thu, Oct 20, 2016 at 3:18 PM Nate McCall  wrote:
>
> > I’m not sure it makes sense to have separate features/stability releases
> in that world. 4.0.x will be stable, every 4.x will be a dev release on the
> road to 5.0.
> >
> +1. Much easier to understand and it's 'backwards compatible' (sort
> of) with wherever we leave 3.x.
>
> Still keeping 4.x on monthly cadence should keep the small, incremental
> focus.
>
>


Re: Proposal - 3.5.1

2016-10-20 Thread Nate McCall
> I’m not sure it makes sense to have separate features/stability releases in 
> that world. 4.0.x will be stable, every 4.x will be a dev release on the road 
> to 5.0.
>
+1. Much easier to understand and it's 'backwards compatible' (sort
of) with wherever we leave 3.x.

Still keeping 4.x on monthly cadence should keep the small, incremental focus.


Re: Proposal - 3.5.1

2016-10-20 Thread Aleksey Yeschenko
I’m not sure it makes sense to have separate features/stability releases in 
that world. 4.0.x will be stable, every 4.x will be a dev release on the road 
to 5.0.

-- 
AY

On 20 October 2016 at 22:43:19, Jeff Jirsa (jji...@apache.org) wrote:



On 2016-10-20 14:21 (-0700), Jeremiah Jordan  wrote:  
> In the original tick tock plan we would not have kept 4.0.x around. So I am 
> proposing a change for that and then we label the 3.x and 4.x releases as 
> "development releases" or some other thing and have "yearly" LTS releases 
> with .0.x.  
> Those are similar to the previous 1.2/2.0/2.1/2.2 and we are adding semi 
> stable development releases as well which give people an easier way to try 
> out new stuff than "build it yourself", which was the only way to do that in 
> between the previous Big Bang releases.  
>  

This sounds reasonable to me. Would 4.(even) still be features and 4.(odd) 
still be stability fixes? Or everything in 4.x is features and/or stability?  



Re: Proposal - 3.5.1

2016-10-20 Thread Jeremiah D Jordan
My thinking was we keep doing tick/tock for the 4.x.  Basically continue on for 
4.0.x / 4.x like we have been with 3.0.x / 3.x, just with some added guidance 
to people that 4.x is “development releases”.  The main problem I hear with the 
tick tock stuff is that we won’t ever have “LTS” branches any more.  So lets 
change that and make the .0 releases LTS branches.

-Jeremiah

> On Oct 20, 2016, at 4:42 PM, Jeff Jirsa  wrote:
> 
> 
> 
> On 2016-10-20 14:21 (-0700), Jeremiah Jordan  wrote: 
>> In the original tick tock plan we would not have kept 4.0.x around.  So I am 
>> proposing a change for that and then we label the 3.x and 4.x releases as 
>> "development releases" or some other thing and have "yearly" LTS releases 
>> with .0.x.
>> Those are similar to the previous 1.2/2.0/2.1/2.2 and we are adding semi 
>> stable development releases as well which give people an easier way to try 
>> out new stuff than "build it yourself", which was the only way to do that in 
>> between the previous Big Bang releases.
>> 
> 
> This sounds reasonable to me. Would 4.(even) still be features and 4.(odd) 
> still be stability fixes? Or everything in 4.x is features and/or stability? 
> 



Re: Proposal - 3.5.1

2016-10-20 Thread Jeff Jirsa


On 2016-10-20 14:21 (-0700), Jeremiah Jordan  wrote: 
> In the original tick tock plan we would not have kept 4.0.x around.  So I am 
> proposing a change for that and then we label the 3.x and 4.x releases as 
> "development releases" or some other thing and have "yearly" LTS releases 
> with .0.x.
> Those are similar to the previous 1.2/2.0/2.1/2.2 and we are adding semi 
> stable development releases as well which give people an easier way to try 
> out new stuff than "build it yourself", which was the only way to do that in 
> between the previous Big Bang releases.
> 

This sounds reasonable to me. Would 4.(even) still be features and 4.(odd) 
still be stability fixes? Or everything in 4.x is features and/or stability? 



Re: Proposal - 3.5.1

2016-10-20 Thread Jeremiah Jordan
In the original tick tock plan we would not have kept 4.0.x around.  So I am 
proposing a change for that and then we label the 3.x and 4.x releases as 
"development releases" or some other thing and have "yearly" LTS releases with 
.0.x.
Those are similar to the previous 1.2/2.0/2.1/2.2 and we are adding semi stable 
development releases as well which give people an easier way to try out new 
stuff than "build it yourself", which was the only way to do that in between 
the previous Big Bang releases.



> On Oct 20, 2016, at 3:59 PM, Jeff Jirsa  wrote:
> 
> 
> 
>> On 2016-10-20 13:26 (-0700), "J. D. Jordan"  
>> wrote: 
>> If you think of the tick tock releases as interim development releases I 
>> actually think they have been working pretty well. What if we continue with 
>> the same process and do 4.0.x as LTS like we have 3.0.x LTS.
>> 
>> So you get 4.x releases that are trickling out new features which will 
>> eventually be in the 5.0.x LTS and you get 4.0.x as an LTS release of all 
>> the 3.x built up features.
>> 
>> This seems like a fairly straight forward process to me.  It gives people 
>> monthly releases that they can test new features with, but it also provides 
>> a stable line for those that want one.
>> 
> 
> So just tick/tock with new labels? How do we stop users from getting into the 
> situation where they're running 4.5, there's a critical flaw in 4.5, and 
> there's no 4.5.1 ever going to be released? Real users still won't want to 
> jump to 4.7, because there's added risk from stuff that went into 4.6 and 4.7 
> ? Or is it simply "if you want to run bleeding edge, you better be willing to 
> stay on that bleeding edge for up to a year"? 
> 
> 
> 
> 


Re: Proposal - 3.5.1

2016-10-20 Thread Jeff Jirsa


On 2016-10-20 13:26 (-0700), "J. D. Jordan"  wrote: 
> If you think of the tick tock releases as interim development releases I 
> actually think they have been working pretty well. What if we continue with 
> the same process and do 4.0.x as LTS like we have 3.0.x LTS.
> 
> So you get 4.x releases that are trickling out new features which will 
> eventually be in the 5.0.x LTS and you get 4.0.x as an LTS release of all the 
> 3.x built up features.
> 
> This seems like a fairly straight forward process to me.  It gives people 
> monthly releases that they can test new features with, but it also provides a 
> stable line for those that want one.
> 

So just tick/tock with new labels? How do we stop users from getting into the 
situation where they're running 4.5, there's a critical flaw in 4.5, and 
there's no 4.5.1 ever going to be released? Real users still won't want to jump 
to 4.7, because there's added risk from stuff that went into 4.6 and 4.7 ? Or 
is it simply "if you want to run bleeding edge, you better be willing to stay 
on that bleeding edge for up to a year"? 






Re: Proposal - 3.5.1

2016-10-20 Thread J. D. Jordan
If you think of the tick tock releases as interim development releases I 
actually think they have been working pretty well. What if we continue with the 
same process and do 4.0.x as LTS like we have 3.0.x LTS.

So you get 4.x releases that are trickling out new features which will 
eventually be in the 5.0.x LTS and you get 4.0.x as an LTS release of all the 
3.x built up features.

This seems like a fairly straight forward process to me.  It gives people 
monthly releases that they can test new features with, but it also provides a 
stable line for those that want one.

-Jeremiah

> On Oct 20, 2016, at 11:57 AM, Jeremy Hanna  wrote:
> 
> Thanks Ben.  It’s great to have a 3.x LTS option as things work themselves 
> out.  I just wanted to revive this thread in parallel so that it could 
> hopefully come to a way forward for the project as well.  Is the 3 branch 
> strategy that Sylvain proposed the way forward?
> 
>> On Oct 20, 2016, at 11:52 AM, Ben Bromhead  wrote:
>> 
>> For reference we have released https://github.com/instaclustr/cassandra ,
>> with the end goal that people have a stable target on the 3.x branch while
>> this is all worked out.
>> 
>> We are likely to continue our releases even with a release cadence change,
>> but we would track official versions much more closely and our repository
>> will end up just being a public view of what we do internally rather than
>> something we advocate over official releases.
>> 
>> For further details on our thoughts around this see:
>> 
>>  - https://www.instaclustr.com/blog/2016/10/19/patched-cassandra-3-7/
>>  - https://github.com/instaclustr/cassandra#faq
>> 
>> 
>> On Thu, 20 Oct 2016 at 09:38 Jeremy Hanna 
>> wrote:
>> 
>>> Is there consensus on a way forward with this?  Is there going to be a
>>> three branch plan with “features”, “testing”, and “stable” starting with
>>> 4.0?  Or is this still in the discussion mode?  External to this thread
>>> there have been decisions made to create third party LTS releases and hopes
>>> that the project would decide to address the concerns in this thread.  It
>>> seems like this is the place to complete the discussion.
>>> 
 On Sep 26, 2016, at 10:52 AM, Jonathan Haddad  wrote:
 
 Not yet. I hadn't seen any Jirsa before to release a specific version,
>>> only
 discussion on the ML.
 
 I'll put up a Jira with my patch that back ports the bug fix.
 On Mon, Sep 26, 2016 at 8:26 AM Michael Shuler 
 wrote:
 
> Jon, is there a JIRA ticket for this request? I appreciate everyone's
> input, and I think this is a fine proposal.
> 
> --
> Kind regards,
> Michael
> 
>> On 09/14/2016 08:30 PM, Jonathan Haddad wrote:
>> Unfortunately CASSANDRA-11618 was fixed in 3.6 but was not back ported
>>> to
>> 3.5 as well, and it makes Cassandra effectively unusable if someone is
>> using any of the 4 types affected in any of their schema.
>> 
>> I have cherry picked & merged the patch back to here and will put it
>>> in a
>> JIRA as well tonight, I just wanted to get the ball rolling asap on
>>> this.
>> 
>> 
> 
>>> https://github.com/rustyrazorblade/cassandra/tree/fix_commitlog_exception
>> 
>> Jon
>> 
> 
> 
>>> 
>>> --
>> Ben Bromhead
>> CTO | Instaclustr 
>> +1 650 284 9692
>> Managed Cassandra / Spark on AWS, Azure and Softlayer
> 


Re: Proposal - 3.5.1

2016-10-20 Thread Ben Bromhead
For reference we have released https://github.com/instaclustr/cassandra ,
with the end goal that people have a stable target on the 3.x branch while
this is all worked out.

We are likely to continue our releases even with a release cadence change,
but we would track official versions much more closely and our repository
will end up just being a public view of what we do internally rather than
something we advocate over official releases.

For further details on our thoughts around this see:

   - https://www.instaclustr.com/blog/2016/10/19/patched-cassandra-3-7/
   - https://github.com/instaclustr/cassandra#faq


On Thu, 20 Oct 2016 at 09:38 Jeremy Hanna 
wrote:

> Is there consensus on a way forward with this?  Is there going to be a
> three branch plan with “features”, “testing”, and “stable” starting with
> 4.0?  Or is this still in the discussion mode?  External to this thread
> there have been decisions made to create third party LTS releases and hopes
> that the project would decide to address the concerns in this thread.  It
> seems like this is the place to complete the discussion.
>
> > On Sep 26, 2016, at 10:52 AM, Jonathan Haddad  wrote:
> >
> > Not yet. I hadn't seen any Jirsa before to release a specific version,
> only
> > discussion on the ML.
> >
> > I'll put up a Jira with my patch that back ports the bug fix.
> > On Mon, Sep 26, 2016 at 8:26 AM Michael Shuler 
> > wrote:
> >
> >> Jon, is there a JIRA ticket for this request? I appreciate everyone's
> >> input, and I think this is a fine proposal.
> >>
> >> --
> >> Kind regards,
> >> Michael
> >>
> >> On 09/14/2016 08:30 PM, Jonathan Haddad wrote:
> >>> Unfortunately CASSANDRA-11618 was fixed in 3.6 but was not back ported
> to
> >>> 3.5 as well, and it makes Cassandra effectively unusable if someone is
> >>> using any of the 4 types affected in any of their schema.
> >>>
> >>> I have cherry picked & merged the patch back to here and will put it
> in a
> >>> JIRA as well tonight, I just wanted to get the ball rolling asap on
> this.
> >>>
> >>>
> >>
> https://github.com/rustyrazorblade/cassandra/tree/fix_commitlog_exception
> >>>
> >>> Jon
> >>>
> >>
> >>
>
> --
Ben Bromhead
CTO | Instaclustr 
+1 650 284 9692
Managed Cassandra / Spark on AWS, Azure and Softlayer


Re: Proposal - 3.5.1

2016-09-26 Thread Michael Shuler
Jon, is there a JIRA ticket for this request? I appreciate everyone's
input, and I think this is a fine proposal.

-- 
Kind regards,
Michael

On 09/14/2016 08:30 PM, Jonathan Haddad wrote:
> Unfortunately CASSANDRA-11618 was fixed in 3.6 but was not back ported to
> 3.5 as well, and it makes Cassandra effectively unusable if someone is
> using any of the 4 types affected in any of their schema.
> 
> I have cherry picked & merged the patch back to here and will put it in a
> JIRA as well tonight, I just wanted to get the ball rolling asap on this.
> 
> https://github.com/rustyrazorblade/cassandra/tree/fix_commitlog_exception
> 
> Jon
> 



Re: Proposal - 3.5.1

2016-09-20 Thread Jonathan Haddad
@Sylvain - I see what you're saying now on the branches.  I suppose a
branching strategy like that does give some flexibility to have multiple
things in the pipeline so it does give some additional flexibility there.


On Mon, Sep 19, 2016 at 9:06 AM Eric Evans 
wrote:

> On Fri, Sep 16, 2016 at 5:05 AM, Sylvain Lebresne 
> wrote:
> > In light of all this, my suggesting for a release cycle woud be:
> > - To have 3 branches: 'features', 'testing' and 'stable', with an X month
> >   rotation: 'features' becomes 'testing' after X months and then 'stable'
> > after
> >   X more, before getting EOL X months later.
> > - The feature branch gets everything. The testing branch only gets bug
> > fixes.
> >   The stable branch only gets critical bug fixes. And imo, we should be
> very
> >   strict on this (I acknowledge that there is sometimes a bit of
> > subjectivity on
> >   whether something is a bug or an improvement, and if it's critical or
> > not, but
> >   I think it's not that hard to get consensus if we're all reasonable
> > (though it
> >   might worth agreeing on some rough but written guideline upfront)).
> > - We release on a short and fixed cadence of Y month(s) for both the
> > feature and
> >   testing branch. For the stable branch, given that it already had X
> months
> > of
> >   only bug fixes during the testing phase, one can hope critical fixes
> will
> > be
> >   fairly rare, less than 1 per Y period on average). Further, it's
> supposed
> > to
> >   be stable and fixes are supposed to be critical, so doing hot-fix
> releases
> >   probably makes the most sense (though it probably only work if we're
> > indeed
> >   strict on what is considered critical).
>
> This seems pretty close to what Mck suggested; I think this could work.
>
>
> --
> Eric Evans
> john.eric.ev...@gmail.com
>


Re: Proposal - 3.5.1

2016-09-19 Thread Eric Evans
On Fri, Sep 16, 2016 at 5:05 AM, Sylvain Lebresne  wrote:
> In light of all this, my suggesting for a release cycle woud be:
> - To have 3 branches: 'features', 'testing' and 'stable', with an X month
>   rotation: 'features' becomes 'testing' after X months and then 'stable'
> after
>   X more, before getting EOL X months later.
> - The feature branch gets everything. The testing branch only gets bug
> fixes.
>   The stable branch only gets critical bug fixes. And imo, we should be very
>   strict on this (I acknowledge that there is sometimes a bit of
> subjectivity on
>   whether something is a bug or an improvement, and if it's critical or
> not, but
>   I think it's not that hard to get consensus if we're all reasonable
> (though it
>   might worth agreeing on some rough but written guideline upfront)).
> - We release on a short and fixed cadence of Y month(s) for both the
> feature and
>   testing branch. For the stable branch, given that it already had X months
> of
>   only bug fixes during the testing phase, one can hope critical fixes will
> be
>   fairly rare, less than 1 per Y period on average). Further, it's supposed
> to
>   be stable and fixes are supposed to be critical, so doing hot-fix releases
>   probably makes the most sense (though it probably only work if we're
> indeed
>   strict on what is considered critical).

This seems pretty close to what Mck suggested; I think this could work.


-- 
Eric Evans
john.eric.ev...@gmail.com


Re: Proposal - 3.5.1

2016-09-19 Thread Eric Evans
On Thu, Sep 15, 2016 at 9:33 PM, Mick Semb Wever  wrote:
>  - keep bimonthly feature releases,
>  - revert from tick-tock to SemVer numbering scheme,
>  - during the release vote also vote on the quality label (feature branches
> start with a 'Alpha' and the first patch release as 'Beta'),
>  - accept that every feature release isn't by default initially supported,
> and its branch might never be,
>  - maintain 3 'GA' branches at any one time,
>  - accept that it's not going to be the oldest GA branches that necessarily
> reach EOL first.

I like it.

-- 
Eric Evans
john.eric.ev...@gmail.com


Re: Proposal - 3.5.1

2016-09-16 Thread Edward Capriolo
If you all have never seen the movie "grandma's boy" I suggest it.

https://www.youtube.com/watch?v=uJLQ5DHmw-U

There is one funny seen where the product/project person says something
like, "The game is ready. We have fixed ALL THE BUGS". The people who made
the movie probably think the coders doing dance dance revolution is funny.
To me the funniest part of the movie is the summary statement that "all the
bugs are fixed".

I agree with Sylvain, that cutting branches really has nothing to do with
"quality". Quality like "production ready" is hard to define.

I am phrasing this next part as questions to encourage deep thought not to
be a jerk.

Someone jokingly said said 3.0 was the "break everything" release. What if
4.0 was the "fix everything" release?
What would that mean?
What would we need?
No new features for 6 months?
A vast network of amazon machines to test things?
Jepsen ++?
24 hour integration tests that run CAS operations across a multi-node mixed
version cluster while we chaos monkey nodes?
Could we keep busy for 6 months just looking at the code and fix all the
bugs for Mr. Cheezle?
Could we fix ALL THE BUGS and then from that day it is just feature,
feature, feature?
We sit there and join and unjoin nodes for 2 days while running stress and
at the end use the map reduce export and prove that not a single datum was
lost?








On Fri, Sep 16, 2016 at 2:42 PM, Sylvain Lebresne 
wrote:

> On Fri, Sep 16, 2016 at 6:59 PM, Blake Eggleston 
> wrote:
>
> > Clearly, we won’t get to this point right away, but it should definitely
> > be a goal.
> >
>
> I'm not entirely clear on why anyone would read in what I'm saying that it
> shouldn't be a goal. I'm a huge proponent of this and of putting emphasis
> on quality in general, and because it's Friday night and I'm tired, I'm
> gonna add that I think I have a bigger track record of actually acting on
> improving quality for Cassandra than anyone else that is putting word in my
> mouth.
>
> Mainly, I'm suggesting that we don't have to tie the existence of a clearly
> labeled stable branch (useful to user, especially newcomers) to future
> improvement in the "releasability" of trunk in our design of a new release
> cycle. If we do so, but releasability don't improve as quickly as we'd
> hope, we penalize users in the end. Adopting a release cycle that ensure
> said clearly labeled stable branch does exist no matter the rate of
> improvement to the level of "trunk" releasibility is feels safer, and
> doesn't preclude any effort in improving said releasibilty, nor
> re-evaluating this in 1-2 year to move to release stable releases from
> trunk directly if we have proven we're there.
>
>
>
> >
> > On September 16, 2016 at 9:04:03 AM, Sylvain Lebresne (
> > sylv...@datastax.com) wrote:
> >
> > On Fri, Sep 16, 2016 at 5:18 PM, Jonathan Haddad 
> > wrote:
> >
> > >
> > > This is a different mentality from having a "features" branch, where
> it's
> > > implied that at times it's acceptable that it not be stable.
> >
> >
> > I absolutely never implied that, though I willingly admit my choice of
> > branch
> > names may be to blame. I 100% agree that no releases should be done
> > without a green test board moving forward and if something was implicit
> > in my 'feature' branch proposal, it was that.
> >
> > Where we might not be in the same page is that I just don't believe it's
> > reasonable to expect the project will get any time soon in a state where
> > even a green test board release (with new features) meets the "can be
> > confidently put into production". I'm not even sure it's reasonable to
> > expect from *any* software, and even less so for an open-source
> > project based on volunteering. Not saying it wouldn't be amazing, it
> > would, I just don't believe it's realistic. In a way, the reason why I
> > think
> > tick-tock doesn't work is *exactly* because it's based on that
> unrealistic
> > assumption.
> >
> > Of course, I suppose that's kind of my opinion. I'm sure some will think
> > that the "historical trend" of release instability is simply due to a
> lack
> > of
> > effort (obviously Cassandra developers don't give a shit about users,
> that
> > must the simplest explanation).
> >
>


Re: Proposal - 3.5.1

2016-09-16 Thread Sylvain Lebresne
On Fri, Sep 16, 2016 at 6:59 PM, Blake Eggleston 
wrote:

> Clearly, we won’t get to this point right away, but it should definitely
> be a goal.
>

I'm not entirely clear on why anyone would read in what I'm saying that it
shouldn't be a goal. I'm a huge proponent of this and of putting emphasis
on quality in general, and because it's Friday night and I'm tired, I'm
gonna add that I think I have a bigger track record of actually acting on
improving quality for Cassandra than anyone else that is putting word in my
mouth.

Mainly, I'm suggesting that we don't have to tie the existence of a clearly
labeled stable branch (useful to user, especially newcomers) to future
improvement in the "releasability" of trunk in our design of a new release
cycle. If we do so, but releasability don't improve as quickly as we'd
hope, we penalize users in the end. Adopting a release cycle that ensure
said clearly labeled stable branch does exist no matter the rate of
improvement to the level of "trunk" releasibility is feels safer, and
doesn't preclude any effort in improving said releasibilty, nor
re-evaluating this in 1-2 year to move to release stable releases from
trunk directly if we have proven we're there.



>
> On September 16, 2016 at 9:04:03 AM, Sylvain Lebresne (
> sylv...@datastax.com) wrote:
>
> On Fri, Sep 16, 2016 at 5:18 PM, Jonathan Haddad 
> wrote:
>
> >
> > This is a different mentality from having a "features" branch, where it's
> > implied that at times it's acceptable that it not be stable.
>
>
> I absolutely never implied that, though I willingly admit my choice of
> branch
> names may be to blame. I 100% agree that no releases should be done
> without a green test board moving forward and if something was implicit
> in my 'feature' branch proposal, it was that.
>
> Where we might not be in the same page is that I just don't believe it's
> reasonable to expect the project will get any time soon in a state where
> even a green test board release (with new features) meets the "can be
> confidently put into production". I'm not even sure it's reasonable to
> expect from *any* software, and even less so for an open-source
> project based on volunteering. Not saying it wouldn't be amazing, it
> would, I just don't believe it's realistic. In a way, the reason why I
> think
> tick-tock doesn't work is *exactly* because it's based on that unrealistic
> assumption.
>
> Of course, I suppose that's kind of my opinion. I'm sure some will think
> that the "historical trend" of release instability is simply due to a lack
> of
> effort (obviously Cassandra developers don't give a shit about users, that
> must the simplest explanation).
>


Re: Proposal - 3.5.1

2016-09-16 Thread Jonathan Haddad
Yep - the progress that's been made on trunk recently has been excellent
and should continue.  The spirit of tick tock - stable trunk - should not
change, just that the release cycle did not support what humans are
comfortable with maintaining or deploying.

On Fri, Sep 16, 2016 at 10:08 AM Jonathan Ellis  wrote:

> On Fri, Sep 16, 2016 at 11:36 AM, Jonathan Haddad 
> wrote:
>
> > What I was trying to suggest is that the *goal* of trunk should always be
> > releasable, and the alpha releases would be the means of testing that.
> If
> > the goal is to always be releasable, we move towards achieving that goal
> by
> > improving modularity, test coverage and test granularity.
> >
> > Yes, it's very difficult to prove a piece of software is completely free
> of
> > bugs and I wouldn't expect NASA to put Cassandra on the space shuttle.
> > That said, by prioritizing stability in the software development process
> up
> > front, the cost of maintaining older branches over time will decrease and
> > the velocity of the project will increase - which was the original goal
> of
> > Tick Tock.
> >
>
> And we *did* make substantial progress on this.  Not nearly as quickly as I
> originally hoped, but our CI is worlds cleaner and more useful than it was
> this time last year.
>
> --
> Jonathan Ellis
> co-founder, http://www.datastax.com
> @spyced
>


Re: Proposal - 3.5.1

2016-09-16 Thread Jonathan Ellis
On Fri, Sep 16, 2016 at 11:36 AM, Jonathan Haddad  wrote:

> What I was trying to suggest is that the *goal* of trunk should always be
> releasable, and the alpha releases would be the means of testing that.  If
> the goal is to always be releasable, we move towards achieving that goal by
> improving modularity, test coverage and test granularity.
>
> Yes, it's very difficult to prove a piece of software is completely free of
> bugs and I wouldn't expect NASA to put Cassandra on the space shuttle.
> That said, by prioritizing stability in the software development process up
> front, the cost of maintaining older branches over time will decrease and
> the velocity of the project will increase - which was the original goal of
> Tick Tock.
>

And we *did* make substantial progress on this.  Not nearly as quickly as I
originally hoped, but our CI is worlds cleaner and more useful than it was
this time last year.

-- 
Jonathan Ellis
co-founder, http://www.datastax.com
@spyced


Re: Proposal - 3.5.1

2016-09-16 Thread Blake Eggleston
 I'm not even sure it's reasonable to 
expect from *any* software, and even less so for an open-source 
project based on volunteering. Not saying it wouldn't be amazing, it 
would, I just don't believe it's realistic.

Postgres does a pretty good job of this. This sort of thinking is a self 
fulfilling prophecy imo. Clearly, we won’t get to this point right away, but it 
should definitely be a goal.

On September 16, 2016 at 9:04:03 AM, Sylvain Lebresne (sylv...@datastax.com) 
wrote:

On Fri, Sep 16, 2016 at 5:18 PM, Jonathan Haddad  wrote:  

>  
> This is a different mentality from having a "features" branch, where it's  
> implied that at times it's acceptable that it not be stable.  


I absolutely never implied that, though I willingly admit my choice of  
branch  
names may be to blame. I 100% agree that no releases should be done  
without a green test board moving forward and if something was implicit  
in my 'feature' branch proposal, it was that.  

Where we might not be in the same page is that I just don't believe it's  
reasonable to expect the project will get any time soon in a state where  
even a green test board release (with new features) meets the "can be  
confidently put into production". I'm not even sure it's reasonable to  
expect from *any* software, and even less so for an open-source  
project based on volunteering. Not saying it wouldn't be amazing, it  
would, I just don't believe it's realistic. In a way, the reason why I think  
tick-tock doesn't work is *exactly* because it's based on that unrealistic  
assumption.  

Of course, I suppose that's kind of my opinion. I'm sure some will think  
that the "historical trend" of release instability is simply due to a lack  
of  
effort (obviously Cassandra developers don't give a shit about users, that  
must the simplest explanation).  


Re: Proposal - 3.5.1

2016-09-16 Thread Jonathan Haddad
What I was trying to suggest is that the *goal* of trunk should always be
releasable, and the alpha releases would be the means of testing that.  If
the goal is to always be releasable, we move towards achieving that goal by
improving modularity, test coverage and test granularity.

Yes, it's very difficult to prove a piece of software is completely free of
bugs and I wouldn't expect NASA to put Cassandra on the space shuttle.
That said, by prioritizing stability in the software development process up
front, the cost of maintaining older branches over time will decrease and
the velocity of the project will increase - which was the original goal of
Tick Tock.

Jon

On Fri, Sep 16, 2016 at 9:04 AM Sylvain Lebresne 
wrote:

> On Fri, Sep 16, 2016 at 5:18 PM, Jonathan Haddad 
> wrote:
>
> >
> > This is a different mentality from having a "features" branch, where it's
> > implied that at times it's acceptable that it not be stable.
>
>
> I absolutely never implied that, though I willingly admit my choice of
> branch
> names may be to blame. I 100% agree that no releases should be done
> without a green test board moving forward and if something was implicit
> in my 'feature' branch proposal, it was that.
>
> Where we might not be in the same page is that I just don't believe it's
> reasonable to expect the project will get any time soon in a state where
> even a green test board release (with new features) meets the "can be
> confidently put into production". I'm not even sure it's reasonable to
> expect from *any* software, and even less so for an open-source
> project based on volunteering. Not saying it wouldn't be amazing, it
> would, I just don't believe it's realistic. In a way, the reason why I
> think
> tick-tock doesn't work is *exactly* because it's based on that unrealistic
> assumption.
>
> Of course, I suppose that's kind of my opinion. I'm sure some will think
> that the "historical trend" of release instability is simply due to a lack
> of
> effort (obviously Cassandra developers don't give a shit about users, that
> must the simplest explanation).
>


Re: Proposal - 3.5.1

2016-09-16 Thread Sylvain Lebresne
On Fri, Sep 16, 2016 at 5:18 PM, Jonathan Haddad  wrote:

>
> This is a different mentality from having a "features" branch, where it's
> implied that at times it's acceptable that it not be stable.


I absolutely never implied that, though I willingly admit my choice of
branch
names may be to blame. I 100% agree that no releases should be done
without a green test board moving forward and if something was implicit
in my 'feature' branch proposal, it was that.

Where we might not be in the same page is that I just don't believe it's
reasonable to expect the project will get any time soon in a state where
even a green test board release (with new features) meets the "can be
confidently put into production". I'm not even sure it's reasonable to
expect from *any* software, and even less so for an open-source
project based on volunteering. Not saying it wouldn't be amazing, it
would, I just don't believe it's realistic. In a way, the reason why I think
tick-tock doesn't work is *exactly* because it's based on that unrealistic
assumption.

Of course, I suppose that's kind of my opinion. I'm sure some will think
that the "historical trend" of release instability is simply due to a lack
of
effort (obviously Cassandra developers don't give a shit about users, that
must the simplest explanation).


Re: Proposal - 3.5.1

2016-09-16 Thread Jonathan Ellis
On Fri, Sep 16, 2016 at 10:18 AM, Jonathan Haddad  wrote:

> TL;DR:
> Release every 3 months
> Support for 6
> Keep a stable trunk
> New features get merged into trunk but the standard for code quality and
> testing needs to be property defined as something closer to "production
> ready" rather than "let the poor user figure it out"
>

I like it.  I think one of the data points from dick tock is that monthly
releases are just too often.  Quarterly is a better cadence.

-- 
Jonathan Ellis
co-founder, http://www.datastax.com
@spyced


Re: Proposal - 3.5.1

2016-09-16 Thread Edward Capriolo
"The historical trend with the Cassandra codebase has been to test
minimally,
throw the code over the wall, and get feedback from people putting it in
prod who run into issues."

At the summit Brandon and a couple others were making fun over range
tombstones from thrift
https://issues.apache.org/jira/browse/CASSANDRA-5435

I added the thrift support based on code already in trunk. But there was
something ugly bit in there
and far on down the line someone else stuck with an edge case and had to
fix it. Now, I actually added a number
of tests, unit test, and nosetests. I am sure the range tombstones also had
their own set of tests at the storage level.

So as Brandon was making fun of me, I was thinking to myself, "Well I did
not make the bug, I just made it possible for others to find it! So I am
helping!"

The next time I submit a thrift patch I am going to write 5x the unit tests
jk :)

On Fri, Sep 16, 2016 at 11:18 AM, Jonathan Haddad  wrote:

> I've worked on a few projects where we've had a branch that new stuff went
> in before merging to master / trunk.  What you've described reminds me a
> lot of git-flow (http://nvie.com/posts/a-successful-git-branching-model/)
> although not quite the same.  I'll be verbose in this email to minimize the
> reader's assumptions.
>
> The goals of the release cycle should be (in descending order of priority):
>
> 1. Minimize bugs introduced through change
> 2. Allow the codebase to iterate quickly
> 3. Not get caught up in a ton of back porting bug fixes
>
> There is significant benefit to having a releasable trunk.  This is
> different from a trunk which is constantly released.  A releasable trunk
> simply means all tests should *always* pass and PMC & committers should
> feel confident that they could actually put it in prod for a project that
> actually matters.  Having it always be releasable (all tests pass, etc)
> means people can at least test the DB on sample data or evaluate it before
> the release happens, and get feedback to the team when there are bugs.
>
> This is a different mentality from having a "features" branch, where it's
> implied that at times it's acceptable that it not be stable.  The
> historical trend with the Cassandra codebase has been to test minimally,
> throw the code over the wall, and get feedback from people putting it in
> prod who run into issues.  In my experience I have found a general purpose
> "features" branch to result in poorly quality codebases.  It's shares a lot
> of the same problems as the 1+ year release cycle did previously, with
> things getting merged in and then an attempt to stabilize later.
>
> Improving the state of testing in trunk will catch more bugs, satisfying
> #1, which naturally leads to #2, and by reducing bugs before they get
> released #3 will happen over time.
>
> My suggestion for a *supported* feature release every 3 months (could just
> as well be 4 or 6) mixed with Benedict's idea of frequent non-supported
> releases (tagged as alpha).  Supported releases should get ~6 months worth
> of bug fixes, which if done right, will decrease over time due to a
> hopefully more stable codebase.  I 100% agree with Mick that semver makes
> sense here, it's not just for frameworks.  Major.Minor.Patch is well
> understood and is pretty standard throughout the world, I don't think we
> need to reinvent versioning.
>
> TL;DR:
> Release every 3 months
> Support for 6
> Keep a stable trunk
> New features get merged into trunk but the standard for code quality and
> testing needs to be property defined as something closer to "production
> ready" rather than "let the poor user figure it out"
>
> Jon
>
>
>
>
>
>
>
> On Fri, Sep 16, 2016 at 3:05 AM Sylvain Lebresne 
> wrote:
>
> > As probably pretty much everyone at this point, I agree the tick-tock
> > experiment
> > isn't working as well as it should and that it's probably worth course
> > correcting. I happen to have been thinking about this quite a bit already
> > as it
> > turns out so I'm going to share my reasoning and suggestion below, even
> > though
> > it's going to be pretty long, in the hope it can be useful (and if it
> > isn't, so
> > be it).
> >
> > My current thinking is that a good cycle should accommodate 2 main
> > constraints:
> >   1) be useful for users
> >   2) be realistic/limit friction on the development side
> > and let me develop what I mean by both points slightly first.
> >
> > I think users mostly want 2 things out of the release schedule: they
> want a
> > clearly labeled stable branch to know what they should run into
> production,
> > and
> > they want new features and improvements. Let me clarify that different
> > users
> > will want those 2 in different degrees and with variation over time, but
> I
> > believe it's mainly some combination of those. On the development side, I
> > don't
> > think it's realistic to expect more than 2/3 branches/series to be
> > supported at
> > any one time 

Re: Proposal - 3.5.1

2016-09-16 Thread Jonathan Haddad
I've worked on a few projects where we've had a branch that new stuff went
in before merging to master / trunk.  What you've described reminds me a
lot of git-flow (http://nvie.com/posts/a-successful-git-branching-model/)
although not quite the same.  I'll be verbose in this email to minimize the
reader's assumptions.

The goals of the release cycle should be (in descending order of priority):

1. Minimize bugs introduced through change
2. Allow the codebase to iterate quickly
3. Not get caught up in a ton of back porting bug fixes

There is significant benefit to having a releasable trunk.  This is
different from a trunk which is constantly released.  A releasable trunk
simply means all tests should *always* pass and PMC & committers should
feel confident that they could actually put it in prod for a project that
actually matters.  Having it always be releasable (all tests pass, etc)
means people can at least test the DB on sample data or evaluate it before
the release happens, and get feedback to the team when there are bugs.

This is a different mentality from having a "features" branch, where it's
implied that at times it's acceptable that it not be stable.  The
historical trend with the Cassandra codebase has been to test minimally,
throw the code over the wall, and get feedback from people putting it in
prod who run into issues.  In my experience I have found a general purpose
"features" branch to result in poorly quality codebases.  It's shares a lot
of the same problems as the 1+ year release cycle did previously, with
things getting merged in and then an attempt to stabilize later.

Improving the state of testing in trunk will catch more bugs, satisfying
#1, which naturally leads to #2, and by reducing bugs before they get
released #3 will happen over time.

My suggestion for a *supported* feature release every 3 months (could just
as well be 4 or 6) mixed with Benedict's idea of frequent non-supported
releases (tagged as alpha).  Supported releases should get ~6 months worth
of bug fixes, which if done right, will decrease over time due to a
hopefully more stable codebase.  I 100% agree with Mick that semver makes
sense here, it's not just for frameworks.  Major.Minor.Patch is well
understood and is pretty standard throughout the world, I don't think we
need to reinvent versioning.

TL;DR:
Release every 3 months
Support for 6
Keep a stable trunk
New features get merged into trunk but the standard for code quality and
testing needs to be property defined as something closer to "production
ready" rather than "let the poor user figure it out"

Jon







On Fri, Sep 16, 2016 at 3:05 AM Sylvain Lebresne 
wrote:

> As probably pretty much everyone at this point, I agree the tick-tock
> experiment
> isn't working as well as it should and that it's probably worth course
> correcting. I happen to have been thinking about this quite a bit already
> as it
> turns out so I'm going to share my reasoning and suggestion below, even
> though
> it's going to be pretty long, in the hope it can be useful (and if it
> isn't, so
> be it).
>
> My current thinking is that a good cycle should accommodate 2 main
> constraints:
>   1) be useful for users
>   2) be realistic/limit friction on the development side
> and let me develop what I mean by both points slightly first.
>
> I think users mostly want 2 things out of the release schedule: they want a
> clearly labeled stable branch to know what they should run into production,
> and
> they want new features and improvements. Let me clarify that different
> users
> will want those 2 in different degrees and with variation over time, but I
> believe it's mainly some combination of those. On the development side, I
> don't
> think it's realistic to expect more than 2/3 branches/series to be
> supported at
> any one time (not going to argue that, let's call it a professional
> opinion). I
> also think accumulating new work for any meaningful length of time before
> releasing, as we used to do, is bad as it pushes devs to rush things to
> meet a
> given release deadline as they don't want to wait for the next one. This in
> turn
> impacts quality and creates unnecessary drama. It's also good imo to have a
> clear policy regarding where a given work can go (having to debate on each
> ticket on which branch it should go is a waste of dev time).
>
> With those "goals" in mind, I'll note that:
> - the fixed _and_ short cadence of tick-tock is imo very good, in
> particular in
>   (but not limited to) avoiding the 'accumulate unreleased stuffs' problem.
> - we have ample evidence that stuffs don't get truly stable until they get
> only
>   bug fixes for a few months. Which doesn't mean at all that we shouldn't
>   continue to make progress on increasing the quality of new code btw.
> - Simple is also a great quality of a release cycle. I think we should try
> to
>   define what's truly important to achieve (my opinion on that is above)
> and do
>   the 

Re: Proposal - 3.5.1

2016-09-16 Thread Sylvain Lebresne
As probably pretty much everyone at this point, I agree the tick-tock
experiment
isn't working as well as it should and that it's probably worth course
correcting. I happen to have been thinking about this quite a bit already
as it
turns out so I'm going to share my reasoning and suggestion below, even
though
it's going to be pretty long, in the hope it can be useful (and if it
isn't, so
be it).

My current thinking is that a good cycle should accommodate 2 main
constraints:
  1) be useful for users
  2) be realistic/limit friction on the development side
and let me develop what I mean by both points slightly first.

I think users mostly want 2 things out of the release schedule: they want a
clearly labeled stable branch to know what they should run into production,
and
they want new features and improvements. Let me clarify that different users
will want those 2 in different degrees and with variation over time, but I
believe it's mainly some combination of those. On the development side, I
don't
think it's realistic to expect more than 2/3 branches/series to be
supported at
any one time (not going to argue that, let's call it a professional
opinion). I
also think accumulating new work for any meaningful length of time before
releasing, as we used to do, is bad as it pushes devs to rush things to
meet a
given release deadline as they don't want to wait for the next one. This in
turn
impacts quality and creates unnecessary drama. It's also good imo to have a
clear policy regarding where a given work can go (having to debate on each
ticket on which branch it should go is a waste of dev time).

With those "goals" in mind, I'll note that:
- the fixed _and_ short cadence of tick-tock is imo very good, in
particular in
  (but not limited to) avoiding the 'accumulate unreleased stuffs' problem.
- we have ample evidence that stuffs don't get truly stable until they get
only
  bug fixes for a few months. Which doesn't mean at all that we shouldn't
  continue to make progress on increasing the quality of new code btw.
- Simple is also a great quality of a release cycle. I think we should try
to
  define what's truly important to achieve (my opinion on that is above)
and do
  the simplest thing that achieve that. This does imply the release cycle
  won't make the coffee, but that's alright, it probably shouldn't anyway.

In light of all this, my suggesting for a release cycle woud be:
- To have 3 branches: 'features', 'testing' and 'stable', with an X month
  rotation: 'features' becomes 'testing' after X months and then 'stable'
after
  X more, before getting EOL X months later.
- The feature branch gets everything. The testing branch only gets bug
fixes.
  The stable branch only gets critical bug fixes. And imo, we should be very
  strict on this (I acknowledge that there is sometimes a bit of
subjectivity on
  whether something is a bug or an improvement, and if it's critical or
not, but
  I think it's not that hard to get consensus if we're all reasonable
(though it
  might worth agreeing on some rough but written guideline upfront)).
- We release on a short and fixed cadence of Y month(s) for both the
feature and
  testing branch. For the stable branch, given that it already had X months
of
  only bug fixes during the testing phase, one can hope critical fixes will
be
  fairly rare, less than 1 per Y period on average). Further, it's supposed
to
  be stable and fixes are supposed to be critical, so doing hot-fix releases
  probably makes the most sense (though it probably only work if we're
indeed
  strict on what is considered critical).

And that's about it. I think it would believably achieve stability (with a
clear
label on which releases are stable), but also provide new features and
improvements quickly for those that wants that. The testing phase is imo a
necessary intermediate step to get the stable one.

On thing to define is X and Y. For Y (the cadence of feature/testing), I
think 1
or 2 months are the only options that make sense (less than 1 month is too
fast,
and more than 2 months is imo starting to get too long). For X, that's more
debatable but it's open-source and we should recognize volunteers generally
don't want to maintain things for too long either. My 2 is that 6 or 8
months
are probably the best options here.

We'd also have to put some numbering scheme on top of that, but that's not
really the important part (the meaning is in the branch labels, not the
numbers). To give just one possible option (and assuming X=6, Y=1), in
January
2017 we could cut 4.0 as the start of both 'feature' and 'testing'. We'd
then
have 4.1, 4.2, ... on the 'feature' branch, and 4.0.1, 4.0.2, ... on the
testing
branch for the next 6 months. In July, we'd switch from 4.5 to 5.0, with
that
becoming the new 'feature' and 'testing' base. At the same time, we'd cut
4.0.6 from 4.0.5 as the new 'stable' branch. Hot-fix on that stable branch
would
be versioned 4.0.6.1, 4.0.6.2 and so on.

Of course, there can be 

Re: Proposal - 3.5.1

2016-09-15 Thread Mick Semb Wever
Totally agree with all the frustrations felt by Jon here.


TL;DR
Here's a proposal for 4.0 and beyond: that is puts together the comments
from Benedict, Jon, Tyler, Jeremy, and Ed;

 - keep bimonthly feature releases,
 - revert from tick-tock to SemVer numbering scheme,
 - during the release vote also vote on the quality label (feature branches
start with a 'Alpha' and the first patch release as 'Beta'),
 - accept that every feature release isn't by default initially supported,
and its branch might never be,
 - maintain 3 'GA' branches at any one time,
 - accept that it's not going to be the oldest GA branches that necessarily
reach EOL first.


Background and rationale…

IMO the problem with Tick-Tock is that it introduces two separate concepts:
   - incremental development, and
   - limiting patch releases.

The first concept: having bimonthly tocks; made C* development more
incremental. A needed improvement.
No coincidence, at the same time as tick-tock was introduced, there was
also a lot of effort being put into testing and a QA framework.
>From this we've seen a lot of fantastic features incrementally added to C*!

The second concept: having bimonthly ticks; limited C* to having only one
patch release per tock release.
The only real benefit to this was to reduce the effort involved in
maintenance, required because of the more frequent tock releases.
The consequence is instability has gone bananas, as Jon clearly
demonstrates. Someone went and let the monkey out.

A quick comparison of before to tick-tock:

   * Before tick-tock: against 6-12 months of development it took a
time-frame of 3-6 months and 6+ patch releases to stabilise C*.

   * After tick-tock: against 2 months of development we could have
expected the same time-frame of 3-6 months (because adoption is dictated by
users, not developers) and *over* this period 1-2 patch releases to
stabilise. It seemed to have been a fools errand to force this to 1 patch
release after only one month. It seems that the notion of incremental
development was applied for the developers where-as the waterfall model was
applied to QA in production for the users. (note: all this is not taking
into account advantages of incremental development, an improved QA
framework, and a move towards a stable-master.)

The question remains to how many of these releases can the community afford
to support. And being realistic much of this effort relies upon the
commercial entities around the community. For example having 1 year of
support means having to support 6 feature releases, and there's probably
not the people power to do that. It also means that in effect any release
is actually only supported for 6-9 months, since it took 3-6 for it to get
to production-ready.

A typical Apache release process is that each new major release gets voted
on as only 'Alpha' or 'Beta'. As patch releases are made it is ascertained
whether enough people are using it (eg in production) and the quality label
appropriately raised to either 'Beta' or 'GA'.  The quality label can be
proposed in the vote or left to be voted upon by everyone. The quality
label is itself not part of the version number, so that the version number
can follow strict SemVer.

Then the community can say, for example, it supports 3 'GA' branches. This
permits some major releases to never make it to GA, and others to hang over
for a bit longer. It's something that the community gets a feel for by
appreciating the users and actors around it. The number of branches
supported depends on what the community can sustain (including the new
non-GA branches). The community also becomes a bit more honest about the
quality of x.y.0 releases.

The proposal is an example that embraces incremental development and the
release-often mentality, while keeping a realistic and flexible approach to
how many branches can be supported. The cost of supporting branches is
still very real, and pushing for a stable master means no feature branch is
cut without passing everything in the QA framework and 100% belief that it
can be put into a user's production. That is there's not a return to
thinking about feature branches as a place for ongoing stabilisation
efforts, just because they have a 'Alpha/Beta' label. The onus of work is
put upon the developer having to maintain branches for features targeted
for master, and not on the community having to stabilise and support
feature branches.

BTW has anyone figured out whether it's the tick or the tock that
represents the feature release??   I probably got it wrong here :-)


~mck


Re: Proposal - 3.5.1

2016-09-15 Thread Jonathan Haddad
If the releases can be tagged as alpha / beta so that people don't
accidentally put it in prod (or at least, will do so less), that would be
totally reasonable.

On Thu, Sep 15, 2016 at 12:27 PM Tyler Hobbs  wrote:

> On Thu, Sep 15, 2016 at 2:22 PM, Benedict Elliott Smith <
> bened...@apache.org
> > wrote:
>
> > Feature releases don't have to be on the same cadence as bug fixes.
> They're
> > naturally different beasts.
> >
>
> With the exception of critical bug fixes (which can warrant an immediate
> release), I think keeping a regular cadence makes us less likely to slip
> and fall behind on releases.
>
>
> >
> > Why not stick with monthly feature releases, but mark every third (or
> > sixth) as a supported release that gets quarterly updates for 2-3
> quarters?
> >
>
> That's also a good idea.
>
> --
> Tyler Hobbs
> DataStax 
>


Re: Proposal - 3.5.1

2016-09-15 Thread Benedict Elliott Smith
Yes, agreed. I'm advocating a different cadence, not a random cadence.

On Thursday, 15 September 2016, Tyler Hobbs  wrote:

> On Thu, Sep 15, 2016 at 2:22 PM, Benedict Elliott Smith <
> bened...@apache.org 
> > wrote:
>
> > Feature releases don't have to be on the same cadence as bug fixes.
> They're
> > naturally different beasts.
> >
>
> With the exception of critical bug fixes (which can warrant an immediate
> release), I think keeping a regular cadence makes us less likely to slip
> and fall behind on releases.
>
>
> >
> > Why not stick with monthly feature releases, but mark every third (or
> > sixth) as a supported release that gets quarterly updates for 2-3
> quarters?
> >
>
> That's also a good idea.
>
> --
> Tyler Hobbs
> DataStax 
>


Re: Proposal - 3.5.1

2016-09-15 Thread Tyler Hobbs
On Thu, Sep 15, 2016 at 2:22 PM, Benedict Elliott Smith  wrote:

> Feature releases don't have to be on the same cadence as bug fixes. They're
> naturally different beasts.
>

With the exception of critical bug fixes (which can warrant an immediate
release), I think keeping a regular cadence makes us less likely to slip
and fall behind on releases.


>
> Why not stick with monthly feature releases, but mark every third (or
> sixth) as a supported release that gets quarterly updates for 2-3 quarters?
>

That's also a good idea.

-- 
Tyler Hobbs
DataStax 


Re: Proposal - 3.5.1

2016-09-15 Thread Tyler Hobbs
I agree that regular (monthly) releases, and smaller, more frequent feature
releases are the best part of tick/tock.  The downside of tick/tock, as
mentioned above, is that there isn't enough time for user feedback and
testing to catch new bugs before the next feature release.

I would personally like to see a hybrid.  The proposal that Jon mentions of
doing a new feature release every three months plus 6 months of bugfixes
for any release seems like like a good balance to me.

On Thu, Sep 15, 2016 at 1:59 PM, Jonathan Haddad  wrote:

> I don't think it's binary - we don't have to do year long insanity or
> bleeding edge crazyness.
>
> How about a release every 3 months, with each release accepting 6 months of
> patches?  (oldstable & newstable)  Also provide nightly builds & stick to
> the idea of stable trunk.
>
> The issue is the number of bug fixes a given release gets.  1 bug fix
> release for a new feature is just terrible.  The community as a whole
> despises this system and is lowering confidence in the project.
>
> Jon
>
>
> On Thu, Sep 15, 2016 at 11:48 AM Jake Luciani  wrote:
>
> > I'm pretty sure everyone will agree Tick-Tock didn't go well and needs to
> > change.
> >
> > The problem for me is going back to the old way doesn't sound great.
> There
> > are parts of tick-tock I really like,
> > for example, the cadence and limited scope per release.
> >
> > I know at the summit there were a lot of ideas thrown around I can
> > regurgitate but perhaps people
> > who have been thinking about this would like to chime in and present
> ideas?
> >
> > -Jake
> >
> > On Thu, Sep 15, 2016 at 2:28 PM, Benedict Elliott Smith <
> > bened...@apache.org
> > > wrote:
> >
> > > I agree tick-tock is a failure.  But for two reasons IMO:
> > >
> > > 1) Ultimately, the users are the real testers and it takes a while for
> a
> > > release to percolate into the wild for feedback.  The reality is that a
> > > release doesn't have its tires properly kicked for at least three
> months
> > > after it's cut.  So if we are to have any tocks, they should be
> > completely
> > > unwed from the ticks, and should probably happen on a ~3M cadence to
> keep
> > > the labour down but the utility up (and there should probably still be
> > more
> > > than one tock per tick)
> > >
> > > 2) Those promised resources to improved process never happened.  We
> > haven't
> > > even reached parity with the 2.1 release until very recently, i.e. no
> > > failing u/dtests.
> > >
> > >
> > > On 15 September 2016 at 19:08, Jeff Jirsa 
> > > wrote:
> > >
> > > > I know we’ve got a lot of folks following the dev list without a lot
> of
> > > > background, so let’s make sure we get some context here so everyone
> can
> > > be
> > > > on the same page.
> > > >
> > > > Going to preface this wall of text by saying I’m +1 on a 3.5.1 (and
> > > 3.3.1,
> > > > etc) if it’s done AFTER 3.9 (I think we need to get 3.9 out first
> > before
> > > > the RE manpower is spent on backporting fixes, even critical fixes,
> > > because
> > > > 3.9 has multiple critical fixes for people running 3.7).
> > > >
> > > > Now some background:
> > > >
> > > > For many years, Cassandra used to have a dev process that kept 3
> active
> > > > branches - “bleeding edge”, a “stable”, and an “old stable” branch,
> > where
> > > > developers would be committing ALL new contributions to the bleeding
> > > edge,
> > > > non-api-breaking changes to stable, and bugfixes only to old stable.
> > > While
> > > > the api changed and major features were added, that bleeding edge
> would
> > > > just be ‘trunk’, and it’d get cut into a major version when it was
> > ready
> > > to
> > > > ship. We saw that with 2.2 / 2.1 / 2.0 (and before that, 2.1 / 2.0 /
> > 1.2,
> > > > and before that 2.0 / 1.2 / 1.1 ). When that bleeding edge got
> released
> > > as
> > > > a major x.y.0, the third, oldest, most stable branch went EOL, and
> new
> > > > features would go into trunk for the next major version.
> > > >
> > > > There were two big negatives observed with this:
> > > >
> > > > The first big negative is that if multiple major new features were in
> > > > flight, releases were prone to delay. Nobody wants to break an API
> on a
> > > > x.y.1 release, and nobody wants to add a new feature to a x.y.2
> > release,
> > > so
> > > > the project would delay the x.y releases if major features were
> close,
> > > and
> > > > then there’d be pressure to slip them in before they were fully
> tested,
> > > or
> > > > cut features to avoid delaying the release. This pressure was
> observed
> > to
> > > > be bad for the project – it forced technical compromises.
> > > >
> > > > The second downside that was observed was that nobody would try to
> run
> > > the
> > > > new versions when they launched, because they were buggy because they
> > > were
> > > > filled with new features. 2.2, for example, introduced RBAC,
> commitlog
> > > > 

Re: Proposal - 3.5.1

2016-09-15 Thread Jeremy Hanna
Right - I think like Jake and others have said, it seems appropriate to do 
something at this point.  Would a clearer, more liberal backport policy to the 
odd versions be worthwhile until we find our footing?  As Jeremiah said, it 
does seem like the big bang 3.0 release has caused much of the baggage that 
we’re facing.  Combine with that the slow uptake on any specific version so far 
at least partly because of the newness of the release model.

To me, the hard thing to me about 3 month releases is that then you get into 
the larger untested feature releases which is what it was originally supposed 
to get away from.

So in essence, would we
1) do nothing and see it through
2) have a more liberal backport policy in the 3.x line and revisit once we get 
to 4
3) do a tick-tock(-tock-tock) sort of model
4) do some sort of LTS
5) go back to the drawing board
6) go back to the old model

I think the earlier numbers imply some confidence in the thinking behind 
tick-tock.  Would 2 be acceptable to see the 3.x line through with the current 
release model?  Or do we need to do something more extensive at this stage?

> On Sep 15, 2016, at 1:59 PM, Jonathan Haddad  wrote:
> 
> I don't think it's binary - we don't have to do year long insanity or
> bleeding edge crazyness.
> 
> How about a release every 3 months, with each release accepting 6 months of
> patches?  (oldstable & newstable)  Also provide nightly builds & stick to
> the idea of stable trunk.
> 
> The issue is the number of bug fixes a given release gets.  1 bug fix
> release for a new feature is just terrible.  The community as a whole
> despises this system and is lowering confidence in the project.
> 
> Jon
> 
> 
> On Thu, Sep 15, 2016 at 11:48 AM Jake Luciani  wrote:
> 
>> I'm pretty sure everyone will agree Tick-Tock didn't go well and needs to
>> change.
>> 
>> The problem for me is going back to the old way doesn't sound great. There
>> are parts of tick-tock I really like,
>> for example, the cadence and limited scope per release.
>> 
>> I know at the summit there were a lot of ideas thrown around I can
>> regurgitate but perhaps people
>> who have been thinking about this would like to chime in and present ideas?
>> 
>> -Jake
>> 
>> On Thu, Sep 15, 2016 at 2:28 PM, Benedict Elliott Smith <
>> bened...@apache.org
>>> wrote:
>> 
>>> I agree tick-tock is a failure.  But for two reasons IMO:
>>> 
>>> 1) Ultimately, the users are the real testers and it takes a while for a
>>> release to percolate into the wild for feedback.  The reality is that a
>>> release doesn't have its tires properly kicked for at least three months
>>> after it's cut.  So if we are to have any tocks, they should be
>> completely
>>> unwed from the ticks, and should probably happen on a ~3M cadence to keep
>>> the labour down but the utility up (and there should probably still be
>> more
>>> than one tock per tick)
>>> 
>>> 2) Those promised resources to improved process never happened.  We
>> haven't
>>> even reached parity with the 2.1 release until very recently, i.e. no
>>> failing u/dtests.
>>> 
>>> 
>>> On 15 September 2016 at 19:08, Jeff Jirsa 
>>> wrote:
>>> 
 I know we’ve got a lot of folks following the dev list without a lot of
 background, so let’s make sure we get some context here so everyone can
>>> be
 on the same page.
 
 Going to preface this wall of text by saying I’m +1 on a 3.5.1 (and
>>> 3.3.1,
 etc) if it’s done AFTER 3.9 (I think we need to get 3.9 out first
>> before
 the RE manpower is spent on backporting fixes, even critical fixes,
>>> because
 3.9 has multiple critical fixes for people running 3.7).
 
 Now some background:
 
 For many years, Cassandra used to have a dev process that kept 3 active
 branches - “bleeding edge”, a “stable”, and an “old stable” branch,
>> where
 developers would be committing ALL new contributions to the bleeding
>>> edge,
 non-api-breaking changes to stable, and bugfixes only to old stable.
>>> While
 the api changed and major features were added, that bleeding edge would
 just be ‘trunk’, and it’d get cut into a major version when it was
>> ready
>>> to
 ship. We saw that with 2.2 / 2.1 / 2.0 (and before that, 2.1 / 2.0 /
>> 1.2,
 and before that 2.0 / 1.2 / 1.1 ). When that bleeding edge got released
>>> as
 a major x.y.0, the third, oldest, most stable branch went EOL, and new
 features would go into trunk for the next major version.
 
 There were two big negatives observed with this:
 
 The first big negative is that if multiple major new features were in
 flight, releases were prone to delay. Nobody wants to break an API on a
 x.y.1 release, and nobody wants to add a new feature to a x.y.2
>> release,
>>> so
 the project would delay the x.y releases if major features were close,
>>> and
 then there’d be pressure to slip them in 

Re: Proposal - 3.5.1

2016-09-15 Thread Jonathan Haddad
I don't think it's binary - we don't have to do year long insanity or
bleeding edge crazyness.

How about a release every 3 months, with each release accepting 6 months of
patches?  (oldstable & newstable)  Also provide nightly builds & stick to
the idea of stable trunk.

The issue is the number of bug fixes a given release gets.  1 bug fix
release for a new feature is just terrible.  The community as a whole
despises this system and is lowering confidence in the project.

Jon


On Thu, Sep 15, 2016 at 11:48 AM Jake Luciani  wrote:

> I'm pretty sure everyone will agree Tick-Tock didn't go well and needs to
> change.
>
> The problem for me is going back to the old way doesn't sound great. There
> are parts of tick-tock I really like,
> for example, the cadence and limited scope per release.
>
> I know at the summit there were a lot of ideas thrown around I can
> regurgitate but perhaps people
> who have been thinking about this would like to chime in and present ideas?
>
> -Jake
>
> On Thu, Sep 15, 2016 at 2:28 PM, Benedict Elliott Smith <
> bened...@apache.org
> > wrote:
>
> > I agree tick-tock is a failure.  But for two reasons IMO:
> >
> > 1) Ultimately, the users are the real testers and it takes a while for a
> > release to percolate into the wild for feedback.  The reality is that a
> > release doesn't have its tires properly kicked for at least three months
> > after it's cut.  So if we are to have any tocks, they should be
> completely
> > unwed from the ticks, and should probably happen on a ~3M cadence to keep
> > the labour down but the utility up (and there should probably still be
> more
> > than one tock per tick)
> >
> > 2) Those promised resources to improved process never happened.  We
> haven't
> > even reached parity with the 2.1 release until very recently, i.e. no
> > failing u/dtests.
> >
> >
> > On 15 September 2016 at 19:08, Jeff Jirsa 
> > wrote:
> >
> > > I know we’ve got a lot of folks following the dev list without a lot of
> > > background, so let’s make sure we get some context here so everyone can
> > be
> > > on the same page.
> > >
> > > Going to preface this wall of text by saying I’m +1 on a 3.5.1 (and
> > 3.3.1,
> > > etc) if it’s done AFTER 3.9 (I think we need to get 3.9 out first
> before
> > > the RE manpower is spent on backporting fixes, even critical fixes,
> > because
> > > 3.9 has multiple critical fixes for people running 3.7).
> > >
> > > Now some background:
> > >
> > > For many years, Cassandra used to have a dev process that kept 3 active
> > > branches - “bleeding edge”, a “stable”, and an “old stable” branch,
> where
> > > developers would be committing ALL new contributions to the bleeding
> > edge,
> > > non-api-breaking changes to stable, and bugfixes only to old stable.
> > While
> > > the api changed and major features were added, that bleeding edge would
> > > just be ‘trunk’, and it’d get cut into a major version when it was
> ready
> > to
> > > ship. We saw that with 2.2 / 2.1 / 2.0 (and before that, 2.1 / 2.0 /
> 1.2,
> > > and before that 2.0 / 1.2 / 1.1 ). When that bleeding edge got released
> > as
> > > a major x.y.0, the third, oldest, most stable branch went EOL, and new
> > > features would go into trunk for the next major version.
> > >
> > > There were two big negatives observed with this:
> > >
> > > The first big negative is that if multiple major new features were in
> > > flight, releases were prone to delay. Nobody wants to break an API on a
> > > x.y.1 release, and nobody wants to add a new feature to a x.y.2
> release,
> > so
> > > the project would delay the x.y releases if major features were close,
> > and
> > > then there’d be pressure to slip them in before they were fully tested,
> > or
> > > cut features to avoid delaying the release. This pressure was observed
> to
> > > be bad for the project – it forced technical compromises.
> > >
> > > The second downside that was observed was that nobody would try to run
> > the
> > > new versions when they launched, because they were buggy because they
> > were
> > > filled with new features. 2.2, for example, introduced RBAC, commitlog
> > > compression, and user defined functions – major features that needed to
> > be
> > > tested. Unfortunately, because there were few real-world testers, there
> > > were still major bugs being found for months – the first
> production-ready
> > > version of 2.2 is probably in the 2.2.5 or 2.2.6 range.
> > >
> > > For version 3, we moved to an alternate release, modeled on Intel’s
> > > tick/tock https://en.wikipedia.org/wiki/Tick-Tock_model
> > >
> > > The intention was to allow new features into 3.even releases (3.0, 3.2,
> > > 3.4, 3.6, and so on), with bugfixes in 3.odd releases (3.1, … ). The
> hope
> > > was to allow more frequent releases to address the first big negative
> > > (flood of new features that blocked releases), while also helping to
> > > address the second – with fewer major 

Re: Proposal - 3.5.1

2016-09-15 Thread Jeremiah D Jordan
Because tick-tock started based off of the 3.0 big bang “we broke everything” 
release I don’t think we can judge wether or not it is working until we are 
another 6 months in.  AKA when we would have been releasing the next big bang 
release.  Right now a lot if not most of the bugs in a given tick tock release 
are bugs that were introduced in 3.0.  Even the bug mentioned here, it is not a 
tick tock bug, it is a 3.0 bug.


> On Sep 15, 2016, at 1:48 PM, Jake Luciani  wrote:
> 
> I'm pretty sure everyone will agree Tick-Tock didn't go well and needs to
> change.
> 
> The problem for me is going back to the old way doesn't sound great. There
> are parts of tick-tock I really like,
> for example, the cadence and limited scope per release.
> 
> I know at the summit there were a lot of ideas thrown around I can
> regurgitate but perhaps people
> who have been thinking about this would like to chime in and present ideas?
> 
> -Jake
> 
> On Thu, Sep 15, 2016 at 2:28 PM, Benedict Elliott Smith > wrote:
> 
>> I agree tick-tock is a failure.  But for two reasons IMO:
>> 
>> 1) Ultimately, the users are the real testers and it takes a while for a
>> release to percolate into the wild for feedback.  The reality is that a
>> release doesn't have its tires properly kicked for at least three months
>> after it's cut.  So if we are to have any tocks, they should be completely
>> unwed from the ticks, and should probably happen on a ~3M cadence to keep
>> the labour down but the utility up (and there should probably still be more
>> than one tock per tick)
>> 
>> 2) Those promised resources to improved process never happened.  We haven't
>> even reached parity with the 2.1 release until very recently, i.e. no
>> failing u/dtests.
>> 
>> 
>> On 15 September 2016 at 19:08, Jeff Jirsa 
>> wrote:
>> 
>>> I know we’ve got a lot of folks following the dev list without a lot of
>>> background, so let’s make sure we get some context here so everyone can
>> be
>>> on the same page.
>>> 
>>> Going to preface this wall of text by saying I’m +1 on a 3.5.1 (and
>> 3.3.1,
>>> etc) if it’s done AFTER 3.9 (I think we need to get 3.9 out first before
>>> the RE manpower is spent on backporting fixes, even critical fixes,
>> because
>>> 3.9 has multiple critical fixes for people running 3.7).
>>> 
>>> Now some background:
>>> 
>>> For many years, Cassandra used to have a dev process that kept 3 active
>>> branches - “bleeding edge”, a “stable”, and an “old stable” branch, where
>>> developers would be committing ALL new contributions to the bleeding
>> edge,
>>> non-api-breaking changes to stable, and bugfixes only to old stable.
>> While
>>> the api changed and major features were added, that bleeding edge would
>>> just be ‘trunk’, and it’d get cut into a major version when it was ready
>> to
>>> ship. We saw that with 2.2 / 2.1 / 2.0 (and before that, 2.1 / 2.0 / 1.2,
>>> and before that 2.0 / 1.2 / 1.1 ). When that bleeding edge got released
>> as
>>> a major x.y.0, the third, oldest, most stable branch went EOL, and new
>>> features would go into trunk for the next major version.
>>> 
>>> There were two big negatives observed with this:
>>> 
>>> The first big negative is that if multiple major new features were in
>>> flight, releases were prone to delay. Nobody wants to break an API on a
>>> x.y.1 release, and nobody wants to add a new feature to a x.y.2 release,
>> so
>>> the project would delay the x.y releases if major features were close,
>> and
>>> then there’d be pressure to slip them in before they were fully tested,
>> or
>>> cut features to avoid delaying the release. This pressure was observed to
>>> be bad for the project – it forced technical compromises.
>>> 
>>> The second downside that was observed was that nobody would try to run
>> the
>>> new versions when they launched, because they were buggy because they
>> were
>>> filled with new features. 2.2, for example, introduced RBAC, commitlog
>>> compression, and user defined functions – major features that needed to
>> be
>>> tested. Unfortunately, because there were few real-world testers, there
>>> were still major bugs being found for months – the first production-ready
>>> version of 2.2 is probably in the 2.2.5 or 2.2.6 range.
>>> 
>>> For version 3, we moved to an alternate release, modeled on Intel’s
>>> tick/tock https://en.wikipedia.org/wiki/Tick-Tock_model
>>> 
>>> The intention was to allow new features into 3.even releases (3.0, 3.2,
>>> 3.4, 3.6, and so on), with bugfixes in 3.odd releases (3.1, … ). The hope
>>> was to allow more frequent releases to address the first big negative
>>> (flood of new features that blocked releases), while also helping to
>>> address the second – with fewer major features in a release, they better
>>> get more/better test coverage.
>>> 
>>> In the tick/tock model, anyone running 3.odd (like 3.5) should be looking
>>> for bugfixes in 3.7. It’s certainly 

Re: Proposal - 3.5.1

2016-09-15 Thread Jake Luciani
I'm pretty sure everyone will agree Tick-Tock didn't go well and needs to
change.

The problem for me is going back to the old way doesn't sound great. There
are parts of tick-tock I really like,
for example, the cadence and limited scope per release.

I know at the summit there were a lot of ideas thrown around I can
regurgitate but perhaps people
who have been thinking about this would like to chime in and present ideas?

-Jake

On Thu, Sep 15, 2016 at 2:28 PM, Benedict Elliott Smith  wrote:

> I agree tick-tock is a failure.  But for two reasons IMO:
>
> 1) Ultimately, the users are the real testers and it takes a while for a
> release to percolate into the wild for feedback.  The reality is that a
> release doesn't have its tires properly kicked for at least three months
> after it's cut.  So if we are to have any tocks, they should be completely
> unwed from the ticks, and should probably happen on a ~3M cadence to keep
> the labour down but the utility up (and there should probably still be more
> than one tock per tick)
>
> 2) Those promised resources to improved process never happened.  We haven't
> even reached parity with the 2.1 release until very recently, i.e. no
> failing u/dtests.
>
>
> On 15 September 2016 at 19:08, Jeff Jirsa 
> wrote:
>
> > I know we’ve got a lot of folks following the dev list without a lot of
> > background, so let’s make sure we get some context here so everyone can
> be
> > on the same page.
> >
> > Going to preface this wall of text by saying I’m +1 on a 3.5.1 (and
> 3.3.1,
> > etc) if it’s done AFTER 3.9 (I think we need to get 3.9 out first before
> > the RE manpower is spent on backporting fixes, even critical fixes,
> because
> > 3.9 has multiple critical fixes for people running 3.7).
> >
> > Now some background:
> >
> > For many years, Cassandra used to have a dev process that kept 3 active
> > branches - “bleeding edge”, a “stable”, and an “old stable” branch, where
> > developers would be committing ALL new contributions to the bleeding
> edge,
> > non-api-breaking changes to stable, and bugfixes only to old stable.
> While
> > the api changed and major features were added, that bleeding edge would
> > just be ‘trunk’, and it’d get cut into a major version when it was ready
> to
> > ship. We saw that with 2.2 / 2.1 / 2.0 (and before that, 2.1 / 2.0 / 1.2,
> > and before that 2.0 / 1.2 / 1.1 ). When that bleeding edge got released
> as
> > a major x.y.0, the third, oldest, most stable branch went EOL, and new
> > features would go into trunk for the next major version.
> >
> > There were two big negatives observed with this:
> >
> > The first big negative is that if multiple major new features were in
> > flight, releases were prone to delay. Nobody wants to break an API on a
> > x.y.1 release, and nobody wants to add a new feature to a x.y.2 release,
> so
> > the project would delay the x.y releases if major features were close,
> and
> > then there’d be pressure to slip them in before they were fully tested,
> or
> > cut features to avoid delaying the release. This pressure was observed to
> > be bad for the project – it forced technical compromises.
> >
> > The second downside that was observed was that nobody would try to run
> the
> > new versions when they launched, because they were buggy because they
> were
> > filled with new features. 2.2, for example, introduced RBAC, commitlog
> > compression, and user defined functions – major features that needed to
> be
> > tested. Unfortunately, because there were few real-world testers, there
> > were still major bugs being found for months – the first production-ready
> > version of 2.2 is probably in the 2.2.5 or 2.2.6 range.
> >
> > For version 3, we moved to an alternate release, modeled on Intel’s
> > tick/tock https://en.wikipedia.org/wiki/Tick-Tock_model
> >
> > The intention was to allow new features into 3.even releases (3.0, 3.2,
> > 3.4, 3.6, and so on), with bugfixes in 3.odd releases (3.1, … ). The hope
> > was to allow more frequent releases to address the first big negative
> > (flood of new features that blocked releases), while also helping to
> > address the second – with fewer major features in a release, they better
> > get more/better test coverage.
> >
> > In the tick/tock model, anyone running 3.odd (like 3.5) should be looking
> > for bugfixes in 3.7. It’s certainly true that 3.5 is horribly broken (as
> is
> > 3.3, and 3.4, etc), but with this release model, the bugfix SHOULD BE in
> > 3.7. As I mentioned previously, we have precedent for backporting
> critical
> > fixes, but we don’t have a well defined bar (that I see) for what’s
> > critical enough for a backport.
> >
> > Jon is noting (and what many of us who run Cassandra in production have
> > really known for a very long time) is that nobody wants to run 3.newest
> > (even or odd), because 3.newest is likely broken (because it’s a complex
> > distributed database, and testing is 

Re: Proposal - 3.5.1

2016-09-15 Thread Benedict Elliott Smith
I agree tick-tock is a failure.  But for two reasons IMO:

1) Ultimately, the users are the real testers and it takes a while for a
release to percolate into the wild for feedback.  The reality is that a
release doesn't have its tires properly kicked for at least three months
after it's cut.  So if we are to have any tocks, they should be completely
unwed from the ticks, and should probably happen on a ~3M cadence to keep
the labour down but the utility up (and there should probably still be more
than one tock per tick)

2) Those promised resources to improved process never happened.  We haven't
even reached parity with the 2.1 release until very recently, i.e. no
failing u/dtests.


On 15 September 2016 at 19:08, Jeff Jirsa 
wrote:

> I know we’ve got a lot of folks following the dev list without a lot of
> background, so let’s make sure we get some context here so everyone can be
> on the same page.
>
> Going to preface this wall of text by saying I’m +1 on a 3.5.1 (and 3.3.1,
> etc) if it’s done AFTER 3.9 (I think we need to get 3.9 out first before
> the RE manpower is spent on backporting fixes, even critical fixes, because
> 3.9 has multiple critical fixes for people running 3.7).
>
> Now some background:
>
> For many years, Cassandra used to have a dev process that kept 3 active
> branches - “bleeding edge”, a “stable”, and an “old stable” branch, where
> developers would be committing ALL new contributions to the bleeding edge,
> non-api-breaking changes to stable, and bugfixes only to old stable. While
> the api changed and major features were added, that bleeding edge would
> just be ‘trunk’, and it’d get cut into a major version when it was ready to
> ship. We saw that with 2.2 / 2.1 / 2.0 (and before that, 2.1 / 2.0 / 1.2,
> and before that 2.0 / 1.2 / 1.1 ). When that bleeding edge got released as
> a major x.y.0, the third, oldest, most stable branch went EOL, and new
> features would go into trunk for the next major version.
>
> There were two big negatives observed with this:
>
> The first big negative is that if multiple major new features were in
> flight, releases were prone to delay. Nobody wants to break an API on a
> x.y.1 release, and nobody wants to add a new feature to a x.y.2 release, so
> the project would delay the x.y releases if major features were close, and
> then there’d be pressure to slip them in before they were fully tested, or
> cut features to avoid delaying the release. This pressure was observed to
> be bad for the project – it forced technical compromises.
>
> The second downside that was observed was that nobody would try to run the
> new versions when they launched, because they were buggy because they were
> filled with new features. 2.2, for example, introduced RBAC, commitlog
> compression, and user defined functions – major features that needed to be
> tested. Unfortunately, because there were few real-world testers, there
> were still major bugs being found for months – the first production-ready
> version of 2.2 is probably in the 2.2.5 or 2.2.6 range.
>
> For version 3, we moved to an alternate release, modeled on Intel’s
> tick/tock https://en.wikipedia.org/wiki/Tick-Tock_model
>
> The intention was to allow new features into 3.even releases (3.0, 3.2,
> 3.4, 3.6, and so on), with bugfixes in 3.odd releases (3.1, … ). The hope
> was to allow more frequent releases to address the first big negative
> (flood of new features that blocked releases), while also helping to
> address the second – with fewer major features in a release, they better
> get more/better test coverage.
>
> In the tick/tock model, anyone running 3.odd (like 3.5) should be looking
> for bugfixes in 3.7. It’s certainly true that 3.5 is horribly broken (as is
> 3.3, and 3.4, etc), but with this release model, the bugfix SHOULD BE in
> 3.7. As I mentioned previously, we have precedent for backporting critical
> fixes, but we don’t have a well defined bar (that I see) for what’s
> critical enough for a backport.
>
> Jon is noting (and what many of us who run Cassandra in production have
> really known for a very long time) is that nobody wants to run 3.newest
> (even or odd), because 3.newest is likely broken (because it’s a complex
> distributed database, and testing is hard, and it takes time and complex
> workloads to find bugs). In the tick/tock model, because new features went
> into 3.6, there are new features that may not be adequately
> tested/validated in 3.7 a user of 3.5 doesn’t want, and isn’t willing to
> accept the risk.
>
> The bottom line here is that tick/tock is probably a well intentioned but
> failed attempt to bring stability to Cassandra’s releases. The problems
> tick/tock was meant to solve are real problems, but tick/tock doesn’t seem
> to be addressing them – new features invalidate old testing, which makes it
> difficult/impossible for real users to sit on the 3.odd versions.
>
> We’re due for cutting 3.9 and 3.0.9, and we have limited 

Re: Proposal - 3.5.1

2016-09-15 Thread Jeff Jirsa
I know we’ve got a lot of folks following the dev list without a lot of 
background, so let’s make sure we get some context here so everyone can be on 
the same page. 

Going to preface this wall of text by saying I’m +1 on a 3.5.1 (and 3.3.1, etc) 
if it’s done AFTER 3.9 (I think we need to get 3.9 out first before the RE 
manpower is spent on backporting fixes, even critical fixes, because 3.9 has 
multiple critical fixes for people running 3.7). 

Now some background: 

For many years, Cassandra used to have a dev process that kept 3 active 
branches - “bleeding edge”, a “stable”, and an “old stable” branch, where 
developers would be committing ALL new contributions to the bleeding edge, 
non-api-breaking changes to stable, and bugfixes only to old stable. While the 
api changed and major features were added, that bleeding edge would just be 
‘trunk’, and it’d get cut into a major version when it was ready to ship. We 
saw that with 2.2 / 2.1 / 2.0 (and before that, 2.1 / 2.0 / 1.2, and before 
that 2.0 / 1.2 / 1.1 ). When that bleeding edge got released as a major x.y.0, 
the third, oldest, most stable branch went EOL, and new features would go into 
trunk for the next major version. 

There were two big negatives observed with this:

The first big negative is that if multiple major new features were in flight, 
releases were prone to delay. Nobody wants to break an API on a x.y.1 release, 
and nobody wants to add a new feature to a x.y.2 release, so the project would 
delay the x.y releases if major features were close, and then there’d be 
pressure to slip them in before they were fully tested, or cut features to 
avoid delaying the release. This pressure was observed to be bad for the 
project – it forced technical compromises. 

The second downside that was observed was that nobody would try to run the new 
versions when they launched, because they were buggy because they were filled 
with new features. 2.2, for example, introduced RBAC, commitlog compression, 
and user defined functions – major features that needed to be tested. 
Unfortunately, because there were few real-world testers, there were still 
major bugs being found for months – the first production-ready version of 2.2 
is probably in the 2.2.5 or 2.2.6 range. 

For version 3, we moved to an alternate release, modeled on Intel’s tick/tock 
https://en.wikipedia.org/wiki/Tick-Tock_model

The intention was to allow new features into 3.even releases (3.0, 3.2, 3.4, 
3.6, and so on), with bugfixes in 3.odd releases (3.1, … ). The hope was to 
allow more frequent releases to address the first big negative (flood of new 
features that blocked releases), while also helping to address the second – 
with fewer major features in a release, they better get more/better test 
coverage.

In the tick/tock model, anyone running 3.odd (like 3.5) should be looking for 
bugfixes in 3.7. It’s certainly true that 3.5 is horribly broken (as is 3.3, 
and 3.4, etc), but with this release model, the bugfix SHOULD BE in 3.7. As I 
mentioned previously, we have precedent for backporting critical fixes, but we 
don’t have a well defined bar (that I see) for what’s critical enough for a 
backport. 

Jon is noting (and what many of us who run Cassandra in production have really 
known for a very long time) is that nobody wants to run 3.newest (even or odd), 
because 3.newest is likely broken (because it’s a complex distributed database, 
and testing is hard, and it takes time and complex workloads to find bugs). In 
the tick/tock model, because new features went into 3.6, there are new features 
that may not be adequately tested/validated in 3.7 a user of 3.5 doesn’t want, 
and isn’t willing to accept the risk.

The bottom line here is that tick/tock is probably a well intentioned but 
failed attempt to bring stability to Cassandra’s releases. The problems 
tick/tock was meant to solve are real problems, but tick/tock doesn’t seem to 
be addressing them – new features invalidate old testing, which makes it 
difficult/impossible for real users to sit on the 3.odd versions.   

We’re due for cutting 3.9 and 3.0.9, and we have limited RE manpower to get 
those out. Only after those are out would I be +1 on a 3.5.1, and then only 
because if I were running 3.5, and I hit this bug, I wouldn’t want to spend the 
~$100k it would cost my organization to validate 3.7 prior to upgrading, and I 
don’t think it’s reasonable to ask users to recompile a release for a ~10 line 
fix for a very nasty bug. 

I’m also very strongly recommend we (committers/PMC) reconsider tick/tock for 
4.x releases, because this is exactly the type of problem that will continue to 
happen as we move forward. I suggest that we either need to go back to the old 
model and do a better job of dealing with feature creep and testing, or we need 
to better define what gets backported, because the community needs a stable 
version to run, and running latest odd release of tick/tock isn’t it.

- Jeff


On 

Re: Proposal - 3.5.1

2016-09-15 Thread Benedict Elliott Smith
It's worth noting more clearly that 3.5 is an arbitrary point in time.  All
3.X releases < 3.6 are affected.

If we backport to 3.5, it seems like 3.1 and 3.3 should get the same
treatment.  I do recall commitments to backport critical fixes, but exactly
what the bar is was never well defined.

I also cannot see how there would be any added confusion.


On 15 September 2016 at 18:31, Dave Lester  wrote:

> How would cutting a 3.5.1 release possibly confuse users of the software?
> It would be easy to document the change and to send release notes.
>
> Given the bug’s critical nature and that it's a minor fix, I’m +1
> (non-binding) to a new release.
>
> Dave
>
> > On Sep 15, 2016, at 7:18 AM, Jeremiah D Jordan <
> jeremiah.jor...@gmail.com> wrote:
> >
> > I’m with Jeff on this, 3.7 (bug fixes on 3.6) has already been released
> with the fix.  Since the fix applies cleanly anyone is free to put it on
> top of 3.5 on their own if they like, but I see no reason to put out a
> 3.5.1 right now and confuse people further.
> >
> > -Jeremiah
> >
> >
> >> On Sep 15, 2016, at 9:07 AM, Jonathan Haddad  wrote:
> >>
> >> As I follow up, I suppose I'm only advocating for a fix to the odd
> >> releases.  Sadly, Tick Tock versioning is misleading.
> >>
> >> If tick tock were to continue (and I'm very much against how it
> currently
> >> works) the whole even-features odd-fixes thing needs to stop ASAP, all
> it
> >> does it confuse people.
> >>
> >> The follow up to 3.4 (3.5) should have been 3.4.1, following semver, so
> >> people know it's bug fixes only to 3.4.
> >>
> >> Jon
> >>
> >> On Wed, Sep 14, 2016 at 10:37 PM Jonathan Haddad 
> wrote:
> >>
> >>> In this particular case, I'd say adding a bug fix release for every
> >>> version that's affected would be the right thing.  The issue is so
> easily
> >>> reproducible and will likely result in massive data loss for anyone on
> 3.X
> >>> WHERE X < 6 and uses the "date" type.
> >>>
> >>> This is how easy it is to reproduce:
> >>>
> >>> 1. Start Cassandra 3.5
> >>> 2. create KEYSPACE test WITH replication = {'class': 'SimpleStrategy',
> >>> 'replication_factor': 1};
> >>> 3. use test;
> >>> 4. create table fail (id int primary key, d date);
> >>> 5. delete d from fail where id = 1;
> >>> 6. Stop Cassandra
> >>> 7. Start Cassandra
> >>>
> >>> You will get this, and startup will fail:
> >>>
> >>> ERROR 05:32:09 Exiting due to error while processing commit log during
> >>> initialization.
> >>> org.apache.cassandra.db.commitlog.CommitLogReplayer$
> CommitLogReplayException:
> >>> Unexpected error deserializing mutation; saved to
> >>> /var/folders/0l/g2p6cnyd5kx_1wkl83nd3y4rgn/T/
> mutation6313332720566971713dat.
> >>> This may be caused by replaying a mutation against a table with the
> same
> >>> name but incompatible schema.  Exception follows:
> >>> org.apache.cassandra.serializers.MarshalException: Expected 4 byte
> long for
> >>> date (0)
> >>>
> >>> I mean.. come on.  It's an easy fix.  It cleanly merges against 3.5
> (and
> >>> probably the other releases) and requires very little investment from
> >>> anyone.
> >>>
> >>>
> >>> On Wed, Sep 14, 2016 at 9:40 PM Jeff Jirsa  >
> >>> wrote:
> >>>
>  We did 3.1.1 and 3.2.1, so there’s SOME precedent for emergency fixes,
>  but we certainly didn’t/won’t go back and cut new releases from every
>  branch for every critical bug in future releases, so I think we need
> to
>  draw the line somewhere. If it’s fixed in 3.7 and 3.0.x (x >= 6), it
> seems
>  like you’ve got options (either stay on the tick and go up to 3.7, or
> bail
>  down to 3.0.x)
> 
>  Perhaps, though, this highlights the fact that tick/tock may not be
> the
>  best option long term. We’ve tried it for a year, perhaps we should
> instead
>  discuss whether or not it should continue, or if there’s another
> process
>  that gives us a better way to get useful patches into versions people
> are
>  willing to run in production.
> 
> 
> 
>  On 9/14/16, 8:55 PM, "Jonathan Haddad"  wrote:
> 
> > Common sense is what prevents someone from upgrading to yet another
> > completely unknown version with new features which have probably
> broken
> > even more stuff that nobody is aware of.  The folks I'm helping right
> > deployed 3.5 when they got started because
>  https://urldefense.proofpoint.com/v2/url?u=http-3A__
> cassandra.apache.org=DQIBaQ=08AGY6txKsvMOP6lYkHQpPMRA1U6kq
> hAwGa8-0QCg3M=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow=
> MZ9nLcNNhQZkuXyH0NBbP1kSEE2M-SYgyVqZ88IJcXY=pLP3udocOcAG6k_
> sAb9p8tcAhtOhpFm6JB7owGhPQEs=
>  suggests
> > it's acceptable for production.  It turns out using 4 of the built in
> > datatypes of the database result in the server being unable to
> restart
> > without clearing out the commit logs and running a repair.  That

Re: Proposal - 3.5.1

2016-09-15 Thread Edward Capriolo
Where did we come from?

We came from a place where we would say, "You probably do not want to run
2.0.X until it reaches 2.0.6"

One thing about Cassandra is we get into a situation where we can only go
forward. For example, when you update from version X to version Y, version
Y might start writing a new versions of sstables.

X - sstables-v1
Y - sstables-v2

This is very scary operations side because you can not bring the the system
back to running version X as Y data is unreadable.

Where are we at now?

We now seem to be in a place where you say "Problem in 3.5 (trunk at a
given day)?,  go to 3.9 (trunk at last tt- release) "

http://www.planetcassandra.org/blog/cassandra-2-2-3-0-and-beyond/

"To get there, we are investing significant effort in making trunk “always
releasable,” with the goal that each release, or at least each odd-numbered
bugfix release, should be usable in production. "

I support releasable trunk, but the qualifying statement "or at least each
odd number release" undoes the assertion of "always releasable". Not trying
to nit pick here. I realize it may be hard to get to the desired state of
releasable trunk in a short time.

Anecdotally I notice a lot of "movement" in class names/names of functions.
Generally, I can look at a stack trace of a piece of software and I can
bring up the line number in github and it is dead on, or fairly close to
the line of code. Recently I have tried this in versions fairly close
together and seen some drastic changes.

We know some things i personally do not like:
1) lack of stable-ish api's in the codebase
2) use of singletons rather than simple dependency injection (like even
constructor based injection)

IMHO these do not fit well with 'release often' and always produce 'high
quality release'.

I do not love the concept of 'bug fix release' I would not mind waiting
longer for a feature as long as I could have a high trust factor in in
working right the first time.

Take a feature like trickle_fs, By the description it sounds like a clear
optimization win. It is off by default. The description says "turn on for
ssd" but elsewhere in the configuration # disk_optimization_strategy: ssd.
Are we tuning for ssd by default or not?

By being false, it is not tested in wild, how is it covered and trusted
during tests, how many tests have it off vs on?

I think the concept that trickle_fs can be added as a feature, set false
and possibly gains real world coverage is not comforting to me. I do not
want to turn it on and get some weird issue because no one else is running
this. I would rather it be added on by default with extreme confidence or
not added at all.



On Thu, Sep 15, 2016 at 1:37 AM, Jonathan Haddad  wrote:

> In this particular case, I'd say adding a bug fix release for every version
> that's affected would be the right thing.  The issue is so easily
> reproducible and will likely result in massive data loss for anyone on 3.X
> WHERE X < 6 and uses the "date" type.
>
> This is how easy it is to reproduce:
>
> 1. Start Cassandra 3.5
> 2. create KEYSPACE test WITH replication = {'class': 'SimpleStrategy',
> 'replication_factor': 1};
> 3. use test;
> 4. create table fail (id int primary key, d date);
> 5. delete d from fail where id = 1;
> 6. Stop Cassandra
> 7. Start Cassandra
>
> You will get this, and startup will fail:
>
> ERROR 05:32:09 Exiting due to error while processing commit log during
> initialization.
> org.apache.cassandra.db.commitlog.CommitLogReplayer$
> CommitLogReplayException:
> Unexpected error deserializing mutation; saved to
> /var/folders/0l/g2p6cnyd5kx_1wkl83nd3y4rgn/T/
> mutation6313332720566971713dat.
> This may be caused by replaying a mutation against a table with the same
> name but incompatible schema.  Exception follows:
> org.apache.cassandra.serializers.MarshalException: Expected 4 byte long
> for
> date (0)
>
> I mean.. come on.  It's an easy fix.  It cleanly merges against 3.5 (and
> probably the other releases) and requires very little investment from
> anyone.
>
>
> On Wed, Sep 14, 2016 at 9:40 PM Jeff Jirsa 
> wrote:
>
> > We did 3.1.1 and 3.2.1, so there’s SOME precedent for emergency fixes,
> but
> > we certainly didn’t/won’t go back and cut new releases from every branch
> > for every critical bug in future releases, so I think we need to draw the
> > line somewhere. If it’s fixed in 3.7 and 3.0.x (x >= 6), it seems like
> > you’ve got options (either stay on the tick and go up to 3.7, or bail
> down
> > to 3.0.x)
> >
> > Perhaps, though, this highlights the fact that tick/tock may not be the
> > best option long term. We’ve tried it for a year, perhaps we should
> instead
> > discuss whether or not it should continue, or if there’s another process
> > that gives us a better way to get useful patches into versions people are
> > willing to run in production.
> >
> >
> >
> > On 9/14/16, 8:55 PM, "Jonathan Haddad"  wrote:
> >
> > >Common 

Re: Proposal - 3.5.1

2016-09-15 Thread Jeremiah D Jordan
I’m with Jeff on this, 3.7 (bug fixes on 3.6) has already been released with 
the fix.  Since the fix applies cleanly anyone is free to put it on top of 3.5 
on their own if they like, but I see no reason to put out a 3.5.1 right now and 
confuse people further.

-Jeremiah


> On Sep 15, 2016, at 9:07 AM, Jonathan Haddad  wrote:
> 
> As I follow up, I suppose I'm only advocating for a fix to the odd
> releases.  Sadly, Tick Tock versioning is misleading.
> 
> If tick tock were to continue (and I'm very much against how it currently
> works) the whole even-features odd-fixes thing needs to stop ASAP, all it
> does it confuse people.
> 
> The follow up to 3.4 (3.5) should have been 3.4.1, following semver, so
> people know it's bug fixes only to 3.4.
> 
> Jon
> 
> On Wed, Sep 14, 2016 at 10:37 PM Jonathan Haddad  wrote:
> 
>> In this particular case, I'd say adding a bug fix release for every
>> version that's affected would be the right thing.  The issue is so easily
>> reproducible and will likely result in massive data loss for anyone on 3.X
>> WHERE X < 6 and uses the "date" type.
>> 
>> This is how easy it is to reproduce:
>> 
>> 1. Start Cassandra 3.5
>> 2. create KEYSPACE test WITH replication = {'class': 'SimpleStrategy',
>> 'replication_factor': 1};
>> 3. use test;
>> 4. create table fail (id int primary key, d date);
>> 5. delete d from fail where id = 1;
>> 6. Stop Cassandra
>> 7. Start Cassandra
>> 
>> You will get this, and startup will fail:
>> 
>> ERROR 05:32:09 Exiting due to error while processing commit log during
>> initialization.
>> org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException:
>> Unexpected error deserializing mutation; saved to
>> /var/folders/0l/g2p6cnyd5kx_1wkl83nd3y4rgn/T/mutation6313332720566971713dat.
>> This may be caused by replaying a mutation against a table with the same
>> name but incompatible schema.  Exception follows:
>> org.apache.cassandra.serializers.MarshalException: Expected 4 byte long for
>> date (0)
>> 
>> I mean.. come on.  It's an easy fix.  It cleanly merges against 3.5 (and
>> probably the other releases) and requires very little investment from
>> anyone.
>> 
>> 
>> On Wed, Sep 14, 2016 at 9:40 PM Jeff Jirsa 
>> wrote:
>> 
>>> We did 3.1.1 and 3.2.1, so there’s SOME precedent for emergency fixes,
>>> but we certainly didn’t/won’t go back and cut new releases from every
>>> branch for every critical bug in future releases, so I think we need to
>>> draw the line somewhere. If it’s fixed in 3.7 and 3.0.x (x >= 6), it seems
>>> like you’ve got options (either stay on the tick and go up to 3.7, or bail
>>> down to 3.0.x)
>>> 
>>> Perhaps, though, this highlights the fact that tick/tock may not be the
>>> best option long term. We’ve tried it for a year, perhaps we should instead
>>> discuss whether or not it should continue, or if there’s another process
>>> that gives us a better way to get useful patches into versions people are
>>> willing to run in production.
>>> 
>>> 
>>> 
>>> On 9/14/16, 8:55 PM, "Jonathan Haddad"  wrote:
>>> 
 Common sense is what prevents someone from upgrading to yet another
 completely unknown version with new features which have probably broken
 even more stuff that nobody is aware of.  The folks I'm helping right
 deployed 3.5 when they got started because
>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__cassandra.apache.org=DQIBaQ=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow=MZ9nLcNNhQZkuXyH0NBbP1kSEE2M-SYgyVqZ88IJcXY=pLP3udocOcAG6k_sAb9p8tcAhtOhpFm6JB7owGhPQEs=
>>> suggests
 it's acceptable for production.  It turns out using 4 of the built in
 datatypes of the database result in the server being unable to restart
 without clearing out the commit logs and running a repair.  That screams
 critical to me.  You shouldn't even be able to install 3.5 without the
 patch I've supplied - that bug is a ticking time bomb for anyone that
 installs it.
 
 On Wed, Sep 14, 2016 at 8:12 PM Michael Shuler 
 wrote:
 
> What's preventing the use of the 3.6 or 3.7 releases where this bug is
> already fixed? This is also fixed in the 3.0.6/7/8 releases.
> 
> Michael
> 
> On 09/14/2016 08:30 PM, Jonathan Haddad wrote:
>> Unfortunately CASSANDRA-11618 was fixed in 3.6 but was not back
>>> ported to
>> 3.5 as well, and it makes Cassandra effectively unusable if someone
>>> is
>> using any of the 4 types affected in any of their schema.
>> 
>> I have cherry picked & merged the patch back to here and will put it
>>> in a
>> JIRA as well tonight, I just wanted to get the ball rolling asap on
>>> this.
>> 
>> 
> 
>>> 

Re: Proposal - 3.5.1

2016-09-15 Thread Jonathan Haddad
As I follow up, I suppose I'm only advocating for a fix to the odd
releases.  Sadly, Tick Tock versioning is misleading.

If tick tock were to continue (and I'm very much against how it currently
works) the whole even-features odd-fixes thing needs to stop ASAP, all it
does it confuse people.

The follow up to 3.4 (3.5) should have been 3.4.1, following semver, so
people know it's bug fixes only to 3.4.

Jon

On Wed, Sep 14, 2016 at 10:37 PM Jonathan Haddad  wrote:

> In this particular case, I'd say adding a bug fix release for every
> version that's affected would be the right thing.  The issue is so easily
> reproducible and will likely result in massive data loss for anyone on 3.X
> WHERE X < 6 and uses the "date" type.
>
> This is how easy it is to reproduce:
>
> 1. Start Cassandra 3.5
> 2. create KEYSPACE test WITH replication = {'class': 'SimpleStrategy',
> 'replication_factor': 1};
> 3. use test;
> 4. create table fail (id int primary key, d date);
> 5. delete d from fail where id = 1;
> 6. Stop Cassandra
> 7. Start Cassandra
>
> You will get this, and startup will fail:
>
> ERROR 05:32:09 Exiting due to error while processing commit log during
> initialization.
> org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException:
> Unexpected error deserializing mutation; saved to
> /var/folders/0l/g2p6cnyd5kx_1wkl83nd3y4rgn/T/mutation6313332720566971713dat.
> This may be caused by replaying a mutation against a table with the same
> name but incompatible schema.  Exception follows:
> org.apache.cassandra.serializers.MarshalException: Expected 4 byte long for
> date (0)
>
> I mean.. come on.  It's an easy fix.  It cleanly merges against 3.5 (and
> probably the other releases) and requires very little investment from
> anyone.
>
>
> On Wed, Sep 14, 2016 at 9:40 PM Jeff Jirsa 
> wrote:
>
>> We did 3.1.1 and 3.2.1, so there’s SOME precedent for emergency fixes,
>> but we certainly didn’t/won’t go back and cut new releases from every
>> branch for every critical bug in future releases, so I think we need to
>> draw the line somewhere. If it’s fixed in 3.7 and 3.0.x (x >= 6), it seems
>> like you’ve got options (either stay on the tick and go up to 3.7, or bail
>> down to 3.0.x)
>>
>> Perhaps, though, this highlights the fact that tick/tock may not be the
>> best option long term. We’ve tried it for a year, perhaps we should instead
>> discuss whether or not it should continue, or if there’s another process
>> that gives us a better way to get useful patches into versions people are
>> willing to run in production.
>>
>>
>>
>> On 9/14/16, 8:55 PM, "Jonathan Haddad"  wrote:
>>
>> >Common sense is what prevents someone from upgrading to yet another
>> >completely unknown version with new features which have probably broken
>> >even more stuff that nobody is aware of.  The folks I'm helping right
>> >deployed 3.5 when they got started because
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__cassandra.apache.org=DQIBaQ=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow=MZ9nLcNNhQZkuXyH0NBbP1kSEE2M-SYgyVqZ88IJcXY=pLP3udocOcAG6k_sAb9p8tcAhtOhpFm6JB7owGhPQEs=
>> suggests
>> >it's acceptable for production.  It turns out using 4 of the built in
>> >datatypes of the database result in the server being unable to restart
>> >without clearing out the commit logs and running a repair.  That screams
>> >critical to me.  You shouldn't even be able to install 3.5 without the
>> >patch I've supplied - that bug is a ticking time bomb for anyone that
>> >installs it.
>> >
>> >On Wed, Sep 14, 2016 at 8:12 PM Michael Shuler 
>> >wrote:
>> >
>> >> What's preventing the use of the 3.6 or 3.7 releases where this bug is
>> >> already fixed? This is also fixed in the 3.0.6/7/8 releases.
>> >>
>> >> Michael
>> >>
>> >> On 09/14/2016 08:30 PM, Jonathan Haddad wrote:
>> >> > Unfortunately CASSANDRA-11618 was fixed in 3.6 but was not back
>> ported to
>> >> > 3.5 as well, and it makes Cassandra effectively unusable if someone
>> is
>> >> > using any of the 4 types affected in any of their schema.
>> >> >
>> >> > I have cherry picked & merged the patch back to here and will put it
>> in a
>> >> > JIRA as well tonight, I just wanted to get the ball rolling asap on
>> this.
>> >> >
>> >> >
>> >>
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_rustyrazorblade_cassandra_tree_fix-5Fcommitlog-5Fexception=DQIBaQ=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow=MZ9nLcNNhQZkuXyH0NBbP1kSEE2M-SYgyVqZ88IJcXY=ktY5tkT-nO1jtyc0EicbgZHXJYl03DvzuxqzyyOgzII=
>> >> >
>> >> > Jon
>> >> >
>> >>
>> >>
>>
>


Re: Proposal - 3.5.1

2016-09-14 Thread Jeff Jirsa
We did 3.1.1 and 3.2.1, so there’s SOME precedent for emergency fixes, but we 
certainly didn’t/won’t go back and cut new releases from every branch for every 
critical bug in future releases, so I think we need to draw the line somewhere. 
If it’s fixed in 3.7 and 3.0.x (x >= 6), it seems like you’ve got options 
(either stay on the tick and go up to 3.7, or bail down to 3.0.x)

Perhaps, though, this highlights the fact that tick/tock may not be the best 
option long term. We’ve tried it for a year, perhaps we should instead discuss 
whether or not it should continue, or if there’s another process that gives us 
a better way to get useful patches into versions people are willing to run in 
production.



On 9/14/16, 8:55 PM, "Jonathan Haddad"  wrote:

>Common sense is what prevents someone from upgrading to yet another
>completely unknown version with new features which have probably broken
>even more stuff that nobody is aware of.  The folks I'm helping right
>deployed 3.5 when they got started because 
>https://urldefense.proofpoint.com/v2/url?u=http-3A__cassandra.apache.org=DQIBaQ=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow=MZ9nLcNNhQZkuXyH0NBbP1kSEE2M-SYgyVqZ88IJcXY=pLP3udocOcAG6k_sAb9p8tcAhtOhpFm6JB7owGhPQEs=
>  suggests
>it's acceptable for production.  It turns out using 4 of the built in
>datatypes of the database result in the server being unable to restart
>without clearing out the commit logs and running a repair.  That screams
>critical to me.  You shouldn't even be able to install 3.5 without the
>patch I've supplied - that bug is a ticking time bomb for anyone that
>installs it.
>
>On Wed, Sep 14, 2016 at 8:12 PM Michael Shuler 
>wrote:
>
>> What's preventing the use of the 3.6 or 3.7 releases where this bug is
>> already fixed? This is also fixed in the 3.0.6/7/8 releases.
>>
>> Michael
>>
>> On 09/14/2016 08:30 PM, Jonathan Haddad wrote:
>> > Unfortunately CASSANDRA-11618 was fixed in 3.6 but was not back ported to
>> > 3.5 as well, and it makes Cassandra effectively unusable if someone is
>> > using any of the 4 types affected in any of their schema.
>> >
>> > I have cherry picked & merged the patch back to here and will put it in a
>> > JIRA as well tonight, I just wanted to get the ball rolling asap on this.
>> >
>> >
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_rustyrazorblade_cassandra_tree_fix-5Fcommitlog-5Fexception=DQIBaQ=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow=MZ9nLcNNhQZkuXyH0NBbP1kSEE2M-SYgyVqZ88IJcXY=ktY5tkT-nO1jtyc0EicbgZHXJYl03DvzuxqzyyOgzII=
>>  
>> >
>> > Jon
>> >
>>
>>


smime.p7s
Description: S/MIME cryptographic signature


Re: Proposal - 3.5.1

2016-09-14 Thread Jonathan Haddad
Common sense is what prevents someone from upgrading to yet another
completely unknown version with new features which have probably broken
even more stuff that nobody is aware of.  The folks I'm helping right
deployed 3.5 when they got started because cassandra.apache.org suggests
it's acceptable for production.  It turns out using 4 of the built in
datatypes of the database result in the server being unable to restart
without clearing out the commit logs and running a repair.  That screams
critical to me.  You shouldn't even be able to install 3.5 without the
patch I've supplied - that bug is a ticking time bomb for anyone that
installs it.

On Wed, Sep 14, 2016 at 8:12 PM Michael Shuler 
wrote:

> What's preventing the use of the 3.6 or 3.7 releases where this bug is
> already fixed? This is also fixed in the 3.0.6/7/8 releases.
>
> Michael
>
> On 09/14/2016 08:30 PM, Jonathan Haddad wrote:
> > Unfortunately CASSANDRA-11618 was fixed in 3.6 but was not back ported to
> > 3.5 as well, and it makes Cassandra effectively unusable if someone is
> > using any of the 4 types affected in any of their schema.
> >
> > I have cherry picked & merged the patch back to here and will put it in a
> > JIRA as well tonight, I just wanted to get the ball rolling asap on this.
> >
> >
> https://github.com/rustyrazorblade/cassandra/tree/fix_commitlog_exception
> >
> > Jon
> >
>
>


Re: Proposal - 3.5.1

2016-09-14 Thread Michael Shuler
What's preventing the use of the 3.6 or 3.7 releases where this bug is
already fixed? This is also fixed in the 3.0.6/7/8 releases.

Michael

On 09/14/2016 08:30 PM, Jonathan Haddad wrote:
> Unfortunately CASSANDRA-11618 was fixed in 3.6 but was not back ported to
> 3.5 as well, and it makes Cassandra effectively unusable if someone is
> using any of the 4 types affected in any of their schema.
> 
> I have cherry picked & merged the patch back to here and will put it in a
> JIRA as well tonight, I just wanted to get the ball rolling asap on this.
> 
> https://github.com/rustyrazorblade/cassandra/tree/fix_commitlog_exception
> 
> Jon
>