Re: Wrapping up tick-tock

2017-01-10 Thread Dikang Gu
+1 to 6 months *major* release.

I think we still need *minor* release containing bug fixes (or small
features maybe?), which I think would make sense to release more
frequently, like monthly. So that we won't need to wait for 6 months for
bug fixes, or have to maintain a lot of patches internally.

On Tue, Jan 10, 2017 at 1:56 PM, sankalp kohli 
wrote:

> +1 to 6 month release and ending tick/tock
>
> On Tue, Jan 10, 2017 at 9:44 AM, Nate McCall  wrote:
>
> > >
> > > If this question is to outside the topic and more appropriate for a
> > > different thread I'm happy to put a hold on it until the release
> cadence
> > is
> > > agreed.
> > >
> >
> > Let's please do put this on another thread. Thanks for bringing it up
> > though as it is important and needs discussion.
> >
>



-- 
Dikang


Re: Rollback procedure for Cassandra Upgrade.

2017-01-10 Thread Edward Capriolo
On Tuesday, January 10, 2017, Romain Hardouin 
wrote:

> To be able to downgrade we should be able to pin both commitlog and
> sstables versions, e.g. -Dcassandra.commitlog_version=3
> -Dcassandra.sstable_version=jb
> That would be awesome because it would decorrelate binaries version and
> data version. Upgrades would be much less risky so I guess that adoption of
> new C* versions would increase.
> Best,
> Romain
>
> Le Mardi 10 janvier 2017 6h03, Brandon Williams  > a écrit :
>
>
>  However, it's good to determine *how* it failed.  If nodetool just died or
> timed out, that's no big deal, it'll finish.
>
> On Mon, Jan 9, 2017 at 11:00 PM, Jonathan Haddad  > wrote:
>
> > There's no downgrade procedure. You either upgrade or you go back to a
> > snapshot from the previous version.
> > On Mon, Jan 9, 2017 at 8:13 PM Prakash Chauhan <
> > prakash.chau...@ericsson.com >
> > wrote:
> >
> > > Hi All ,
> > >
> > > Do we have an official procedure to rollback the upgrade of C* from
> 2.0.x
> > > to 2.1.x ?
> > >
> > >
> > > Description:
> > > I have upgraded C* from 2.0.x to 2.1.x . As a part of upgrade
> procedure ,
> > > I have to run nodetool upgradesstables .
> > > What if the command fails in the middle ? Some of the sstables will be
> in
> > > newer format (*-ka-*) where as other might be in older format(*-jb-*).
> > >
> > > Do we have a standard procedure to do rollback in such cases?
> > >
> > >
> > >
> > > Regards,
> > > Prakash Chauhan.
> > >
> > >
> >
>
>
>


It would be amazing if the version could output commitlog and sstables at a
specific version so roll backs are possible.


-- 
Sorry this was sent from mobile. Will do less grammar and spell check than
usual.


Re: Wrapping up tick-tock

2017-01-10 Thread sankalp kohli
+1 to 6 month release and ending tick/tock

On Tue, Jan 10, 2017 at 9:44 AM, Nate McCall  wrote:

> >
> > If this question is to outside the topic and more appropriate for a
> > different thread I'm happy to put a hold on it until the release cadence
> is
> > agreed.
> >
>
> Let's please do put this on another thread. Thanks for bringing it up
> though as it is important and needs discussion.
>


Re: Per blockng release on dtest

2017-01-10 Thread Jeff Jirsa
+1


On Tue, Jan 10, 2017 at 9:23 AM, Aleksey Yeschenko 
wrote:

> That’s a good point.
>
> So 3.11 after 3.10, then move on to 3.11.x further bug fix releases?
>
> +1 to that.
>
> --
> AY
>
> On 10 January 2017 at 17:22:09, Michael Shuler (mich...@pbandjelly.org)
> wrote:
>
> I had the same thought. 3.10 is the tick, so a 3.11 bugfix tock follows
> the intended final fix release for closing out tick-tock. Throwing a
> 3.10.1 out there would add more user confusion and would be the exact
> same contents as a 3.11 release versioned package set anyway.
>
> --
> Michael
>
> On 01/10/2017 11:18 AM, Josh McKenzie wrote:
> > | If someone tries to upgrade 3.10 to whatever 4.0 ends up being I
> > think they will hit the wrong answer bug. So I would advocate for
> > having the fix brought
> > into 3.10, but it was broken in 3.9 as well.
> >
> > Seems like we'd just release that as 3.10.1 (instead of 3.11) and just
> > tell people "you can upgrade to 4.0 w/latest version of 3.10". This
> > does violate the "even releases features, odd releases bugfix", so
> > maybe a 3.11 as final 3.X line would help keep that consistent?
> >
> > I'd rather not open the can of worms of back-porting this to 3.9 as
> > well to hold to our claim of "any 3.X can go to 4.0".
> >
> > On Tue, Jan 10, 2017 at 12:13 PM, Ariel Weisberg 
> wrote:
> >> Hi,
> >>
> >>
> >>
> >> The upgrade tests are tricky because they upgrade from an existing
> >> release to a current release. The bug is in 3.9 and won't be fixed until
> >> 3.11 because the test checks out and builds 3.9 right now. 3.10 doesn't
> >> include the commit that fixes the issue so it will fail after 3.10 is
> >> released and the test is updated to check out 3.10.
> >>
> >>
> >> We claim to support upgrade from any 3.x version to 4.0. If someone
> >> tries to upgrade 3.10 to whatever 4.0 ends up being I think they will
> >> hit the wrong answer bug. So I would advocate for having the fix brought
> >> into 3.10, but it was broken in 3.9 as well.
> >>
> >>
> >> Some of the tests fail because trunk complains of unreadable stables and
> >> I suspect that isn't a bug it's just something that is no longer
> >> supported due to thrift removal, but I haven't fixed those yet. Those
> >> are probably issues with trunk or the tests.
> >>
> >>
> >> Others fail for reasons I haven't triaged yet. I'm struggling with my
> >> own issues getting the tests to run locally.
> >>
> >>
> >> Ariel
> >>
> >>
> >>
> >> On Tue, Jan 10, 2017, at 11:49 AM, Nate McCall wrote:
> >>
> 
> >>
>  I concede it would be fine to do it gradually. Once the pace of
>  issues
>  introduced by new development is beaten by the pace at which
>  they are
>  addressed I think things will go well.
> >>
> >>>
> >>
> >>> So from Michael's JIRA query:
> >>
> >>> https://issues.apache.org/jira/browse/CASSANDRA-12617?
> jql=project%20%3D%20CASSANDRA%20AND%20fixVersion%20%3D%203.
> 10%20AND%20resolution%20%3D%20Unresolved
> >>>
> >>
> >>> Are we good for 3.10 after we get those cleaned up?
> >>
> >>>
> >>
> >>> Ariel, you made reference to:
> >>
> >>> https://github.com/apache/cassandra/commit/
> c612cd8d7dbd24888c216ad53f974686b88dd601
> >>>
> >>
> >>> Do we need to re-open an issue to have this applied to 3.10 and add it
> >>> to the above list?
> >>
> >>>
> >>
> 
> >>
>  On Tue, Jan 10, 2017, at 11:17 AM, Josh McKenzie wrote:
> >>
> >
> >>
> > Sankalp's proposal of us progressively tightening up our standards
> > allows
> > us to get code out the door and regain some lost momentum on
> > the 3.10
> > release failures and blocking, and gives us time as a community to
> > adjust
> > our behavior without the burden of an ever-later slipped release
> > hanging
> > over our heads. There's plenty of bugfixes in the 3.X line; the
> > more time
> > people can have to kick the tires on that code, the more things
> > we can
> > find
> >>
> > and the better future releases will be.
> >>
> >>>
> >>
> >>>
> >>
> >>> +1 On gradually moving to this. Dropping releases with huge change
> >>
> >>> lists has never gone well for us in the past.
> >>
> >>
>
>


Re: Wrapping up tick-tock

2017-01-10 Thread Nate McCall
>
> If this question is to outside the topic and more appropriate for a
> different thread I'm happy to put a hold on it until the release cadence is
> agreed.
>

Let's please do put this on another thread. Thanks for bringing it up
though as it is important and needs discussion.


Re: Wrapping up tick-tock

2017-01-10 Thread Ben Bromhead
+1 on killing tick/tock
+1 on six months

What is the appetite for a longer bug fix period for some releases (e.g.
every second release gets 18 - 24 months critical bug fixes)?

Currently only vendors / large users are maintaining long running releases,
given this work is already happening I would rather the effort happen under
the Apache umbrella and be available for all user if existing long term
release maintainers are happy to do so.

If this question is to outside the topic and more appropriate for a
different thread I'm happy to put a hold on it until the release cadence is
agreed.



On Tue, 10 Jan 2017 at 09:27 Nate McCall  wrote:

> > I agreed with you at the time that the yearly cycle was too long to be
> > adding features before cutting a release, and still do now.  Instead of
> > elastic banding all the way back to a process which wasn't working
> before,
> > why not try somewhere in the middle?  A release every 6 months (with
> > monthly bug fixes for a year) gives:
> >
> > 1. long enough time to stabilize (1 year vs 1 month)
> > 2. not so long things sit around untested forever
> > 3. only 2 releases (current and previous) to do bug fix support at any
> > given time.
>
> The third reason is particularly appealing.
>
> +1 on six months.
> +1 on killing tick/tock at 3.10 (with a potential bugfix follow up per
> the other thread).
>
-- 
Ben Bromhead
CTO | Instaclustr 
+1 650 284 9692
Managed Cassandra / Spark on AWS, Azure and Softlayer


Re: Per blockng release on dtest

2017-01-10 Thread Nate McCall
>
> So 3.11 after 3.10, then move on to 3.11.x further bug fix releases?
>

+1


Re: Per blockng release on dtest

2017-01-10 Thread Jonathan Ellis
+1

On Tue, Jan 10, 2017 at 11:23 AM, Aleksey Yeschenko 
wrote:

> That’s a good point.
>
> So 3.11 after 3.10, then move on to 3.11.x further bug fix releases?
>
> +1 to that.
>
> --
> AY
>
> On 10 January 2017 at 17:22:09, Michael Shuler (mich...@pbandjelly.org)
> wrote:
>
> I had the same thought. 3.10 is the tick, so a 3.11 bugfix tock follows
> the intended final fix release for closing out tick-tock. Throwing a
> 3.10.1 out there would add more user confusion and would be the exact
> same contents as a 3.11 release versioned package set anyway.
>
> --
> Michael
>
> On 01/10/2017 11:18 AM, Josh McKenzie wrote:
> > | If someone tries to upgrade 3.10 to whatever 4.0 ends up being I
> > think they will hit the wrong answer bug. So I would advocate for
> > having the fix brought
> > into 3.10, but it was broken in 3.9 as well.
> >
> > Seems like we'd just release that as 3.10.1 (instead of 3.11) and just
> > tell people "you can upgrade to 4.0 w/latest version of 3.10". This
> > does violate the "even releases features, odd releases bugfix", so
> > maybe a 3.11 as final 3.X line would help keep that consistent?
> >
> > I'd rather not open the can of worms of back-porting this to 3.9 as
> > well to hold to our claim of "any 3.X can go to 4.0".
> >
> > On Tue, Jan 10, 2017 at 12:13 PM, Ariel Weisberg 
> wrote:
> >> Hi,
> >>
> >>
> >>
> >> The upgrade tests are tricky because they upgrade from an existing
> >> release to a current release. The bug is in 3.9 and won't be fixed until
> >> 3.11 because the test checks out and builds 3.9 right now. 3.10 doesn't
> >> include the commit that fixes the issue so it will fail after 3.10 is
> >> released and the test is updated to check out 3.10.
> >>
> >>
> >> We claim to support upgrade from any 3.x version to 4.0. If someone
> >> tries to upgrade 3.10 to whatever 4.0 ends up being I think they will
> >> hit the wrong answer bug. So I would advocate for having the fix brought
> >> into 3.10, but it was broken in 3.9 as well.
> >>
> >>
> >> Some of the tests fail because trunk complains of unreadable stables and
> >> I suspect that isn't a bug it's just something that is no longer
> >> supported due to thrift removal, but I haven't fixed those yet. Those
> >> are probably issues with trunk or the tests.
> >>
> >>
> >> Others fail for reasons I haven't triaged yet. I'm struggling with my
> >> own issues getting the tests to run locally.
> >>
> >>
> >> Ariel
> >>
> >>
> >>
> >> On Tue, Jan 10, 2017, at 11:49 AM, Nate McCall wrote:
> >>
> 
> >>
>  I concede it would be fine to do it gradually. Once the pace of
>  issues
>  introduced by new development is beaten by the pace at which
>  they are
>  addressed I think things will go well.
> >>
> >>>
> >>
> >>> So from Michael's JIRA query:
> >>
> >>> https://issues.apache.org/jira/browse/CASSANDRA-12617?
> jql=project%20%3D%20CASSANDRA%20AND%20fixVersion%20%3D%203.
> 10%20AND%20resolution%20%3D%20Unresolved
> >>>
> >>
> >>> Are we good for 3.10 after we get those cleaned up?
> >>
> >>>
> >>
> >>> Ariel, you made reference to:
> >>
> >>> https://github.com/apache/cassandra/commit/
> c612cd8d7dbd24888c216ad53f974686b88dd601
> >>>
> >>
> >>> Do we need to re-open an issue to have this applied to 3.10 and add it
> >>> to the above list?
> >>
> >>>
> >>
> 
> >>
>  On Tue, Jan 10, 2017, at 11:17 AM, Josh McKenzie wrote:
> >>
> >
> >>
> > Sankalp's proposal of us progressively tightening up our standards
> > allows
> > us to get code out the door and regain some lost momentum on
> > the 3.10
> > release failures and blocking, and gives us time as a community to
> > adjust
> > our behavior without the burden of an ever-later slipped release
> > hanging
> > over our heads. There's plenty of bugfixes in the 3.X line; the
> > more time
> > people can have to kick the tires on that code, the more things
> > we can
> > find
> >>
> > and the better future releases will be.
> >>
> >>>
> >>
> >>>
> >>
> >>> +1 On gradually moving to this. Dropping releases with huge change
> >>
> >>> lists has never gone well for us in the past.
> >>
> >>
>
>


-- 
Jonathan Ellis
co-founder, http://www.datastax.com
@spyced


Re: Wrapping up tick-tock

2017-01-10 Thread Nate McCall
> I agreed with you at the time that the yearly cycle was too long to be
> adding features before cutting a release, and still do now.  Instead of
> elastic banding all the way back to a process which wasn't working before,
> why not try somewhere in the middle?  A release every 6 months (with
> monthly bug fixes for a year) gives:
>
> 1. long enough time to stabilize (1 year vs 1 month)
> 2. not so long things sit around untested forever
> 3. only 2 releases (current and previous) to do bug fix support at any
> given time.

The third reason is particularly appealing.

+1 on six months.
+1 on killing tick/tock at 3.10 (with a potential bugfix follow up per
the other thread).


Re: Per blockng release on dtest

2017-01-10 Thread Josh McKenzie
> So 3.11 after 3.10, then move on to 3.11.x further bug fix releases?
+1

On Tue, Jan 10, 2017 at 12:23 PM, Aleksey Yeschenko  wrote:
> That’s a good point.
>
> So 3.11 after 3.10, then move on to 3.11.x further bug fix releases?
>
> +1 to that.
>
> --
> AY
>
> On 10 January 2017 at 17:22:09, Michael Shuler (mich...@pbandjelly.org) wrote:
>
> I had the same thought. 3.10 is the tick, so a 3.11 bugfix tock follows
> the intended final fix release for closing out tick-tock. Throwing a
> 3.10.1 out there would add more user confusion and would be the exact
> same contents as a 3.11 release versioned package set anyway.
>
> --
> Michael
>
> On 01/10/2017 11:18 AM, Josh McKenzie wrote:
>> | If someone tries to upgrade 3.10 to whatever 4.0 ends up being I
>> think they will hit the wrong answer bug. So I would advocate for
>> having the fix brought
>> into 3.10, but it was broken in 3.9 as well.
>>
>> Seems like we'd just release that as 3.10.1 (instead of 3.11) and just
>> tell people "you can upgrade to 4.0 w/latest version of 3.10". This
>> does violate the "even releases features, odd releases bugfix", so
>> maybe a 3.11 as final 3.X line would help keep that consistent?
>>
>> I'd rather not open the can of worms of back-porting this to 3.9 as
>> well to hold to our claim of "any 3.X can go to 4.0".
>>
>> On Tue, Jan 10, 2017 at 12:13 PM, Ariel Weisberg  wrote:
>>> Hi,
>>>
>>>
>>>
>>> The upgrade tests are tricky because they upgrade from an existing
>>> release to a current release. The bug is in 3.9 and won't be fixed until
>>> 3.11 because the test checks out and builds 3.9 right now. 3.10 doesn't
>>> include the commit that fixes the issue so it will fail after 3.10 is
>>> released and the test is updated to check out 3.10.
>>>
>>>
>>> We claim to support upgrade from any 3.x version to 4.0. If someone
>>> tries to upgrade 3.10 to whatever 4.0 ends up being I think they will
>>> hit the wrong answer bug. So I would advocate for having the fix brought
>>> into 3.10, but it was broken in 3.9 as well.
>>>
>>>
>>> Some of the tests fail because trunk complains of unreadable stables and
>>> I suspect that isn't a bug it's just something that is no longer
>>> supported due to thrift removal, but I haven't fixed those yet. Those
>>> are probably issues with trunk or the tests.
>>>
>>>
>>> Others fail for reasons I haven't triaged yet. I'm struggling with my
>>> own issues getting the tests to run locally.
>>>
>>>
>>> Ariel
>>>
>>>
>>>
>>> On Tue, Jan 10, 2017, at 11:49 AM, Nate McCall wrote:
>>>
>
>>>
> I concede it would be fine to do it gradually. Once the pace of
> issues
> introduced by new development is beaten by the pace at which
> they are
> addressed I think things will go well.
>>>

>>>
 So from Michael's JIRA query:
>>>
 https://issues.apache.org/jira/browse/CASSANDRA-12617?jql=project%20%3D%20CASSANDRA%20AND%20fixVersion%20%3D%203.10%20AND%20resolution%20%3D%20Unresolved

>>>
 Are we good for 3.10 after we get those cleaned up?
>>>

>>>
 Ariel, you made reference to:
>>>
 https://github.com/apache/cassandra/commit/c612cd8d7dbd24888c216ad53f974686b88dd601

>>>
 Do we need to re-open an issue to have this applied to 3.10 and add it
 to the above list?
>>>

>>>
>
>>>
> On Tue, Jan 10, 2017, at 11:17 AM, Josh McKenzie wrote:
>>>
>>
>>>
>> Sankalp's proposal of us progressively tightening up our standards
>> allows
>> us to get code out the door and regain some lost momentum on
>> the 3.10
>> release failures and blocking, and gives us time as a community to
>> adjust
>> our behavior without the burden of an ever-later slipped release
>> hanging
>> over our heads. There's plenty of bugfixes in the 3.X line; the
>> more time
>> people can have to kick the tires on that code, the more things
>> we can
>> find
>>>
>> and the better future releases will be.
>>>

>>>

>>>
 +1 On gradually moving to this. Dropping releases with huge change
>>>
 lists has never gone well for us in the past.
>>>
>>>
>


Re: Per blockng release on dtest

2017-01-10 Thread Aleksey Yeschenko
That’s a good point.

So 3.11 after 3.10, then move on to 3.11.x further bug fix releases?

+1 to that.

-- 
AY

On 10 January 2017 at 17:22:09, Michael Shuler (mich...@pbandjelly.org) wrote:

I had the same thought. 3.10 is the tick, so a 3.11 bugfix tock follows  
the intended final fix release for closing out tick-tock. Throwing a  
3.10.1 out there would add more user confusion and would be the exact  
same contents as a 3.11 release versioned package set anyway.  

--  
Michael  

On 01/10/2017 11:18 AM, Josh McKenzie wrote:  
> | If someone tries to upgrade 3.10 to whatever 4.0 ends up being I  
> think they will hit the wrong answer bug. So I would advocate for  
> having the fix brought  
> into 3.10, but it was broken in 3.9 as well.  
>  
> Seems like we'd just release that as 3.10.1 (instead of 3.11) and just  
> tell people "you can upgrade to 4.0 w/latest version of 3.10". This  
> does violate the "even releases features, odd releases bugfix", so  
> maybe a 3.11 as final 3.X line would help keep that consistent?  
>  
> I'd rather not open the can of worms of back-porting this to 3.9 as  
> well to hold to our claim of "any 3.X can go to 4.0".  
>  
> On Tue, Jan 10, 2017 at 12:13 PM, Ariel Weisberg  wrote:  
>> Hi,  
>>  
>>  
>>  
>> The upgrade tests are tricky because they upgrade from an existing  
>> release to a current release. The bug is in 3.9 and won't be fixed until  
>> 3.11 because the test checks out and builds 3.9 right now. 3.10 doesn't  
>> include the commit that fixes the issue so it will fail after 3.10 is  
>> released and the test is updated to check out 3.10.  
>>  
>>  
>> We claim to support upgrade from any 3.x version to 4.0. If someone  
>> tries to upgrade 3.10 to whatever 4.0 ends up being I think they will  
>> hit the wrong answer bug. So I would advocate for having the fix brought  
>> into 3.10, but it was broken in 3.9 as well.  
>>  
>>  
>> Some of the tests fail because trunk complains of unreadable stables and  
>> I suspect that isn't a bug it's just something that is no longer  
>> supported due to thrift removal, but I haven't fixed those yet. Those  
>> are probably issues with trunk or the tests.  
>>  
>>  
>> Others fail for reasons I haven't triaged yet. I'm struggling with my  
>> own issues getting the tests to run locally.  
>>  
>>  
>> Ariel  
>>  
>>  
>>  
>> On Tue, Jan 10, 2017, at 11:49 AM, Nate McCall wrote:  
>>  
  
>>  
 I concede it would be fine to do it gradually. Once the pace of  
 issues  
 introduced by new development is beaten by the pace at which  
 they are  
 addressed I think things will go well.  
>>  
>>>  
>>  
>>> So from Michael's JIRA query:  
>>  
>>> https://issues.apache.org/jira/browse/CASSANDRA-12617?jql=project%20%3D%20CASSANDRA%20AND%20fixVersion%20%3D%203.10%20AND%20resolution%20%3D%20Unresolved
>>>   
>>>  
>>  
>>> Are we good for 3.10 after we get those cleaned up?  
>>  
>>>  
>>  
>>> Ariel, you made reference to:  
>>  
>>> https://github.com/apache/cassandra/commit/c612cd8d7dbd24888c216ad53f974686b88dd601
>>>   
>>>  
>>  
>>> Do we need to re-open an issue to have this applied to 3.10 and add it  
>>> to the above list?  
>>  
>>>  
>>  
  
>>  
 On Tue, Jan 10, 2017, at 11:17 AM, Josh McKenzie wrote:  
>>  
>  
>>  
> Sankalp's proposal of us progressively tightening up our standards  
> allows  
> us to get code out the door and regain some lost momentum on  
> the 3.10  
> release failures and blocking, and gives us time as a community to  
> adjust  
> our behavior without the burden of an ever-later slipped release  
> hanging  
> over our heads. There's plenty of bugfixes in the 3.X line; the  
> more time  
> people can have to kick the tires on that code, the more things  
> we can  
> find  
>>  
> and the better future releases will be.  
>>  
>>>  
>>  
>>>  
>>  
>>> +1 On gradually moving to this. Dropping releases with huge change  
>>  
>>> lists has never gone well for us in the past.  
>>  
>>  



Re: Per blockng release on dtest

2017-01-10 Thread Michael Shuler
I had the same thought. 3.10 is the tick, so a 3.11 bugfix tock follows
the intended final fix release for closing out tick-tock. Throwing a
3.10.1 out there would add more user confusion and would be the exact
same contents as a 3.11 release versioned package set anyway.

-- 
Michael

On 01/10/2017 11:18 AM, Josh McKenzie wrote:
> | If someone tries to upgrade 3.10 to whatever 4.0 ends up being I
> think they will hit the wrong answer bug. So I would advocate for
> having the fix brought
> into 3.10, but it was broken in 3.9 as well.
> 
> Seems like we'd just release that as 3.10.1 (instead of 3.11) and just
> tell people "you can upgrade to 4.0 w/latest version of 3.10". This
> does violate the "even releases features, odd releases bugfix", so
> maybe a 3.11 as final 3.X line would help keep that consistent?
> 
> I'd rather not open the can of worms of back-porting this to 3.9 as
> well to hold to our claim of "any 3.X can go to 4.0".
> 
> On Tue, Jan 10, 2017 at 12:13 PM, Ariel Weisberg  wrote:
>> Hi,
>>
>>
>>
>> The upgrade tests are tricky because they upgrade from an existing
>> release to a current release. The bug is in 3.9 and won't be fixed until
>> 3.11 because the test  checks out and builds 3.9 right now. 3.10 doesn't
>> include the commit that fixes the issue so it will fail after 3.10 is
>> released and the test is updated to check out 3.10.
>>
>>
>> We claim to support upgrade from any 3.x version to 4.0. If someone
>> tries to upgrade 3.10 to whatever 4.0 ends up being I think they will
>> hit the wrong answer bug. So I would advocate for having the fix brought
>> into 3.10, but it was broken in 3.9 as well.
>>
>>
>> Some of the tests fail because trunk complains of unreadable stables and
>> I suspect that isn't a bug it's just something that is no longer
>> supported due to thrift removal, but I haven't fixed those yet. Those
>> are probably issues with trunk or the tests.
>>
>>
>> Others fail for reasons I haven't triaged yet. I'm struggling with my
>> own issues getting the tests to run locally.
>>
>>
>> Ariel
>>
>>
>>
>> On Tue, Jan 10, 2017, at 11:49 AM, Nate McCall wrote:
>>

>>
 I concede it would be fine to do it gradually. Once the pace of
 issues
 introduced by new development is beaten by the pace at which
 they are
 addressed I think things will go well.
>>
>>>
>>
>>> So from Michael's JIRA query:
>>
>>> https://issues.apache.org/jira/browse/CASSANDRA-12617?jql=project%20%3D%20CASSANDRA%20AND%20fixVersion%20%3D%203.10%20AND%20resolution%20%3D%20Unresolved
>>>
>>
>>> Are we good for 3.10 after we get those cleaned up?
>>
>>>
>>
>>> Ariel, you made reference to:
>>
>>> https://github.com/apache/cassandra/commit/c612cd8d7dbd24888c216ad53f974686b88dd601
>>>
>>
>>> Do we need to re-open an issue to have this applied to 3.10 and add it
>>> to the above list?
>>
>>>
>>

>>
 On Tue, Jan 10, 2017, at 11:17 AM, Josh McKenzie wrote:
>>
>
>>
> Sankalp's proposal of us progressively tightening up our standards
> allows
> us to get code out the door and regain some lost momentum on
> the 3.10
> release failures and blocking, and gives us time as a community to
> adjust
> our behavior without the burden of an ever-later slipped release
> hanging
> over our heads. There's plenty of bugfixes in the 3.X line; the
> more time
> people can have to kick the tires on that code, the more things
> we can
> find
>>
> and the better future releases will be.
>>
>>>
>>
>>>
>>
>>> +1 On gradually moving to this. Dropping releases with huge change
>>
>>> lists has never gone well for us in the past.
>>
>>



Re: Per blockng release on dtest

2017-01-10 Thread Nate McCall
> Seems like we'd just release that as 3.10.1 (instead of 3.11) and just
> tell people "you can upgrade to 4.0 w/latest version of 3.10". This
> does violate the "even releases features, odd releases bugfix", so
> maybe a 3.11 as final 3.X line would help keep that consistent?

This feels like a decent compromise to me.


Re: Per blockng release on dtest

2017-01-10 Thread Josh McKenzie
| If someone tries to upgrade 3.10 to whatever 4.0 ends up being I
think they will hit the wrong answer bug. So I would advocate for
having the fix brought
into 3.10, but it was broken in 3.9 as well.

Seems like we'd just release that as 3.10.1 (instead of 3.11) and just
tell people "you can upgrade to 4.0 w/latest version of 3.10". This
does violate the "even releases features, odd releases bugfix", so
maybe a 3.11 as final 3.X line would help keep that consistent?

I'd rather not open the can of worms of back-porting this to 3.9 as
well to hold to our claim of "any 3.X can go to 4.0".

On Tue, Jan 10, 2017 at 12:13 PM, Ariel Weisberg  wrote:
> Hi,
>
>
>
> The upgrade tests are tricky because they upgrade from an existing
> release to a current release. The bug is in 3.9 and won't be fixed until
> 3.11 because the test  checks out and builds 3.9 right now. 3.10 doesn't
> include the commit that fixes the issue so it will fail after 3.10 is
> released and the test is updated to check out 3.10.
>
>
> We claim to support upgrade from any 3.x version to 4.0. If someone
> tries to upgrade 3.10 to whatever 4.0 ends up being I think they will
> hit the wrong answer bug. So I would advocate for having the fix brought
> into 3.10, but it was broken in 3.9 as well.
>
>
> Some of the tests fail because trunk complains of unreadable stables and
> I suspect that isn't a bug it's just something that is no longer
> supported due to thrift removal, but I haven't fixed those yet. Those
> are probably issues with trunk or the tests.
>
>
> Others fail for reasons I haven't triaged yet. I'm struggling with my
> own issues getting the tests to run locally.
>
>
> Ariel
>
>
>
> On Tue, Jan 10, 2017, at 11:49 AM, Nate McCall wrote:
>
>> >
>
>> > I concede it would be fine to do it gradually. Once the pace of
>> > issues
>> > introduced by new development is beaten by the pace at which
>> > they are
>> > addressed I think things will go well.
>
>>
>
>> So from Michael's JIRA query:
>
>> https://issues.apache.org/jira/browse/CASSANDRA-12617?jql=project%20%3D%20CASSANDRA%20AND%20fixVersion%20%3D%203.10%20AND%20resolution%20%3D%20Unresolved
>>
>
>> Are we good for 3.10 after we get those cleaned up?
>
>>
>
>> Ariel, you made reference to:
>
>> https://github.com/apache/cassandra/commit/c612cd8d7dbd24888c216ad53f974686b88dd601
>>
>
>> Do we need to re-open an issue to have this applied to 3.10 and add it
>> to the above list?
>
>>
>
>> >
>
>> > On Tue, Jan 10, 2017, at 11:17 AM, Josh McKenzie wrote:
>
>> >>
>
>> >> Sankalp's proposal of us progressively tightening up our standards
>> >> allows
>> >> us to get code out the door and regain some lost momentum on
>> >> the 3.10
>> >> release failures and blocking, and gives us time as a community to
>> >> adjust
>> >> our behavior without the burden of an ever-later slipped release
>> >> hanging
>> >> over our heads. There's plenty of bugfixes in the 3.X line; the
>> >> more time
>> >> people can have to kick the tires on that code, the more things
>> >> we can
>> >> find
>
>> >> and the better future releases will be.
>
>>
>
>>
>
>> +1 On gradually moving to this. Dropping releases with huge change
>
>> lists has never gone well for us in the past.
>
>


Re: Per blockng release on dtest

2017-01-10 Thread Ariel Weisberg
Hi,



The upgrade tests are tricky because they upgrade from an existing
release to a current release. The bug is in 3.9 and won't be fixed until
3.11 because the test  checks out and builds 3.9 right now. 3.10 doesn't
include the commit that fixes the issue so it will fail after 3.10 is
released and the test is updated to check out 3.10.


We claim to support upgrade from any 3.x version to 4.0. If someone
tries to upgrade 3.10 to whatever 4.0 ends up being I think they will
hit the wrong answer bug. So I would advocate for having the fix brought
into 3.10, but it was broken in 3.9 as well.


Some of the tests fail because trunk complains of unreadable stables and
I suspect that isn't a bug it's just something that is no longer
supported due to thrift removal, but I haven't fixed those yet. Those
are probably issues with trunk or the tests.


Others fail for reasons I haven't triaged yet. I'm struggling with my
own issues getting the tests to run locally.


Ariel



On Tue, Jan 10, 2017, at 11:49 AM, Nate McCall wrote:

> >

> > I concede it would be fine to do it gradually. Once the pace of
> > issues
> > introduced by new development is beaten by the pace at which
> > they are
> > addressed I think things will go well.

>

> So from Michael's JIRA query:

> https://issues.apache.org/jira/browse/CASSANDRA-12617?jql=project%20%3D%20CASSANDRA%20AND%20fixVersion%20%3D%203.10%20AND%20resolution%20%3D%20Unresolved
>

> Are we good for 3.10 after we get those cleaned up?

>

> Ariel, you made reference to:

> https://github.com/apache/cassandra/commit/c612cd8d7dbd24888c216ad53f974686b88dd601
>

> Do we need to re-open an issue to have this applied to 3.10 and add it
> to the above list?

>

> >

> > On Tue, Jan 10, 2017, at 11:17 AM, Josh McKenzie wrote:

> >>

> >> Sankalp's proposal of us progressively tightening up our standards
> >> allows
> >> us to get code out the door and regain some lost momentum on
> >> the 3.10
> >> release failures and blocking, and gives us time as a community to
> >> adjust
> >> our behavior without the burden of an ever-later slipped release
> >> hanging
> >> over our heads. There's plenty of bugfixes in the 3.X line; the
> >> more time
> >> people can have to kick the tires on that code, the more things
> >> we can
> >> find

> >> and the better future releases will be.

>

>

> +1 On gradually moving to this. Dropping releases with huge change

> lists has never gone well for us in the past.




Re: Per blockng release on dtest

2017-01-10 Thread Aleksey Yeschenko
I would personally favour pushing 3.10 out without waiting for the pretty 
innocent
#13113 resolution.

With the amount of bug fixes accumulated in the 3.X branch it’s borderline
irresponsible to not release them out to the users.

-- 
AY

On 10 January 2017 at 17:05:57, Michael Shuler (mich...@pbandjelly.org) wrote:

Generally, fixver has only been set during commits - I only marked 3.10  
and blocker status to highlight the few that failed votes, in order to  
sort of cheerlead "fix me so we can release!" JIRA tickets. The full  
test-failure list is probably the more "realistic" view, since any of  
those may occur. As I also just replied, an auth_test method is the  
current failure on c-3.11 branch. Mark it as a blocker? Re-run the job  
and hope for green? Unmark the current 3.10 fixver blockers, since they  
didn't fail? (Likely to get some other failure or maybe a full pass)  

--  
Michael  

On 01/10/2017 10:56 AM, Josh McKenzie wrote:  
> I assume you meant the query w/out 12617 embedded?  
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20CASSANDRA%20AND%20fixVersion%20%3D%203.10%20AND%20resolution%20%3D%20Unresolved
>   
>  
> Do we have confidence that all test failures have fixVersion attached  
> correctly? The list of test failures w/out fixVersion is pretty daunting:  
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20CASSANDRA%20AND%20labels%20in%20(test%2C%20test-failure%2C%20dtest%2C%20unittest)%20AND%20resolution%20%3D%20Unresolved%20AND%20fixversion%20%3D%20null%20and%20labels%20!%3D%20windows%20ORDER%20BY%20created%20asc
>   
>  
> On Tue, Jan 10, 2017 at 11:49 AM, Nate McCall  wrote:  
>  
>>>  
>>> I concede it would be fine to do it gradually. Once the pace of issues  
>>> introduced by new development is beaten by the pace at which they are  
>>> addressed I think things will go well.  
>>  
>> So from Michael's JIRA query:  
>> https://issues.apache.org/jira/browse/CASSANDRA-12617?  
>> jql=project%20%3D%20CASSANDRA%20AND%20fixVersion%20%3D%203.  
>> 10%20AND%20resolution%20%3D%20Unresolved  
>>  
>> Are we good for 3.10 after we get those cleaned up?  
>>  
>> Ariel, you made reference to:  
>> https://github.com/apache/cassandra/commit/c612cd8d7dbd24888c216ad53f9746  
>> 86b88dd601  
>>  
>> Do we need to re-open an issue to have this applied to 3.10 and add it  
>> to the above list?  
>>  
>>>  
>>> On Tue, Jan 10, 2017, at 11:17 AM, Josh McKenzie wrote:  
  
 Sankalp's proposal of us progressively tightening up our standards  
>> allows  
 us to get code out the door and regain some lost momentum on the 3.10  
 release failures and blocking, and gives us time as a community to  
>> adjust  
 our behavior without the burden of an ever-later slipped release hanging  
 over our heads. There's plenty of bugfixes in the 3.X line; the more  
>> time  
 people can have to kick the tires on that code, the more things we can  
 find  
 and the better future releases will be.  
>>  
>>  
>> +1 On gradually moving to this. Dropping releases with huge change  
>> lists has never gone well for us in the past.  
>>  
>  



Re: Per blockng release on dtest

2017-01-10 Thread Michael Shuler
Generally, fixver has only been set during commits - I only marked 3.10
and blocker status to highlight the few that failed votes, in order to
sort of cheerlead "fix me so we can release!" JIRA tickets. The full
test-failure list is probably the more "realistic" view, since any of
those may occur. As I also just replied, an auth_test method is the
current failure on c-3.11 branch. Mark it as a blocker? Re-run the job
and hope for green? Unmark the current 3.10 fixver blockers, since they
didn't fail? (Likely to get some other failure or maybe a full pass)

-- 
Michael

On 01/10/2017 10:56 AM, Josh McKenzie wrote:
> I assume you meant the query w/out 12617 embedded?
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20CASSANDRA%20AND%20fixVersion%20%3D%203.10%20AND%20resolution%20%3D%20Unresolved
> 
> Do we have confidence that all test failures have fixVersion attached
> correctly? The list of test failures w/out fixVersion is pretty daunting:
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20CASSANDRA%20AND%20labels%20in%20(test%2C%20test-failure%2C%20dtest%2C%20unittest)%20AND%20resolution%20%3D%20Unresolved%20AND%20fixversion%20%3D%20null%20and%20labels%20!%3D%20windows%20ORDER%20BY%20created%20asc
> 
> On Tue, Jan 10, 2017 at 11:49 AM, Nate McCall  wrote:
> 
>>>
>>> I concede it would be fine to do it gradually. Once the pace of issues
>>> introduced by new development is beaten by the pace at which they are
>>> addressed I think things will go well.
>>
>> So from Michael's JIRA query:
>> https://issues.apache.org/jira/browse/CASSANDRA-12617?
>> jql=project%20%3D%20CASSANDRA%20AND%20fixVersion%20%3D%203.
>> 10%20AND%20resolution%20%3D%20Unresolved
>>
>> Are we good for 3.10 after we get those cleaned up?
>>
>> Ariel, you made reference to:
>> https://github.com/apache/cassandra/commit/c612cd8d7dbd24888c216ad53f9746
>> 86b88dd601
>>
>> Do we need to re-open an issue to have this applied to 3.10 and add it
>> to the above list?
>>
>>>
>>> On Tue, Jan 10, 2017, at 11:17 AM, Josh McKenzie wrote:

 Sankalp's proposal of us progressively tightening up our standards
>> allows
 us to get code out the door and regain some lost momentum on the 3.10
 release failures and blocking, and gives us time as a community to
>> adjust
 our behavior without the burden of an ever-later slipped release hanging
 over our heads. There's plenty of bugfixes in the 3.X line; the more
>> time
 people can have to kick the tires on that code, the more things we can
 find
 and the better future releases will be.
>>
>>
>> +1 On gradually moving to this. Dropping releases with huge change
>> lists has never gone well for us in the past.
>>
> 



Re: Per blockng release on dtest

2017-01-10 Thread Michael Shuler
Latest cassandra-3.11_dtest run failed on one test,
system_auth_ks_is_alterable_test:

https://issues.apache.org/jira/browse/CASSANDRA-13113

The dtest variations (novnode, offheap, upgrade, large) have other
failures, but if the green light for release is unit tests and the
default dtest, we're close.

http://cassci.datastax.com/view/cassandra-3.11/

-- 
Michael

On 01/10/2017 10:49 AM, Nate McCall wrote:
>>
>> I concede it would be fine to do it gradually. Once the pace of issues
>> introduced by new development is beaten by the pace at which they are
>> addressed I think things will go well.
> 
> So from Michael's JIRA query:
> https://issues.apache.org/jira/browse/CASSANDRA-12617?jql=project%20%3D%20CASSANDRA%20AND%20fixVersion%20%3D%203.10%20AND%20resolution%20%3D%20Unresolved
> 
> Are we good for 3.10 after we get those cleaned up?
> 
> Ariel, you made reference to:
> https://github.com/apache/cassandra/commit/c612cd8d7dbd24888c216ad53f974686b88dd601
> 
> Do we need to re-open an issue to have this applied to 3.10 and add it
> to the above list?
> 
>>
>> On Tue, Jan 10, 2017, at 11:17 AM, Josh McKenzie wrote:
>>>
>>> Sankalp's proposal of us progressively tightening up our standards allows
>>> us to get code out the door and regain some lost momentum on the 3.10
>>> release failures and blocking, and gives us time as a community to adjust
>>> our behavior without the burden of an ever-later slipped release hanging
>>> over our heads. There's plenty of bugfixes in the 3.X line; the more time
>>> people can have to kick the tires on that code, the more things we can
>>> find
>>> and the better future releases will be.
> 
> 
> +1 On gradually moving to this. Dropping releases with huge change
> lists has never gone well for us in the past.
> 



Re: Per blockng release on dtest

2017-01-10 Thread Josh McKenzie
I assume you meant the query w/out 12617 embedded?
https://issues.apache.org/jira/issues/?jql=project%20%3D%20CASSANDRA%20AND%20fixVersion%20%3D%203.10%20AND%20resolution%20%3D%20Unresolved

Do we have confidence that all test failures have fixVersion attached
correctly? The list of test failures w/out fixVersion is pretty daunting:
https://issues.apache.org/jira/issues/?jql=project%20%3D%20CASSANDRA%20AND%20labels%20in%20(test%2C%20test-failure%2C%20dtest%2C%20unittest)%20AND%20resolution%20%3D%20Unresolved%20AND%20fixversion%20%3D%20null%20and%20labels%20!%3D%20windows%20ORDER%20BY%20created%20asc

On Tue, Jan 10, 2017 at 11:49 AM, Nate McCall  wrote:

> >
> > I concede it would be fine to do it gradually. Once the pace of issues
> > introduced by new development is beaten by the pace at which they are
> > addressed I think things will go well.
>
> So from Michael's JIRA query:
> https://issues.apache.org/jira/browse/CASSANDRA-12617?
> jql=project%20%3D%20CASSANDRA%20AND%20fixVersion%20%3D%203.
> 10%20AND%20resolution%20%3D%20Unresolved
>
> Are we good for 3.10 after we get those cleaned up?
>
> Ariel, you made reference to:
> https://github.com/apache/cassandra/commit/c612cd8d7dbd24888c216ad53f9746
> 86b88dd601
>
> Do we need to re-open an issue to have this applied to 3.10 and add it
> to the above list?
>
> >
> > On Tue, Jan 10, 2017, at 11:17 AM, Josh McKenzie wrote:
> >>
> >> Sankalp's proposal of us progressively tightening up our standards
> allows
> >> us to get code out the door and regain some lost momentum on the 3.10
> >> release failures and blocking, and gives us time as a community to
> adjust
> >> our behavior without the burden of an ever-later slipped release hanging
> >> over our heads. There's plenty of bugfixes in the 3.X line; the more
> time
> >> people can have to kick the tires on that code, the more things we can
> >> find
> >> and the better future releases will be.
>
>
> +1 On gradually moving to this. Dropping releases with huge change
> lists has never gone well for us in the past.
>


Re: Per blockng release on dtest

2017-01-10 Thread Nate McCall
>
> I concede it would be fine to do it gradually. Once the pace of issues
> introduced by new development is beaten by the pace at which they are
> addressed I think things will go well.

So from Michael's JIRA query:
https://issues.apache.org/jira/browse/CASSANDRA-12617?jql=project%20%3D%20CASSANDRA%20AND%20fixVersion%20%3D%203.10%20AND%20resolution%20%3D%20Unresolved

Are we good for 3.10 after we get those cleaned up?

Ariel, you made reference to:
https://github.com/apache/cassandra/commit/c612cd8d7dbd24888c216ad53f974686b88dd601

Do we need to re-open an issue to have this applied to 3.10 and add it
to the above list?

>
> On Tue, Jan 10, 2017, at 11:17 AM, Josh McKenzie wrote:
>>
>> Sankalp's proposal of us progressively tightening up our standards allows
>> us to get code out the door and regain some lost momentum on the 3.10
>> release failures and blocking, and gives us time as a community to adjust
>> our behavior without the burden of an ever-later slipped release hanging
>> over our heads. There's plenty of bugfixes in the 3.X line; the more time
>> people can have to kick the tires on that code, the more things we can
>> find
>> and the better future releases will be.


+1 On gradually moving to this. Dropping releases with huge change
lists has never gone well for us in the past.


Re: Wrapping up tick-tock

2017-01-10 Thread Aleksey Yeschenko
I’m thinking put it on the same rails as 2.2.x and 3.0.x. As needed.

-- 
AY

On 10 January 2017 at 16:46:25, Josh McKenzie (jmcken...@apache.org) wrote:

>  
> I would also propose we move on to 3.10.x bugfix only releases from now  
> on, with all new feature development moving to trunk from now on.  

You thinking monthly release on that or "as needed"? In theory, monthly  
should be easier than previous tick-tock if we're only putting in bugfix or  
testfix on the branch.  

On Tue, Jan 10, 2017 at 11:41 AM, Aleksey Yeschenko   
wrote:  

> 6 months seems reasonable to me as well.  
>  
> There seems to be an agreement to halting 3.X on 3.10. I would also propose  
> we move on to 3.10.x bugfix only releases from now on, with all new feature  
> development moving to trunk from now on.  
>  
> This should allow us to finally stabilise 3.X so that we can get all test  
> jobs to green.  
>  
> --  
> AY  
>  
> On 10 January 2017 at 16:36:43, Josh McKenzie (jmcken...@apache.org)  
> wrote:  
>  
> +1 to 6 months.  
>  
> On Tue, Jan 10, 2017 at 11:32 AM, Jonathan Ellis   
> wrote:  
>  
> > I agree that 6 month seems like a reasonable compromise.  
> >  
> > On Tue, Jan 10, 2017 at 10:31 AM, Blake Eggleston   
> > wrote:  
> >  
> > > I agree that 3.10 should be the last tick-tock release, but I also  
> agree  
> > > with Jon that we shouldn't go back to yearly-ish releases.  
> > >  
> > > 6 months has come up several times now as a good cadence for feature  
> > > releases, and I think it's a good compromise between the competing  
> > > interests of long term support, regular release of features (to prevent  
> > > piling on), and effort to release. So +1 to 6 month releases.  
> > >  
> > > On January 10, 2017 at 10:14:12 AM, Ariel Weisberg (ar...@weisberg.ws)  
> > > wrote:  
> > >  
> > > Hi,  
> > >  
> > > With yearly releases trunk is going to be a mess when it comes time to  
> > > cut a release. Cutting releases is when people start caring whether all  
> > > the things in the release are in a finished state. It's when the state  
> > > of CI finally becomes relevant.  
> > >  
> > > If we wait a year we are going to accumulate a years worth of  
> unfinished  
> > > stuff in a single release. It's more expensive to context switch back  
> > > and then address those issues. If we put out large unstable releases it  
> > > means time until the features in the release are usable is pushed back  
> > > even further since it takes another 6-12 months for the release to  
> > > stabilize. Features introduced at the beginning of the cycle will have  
> > > to wait 18-24 months before anyone can benefit from them.  
> > >  
> > > Is the biggest pain point with tick-tock just the elimination of long  
> > > term support releases? What is the pain point around release frequency?  
> > > Right now people should be using 3.0 unless they need a bleeding edge  
> > > feature from 3.X and those people will have to give up something to get  
> > > something.  
> > >  
> > > Ariel  
> > >  
> > > On Tue, Jan 10, 2017, at 10:29 AM, Jonathan Haddad wrote:  
> > > > I don't see why it has to be one extreme (yearly) or another  
> (monthly).  
> > > > When you had originally proposed Tick Tock, you wrote:  
> > > >  
> > > > "The primary goal is to improve release quality. Our current major  
> “dot  
> > > > zero” releases require another five or six months to make them stable  
> > > > enough for production. This is directly related to how we pile  
> features  
> > > > in  
> > > > for 9 to 12 months and release all at once. The interactions between  
> > the  
> > > > new features are complex and not always obvious. 2.1 was no  
> exception,  
> > > > despite DataStax hiring a full tme test engineering team specifically  
> > for  
> > > > Apache Cassandra."  
> > > >  
> > > > I agreed with you at the time that the yearly cycle was too long to  
> be  
> > > > adding features before cutting a release, and still do now. Instead  
> of  
> > > > elastic banding all the way back to a process which wasn't working  
> > > > before,  
> > > > why not try somewhere in the middle? A release every 6 months (with  
> > > > monthly bug fixes for a year) gives:  
> > > >  
> > > > 1. long enough time to stabilize (1 year vs 1 month)  
> > > > 2. not so long things sit around untested forever  
> > > > 3. only 2 releases (current and previous) to do bug fix support at  
> any  
> > > > given time.  
> > > >  
> > > > Jon  
> > > >  
> > > > On Tue, Jan 10, 2017 at 6:56 AM Jonathan Ellis   
> > > wrote:  
> > > >  
> > > > > Hi all,  
> > > > >  
> > > > > We’ve had a few threads now about the successes and failures of the  
> > > > > tick-tock release process and what to do to replace it, but they  
> all  
> > > died  
> > > > > out without reaching a robust consensus.  
> > > > >  
> > > > > In those threads we saw several reasonable 

Re: Wrapping up tick-tock

2017-01-10 Thread Josh McKenzie
>
> I would also propose we move on to 3.10.x bugfix only releases from now
> on, with all new feature development moving to trunk from now on.

You thinking monthly release on that or "as needed"? In theory, monthly
should be easier than previous tick-tock if we're only putting in bugfix or
testfix on the branch.

On Tue, Jan 10, 2017 at 11:41 AM, Aleksey Yeschenko 
wrote:

> 6 months seems reasonable to me as well.
>
> There seems to be an agreement to halting 3.X on 3.10. I would also propose
> we move on to 3.10.x bugfix only releases from now on, with all new feature
> development moving to trunk from now on.
>
> This should allow us to finally stabilise 3.X so that we can get all test
> jobs to green.
>
> --
> AY
>
> On 10 January 2017 at 16:36:43, Josh McKenzie (jmcken...@apache.org)
> wrote:
>
> +1 to 6 months.
>
> On Tue, Jan 10, 2017 at 11:32 AM, Jonathan Ellis 
> wrote:
>
> > I agree that 6 month seems like a reasonable compromise.
> >
> > On Tue, Jan 10, 2017 at 10:31 AM, Blake Eggleston 
> > wrote:
> >
> > > I agree that 3.10 should be the last tick-tock release, but I also
> agree
> > > with Jon that we shouldn't go back to yearly-ish releases.
> > >
> > > 6 months has come up several times now as a good cadence for feature
> > > releases, and I think it's a good compromise between the competing
> > > interests of long term support, regular release of features (to prevent
> > > piling on), and effort to release. So +1 to 6 month releases.
> > >
> > > On January 10, 2017 at 10:14:12 AM, Ariel Weisberg (ar...@weisberg.ws)
> > > wrote:
> > >
> > > Hi,
> > >
> > > With yearly releases trunk is going to be a mess when it comes time to
> > > cut a release. Cutting releases is when people start caring whether all
> > > the things in the release are in a finished state. It's when the state
> > > of CI finally becomes relevant.
> > >
> > > If we wait a year we are going to accumulate a years worth of
> unfinished
> > > stuff in a single release. It's more expensive to context switch back
> > > and then address those issues. If we put out large unstable releases it
> > > means time until the features in the release are usable is pushed back
> > > even further since it takes another 6-12 months for the release to
> > > stabilize. Features introduced at the beginning of the cycle will have
> > > to wait 18-24 months before anyone can benefit from them.
> > >
> > > Is the biggest pain point with tick-tock just the elimination of long
> > > term support releases? What is the pain point around release frequency?
> > > Right now people should be using 3.0 unless they need a bleeding edge
> > > feature from 3.X and those people will have to give up something to get
> > > something.
> > >
> > > Ariel
> > >
> > > On Tue, Jan 10, 2017, at 10:29 AM, Jonathan Haddad wrote:
> > > > I don't see why it has to be one extreme (yearly) or another
> (monthly).
> > > > When you had originally proposed Tick Tock, you wrote:
> > > >
> > > > "The primary goal is to improve release quality. Our current major
> “dot
> > > > zero” releases require another five or six months to make them stable
> > > > enough for production. This is directly related to how we pile
> features
> > > > in
> > > > for 9 to 12 months and release all at once. The interactions between
> > the
> > > > new features are complex and not always obvious. 2.1 was no
> exception,
> > > > despite DataStax hiring a full tme test engineering team specifically
> > for
> > > > Apache Cassandra."
> > > >
> > > > I agreed with you at the time that the yearly cycle was too long to
> be
> > > > adding features before cutting a release, and still do now. Instead
> of
> > > > elastic banding all the way back to a process which wasn't working
> > > > before,
> > > > why not try somewhere in the middle? A release every 6 months (with
> > > > monthly bug fixes for a year) gives:
> > > >
> > > > 1. long enough time to stabilize (1 year vs 1 month)
> > > > 2. not so long things sit around untested forever
> > > > 3. only 2 releases (current and previous) to do bug fix support at
> any
> > > > given time.
> > > >
> > > > Jon
> > > >
> > > > On Tue, Jan 10, 2017 at 6:56 AM Jonathan Ellis 
> > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > We’ve had a few threads now about the successes and failures of the
> > > > > tick-tock release process and what to do to replace it, but they
> all
> > > died
> > > > > out without reaching a robust consensus.
> > > > >
> > > > > In those threads we saw several reasonable options proposed, but
> from
> > > my
> > > > > perspective they all operated in a kind of theoretical fantasy land
> > of
> > > > > testing and development resources. In particular, it takes around a
> > > > > person-week of effort to verify that a release is ready. That is,
> > going
> > > > > through all the test suites, inspecting and re-running failing
> tests
> > > to see
> > > > > 

Re: Wrapping up tick-tock

2017-01-10 Thread Aleksey Yeschenko
6 months seems reasonable to me as well.

There seems to be an agreement to halting 3.X on 3.10. I would also propose
we move on to 3.10.x bugfix only releases from now on, with all new feature
development moving to trunk from now on.

This should allow us to finally stabilise 3.X so that we can get all test jobs 
to green.

-- 
AY

On 10 January 2017 at 16:36:43, Josh McKenzie (jmcken...@apache.org) wrote:

+1 to 6 months.  

On Tue, Jan 10, 2017 at 11:32 AM, Jonathan Ellis  wrote:  

> I agree that 6 month seems like a reasonable compromise.  
>  
> On Tue, Jan 10, 2017 at 10:31 AM, Blake Eggleston   
> wrote:  
>  
> > I agree that 3.10 should be the last tick-tock release, but I also agree  
> > with Jon that we shouldn't go back to yearly-ish releases.  
> >  
> > 6 months has come up several times now as a good cadence for feature  
> > releases, and I think it's a good compromise between the competing  
> > interests of long term support, regular release of features (to prevent  
> > piling on), and effort to release. So +1 to 6 month releases.  
> >  
> > On January 10, 2017 at 10:14:12 AM, Ariel Weisberg (ar...@weisberg.ws)  
> > wrote:  
> >  
> > Hi,  
> >  
> > With yearly releases trunk is going to be a mess when it comes time to  
> > cut a release. Cutting releases is when people start caring whether all  
> > the things in the release are in a finished state. It's when the state  
> > of CI finally becomes relevant.  
> >  
> > If we wait a year we are going to accumulate a years worth of unfinished  
> > stuff in a single release. It's more expensive to context switch back  
> > and then address those issues. If we put out large unstable releases it  
> > means time until the features in the release are usable is pushed back  
> > even further since it takes another 6-12 months for the release to  
> > stabilize. Features introduced at the beginning of the cycle will have  
> > to wait 18-24 months before anyone can benefit from them.  
> >  
> > Is the biggest pain point with tick-tock just the elimination of long  
> > term support releases? What is the pain point around release frequency?  
> > Right now people should be using 3.0 unless they need a bleeding edge  
> > feature from 3.X and those people will have to give up something to get  
> > something.  
> >  
> > Ariel  
> >  
> > On Tue, Jan 10, 2017, at 10:29 AM, Jonathan Haddad wrote:  
> > > I don't see why it has to be one extreme (yearly) or another (monthly).  
> > > When you had originally proposed Tick Tock, you wrote:  
> > >  
> > > "The primary goal is to improve release quality. Our current major “dot  
> > > zero” releases require another five or six months to make them stable  
> > > enough for production. This is directly related to how we pile features  
> > > in  
> > > for 9 to 12 months and release all at once. The interactions between  
> the  
> > > new features are complex and not always obvious. 2.1 was no exception,  
> > > despite DataStax hiring a full tme test engineering team specifically  
> for  
> > > Apache Cassandra."  
> > >  
> > > I agreed with you at the time that the yearly cycle was too long to be  
> > > adding features before cutting a release, and still do now. Instead of  
> > > elastic banding all the way back to a process which wasn't working  
> > > before,  
> > > why not try somewhere in the middle? A release every 6 months (with  
> > > monthly bug fixes for a year) gives:  
> > >  
> > > 1. long enough time to stabilize (1 year vs 1 month)  
> > > 2. not so long things sit around untested forever  
> > > 3. only 2 releases (current and previous) to do bug fix support at any  
> > > given time.  
> > >  
> > > Jon  
> > >  
> > > On Tue, Jan 10, 2017 at 6:56 AM Jonathan Ellis   
> > wrote:  
> > >  
> > > > Hi all,  
> > > >  
> > > > We’ve had a few threads now about the successes and failures of the  
> > > > tick-tock release process and what to do to replace it, but they all  
> > died  
> > > > out without reaching a robust consensus.  
> > > >  
> > > > In those threads we saw several reasonable options proposed, but from  
> > my  
> > > > perspective they all operated in a kind of theoretical fantasy land  
> of  
> > > > testing and development resources. In particular, it takes around a  
> > > > person-week of effort to verify that a release is ready. That is,  
> going  
> > > > through all the test suites, inspecting and re-running failing tests  
> > to see  
> > > > if there is a product problem or a flaky test.  
> > > >  
> > > > (I agree that in a perfect world this wouldn’t be necessary because  
> > your  
> > > > test ci is always green, but see my previous framing of the perfect  
> > world  
> > > > as a fantasy land. It’s also worth noting that this is a common  
> problem  
> > > > for large OSS projects, not necessarily something to beat ourselves  
> up  
> > > > over, but in any case, that's our 

Re: Wrapping up tick-tock

2017-01-10 Thread Josh McKenzie
+1 to 6 months.

On Tue, Jan 10, 2017 at 11:32 AM, Jonathan Ellis  wrote:

> I agree that 6 month seems like a reasonable compromise.
>
> On Tue, Jan 10, 2017 at 10:31 AM, Blake Eggleston 
> wrote:
>
> > I agree that 3.10 should be the last tick-tock release, but I also agree
> > with Jon that we shouldn't go back to yearly-ish releases.
> >
> > 6 months has come up several times now as a good cadence for feature
> > releases, and I think it's a good compromise between the competing
> > interests of long term support, regular release of features (to prevent
> > piling on), and effort to release. So +1 to 6 month releases.
> >
> > On January 10, 2017 at 10:14:12 AM, Ariel Weisberg (ar...@weisberg.ws)
> > wrote:
> >
> > Hi,
> >
> > With yearly releases trunk is going to be a mess when it comes time to
> > cut a release. Cutting releases is when people start caring whether all
> > the things in the release are in a finished state. It's when the state
> > of CI finally becomes relevant.
> >
> > If we wait a year we are going to accumulate a years worth of unfinished
> > stuff in a single release. It's more expensive to context switch back
> > and then address those issues. If we put out large unstable releases it
> > means time until the features in the release are usable is pushed back
> > even further since it takes another 6-12 months for the release to
> > stabilize. Features introduced at the beginning of the cycle will have
> > to wait 18-24 months before anyone can benefit from them.
> >
> > Is the biggest pain point with tick-tock just the elimination of long
> > term support releases? What is the pain point around release frequency?
> > Right now people should be using 3.0 unless they need a bleeding edge
> > feature from 3.X and those people will have to give up something to get
> > something.
> >
> > Ariel
> >
> > On Tue, Jan 10, 2017, at 10:29 AM, Jonathan Haddad wrote:
> > > I don't see why it has to be one extreme (yearly) or another (monthly).
> > > When you had originally proposed Tick Tock, you wrote:
> > >
> > > "The primary goal is to improve release quality. Our current major “dot
> > > zero” releases require another five or six months to make them stable
> > > enough for production. This is directly related to how we pile features
> > > in
> > > for 9 to 12 months and release all at once. The interactions between
> the
> > > new features are complex and not always obvious. 2.1 was no exception,
> > > despite DataStax hiring a full tme test engineering team specifically
> for
> > > Apache Cassandra."
> > >
> > > I agreed with you at the time that the yearly cycle was too long to be
> > > adding features before cutting a release, and still do now. Instead of
> > > elastic banding all the way back to a process which wasn't working
> > > before,
> > > why not try somewhere in the middle? A release every 6 months (with
> > > monthly bug fixes for a year) gives:
> > >
> > > 1. long enough time to stabilize (1 year vs 1 month)
> > > 2. not so long things sit around untested forever
> > > 3. only 2 releases (current and previous) to do bug fix support at any
> > > given time.
> > >
> > > Jon
> > >
> > > On Tue, Jan 10, 2017 at 6:56 AM Jonathan Ellis 
> > wrote:
> > >
> > > > Hi all,
> > > >
> > > > We’ve had a few threads now about the successes and failures of the
> > > > tick-tock release process and what to do to replace it, but they all
> > died
> > > > out without reaching a robust consensus.
> > > >
> > > > In those threads we saw several reasonable options proposed, but from
> > my
> > > > perspective they all operated in a kind of theoretical fantasy land
> of
> > > > testing and development resources. In particular, it takes around a
> > > > person-week of effort to verify that a release is ready. That is,
> going
> > > > through all the test suites, inspecting and re-running failing tests
> > to see
> > > > if there is a product problem or a flaky test.
> > > >
> > > > (I agree that in a perfect world this wouldn’t be necessary because
> > your
> > > > test ci is always green, but see my previous framing of the perfect
> > world
> > > > as a fantasy land. It’s also worth noting that this is a common
> problem
> > > > for large OSS projects, not necessarily something to beat ourselves
> up
> > > > over, but in any case, that's our reality right now.)
> > > >
> > > > I submit that any process that assumes a monthly release cadence is
> not
> > > > realistic from a resourcing standpoint for this validation. Notably,
> we
> > > > have struggled to marshal this for 3.10 for two months now.
> > > >
> > > > Therefore, I suggest first that we collectively roll up our sleeves
> to
> > vet
> > > > 3.10 as the last tick-tock release. Stick a fork in it, it’s done. No
> > > > more tick-tock.
> > > >
> > > > I further suggest that in place of tick tock we go back to our old
> > model of
> > > > yearly-ish releases with as-needed bug fix 

Re: Wrapping up tick-tock

2017-01-10 Thread Jonathan Ellis
I agree that 6 month seems like a reasonable compromise.

On Tue, Jan 10, 2017 at 10:31 AM, Blake Eggleston 
wrote:

> I agree that 3.10 should be the last tick-tock release, but I also agree
> with Jon that we shouldn't go back to yearly-ish releases.
>
> 6 months has come up several times now as a good cadence for feature
> releases, and I think it's a good compromise between the competing
> interests of long term support, regular release of features (to prevent
> piling on), and effort to release. So +1 to 6 month releases.
>
> On January 10, 2017 at 10:14:12 AM, Ariel Weisberg (ar...@weisberg.ws)
> wrote:
>
> Hi,
>
> With yearly releases trunk is going to be a mess when it comes time to
> cut a release. Cutting releases is when people start caring whether all
> the things in the release are in a finished state. It's when the state
> of CI finally becomes relevant.
>
> If we wait a year we are going to accumulate a years worth of unfinished
> stuff in a single release. It's more expensive to context switch back
> and then address those issues. If we put out large unstable releases it
> means time until the features in the release are usable is pushed back
> even further since it takes another 6-12 months for the release to
> stabilize. Features introduced at the beginning of the cycle will have
> to wait 18-24 months before anyone can benefit from them.
>
> Is the biggest pain point with tick-tock just the elimination of long
> term support releases? What is the pain point around release frequency?
> Right now people should be using 3.0 unless they need a bleeding edge
> feature from 3.X and those people will have to give up something to get
> something.
>
> Ariel
>
> On Tue, Jan 10, 2017, at 10:29 AM, Jonathan Haddad wrote:
> > I don't see why it has to be one extreme (yearly) or another (monthly).
> > When you had originally proposed Tick Tock, you wrote:
> >
> > "The primary goal is to improve release quality. Our current major “dot
> > zero” releases require another five or six months to make them stable
> > enough for production. This is directly related to how we pile features
> > in
> > for 9 to 12 months and release all at once. The interactions between the
> > new features are complex and not always obvious. 2.1 was no exception,
> > despite DataStax hiring a full tme test engineering team specifically for
> > Apache Cassandra."
> >
> > I agreed with you at the time that the yearly cycle was too long to be
> > adding features before cutting a release, and still do now. Instead of
> > elastic banding all the way back to a process which wasn't working
> > before,
> > why not try somewhere in the middle? A release every 6 months (with
> > monthly bug fixes for a year) gives:
> >
> > 1. long enough time to stabilize (1 year vs 1 month)
> > 2. not so long things sit around untested forever
> > 3. only 2 releases (current and previous) to do bug fix support at any
> > given time.
> >
> > Jon
> >
> > On Tue, Jan 10, 2017 at 6:56 AM Jonathan Ellis 
> wrote:
> >
> > > Hi all,
> > >
> > > We’ve had a few threads now about the successes and failures of the
> > > tick-tock release process and what to do to replace it, but they all
> died
> > > out without reaching a robust consensus.
> > >
> > > In those threads we saw several reasonable options proposed, but from
> my
> > > perspective they all operated in a kind of theoretical fantasy land of
> > > testing and development resources. In particular, it takes around a
> > > person-week of effort to verify that a release is ready. That is, going
> > > through all the test suites, inspecting and re-running failing tests
> to see
> > > if there is a product problem or a flaky test.
> > >
> > > (I agree that in a perfect world this wouldn’t be necessary because
> your
> > > test ci is always green, but see my previous framing of the perfect
> world
> > > as a fantasy land. It’s also worth noting that this is a common problem
> > > for large OSS projects, not necessarily something to beat ourselves up
> > > over, but in any case, that's our reality right now.)
> > >
> > > I submit that any process that assumes a monthly release cadence is not
> > > realistic from a resourcing standpoint for this validation. Notably, we
> > > have struggled to marshal this for 3.10 for two months now.
> > >
> > > Therefore, I suggest first that we collectively roll up our sleeves to
> vet
> > > 3.10 as the last tick-tock release. Stick a fork in it, it’s done. No
> > > more tick-tock.
> > >
> > > I further suggest that in place of tick tock we go back to our old
> model of
> > > yearly-ish releases with as-needed bug fix releases on stable branches,
> > > probably bi-monthly. This amortizes the release validation problem
> over a
> > > longer development period. And of course we remain free to ramp back
> up to
> > > the more rapid cadence envisioned by the other proposals if we
> increase our
> > > pool of QA effort or we are able to 

Re: Wrapping up tick-tock

2017-01-10 Thread Dave Brosius
The problem with long release cycles is that everything goes in. and you 
have potentially a mish-mash of features, some more done than others, 
causing instability. Quick releases attempt to fix this issue by keeping 
the number of commits down to a manageable size. The problem is that 
that commit list isn't necessarily cohesive and so you really haven't 
solved anything, most likely.


It seems to me, the answer is to move towards feature branches, without 
any concept of release branches. When a feature is truely done, and 
people and tests are happy, you merge it to a newly created release 
branch, and ship it, by itself. In that way the release branch doesn't 
have flotsum/jetsum in it.


This of course stresses the CI environment as now you are going to ask 
to build n branches at once, but not that big of a deal.


---


On 2017-01-10 11:13, Ariel Weisberg wrote:

Hi,

With yearly releases trunk is going to be a mess when it comes time to
cut a release. Cutting releases is when people start caring whether all
the things in the release are in a finished state. It's when the state
of CI finally becomes relevant.

If we wait a year we are going to accumulate a years worth of 
unfinished

stuff in a single release. It's more expensive to context switch back
and then address those issues. If we put out large unstable releases it
means time until the features in the release are usable is pushed back
even further since it takes another 6-12 months for the release to
stabilize. Features introduced at the beginning of the cycle will have
to wait 18-24 months before anyone can benefit from them.

Is the biggest pain point with tick-tock just the elimination of long
term support releases? What is the pain point around release frequency?
Right now people should be using 3.0 unless they need a bleeding edge
feature from 3.X and those people will have to give up something to get
something.

Ariel

On Tue, Jan 10, 2017, at 10:29 AM, Jonathan Haddad wrote:
I don't see why it has to be one extreme (yearly) or another 
(monthly).

When you had originally proposed Tick Tock, you wrote:

"The primary goal is to improve release quality.  Our current major 
“dot

zero” releases require another five or six months to make them stable
enough for production.  This is directly related to how we pile 
features

in
for 9 to 12 months and release all at once.  The interactions between 
the
new features are complex and not always obvious.  2.1 was no 
exception,
despite DataStax hiring a full tme test engineering team specifically 
for

Apache Cassandra."

I agreed with you at the time that the yearly cycle was too long to be
adding features before cutting a release, and still do now.  Instead 
of

elastic banding all the way back to a process which wasn't working
before,
why not try somewhere in the middle?  A release every 6 months (with
monthly bug fixes for a year) gives:

1. long enough time to stabilize (1 year vs 1 month)
2. not so long things sit around untested forever
3. only 2 releases (current and previous) to do bug fix support at any
given time.

Jon

On Tue, Jan 10, 2017 at 6:56 AM Jonathan Ellis  
wrote:


> Hi all,
>
> We’ve had a few threads now about the successes and failures of the
> tick-tock release process and what to do to replace it, but they all died
> out without reaching a robust consensus.
>
> In those threads we saw several reasonable options proposed, but from my
> perspective they all operated in a kind of theoretical fantasy land of
> testing and development resources.  In particular, it takes around a
> person-week of effort to verify that a release is ready.  That is, going
> through all the test suites, inspecting and re-running failing tests to see
> if there is a product problem or a flaky test.
>
> (I agree that in a perfect world this wouldn’t be necessary because your
> test ci is always green, but see my previous framing of the perfect world
> as a fantasy land.  It’s also worth noting that this is a common problem
> for large OSS projects, not necessarily something to beat ourselves up
> over, but in any case, that's our reality right now.)
>
> I submit that any process that assumes a monthly release cadence is not
> realistic from a resourcing standpoint for this validation.  Notably, we
> have struggled to marshal this for 3.10 for two months now.
>
> Therefore, I suggest first that we collectively roll up our sleeves to vet
> 3.10 as the last tick-tock release.  Stick a fork in it, it’s done.  No
> more tick-tock.
>
> I further suggest that in place of tick tock we go back to our old model of
> yearly-ish releases with as-needed bug fix releases on stable branches,
> probably bi-monthly.  This amortizes the release validation problem over a
> longer development period.  And of course we remain free to ramp back up to
> the more rapid cadence envisioned by the other proposals if we increase our
> pool of QA effort or we are able to eliminate flakey tests to the 

Re: Wrapping up tick-tock

2017-01-10 Thread Blake Eggleston
I agree that 3.10 should be the last tick-tock release, but I also agree with 
Jon that we shouldn't go back to yearly-ish releases.

6 months has come up several times now as a good cadence for feature releases, 
and I think it's a good compromise between the competing interests of long term 
support, regular release of features (to prevent piling on), and effort to 
release. So +1 to 6 month releases.

On January 10, 2017 at 10:14:12 AM, Ariel Weisberg (ar...@weisberg.ws) wrote:

Hi,  

With yearly releases trunk is going to be a mess when it comes time to  
cut a release. Cutting releases is when people start caring whether all  
the things in the release are in a finished state. It's when the state  
of CI finally becomes relevant.  

If we wait a year we are going to accumulate a years worth of unfinished  
stuff in a single release. It's more expensive to context switch back  
and then address those issues. If we put out large unstable releases it  
means time until the features in the release are usable is pushed back  
even further since it takes another 6-12 months for the release to  
stabilize. Features introduced at the beginning of the cycle will have  
to wait 18-24 months before anyone can benefit from them.  

Is the biggest pain point with tick-tock just the elimination of long  
term support releases? What is the pain point around release frequency?  
Right now people should be using 3.0 unless they need a bleeding edge  
feature from 3.X and those people will have to give up something to get  
something.  

Ariel  

On Tue, Jan 10, 2017, at 10:29 AM, Jonathan Haddad wrote:  
> I don't see why it has to be one extreme (yearly) or another (monthly).  
> When you had originally proposed Tick Tock, you wrote:  
>  
> "The primary goal is to improve release quality. Our current major “dot  
> zero” releases require another five or six months to make them stable  
> enough for production. This is directly related to how we pile features  
> in  
> for 9 to 12 months and release all at once. The interactions between the  
> new features are complex and not always obvious. 2.1 was no exception,  
> despite DataStax hiring a full tme test engineering team specifically for  
> Apache Cassandra."  
>  
> I agreed with you at the time that the yearly cycle was too long to be  
> adding features before cutting a release, and still do now. Instead of  
> elastic banding all the way back to a process which wasn't working  
> before,  
> why not try somewhere in the middle? A release every 6 months (with  
> monthly bug fixes for a year) gives:  
>  
> 1. long enough time to stabilize (1 year vs 1 month)  
> 2. not so long things sit around untested forever  
> 3. only 2 releases (current and previous) to do bug fix support at any  
> given time.  
>  
> Jon  
>  
> On Tue, Jan 10, 2017 at 6:56 AM Jonathan Ellis  wrote:  
>  
> > Hi all,  
> >  
> > We’ve had a few threads now about the successes and failures of the  
> > tick-tock release process and what to do to replace it, but they all died  
> > out without reaching a robust consensus.  
> >  
> > In those threads we saw several reasonable options proposed, but from my  
> > perspective they all operated in a kind of theoretical fantasy land of  
> > testing and development resources. In particular, it takes around a  
> > person-week of effort to verify that a release is ready. That is, going  
> > through all the test suites, inspecting and re-running failing tests to see 
> >  
> > if there is a product problem or a flaky test.  
> >  
> > (I agree that in a perfect world this wouldn’t be necessary because your  
> > test ci is always green, but see my previous framing of the perfect world  
> > as a fantasy land. It’s also worth noting that this is a common problem  
> > for large OSS projects, not necessarily something to beat ourselves up  
> > over, but in any case, that's our reality right now.)  
> >  
> > I submit that any process that assumes a monthly release cadence is not  
> > realistic from a resourcing standpoint for this validation. Notably, we  
> > have struggled to marshal this for 3.10 for two months now.  
> >  
> > Therefore, I suggest first that we collectively roll up our sleeves to vet  
> > 3.10 as the last tick-tock release. Stick a fork in it, it’s done. No  
> > more tick-tock.  
> >  
> > I further suggest that in place of tick tock we go back to our old model of 
> >  
> > yearly-ish releases with as-needed bug fix releases on stable branches,  
> > probably bi-monthly. This amortizes the release validation problem over a  
> > longer development period. And of course we remain free to ramp back up to  
> > the more rapid cadence envisioned by the other proposals if we increase our 
> >  
> > pool of QA effort or we are able to eliminate flakey tests to the point  
> > that a long validation process becomes unnecessary.  
> >  
> > (While a longer dev period could mean a correspondingly more 

Re: Per blockng release on dtest

2017-01-10 Thread Ariel Weisberg
Hi,

I concede it would be fine to do it gradually. Once the pace of issues
introduced by new development is beaten by the pace at which they are
addressed I think things will go well.

Ariel

On Tue, Jan 10, 2017, at 11:17 AM, Josh McKenzie wrote:
> @ariel: you're letting the perfect be the enemy of the good here. We (as
> a
> project) have been releasing with a smattering of test failures and
> upgrade
> edge-cases back into perpetuity. While that doesn't make it ideal or
> justify continuing the behavior, getting a green testall + dtest for 3.10
> is a strong incremental improvement. Integrating other tests in the
> "block
> if not green" on subsequent releases is likewise an improvement.
> 
> I strongly advocate for incremental change in expectations of the
> community's behavior rather than a black-and-white, "this has to be
> perfect
> or we block" mentality.
> 
> Sankalp's proposal of us progressively tightening up our standards allows
> us to get code out the door and regain some lost momentum on the 3.10
> release failures and blocking, and gives us time as a community to adjust
> our behavior without the burden of an ever-later slipped release hanging
> over our heads. There's plenty of bugfixes in the 3.X line; the more time
> people can have to kick the tires on that code, the more things we can
> find
> and the better future releases will be.
> 
> 
> 
> 
> 
> On Tue, Jan 10, 2017 at 10:33 AM, Ariel Weisberg 
> wrote:
> 
> > Hi,
> >
> > At least some of those failures are real. I don't think we should
> > release 3.10 until the real failures are addressed. As I said earlier
> > one of them is a wrong answer bug that is not going to be fixed in 3.10.
> >
> > Can we just ignore failures because we think they don't mean anything?
> > Who is going to check which of the 60 failures is real?
> >
> > These tests were passing just fine at the beginning of December and then
> > commits happened and now the tests are failing. That is exactly what
> > their for. They are good tests. I don't think it matters if the failures
> > are "real" today because those are valid tests and they don't test
> > anything if they fail for spurious reasons. They are a critical part of
> > the Cassandra infrastructure as much as the storage engine or network
> > code.
> >
> > In my opinion the tests need to be fixed and people need to fix them as
> > they break them and we need to figure out how to get from people
> > breaking them and it going unnoticed to they break it and then fix it in
> > a time frame that fits the release schedule.
> >
> > My personal opinion is that releases are a reward for finishing the job.
> > Releasing without finishing the job creates the wrong incentive
> > structure for the community. If you break something you are no longer
> > the person that blocked the release you are just one of several people
> > breaking things without consequence.
> >
> > I think that rapid feedback and triaging combined with releases blocked
> > by the stuff individual contributors have broken is the way to more
> > consistent releases both schedule wise and quality wise.
> >
> > Regarding delaying 3.10? Who exactly is the consumer that is chomping at
> > the bit to get another release? One that doesn't reliably upgrade from a
> > previous version?
> >
> > Ariel
> >
> > On Tue, Jan 10, 2017, at 08:13 AM, Josh McKenzie wrote:
> > > First, I think we need to clarify if we're blocking on just testall +
> > > dtest
> > > or blocking on *all test jobs*.
> > >
> > > If the latter, upgrade tests are the elephant in the room:
> > > http://cassci.datastax.com/view/cassandra-3.11/job/
> > cassandra-3.11_dtest_upgrade/lastCompletedBuild/testReport/
> > >
> > > Do we have confidence that the reported failures are all test problems
> > > and
> > > not w/Cassandra itself? If so, is that documented somewhere?
> > >
> > > On Mon, Jan 9, 2017 at 7:33 PM, Nate McCall  wrote:
> > >
> > > > I'm not sure I understand the culmination of the past couple of
> > threads on
> > > > this.
> > > >
> > > > With a situation like:
> > > > http://cassci.datastax.com/view/cassandra-3.11/job/
> > cassandra-3.11_dtest/
> > > > lastCompletedBuild/testReport/
> > > >
> > > > We have some sense of stability on what might be flaky tests(?).
> > > > Again, I'm not sure what our criteria is specifically.
> > > >
> > > > Basically, it feels like we are in a stalemate right now. How do we
> > > > move forward?
> > > >
> > > > -Nate
> > > >
> >


Re: Per blockng release on dtest

2017-01-10 Thread Aleksey Yeschenko
If they aren’t regressions from 3.9, we should still push 3.10 out.

The branch has accumulated a lot of fixes, for problems that *are* real.
Just have a look at CHANGES.txt.

By holding 3.10 you are denying those (arguably few, but still) users fixes for 
bugs that we
know are in.

It’s been more than 3 months now, delaying it further is unreasonable. The 
branch needs to be uncorked.

I would also prefer that people who -1, in particular bindingly, were prepared 
to go and fix the offending tests,
if they are blocking the vote on the ground of tests. Can’t expect test 
failures to magically go away all
by themselves.

-- 
AY

On 10 January 2017 at 15:33:45, Ariel Weisberg (ar...@weisberg.ws) wrote:

Hi,  

At least some of those failures are real. I don't think we should  
release 3.10 until the real failures are addressed. As I said earlier  
one of them is a wrong answer bug that is not going to be fixed in 3.10.  

Can we just ignore failures because we think they don't mean anything?  
Who is going to check which of the 60 failures is real?  

These tests were passing just fine at the beginning of December and then  
commits happened and now the tests are failing. That is exactly what  
their for. They are good tests. I don't think it matters if the failures  
are "real" today because those are valid tests and they don't test  
anything if they fail for spurious reasons. They are a critical part of  
the Cassandra infrastructure as much as the storage engine or network  
code.  

In my opinion the tests need to be fixed and people need to fix them as  
they break them and we need to figure out how to get from people  
breaking them and it going unnoticed to they break it and then fix it in  
a time frame that fits the release schedule.  

My personal opinion is that releases are a reward for finishing the job.  
Releasing without finishing the job creates the wrong incentive  
structure for the community. If you break something you are no longer  
the person that blocked the release you are just one of several people  
breaking things without consequence.  

I think that rapid feedback and triaging combined with releases blocked  
by the stuff individual contributors have broken is the way to more  
consistent releases both schedule wise and quality wise.  

Regarding delaying 3.10? Who exactly is the consumer that is chomping at  
the bit to get another release? One that doesn't reliably upgrade from a  
previous version?  

Ariel  

On Tue, Jan 10, 2017, at 08:13 AM, Josh McKenzie wrote:  
> First, I think we need to clarify if we're blocking on just testall +  
> dtest  
> or blocking on *all test jobs*.  
>  
> If the latter, upgrade tests are the elephant in the room:  
> http://cassci.datastax.com/view/cassandra-3.11/job/cassandra-3.11_dtest_upgrade/lastCompletedBuild/testReport/
>   
>  
> Do we have confidence that the reported failures are all test problems  
> and  
> not w/Cassandra itself? If so, is that documented somewhere?  
>  
> On Mon, Jan 9, 2017 at 7:33 PM, Nate McCall  wrote:  
>  
> > I'm not sure I understand the culmination of the past couple of threads on  
> > this.  
> >  
> > With a situation like:  
> > http://cassci.datastax.com/view/cassandra-3.11/job/cassandra-3.11_dtest/  
> > lastCompletedBuild/testReport/  
> >  
> > We have some sense of stability on what might be flaky tests(?).  
> > Again, I'm not sure what our criteria is specifically.  
> >  
> > Basically, it feels like we are in a stalemate right now. How do we  
> > move forward?  
> >  
> > -Nate  
> >  


Re: Per blockng release on dtest

2017-01-10 Thread Josh McKenzie
@ariel: you're letting the perfect be the enemy of the good here. We (as a
project) have been releasing with a smattering of test failures and upgrade
edge-cases back into perpetuity. While that doesn't make it ideal or
justify continuing the behavior, getting a green testall + dtest for 3.10
is a strong incremental improvement. Integrating other tests in the "block
if not green" on subsequent releases is likewise an improvement.

I strongly advocate for incremental change in expectations of the
community's behavior rather than a black-and-white, "this has to be perfect
or we block" mentality.

Sankalp's proposal of us progressively tightening up our standards allows
us to get code out the door and regain some lost momentum on the 3.10
release failures and blocking, and gives us time as a community to adjust
our behavior without the burden of an ever-later slipped release hanging
over our heads. There's plenty of bugfixes in the 3.X line; the more time
people can have to kick the tires on that code, the more things we can find
and the better future releases will be.





On Tue, Jan 10, 2017 at 10:33 AM, Ariel Weisberg  wrote:

> Hi,
>
> At least some of those failures are real. I don't think we should
> release 3.10 until the real failures are addressed. As I said earlier
> one of them is a wrong answer bug that is not going to be fixed in 3.10.
>
> Can we just ignore failures because we think they don't mean anything?
> Who is going to check which of the 60 failures is real?
>
> These tests were passing just fine at the beginning of December and then
> commits happened and now the tests are failing. That is exactly what
> their for. They are good tests. I don't think it matters if the failures
> are "real" today because those are valid tests and they don't test
> anything if they fail for spurious reasons. They are a critical part of
> the Cassandra infrastructure as much as the storage engine or network
> code.
>
> In my opinion the tests need to be fixed and people need to fix them as
> they break them and we need to figure out how to get from people
> breaking them and it going unnoticed to they break it and then fix it in
> a time frame that fits the release schedule.
>
> My personal opinion is that releases are a reward for finishing the job.
> Releasing without finishing the job creates the wrong incentive
> structure for the community. If you break something you are no longer
> the person that blocked the release you are just one of several people
> breaking things without consequence.
>
> I think that rapid feedback and triaging combined with releases blocked
> by the stuff individual contributors have broken is the way to more
> consistent releases both schedule wise and quality wise.
>
> Regarding delaying 3.10? Who exactly is the consumer that is chomping at
> the bit to get another release? One that doesn't reliably upgrade from a
> previous version?
>
> Ariel
>
> On Tue, Jan 10, 2017, at 08:13 AM, Josh McKenzie wrote:
> > First, I think we need to clarify if we're blocking on just testall +
> > dtest
> > or blocking on *all test jobs*.
> >
> > If the latter, upgrade tests are the elephant in the room:
> > http://cassci.datastax.com/view/cassandra-3.11/job/
> cassandra-3.11_dtest_upgrade/lastCompletedBuild/testReport/
> >
> > Do we have confidence that the reported failures are all test problems
> > and
> > not w/Cassandra itself? If so, is that documented somewhere?
> >
> > On Mon, Jan 9, 2017 at 7:33 PM, Nate McCall  wrote:
> >
> > > I'm not sure I understand the culmination of the past couple of
> threads on
> > > this.
> > >
> > > With a situation like:
> > > http://cassci.datastax.com/view/cassandra-3.11/job/
> cassandra-3.11_dtest/
> > > lastCompletedBuild/testReport/
> > >
> > > We have some sense of stability on what might be flaky tests(?).
> > > Again, I'm not sure what our criteria is specifically.
> > >
> > > Basically, it feels like we are in a stalemate right now. How do we
> > > move forward?
> > >
> > > -Nate
> > >
>


Re: Wrapping up tick-tock

2017-01-10 Thread Ariel Weisberg
Hi,

With yearly releases trunk is going to be a mess when it comes time to
cut a release. Cutting releases is when people start caring whether all
the things in the release are in a finished state. It's when the state
of CI finally becomes relevant.

If we wait a year we are going to accumulate a years worth of unfinished
stuff in a single release. It's more expensive to context switch back
and then address those issues. If we put out large unstable releases it
means time until the features in the release are usable is pushed back
even further since it takes another 6-12 months for the release to
stabilize. Features introduced at the beginning of the cycle will have
to wait 18-24 months before anyone can benefit from them.

Is the biggest pain point with tick-tock just the elimination of long
term support releases? What is the pain point around release frequency?
Right now people should be using 3.0 unless they need a bleeding edge
feature from 3.X and those people will have to give up something to get
something.

Ariel

On Tue, Jan 10, 2017, at 10:29 AM, Jonathan Haddad wrote:
> I don't see why it has to be one extreme (yearly) or another (monthly).
> When you had originally proposed Tick Tock, you wrote:
> 
> "The primary goal is to improve release quality.  Our current major “dot
> zero” releases require another five or six months to make them stable
> enough for production.  This is directly related to how we pile features
> in
> for 9 to 12 months and release all at once.  The interactions between the
> new features are complex and not always obvious.  2.1 was no exception,
> despite DataStax hiring a full tme test engineering team specifically for
> Apache Cassandra."
> 
> I agreed with you at the time that the yearly cycle was too long to be
> adding features before cutting a release, and still do now.  Instead of
> elastic banding all the way back to a process which wasn't working
> before,
> why not try somewhere in the middle?  A release every 6 months (with
> monthly bug fixes for a year) gives:
> 
> 1. long enough time to stabilize (1 year vs 1 month)
> 2. not so long things sit around untested forever
> 3. only 2 releases (current and previous) to do bug fix support at any
> given time.
> 
> Jon
> 
> On Tue, Jan 10, 2017 at 6:56 AM Jonathan Ellis  wrote:
> 
> > Hi all,
> >
> > We’ve had a few threads now about the successes and failures of the
> > tick-tock release process and what to do to replace it, but they all died
> > out without reaching a robust consensus.
> >
> > In those threads we saw several reasonable options proposed, but from my
> > perspective they all operated in a kind of theoretical fantasy land of
> > testing and development resources.  In particular, it takes around a
> > person-week of effort to verify that a release is ready.  That is, going
> > through all the test suites, inspecting and re-running failing tests to see
> > if there is a product problem or a flaky test.
> >
> > (I agree that in a perfect world this wouldn’t be necessary because your
> > test ci is always green, but see my previous framing of the perfect world
> > as a fantasy land.  It’s also worth noting that this is a common problem
> > for large OSS projects, not necessarily something to beat ourselves up
> > over, but in any case, that's our reality right now.)
> >
> > I submit that any process that assumes a monthly release cadence is not
> > realistic from a resourcing standpoint for this validation.  Notably, we
> > have struggled to marshal this for 3.10 for two months now.
> >
> > Therefore, I suggest first that we collectively roll up our sleeves to vet
> > 3.10 as the last tick-tock release.  Stick a fork in it, it’s done.  No
> > more tick-tock.
> >
> > I further suggest that in place of tick tock we go back to our old model of
> > yearly-ish releases with as-needed bug fix releases on stable branches,
> > probably bi-monthly.  This amortizes the release validation problem over a
> > longer development period.  And of course we remain free to ramp back up to
> > the more rapid cadence envisioned by the other proposals if we increase our
> > pool of QA effort or we are able to eliminate flakey tests to the point
> > that a long validation process becomes unnecessary.
> >
> > (While a longer dev period could mean a correspondingly more painful test
> > validation process at the end, my experience is that most of the validation
> > cost is “fixed” in the form of flaky tests and thus does not increase
> > proportionally to development time.)
> >
> > Thoughts?
> >
> > --
> > Jonathan Ellis
> > co-founder, http://www.datastax.com
> > @spyced
> >


Re: Per blockng release on dtest

2017-01-10 Thread Ariel Weisberg
Hi,

At least some of those failures are real. I don't think we should
release 3.10 until the real failures are addressed. As I said earlier
one of them is a wrong answer bug that is not going to be fixed in 3.10.

Can we just ignore failures because we think they don't mean anything?
Who is going to check which of the 60 failures is real?

These tests were passing just fine at the beginning of December and then
commits happened and now the tests are failing. That is exactly what
their for. They are good tests. I don't think it matters if the failures
are "real" today because those are valid tests and they don't test
anything if they fail for spurious reasons. They are a critical part of
the Cassandra infrastructure as much as the storage engine or network
code.

In my opinion the tests need to be fixed and people need to fix them as
they break them and we need to figure out how to get from people
breaking them and it going unnoticed to they break it and then fix it in
a time frame that fits the release schedule.

My personal opinion is that releases are a reward for finishing the job.
Releasing without finishing the job creates the wrong incentive
structure for the community. If you break something you are no longer
the person that blocked the release you are just one of several people
breaking things without consequence.

I think that rapid feedback and triaging combined with releases blocked
by the stuff individual contributors have broken is the way to more
consistent releases both schedule wise and quality wise.

Regarding delaying 3.10? Who exactly is the consumer that is chomping at
the bit to get another release? One that doesn't reliably upgrade from a
previous version?
 
Ariel

On Tue, Jan 10, 2017, at 08:13 AM, Josh McKenzie wrote:
> First, I think we need to clarify if we're blocking on just testall +
> dtest
> or blocking on *all test jobs*.
> 
> If the latter, upgrade tests are the elephant in the room:
> http://cassci.datastax.com/view/cassandra-3.11/job/cassandra-3.11_dtest_upgrade/lastCompletedBuild/testReport/
> 
> Do we have confidence that the reported failures are all test problems
> and
> not w/Cassandra itself? If so, is that documented somewhere?
> 
> On Mon, Jan 9, 2017 at 7:33 PM, Nate McCall  wrote:
> 
> > I'm not sure I understand the culmination of the past couple of threads on
> > this.
> >
> > With a situation like:
> > http://cassci.datastax.com/view/cassandra-3.11/job/cassandra-3.11_dtest/
> > lastCompletedBuild/testReport/
> >
> > We have some sense of stability on what might be flaky tests(?).
> > Again, I'm not sure what our criteria is specifically.
> >
> > Basically, it feels like we are in a stalemate right now. How do we
> > move forward?
> >
> > -Nate
> >


Re: Wrapping up tick-tock

2017-01-10 Thread Jonathan Haddad
I don't see why it has to be one extreme (yearly) or another (monthly).
When you had originally proposed Tick Tock, you wrote:

"The primary goal is to improve release quality.  Our current major “dot
zero” releases require another five or six months to make them stable
enough for production.  This is directly related to how we pile features in
for 9 to 12 months and release all at once.  The interactions between the
new features are complex and not always obvious.  2.1 was no exception,
despite DataStax hiring a full tme test engineering team specifically for
Apache Cassandra."

I agreed with you at the time that the yearly cycle was too long to be
adding features before cutting a release, and still do now.  Instead of
elastic banding all the way back to a process which wasn't working before,
why not try somewhere in the middle?  A release every 6 months (with
monthly bug fixes for a year) gives:

1. long enough time to stabilize (1 year vs 1 month)
2. not so long things sit around untested forever
3. only 2 releases (current and previous) to do bug fix support at any
given time.

Jon

On Tue, Jan 10, 2017 at 6:56 AM Jonathan Ellis  wrote:

> Hi all,
>
> We’ve had a few threads now about the successes and failures of the
> tick-tock release process and what to do to replace it, but they all died
> out without reaching a robust consensus.
>
> In those threads we saw several reasonable options proposed, but from my
> perspective they all operated in a kind of theoretical fantasy land of
> testing and development resources.  In particular, it takes around a
> person-week of effort to verify that a release is ready.  That is, going
> through all the test suites, inspecting and re-running failing tests to see
> if there is a product problem or a flaky test.
>
> (I agree that in a perfect world this wouldn’t be necessary because your
> test ci is always green, but see my previous framing of the perfect world
> as a fantasy land.  It’s also worth noting that this is a common problem
> for large OSS projects, not necessarily something to beat ourselves up
> over, but in any case, that's our reality right now.)
>
> I submit that any process that assumes a monthly release cadence is not
> realistic from a resourcing standpoint for this validation.  Notably, we
> have struggled to marshal this for 3.10 for two months now.
>
> Therefore, I suggest first that we collectively roll up our sleeves to vet
> 3.10 as the last tick-tock release.  Stick a fork in it, it’s done.  No
> more tick-tock.
>
> I further suggest that in place of tick tock we go back to our old model of
> yearly-ish releases with as-needed bug fix releases on stable branches,
> probably bi-monthly.  This amortizes the release validation problem over a
> longer development period.  And of course we remain free to ramp back up to
> the more rapid cadence envisioned by the other proposals if we increase our
> pool of QA effort or we are able to eliminate flakey tests to the point
> that a long validation process becomes unnecessary.
>
> (While a longer dev period could mean a correspondingly more painful test
> validation process at the end, my experience is that most of the validation
> cost is “fixed” in the form of flaky tests and thus does not increase
> proportionally to development time.)
>
> Thoughts?
>
> --
> Jonathan Ellis
> co-founder, http://www.datastax.com
> @spyced
>


Wrapping up tick-tock

2017-01-10 Thread Jonathan Ellis
Hi all,

We’ve had a few threads now about the successes and failures of the
tick-tock release process and what to do to replace it, but they all died
out without reaching a robust consensus.

In those threads we saw several reasonable options proposed, but from my
perspective they all operated in a kind of theoretical fantasy land of
testing and development resources.  In particular, it takes around a
person-week of effort to verify that a release is ready.  That is, going
through all the test suites, inspecting and re-running failing tests to see
if there is a product problem or a flaky test.

(I agree that in a perfect world this wouldn’t be necessary because your
test ci is always green, but see my previous framing of the perfect world
as a fantasy land.  It’s also worth noting that this is a common problem
for large OSS projects, not necessarily something to beat ourselves up
over, but in any case, that's our reality right now.)

I submit that any process that assumes a monthly release cadence is not
realistic from a resourcing standpoint for this validation.  Notably, we
have struggled to marshal this for 3.10 for two months now.

Therefore, I suggest first that we collectively roll up our sleeves to vet
3.10 as the last tick-tock release.  Stick a fork in it, it’s done.  No
more tick-tock.

I further suggest that in place of tick tock we go back to our old model of
yearly-ish releases with as-needed bug fix releases on stable branches,
probably bi-monthly.  This amortizes the release validation problem over a
longer development period.  And of course we remain free to ramp back up to
the more rapid cadence envisioned by the other proposals if we increase our
pool of QA effort or we are able to eliminate flakey tests to the point
that a long validation process becomes unnecessary.

(While a longer dev period could mean a correspondingly more painful test
validation process at the end, my experience is that most of the validation
cost is “fixed” in the form of flaky tests and thus does not increase
proportionally to development time.)

Thoughts?

-- 
Jonathan Ellis
co-founder, http://www.datastax.com
@spyced


Re: Per blockng release on dtest

2017-01-10 Thread sankalp kohli
I think we should start with blocking 3.10 releases on testall + Dtest.
After 3.10, we can start blocking it on other jobs for each release after
that. This will make sure we make progress and dont cause 3.10 to sit for a
long time. Thoughts?

On Tue, Jan 10, 2017 at 5:13 AM, Josh McKenzie  wrote:

> First, I think we need to clarify if we're blocking on just testall + dtest
> or blocking on *all test jobs*.
>
> If the latter, upgrade tests are the elephant in the room:
> http://cassci.datastax.com/view/cassandra-3.11/job/
> cassandra-3.11_dtest_upgrade/lastCompletedBuild/testReport/
>
> Do we have confidence that the reported failures are all test problems and
> not w/Cassandra itself? If so, is that documented somewhere?
>
> On Mon, Jan 9, 2017 at 7:33 PM, Nate McCall  wrote:
>
> > I'm not sure I understand the culmination of the past couple of threads
> on
> > this.
> >
> > With a situation like:
> > http://cassci.datastax.com/view/cassandra-3.11/job/cassandra-3.11_dtest/
> > lastCompletedBuild/testReport/
> >
> > We have some sense of stability on what might be flaky tests(?).
> > Again, I'm not sure what our criteria is specifically.
> >
> > Basically, it feels like we are in a stalemate right now. How do we
> > move forward?
> >
> > -Nate
> >
>


Re: Per blockng release on dtest

2017-01-10 Thread Josh McKenzie
First, I think we need to clarify if we're blocking on just testall + dtest
or blocking on *all test jobs*.

If the latter, upgrade tests are the elephant in the room:
http://cassci.datastax.com/view/cassandra-3.11/job/cassandra-3.11_dtest_upgrade/lastCompletedBuild/testReport/

Do we have confidence that the reported failures are all test problems and
not w/Cassandra itself? If so, is that documented somewhere?

On Mon, Jan 9, 2017 at 7:33 PM, Nate McCall  wrote:

> I'm not sure I understand the culmination of the past couple of threads on
> this.
>
> With a situation like:
> http://cassci.datastax.com/view/cassandra-3.11/job/cassandra-3.11_dtest/
> lastCompletedBuild/testReport/
>
> We have some sense of stability on what might be flaky tests(?).
> Again, I'm not sure what our criteria is specifically.
>
> Basically, it feels like we are in a stalemate right now. How do we
> move forward?
>
> -Nate
>


Re: Rollback procedure for Cassandra Upgrade.

2017-01-10 Thread Jeremy Hanna
See the comment thread on https://issues.apache.org/jira/browse/CASSANDRA-8928 
 (add downgradesstables)
> On Jan 10, 2017, at 5:00 AM, Jonathan Haddad  wrote:
> 
> There's no downgrade procedure. You either upgrade or you go back to a
> snapshot from the previous version.
> On Mon, Jan 9, 2017 at 8:13 PM Prakash Chauhan 
> wrote:
> 
>> Hi All ,
>> 
>> Do we have an official procedure to rollback the upgrade of C* from 2.0.x
>> to 2.1.x ?
>> 
>> 
>> Description:
>> I have upgraded C* from 2.0.x to 2.1.x . As a part of upgrade procedure ,
>> I have to run nodetool upgradesstables .
>> What if the command fails in the middle ? Some of the sstables will be in
>> newer format (*-ka-*) where as other might be in older format(*-jb-*).
>> 
>> Do we have a standard procedure to do rollback in such cases?
>> 
>> 
>> 
>> Regards,
>> Prakash Chauhan.
>> 
>> 



Re: Rollback procedure for Cassandra Upgrade.

2017-01-10 Thread Romain Hardouin
To be able to downgrade we should be able to pin both commitlog and sstables 
versions, e.g. -Dcassandra.commitlog_version=3 -Dcassandra.sstable_version=jb
That would be awesome because it would decorrelate binaries version and data 
version. Upgrades would be much less risky so I guess that adoption of new C* 
versions would increase.
Best,
Romain 

Le Mardi 10 janvier 2017 6h03, Brandon Williams  a écrit :
 

 However, it's good to determine *how* it failed.  If nodetool just died or
timed out, that's no big deal, it'll finish.

On Mon, Jan 9, 2017 at 11:00 PM, Jonathan Haddad  wrote:

> There's no downgrade procedure. You either upgrade or you go back to a
> snapshot from the previous version.
> On Mon, Jan 9, 2017 at 8:13 PM Prakash Chauhan <
> prakash.chau...@ericsson.com>
> wrote:
>
> > Hi All ,
> >
> > Do we have an official procedure to rollback the upgrade of C* from 2.0.x
> > to 2.1.x ?
> >
> >
> > Description:
> > I have upgraded C* from 2.0.x to 2.1.x . As a part of upgrade procedure ,
> > I have to run nodetool upgradesstables .
> > What if the command fails in the middle ? Some of the sstables will be in
> > newer format (*-ka-*) where as other might be in older format(*-jb-*).
> >
> > Do we have a standard procedure to do rollback in such cases?
> >
> >
> >
> > Regards,
> > Prakash Chauhan.
> >
> >
>