Re: Discussion: release quality

Joe Stein Tue, 29 Nov 2011 19:15:35 -0800

I need at least a week, maybe two to promote anything to staging which is 
mainly because we do weekly releases.   I could introduce a 2 day turn around 
but only with a more fixed type schedule.  I am running 0.8.6 in production and 
REALLY want to upgrade for nothing more than getting compression ( the cost of 
petabytes of uncompressed data is just stupid ).  So however I can help in 
changing my process OR better understanding the PMC here I am game for.


One thing I use C* for is holding days worth of data and re-running those days 
for regression on our software... simulating production... It might not take 
much to reverse it.

/*
Joe Stein
http://www.medialets.com
Twitter: @allthingshadoop
*/

On Nov 29, 2011, at 10:04 PM, Edward Capriolo <edlinuxg...@gmail.com> wrote:

> On Tue, Nov 29, 2011 at 6:16 PM, Jeremy Hanna 
> <jeremy.hanna1...@gmail.com>wrote:
> 
>> I'd like to start a discussion about ideas to improve release quality for
>> Cassandra.  Specifically I wonder if the community can do more to help the
>> project as a whole become more solid.  Cassandra has an active and vibrant
>> community using Cassandra for a variety of things.  If we all pitch in a
>> little bit, it seems like we can make a difference here.
>> 
>> Release quality is difficult, especially for a distributed system like
>> Cassandra.  The core devs have done an amazing job with this considering
>> how complicated it is.  Currently, there are several things in place to
>> make sure that a release is generally usable:
>> - review-then-commit
>> - 72 hour voting period
>> - at least 3 binding +1 votes
>> - unit tests
>> - integration tests
>> Then there is the personal responsibility aspect - testing a release in a
>> staging environment before pushing it to production.
>> 
>> I wonder if more could be done here to give more confidence in releases.
>> I wanted to see if there might be ways that the community could help out
>> without being too burdensome on either the core devs or the community.
>> 
>> Some ideas:
>> More automation: run YCSB and stress with various setups.  Maybe people
>> can rotate donating cloud instances (or simply money for them) but have a
>> common set of scripts to do this in the source.
>> 
>> Dedicated distributed test suite: I know there has been work done on
>> various distributed test suites (which is great!) but none have really
>> caught on so far.
>> 
>> I know what the apache guidelines say, but what if the community could
>> help out with the testing effort in a more formal way.  For example, for
>> each release to be finalized, what if there needed to be 3 community
>> members that needed to try it out in their own environment?
>> 
>> What if there was a post release +1 vote for the community to sign off on
>> - sort of a "works for me" kind of thing to reassure others that it's safe
>> to try.  So when the release email gets posted to the user list, start a
>> tradition of people saying +1 in reply if they've tested it out and it
>> works for them.  That's happening informally now when there are problems,
>> but it might be nice to see a vote of confidence.  Just another idea.
>> 
>> Any other ideas or variations?
> 
> 
> I am no software engineering guru, but whenever I +1 a hive release I
> actually do checkout the code and run a couple queries. Mostly I find that
> because there is just so many things not unit testable like those gosh darn
> bash scripts that launch Java applications. There have been times when even
> after multiple patch revisions and passing unit tests something just does
> not work in the real world. So I never +1 a binary release I don't spend an
> hour with and if possible I try twisting the knobs on any new feature or at
> least just trying the basics.Hive is aiming for something like quarterly
> releases.
> 
> So possibly better to have Cassandra do time based releases. It does not
> have to be quarterly but if people want bleeding edge features (something
> committed 2 days ago) really they should go out and build something from
> trunk.
> 
> It seems like Cassandra devs have the voting and releasing down to a
> science but from my world the types of bugs I worry about are data file
> corruption, and any weird bug that would result in data faults like
> read_repair not working or writes not going to the write nodes, or bloom
> filters giving a faulty result. New features are great and I love seeing
> them but I can wait for those.
> 
> Updates now even trivial ones get political, you just never want to be the
> guy that champions a update and then not have it go well :)
> 
> Most users of Cassandra are going to have large clusters and really the
> project should not outstrip the common users ability to stay up to date.
> You have to figure that a large cluster like 20 nodes with maybe 200Gb
> data/node, doing a rolling restart without degrading performance is going
> to take some time. This is more then 'yum update cassandra'
> /etc/init.d/cassandra restart' and with risk of something going wrong
> people need time to QA and time for ops. This type of person does not like
> to fall many releases behind and likewise can not be updating too often
> either.
> 
> I have never had to roll back a release but I do wait usually for a month
> before running one to make sure there is not following soon.

Re: Discussion: release quality

Reply via email to