I think we definitely need better quality for the releases. Just looked at 1.0.3 and 1.0.4. I am willing to test out release candidate and report back my finding on the mailing list. Hopefully more folks can do that to make the testing more comprehensive. And folks with binding votes can take the testing results into account before they vote.
Bill On Tue, Nov 29, 2011 at 10:14 PM, Joe Stein <crypt...@gmail.com> wrote: > I need at least a week, maybe two to promote anything to staging which is > mainly because we do weekly releases. I could introduce a 2 day turn > around but only with a more fixed type schedule. I am running 0.8.6 in > production and REALLY want to upgrade for nothing more than getting > compression ( the cost of petabytes of uncompressed data is just stupid ). > So however I can help in changing my process OR better understanding the > PMC here I am game for. > > One thing I use C* for is holding days worth of data and re-running those > days for regression on our software... simulating production... It might > not take much to reverse it. > > /* > Joe Stein > http://www.medialets.com > Twitter: @allthingshadoop > */ > > On Nov 29, 2011, at 10:04 PM, Edward Capriolo <edlinuxg...@gmail.com> > wrote: > > > On Tue, Nov 29, 2011 at 6:16 PM, Jeremy Hanna < > jeremy.hanna1...@gmail.com>wrote: > > > >> I'd like to start a discussion about ideas to improve release quality > for > >> Cassandra. Specifically I wonder if the community can do more to help > the > >> project as a whole become more solid. Cassandra has an active and > vibrant > >> community using Cassandra for a variety of things. If we all pitch in a > >> little bit, it seems like we can make a difference here. > >> > >> Release quality is difficult, especially for a distributed system like > >> Cassandra. The core devs have done an amazing job with this considering > >> how complicated it is. Currently, there are several things in place to > >> make sure that a release is generally usable: > >> - review-then-commit > >> - 72 hour voting period > >> - at least 3 binding +1 votes > >> - unit tests > >> - integration tests > >> Then there is the personal responsibility aspect - testing a release in > a > >> staging environment before pushing it to production. > >> > >> I wonder if more could be done here to give more confidence in releases. > >> I wanted to see if there might be ways that the community could help out > >> without being too burdensome on either the core devs or the community. > >> > >> Some ideas: > >> More automation: run YCSB and stress with various setups. Maybe people > >> can rotate donating cloud instances (or simply money for them) but have > a > >> common set of scripts to do this in the source. > >> > >> Dedicated distributed test suite: I know there has been work done on > >> various distributed test suites (which is great!) but none have really > >> caught on so far. > >> > >> I know what the apache guidelines say, but what if the community could > >> help out with the testing effort in a more formal way. For example, for > >> each release to be finalized, what if there needed to be 3 community > >> members that needed to try it out in their own environment? > >> > >> What if there was a post release +1 vote for the community to sign off > on > >> - sort of a "works for me" kind of thing to reassure others that it's > safe > >> to try. So when the release email gets posted to the user list, start a > >> tradition of people saying +1 in reply if they've tested it out and it > >> works for them. That's happening informally now when there are > problems, > >> but it might be nice to see a vote of confidence. Just another idea. > >> > >> Any other ideas or variations? > > > > > > I am no software engineering guru, but whenever I +1 a hive release I > > actually do checkout the code and run a couple queries. Mostly I find > that > > because there is just so many things not unit testable like those gosh > darn > > bash scripts that launch Java applications. There have been times when > even > > after multiple patch revisions and passing unit tests something just does > > not work in the real world. So I never +1 a binary release I don't spend > an > > hour with and if possible I try twisting the knobs on any new feature or > at > > least just trying the basics.Hive is aiming for something like quarterly > > releases. > > > > So possibly better to have Cassandra do time based releases. It does not > > have to be quarterly but if people want bleeding edge features (something > > committed 2 days ago) really they should go out and build something from > > trunk. > > > > It seems like Cassandra devs have the voting and releasing down to a > > science but from my world the types of bugs I worry about are data file > > corruption, and any weird bug that would result in data faults like > > read_repair not working or writes not going to the write nodes, or bloom > > filters giving a faulty result. New features are great and I love seeing > > them but I can wait for those. > > > > Updates now even trivial ones get political, you just never want to be > the > > guy that champions a update and then not have it go well :) > > > > Most users of Cassandra are going to have large clusters and really the > > project should not outstrip the common users ability to stay up to date. > > You have to figure that a large cluster like 20 nodes with maybe 200Gb > > data/node, doing a rolling restart without degrading performance is going > > to take some time. This is more then 'yum update cassandra' > > /etc/init.d/cassandra restart' and with risk of something going wrong > > people need time to QA and time for ops. This type of person does not > like > > to fall many releases behind and likewise can not be updating too often > > either. > > > > I have never had to roll back a release but I do wait usually for a month > > before running one to make sure there is not following soon. >