Hi all, I've been throwing this idea around to a few people individually over the last week, but wanted to raise it on the list for a wider discussion:
I'd like to propose inserting a "development release series" in between 0.20 and our next release (the one that will have full durability, significant master fixup, etc). In terms of what this release series would be: 1) Frequent Releases. We should try to do a release once a month between now and the next major release. 2) Low Amount of QA. We should not make any guarantees that the releases have been heavily tested. They should certainly pass unit tests and basic functional tests (eg PE and other cluster smoke tests) but we would not go through multi-week rc cycles. 3) Clearly communicated "Development" status. We do not want new users on this release unless they are aware they are trying what is essentially a trunk snapshot. We should feel free to have serious known bugs in these development releases and not address them until the next release. 4) Given the "development" status of this build, I think we should drop the "no breaking changes between point releases" restriction. That way we're fine to make RPC/API fixups as the series progresses. 5) Normal policies regarding license cleanliness, release voting, etc. Kind of goes without saying, but we still need to be careful to maintain proper licensing, etc, and not let anything slip in that would be Apache 2 incompatible, etc. Now, the reasons I am proposing this: 1) We have a vibrant user/contributor community which has a diverse set of workloads. The core developer community has a smaller (but growing) set of workloads. There have been a few cases where a release has worked great for the workloads of the core team, but after a release we've found a serious bug that only affects some feature or access pattern that hasn't been sufficiently tested during development. Doing frequent development releases gives reasonable milestones for users to try upcoming releases for their workloads and give feedback if there has been a regression. Two particular examples of the above are the ExplicitColumnTracker infinite loop bug in 0.20.4, and the META caching regression in 0.20.3. 2) Given the recent switch to Maven, I think we will find that the first post-maven release will bring new difficulties we haven't worked through in the past. We should make sure we know how to release off the new build system, and that all of the various shell scripts and integration points still work. Doing some "practice" releases before the real thing will allow us to start tackling these problems early. 3) It's been quite some time since the last major release. Doing a fast progression of a few development releases is a good way to highlight the recent surge of HBase development and build anticipation for what's coming. 4) On a selfish note, I would like to attempt to include a "beta" release of durable HBase packaged in CDH3 in June. We will certainly not be ready for the next major release at that point. I can certainly pick an svn revision and build my own release off that, but if the version number in CDH lines up with a version number of this development series, I think it will present less user confusion, etc. I am also happy to volunteer myself to release manage at least the first of this series. Though I do not have committer access, the Apache rules generally allow anyone to propose a release tarball, so long as it's voted on by the PMC. Thanks -Todd -- Todd Lipcon Software Engineer, Cloudera