On Wed, Oct 30, 2013 at 2:23 PM, Gabriel Reid <[email protected]>wrote:
> Inlined below. > > On Wed, Oct 30, 2013 at 6:02 PM, Josh Wills <[email protected]> wrote: > > > > That said, the > > code changes that we'll need to make to get Crunch working against the > 0.96 > > APIs are different enough from the 0.94 APIs that I feel like maintaining > > some sort of compatibility layer in our code will be pretty ugly. > > Yep, that's definitely something we don't want to do unless we've got > a good reason > to do it. > > > > I'm thinking along these lines: > > > > 1) Release 0.8.0 in the next couple of days against our current set of > > dependencies (Hadoop and HBase.) > > 2) Upgrade the Hadoop 2 dependency to Hadoop 2.2.0, which will also > require > > us to upgrade to protocol buffers 2.5.0 in the build-- I've already done > > this and verified that everything works. > > 3) Switch the HBase code to the 0.96 APIs, without trying to maintain > > backwards compatibility with 0.94, and get everything working. > > 4) Do the 0.9.0 release against with Hadoop 2 and HBase 0.96 as the > default. > > > > I imagine that there will still be bugfixes against 0.8.0 (both core and > > HBase) that will mean that we'll need to do 0.8.1, 0.8.2, etc. releases > to > > support, and I'm happy to keep those up at a regular cadence. > > This works for me, but I wish we had a better idea of what the adoption of > HBase 0.96 will be. I'm guessing it'll be pretty high, as people who are > just > using the normal client APIs have a less troublesome migration path than > those working with MapReduce. On the other hand, it would be a bummer to > shut out all the 0.94.x users if there isn't major adoption of 0.96 right > away. > > Anyhow, like I said, I'm personally fine with just supporting 0.96 as I > don't > think it'll be a problem for me. > Yeah, it's a hard balance to strike. I fully expect that we will have 0.8.1, 0.8.2, etc. releases to bring some of the fixes we do in trunk to the HBase 0.94-based Crunch, which will still be the major version for awhile. The HBase folks consider 0.96 the future and the best version to use w/Hadoop 2.2.0, so I'd like to pay whatever cost we have to pay in terms of APIs and dependency changes all at once instead of piecemeal. > - Gabriel > -- Director of Data Science Cloudera <http://www.cloudera.com> Twitter: @josh_wills <http://twitter.com/josh_wills>
