On Dec 23, 2010, at 12:47 PM, Andrew Purtell wrote: > I'm on the HBase PMC. > >> We will end up with Apache+Security release vs >> Apache+Append release vs Apache+Avatar release, > > The current situation is pretty close to this.
agreed. and I would like to make it better > > HBase has no suitable binary ASF Hadoop release to work against, currently. > Vanilla version 0.20 does not have sync/append support. We recommend users > adopt Cloudera's CDH3 beta 2, or compile the 0.20-append branch from source. > Version 0.21 is marked as unstable, was not tested at scale by Yahoo (unlike > 0.20), and has been panned by many would be adopters, if the various tweets > and blog posts I have seen in that regard are any indication. > I'd like to make two points here: 1. There is no substitute for your own QA team. You can never rely on a single company to do your testing for you. While it was great Yahoo tested the initial releases, you can see by their own distribution that what they were/are running is different to what other people are running. It is better for people to not blindly trust that just because company X is claiming that the are running something that it will work for them, and we cannot just rely on a individual or single company to provide that service going forward. Communities don't work that way. And reliance on a single company to provide your core infrastructure for gratis isn't really going to end up well either. Saying that, we are very lucky that Yahoo has chosen to openly contribute as much as they have, and I look forward to them and other large installation's contributions and participation going forward. 2. Hadoop is only one piece of the puzzle for most installations. One of the other issues with 0.21 (and with future releases going forward) is that 3rd parties did not port/upgrade their software to run with our new APIs. Without major software like Hbase, Pig, Hive being able to run on the platform, major installations won't even bother looking at it. I don't expect people to immediately upgrade to 0.22 when we release it. I expect it will take a good 3-6 months until people have the software they run available on it, and possibly a point release with some of problems people have found in their own testing fixed in our and other software. Like I said, I don't mind getting 0.20.3 released with the append/sync patch applied to it (with the other 20 or so patches), but I don't think the Hadoop team is large enough to support all the different releases as-is, let alone another one. --Ian >> Thats why I think we should go to 0.22 ASAP and get >> companies to build their new features on trunk against >> that. > > If Hadoop 0.22 is not vetted at high scale as was 0.20 -- this is the current > situation with 0.21 -- then I fear the current situation will not change and > HBase will still to refer would be users to a non-ASF release or a > source-only branch. > > Best regards, > > - Andy > > Problems worthy of attack prove their worth by hitting back. > - Piet Hein (via Tom White) > > > --- On Wed, 12/22/10, Ian Holsman <[email protected]> wrote: > >> From: Ian Holsman <[email protected]> >> Subject: Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of >> branch-0.20-append branch? >> To: [email protected] >> Date: Wednesday, December 22, 2010, 5:03 PM >> >> On Dec 23, 2010, at 11:33 AM, Stack wrote: >> >>> On Wed, Dec 22, 2010 at 4:05 PM, Ian Holsman <[email protected]> >> wrote: >>>> There are already 5 Hadoop 20.x releases out >> there, I don't think there is a need for another. (personal >> opinion, not a veto or speaking as the chair) >>>> >>> >>> Are you counting other than Apache releases? (I >> see only 4 here, two >>> of which probably should be removed: >>> http://www.gtlib.gatech.edu/pub/apache//hadoop/core/.) >> >> >> yes.. I was referring to the external companies who have >> decided to release their own version, for their own business >> purposes. (please don't take that as a negative). >> >>> >>>> Is there a reason why we couldn't create a hadoop >> 0.20.3 release that has this patch inside of it, as well as >> other fixes that have been applied since 0.20.2 (~26 >> patches)? Would this be too much effort for you to RM?.. >>>> >>> >>> I'd like that but my sense is the general populace of >> hadoopers would >>> think the append/sync suite of patches destabilizing >> -- append/sync >>> has a long 'history' in hadoop -- and a violation of >> the general >>> principal that bug fixes only are added on a branch. >> >> I'm open with adding it, as lack of append/sync could be >> seen as a bug to some. (yes i'm playing with words) >>> >>> >>>> I really don't want to come to a^h^h^h^hget out of >> the situation where we have multiple releases of 0.20 each >> with a unique feature. >>>> >>> >>> Sure. The notion has been broached before up on >> these lists -- e.g. >>> there was talk of a 0.20 Apache release that had >> security in it -- and >>> at the time folks seemed amenable. >> >> I think that approach encourages groups of >> individuals/companies to huddle up together to build large >> features without taking the larger group into account and >> then 'drop' the feature off and wait for others to thank >> them & port it to their releases. We then become >> multiple communities instead of a single one. >> >> We will end up with Apache+Security release vs >> Apache+Append release vs Apache+Avatar release, with various >> bug-fixes sprinkled into each. >> And I'm not sure which release Pig or Hbase would target to >> develop against. >> >> Thats why I think we should go to 0.22 ASAP and get >> companies to build their new features on trunk against >> that. >> >>> >>> Thanks for getting the discussion off the ground, >>> St.Ack >> >> > > >
