Unfortunately the picture is a bit more confusing.

Yahoo! is now HortonWorks. Their stated goal is to not have their own 
derivative release but to sell commercial support for the official Apache 
release.
So those selling commercial support are:
*Cloudera
*HortonWorks
*MapRTech
*EMC (reselling MapRTech, but had announced their own)
*IBM (not sure what they are selling exactly... still seems like smoke and 
mirrors...)
*DataStax 

So while you can use the Apache release, it may not make sense for your 
organization to do so. (Said as I don the flame retardant suit...)

The issue is that outside of HortonWorks which is stating that they will 
support the official Apache release, everything else is a derivative work of 
Apache's Hadoop. From what I have seen, Cloudera's release is the closest to 
the Apache release.

Like I said, things are getting interesting.

HTH

-Mike



> From: ev...@yahoo-inc.com
> To: common-user@hadoop.apache.org
> Date: Fri, 15 Jul 2011 07:35:45 -0700
> Subject: Re: Which release to use?
> 
> Adarsh,
> 
> Yahoo! no longer has its own distribution of Hadoop.  It has been merged into 
> the 0.20.2XX line so 0.20.203 is what Yahoo is running internally right now, 
> and we are moving towards 0.20.204 which should be out soon.  I am not an 
> expert on Cloudera so I cannot really map its releases to the Apache 
> Releases, but their distro is based off of Apache Hadoop with a few bug fixes 
> and maybe a few features like append added in on top of it, but you need to 
> talk to Cloudera about the exact details.  For the most part they are all 
> very similar.  You need to think most about support, there are several 
> companies that can sell you support if you want/need it.  You also need to 
> think about features vs. stability.  The 0.20.203 release has been tested on 
> a lot of machines by many different groups, but may be missing some features 
> that are needed in some situations.
> 
> --Bobby
> 
> 
> On 7/14/11 11:49 PM, "Adarsh Sharma" <adarsh.sha...@orkash.com> wrote:
> 
> Hadoop releases are issued time by time. But one more thing related to
> hadoop usage,
> 
> There are so many providers that provides the distribution of Hadoop ;
> 
> 1. Apache Hadoop
> 2. Cloudera
> 3. Yahoo
> 
> etc.
> Which distribution is best among them on production usage.
> I think Cloudera's  is best among them.
> 
> 
> Best Regards,
> Adarsh
> Owen O'Malley wrote:
> > On Jul 14, 2011, at 4:33 PM, Teruhiko Kurosaka wrote:
> >
> >
> >> I'm a newbie and I am confused by the Hadoop releases.
> >> I thought 0.21.0 is the latest & greatest release that I
> >> should be using but I noticed 0.20.203 has been released
> >> lately, and 0.21.X is marked "unstable, unsupported".
> >>
> >> Should I be using 0.20.203?
> >>
> >
> > Yes, I apologize for confusing release numbering, but the best release to 
> > use is 0.20.203.0. It includes security, job limits, and many other 
> > improvements over 0.20.2 and 0.21.0. Unfortunately, it doesn't have the new 
> > sync support so it isn't suitable for using with HBase. Most large clusters 
> > use a separate version of HDFS for HBase.
> >
> > -- Owen
> >
> >
> 
> 
                                          

Reply via email to