Somes patches that improve throughput for HBase, although you also need a HBase-side patch (HBASE-2467). They also backported stuff from 0.21 that's never going to be in 0.20-append. That's our main reasons to use CDh3b2.
J-D On Tue, Sep 28, 2010 at 10:02 AM, Renato Marroquín Mogrovejo <[email protected]> wrote: > Just a quick question that often intrigues me, why do you guys prefer the > CDH3b2? and not a regular hadoop-0.20.X??? > Thanks in advanced. > > > Renato M. > > 2010/9/28 Jean-Daniel Cryans <[email protected]> > >> > Will upgrading to 0.89 be a PITA? >> >> Unless you still use the deprecated APIs, it's actually just a matter >> of replacing the distribution and restarting. >> >> > >> > Should we expect to be able to upgrade the servers without losing data? >> >> Definitely, since no upgrade of the filesystem format is required. But >> it's always a good practice to backup your data before any upgrade. >> >> > >> > Will there be tons of client code changes? >> >> See first answer. >> >> > >> > What about configuration changes (especially little changes that will >> bite >> > us)? >> >> When we upgraded we only added dfs.support.append set to true. >> >> > >> > Do we need/want to upgrade hadoop at all (we're on 0.20.2)? >> >> If you want data durability (eg no data loss), yes. >> >> > >> > If we do upgrade, what is the recommended package to get it from? >> >> We use CDH3b2's hadoop, I'd recommend that, but you can also use the >> head of the Hadoop 0.20-append branch. We have our own HBase "distro" >> that we publish here http://github.com/stumbleupon/hbase, this is >> what's currently in production. It's not much different from >> 0.89.20100924 (which is still RC1), mainly fixes and improvements for >> cluster replication that will eventually make it to core hbase, and >> some neat changes to the ThriftServer to enable async ICVs. >> >> J-D >> >
