On 2/9/10 2:40 PM, Massoud Mazar wrote:
Since the most active HIVE users are (should be) in this list, I wanted
to ask your opinion about using the bare hadoop/hive distribution vs.
Cloudera’s distribution. What are the pros and cons of each?
Thanks
Generally, the Cloudera distros tends to receive patches for bugs /
features their customers have hit / requested faster than the mainline
release cycle. They come in nicely packaged rpm / debs for you and make
use of the alternatives system for configuration management (which is
nice for coordinated configuration changes across a cluster). The ASF
distro what's blessed by the core Hive commiters. The Cloudera distro
only has a few patches applied and I don't remember what those patches
address. Ultimately, all patches from Cloudera (I think) are submitted
back to the ASF. Whether they get included are left to the ASF
commiters, of course.
I've had great luck with the Cloudera distros of Hadoop and Hive.
In full disclosure, I do some training for Cloudera.
Hope this helps.
--
Eric Sammer
[email protected]
http://esammer.blogspot.com