I almost always want to pick the exact release of what is installed. I have been using whirr.hadoop.tarball.url (and similar props) to specify versions (with the side-effect that I also specify from where it is downloaded). Building a cluster for interactive, exploratory use might work fine with obtaining the latest of everything, but I want Whirr to help me spawn something that I previously characterized so that it has predictable performance.

I thought that using:

whirr.hadoop.tarball.url=http://archive.cloudera.com/cdh/3/hadoop-0.20.2-cdh3u1.tar.gz

would ensure that I would get cdh3u1, but since the release of hadoop-0.20.2-cdh3u2.tar.gz on Oct. 20, I see lots of cdh3u2 components installed, as seen in the names of jar files. Aside from being untested for my purposes, this also breaks some of my post-processing scripts for setting up Hive + HBase.

Is the above property insufficient for specifying exact releases? What else should I do or not do?


Paul

Reply via email to