I've come to the conclusion that using the contrib/ec2 scripts from hadoop 0.17 is incompatible with the prebuilt hadoop-0.16.1 image currently available in the hadoop-ec2-images bucket (ami-461df82f hadoop-ec2-images/hadoop-0.16.1.manifest.xml to be precise).
The problem is that the user-data passed in by 0.17 has a different format than what is expected by the hadoop-init script packaged with 0.16.1. Specifically, 0.17's user -data is meant to be a comma delimited list of bash var settings of the form KEY=VAL, whereas 0.16.x seems to expect just a comma delimited list of values whose keys are known by their ordinal placement (that is, the first value is the number of instances, the second value is the name of the master node). So now I'm back to the idea that I'm going to have to build myself an ec2 AMI with hadoop 0.17 from "scratch" (using the create-instance scripts of course). This isn't /too/ much more work than I'd have to do anyway. I plan on running hbase on my cluster as well as python hadoop-streaming jobs which was going to require other libraries (like SQLAlchemy and Thrift). These items were going to necessitate creating my own images anyway :/ -- Jim On Wed, May 7, 2008 at 3:51 PM, Jim R. Wilson <[EMAIL PROTECTED]> wrote: > > keep the questions coming. will be glad to see HBase running on ec2, maybe > > we can put your changes back into the tree. > > Thanks :) > > Just prior to this excursion, I found some things needing tweaking to > work with my hosting provider. Now that I'm moving to ec2, I'll be on > the lookout for similar issues. I'll submit any patches I end up > making. > > -- Jim > > > > On Wed, May 7, 2008 at 3:39 PM, Chris K Wensel <[EMAIL PROTECTED]> wrote: > > > > > In the aforementioned EC2 wiki page, it has a configuration section > > > labeled "(Pre 0.17) Hadoop cluster variables (GROUP, MASTER_HOST, > > > NO_INSTANCES)". I had assumed that I would actually need hadoop-0.17 > > > on my ec2 instances in order to forgo those instructions. It sounds > > > like you're telling me that all the "pre 0.17" or "0.17" specific > > > instructions in the wiki page refer only to the ec2 creation scripts, > > > not the actual running version in the cluster, is that correct? > > > > > > > > > > correct. > > > > the only reason these scripts are in the 0.17 branch is because they are > > not backward compatible with themselves. > > > > > > > > > Also, how safe is it to run a different version of the ec2 scripts > > > from the actual running hadoop instance? I'm guessing it's pretty > > > safe since you suggested it :) > > > > > > > > > > there are no dependencies between the EC2 scripts and Hadoop core. you can > > use them with any version, as long as you build EC2 Images for the versions > > of Hadoop you are after with the 'new' scripts. > > > > > > > > > Thanks again for all the help - still wrapping my mind around this stuff. > > > > > > > > > > keep the questions coming. will be glad to see HBase running on ec2, maybe > > we can put your changes back into the tree. > > > > > > > > Chris K Wensel > > [EMAIL PROTECTED] > > http://chris.wensel.net/ > > http://www.cascading.org/ > > > > > > > > > > >
