On 9/6/07, Tom White <[EMAIL PROTECTED]> wrote: > > Yeah, I actually read all of the wiki and your article about using > > Hadoop on EC2/S3 and I can't really find a reference to the S3 support > > not being for "regular" S3 keys. Did I miss something or should I > > update the wiki to make it more clear (or both)? > > I don't think this is explained clearly enough, so please do update > the wiki. Thanks.
I just updated the page to add a Notes section explaining the issue and referencing the JIRA issue # you mentioned earlier. > > Also, the instructions on the EC2 page on the wiki no longer work, in > > that due to the kind of NAT Amazon is using, the slaves can't connect > > to the master using an externally-resolved IP address via a DNS name. > > What I mean is, if you set DNS to the external IP of your master > > instance, your slaves can resolve that address but cannot then connect > > to it. So, I had to alter the launch-hadoop-cluster and start-hadoop > > scripts and merge them to just pick the master and use its EC2-given > > name as the $MASTER_HOST to make it work. > > This sounds like the problem fixed in > https://issues.apache.org/jira/browse/HADOOP-1638 in 0.14.0, which is > the version you're using isn't it? > > Are you able to do 'bin/hadoop-ec2 launch-cluster' then (on your workstation) > > . bin/hadoop-ec2-env.sh > ssh $SSH_OPTS "[EMAIL PROTECTED]" "sed -i -e > \"s/$MASTER_HOST/\$(hostname)/g\" > /usr/local/hadoop-$HADOOP_VERSION/conf/hadoop-site.xml" > > and then check to see if the master host has been set correctly (to > the internal IP) in the master host's hadoop-site.xml. Well, no, since my $MASTER_HOST is now just the external DNS name of the first instance started in the reservation, but this is performed as part of my launch-hadoop-cluster script. In any case, that value is not set to the internal IP, but rather to the hostname portion of the internal DNS name. Currently, my MR jobs are failing because the reducers can't copy the map output and I'm thinking it might be because there is some kind of external address getting in there somehow. I see connections to external IPs in netstat -tan (72.* addresses). Any ideas about that? In the hadoop-site.xml's on the slaves, the address is the external DNS name of the master (ec2-*) but that resolves to the internal 10/8 address like it should. > Also, what version of the EC2 tools are you using? black:~/code/hadoop-0.14.0/src/contrib/ec2> ec2-version 1.2-11797 2007-03-01 black:~/code/hadoop-0.14.0/src/contrib/ec2> > > I also updated the scripts > > to only look for a given AMI ID and only start/manage/terminate > > instances of that AMI ID (since I have others I'd rather not > > terminated just on the basis of their AMI launch index ;-)). > > Instances are terminated on the basis of their AMI ID since 0.14.0. > See https://issues.apache.org/jira/browse/HADOOP-1504. I felt this was unsafe as it was, since it looked for a name of an image and then reversed it to the AMI ID. I just hacked it so you have to put in the AMI ID in hadoop-ec2-env.sh. Also, the script as it is right now doesn't grep for 'running' so may potentially shut down some instances starting up in another cluster. I may just be paranoid, however ;) -- Toby DiPasquale
