> I just updated the page to add a Notes section explaining the issue > and referencing the JIRA issue # you mentioned earlier.
Great - thanks. > > Are you able to do 'bin/hadoop-ec2 launch-cluster' then (on your > > workstation) > > > > . bin/hadoop-ec2-env.sh > > ssh $SSH_OPTS "[EMAIL PROTECTED]" "sed -i -e > > \"s/$MASTER_HOST/\$(hostname)/g\" > > /usr/local/hadoop-$HADOOP_VERSION/conf/hadoop-site.xml" > > > > and then check to see if the master host has been set correctly (to > > the internal IP) in the master host's hadoop-site.xml. > > Well, no, since my $MASTER_HOST is now just the external DNS name of > the first instance started in the reservation, but this is performed > as part of my launch-hadoop-cluster script. In any case, that value is > not set to the internal IP, but rather to the hostname portion of the > internal DNS name. This is a bit of a mystery to me - I'll try to reproduce it in on my workstation. > > Currently, my MR jobs are failing because the reducers can't copy the > map output and I'm thinking it might be because there is some kind of > external address getting in there somehow. I see connections to > external IPs in netstat -tan (72.* addresses). Any ideas about that? > In the hadoop-site.xml's on the slaves, the address is the external > DNS name of the master (ec2-*) but that resolves to the internal 10/8 > address like it should. > > > Also, what version of the EC2 tools are you using? > > black:~/code/hadoop-0.14.0/src/contrib/ec2> ec2-version > 1.2-11797 2007-03-01 > black:~/code/hadoop-0.14.0/src/contrib/ec2> I'm using the same version so that's not it. > > Instances are terminated on the basis of their AMI ID since 0.14.0. > > See https://issues.apache.org/jira/browse/HADOOP-1504. > > I felt this was unsafe as it was, since it looked for a name of an > image and then reversed it to the AMI ID. I just hacked it so you have > to put in the AMI ID in hadoop-ec2-env.sh. Also, the script as it is > right now doesn't grep for 'running' so may potentially shut down some > instances starting up in another cluster. I may just be paranoid, > however ;) Checking for 'running' is a good idea. I've relied on version number so folks can easily select the version of hadoop they want on the cluster. Perhaps the best solution would be to allow an optional parameter to the terminate script to specify the AMI ID if you need extra certainty (the script already prompts with a list of instances to terminate). Tom
