
I'm trying to setup a Hadoop 0.16.0 cluster on EC2/S3. (Manually, not
using the Hadoop AMIs)

I've got the S3 based HDFS working, but I'm stumped when I try to get a
test job running:

[EMAIL PROTECTED]:~/hadoop-0.16.0$ time bin/hadoop jar 
contrib/streaming/hadoop-0.16.0-streaming.jar -mapper /tmp/test.sh -reducer cat 
-input testlogs/* -output testlogs-output
packageJobJar: [/tmp/hadoop-hadoop/hadoop-unjar17969/] [] 
/tmp/streamjob17970.jar tmpDir=null
08/03/10 14:01:28 INFO mapred.FileInputFormat: Total input paths to process : 
08/03/10 14:02:58 INFO streaming.StreamJob: getLocalDirs(): 
08/03/10 14:02:58 INFO streaming.StreamJob: Running job: job_200803101400_0001
08/03/10 14:02:58 INFO streaming.StreamJob: To kill this job, run:
08/03/10 14:02:58 INFO streaming.StreamJob: 
/home/hadoop/hadoop-0.16.0/bin/../bin/hadoop job  
-Dmapred.job.tracker=ec2-67-202-58-97.compute-1.amazonaws.com:9001 -kill 
08/03/10 14:02:58 INFO streaming.StreamJob: Tracking URL: 
08/03/10 14:02:59 INFO streaming.StreamJob:  map 0%  reduce 0%

Furthermore, when I try to connect port 9001 on via telnet from 
the masterhost itself, it connects:
[EMAIL PROTECTED]:~/hadoop-0.16.0$ telnet 9001
Connected to
Escape character is '^]'.
telnet> quit
Connection closed.

When I try to do this from other VMs in my cluster, it just hangs. 
(tcpdump on the masterhost shows no activity for tcp port 9001):

[EMAIL PROTECTED]:~/hadoop-0.16.0$ telnet ip-10-251-75-165.ec2.internal 9001

[EMAIL PROTECTED]:~/hadoop-0.16.0$ telnet ip-10-251-75-165.ec2.internal 22
Connected to ip-10-251-75-165.ec2.internal.
Escape character is '^]'.
SSH-2.0-OpenSSH_4.3p2 Debian-9
telnet> quit
Connection closed.

This is also shown when I connect port 50030, which shows 0 nodes ready to 
process the job.

Furthermore, the slaves show the following messages:
2008-03-10 15:30:11,455 INFO org.apache.hadoop.ipc.RPC: Problem connecting to 
server: ec2-67-202-58-97.compute-1.amazonaws.com/
2008-03-10 15:31:12,465 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: ec2-67-202-58-97.compute-1.amazonaws.com/ Already 
tried 1 time(s).
2008-03-10 15:32:13,475 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: ec2-67-202-58-97.compute-1.amazonaws.com/ Already 
tried 2 time(s).

Last but not least, here is my site conf:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

  <description>The name of the default file system.  A URI whose
  scheme and authority determine the FileSystem implementation.  The
  uri's scheme determines the config property (fs.SCHEME.impl) naming
  the FileSystem implementation class.  The uri's authority is used to
  determine the host, port, etc. for a filesystem.</description>



  <description>The host and port that the MapReduce job tracker runs
  at.  If "local", then jobs are run in-process as a single map
  and reduce task.

The masternode listens not no localhost:
[EMAIL PROTECTED]:~/hadoop-0.16.0$ netstat -an | grep 9001
tcp        0      0*               LISTEN     

Any ideas? My conclusions thus are:

1.) First, it's not a general connectivity problem, because I can connect port 
22 without any problems.
2.) OTOH, on port 9001, inside the same group, the connectivity seems to be 
3.) All AWS docs tell me that VMs in one group have no firewalls in place.

So what is happening here? Any ideas?


Attachment: signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil

Reply via email to