In order to use hadoop dfs, your client must be able to talk to all your 
datanodes and the namenode.

So you should:
1. Make sure you can talk to datanodes
2. Make sure your datanode reports its public ip/dns name to the namenode, not 
its internal amazon ip/dns name.  You can check this on the namenode status 
page (http://namenode:50030).

You can also ssh to a machine inside your ec2 cluster and submit your jobs from 
there, but this will greatly decrease your bandwidth entering/leaving ec2.

-Michael

On 11/12/07 9:28 AM, "jonathan doklovic" <[EMAIL PROTECTED]> wrote:

Hi,

I just got an EC2 cluster up and running, and the first thing I need to
do is loop through a file on my local system and create a mapfile on the
hadoop cludter from it.

I tried doing this with a local java client, but when I call close() on
my MapFile.Writer, I get the following messages over and over:

07/11/12 11:07:24 INFO fs.DFSClient: Waiting to find target
node: /10.252.27.223:50010
07/11/12 11:08:30 INFO fs.DFSClient: Waiting to find target
node: /10.252.31.31:50010
07/11/12 11:09:36 INFO fs.DFSClient: Waiting to find target
node: /10.252.31.31:50010
07/11/12 11:10:43 INFO fs.DFSClient: Waiting to find target
node: /10.252.27.223:50010
07/11/12 11:11:49 INFO fs.DFSClient: Waiting to find target
node: /10.252.31.31:50010
07/11/12 11:12:55 INFO fs.DFSClient: Waiting to find target
node: /10.252.22.192:50010

I believe it's because it's looking on my network for the ip instead of
the ec2 cluster.

Also, I thought maybe I could just copy the file to the cluster
using ./hadoop dfs -put

but I can't seem to get that to work either.

any suggestions?

- Jonathan


Reply via email to