In order to use hadoop dfs, your client must be able to talk to all your datanodes and the namenode.
So you should: 1. Make sure you can talk to datanodes 2. Make sure your datanode reports its public ip/dns name to the namenode, not its internal amazon ip/dns name. You can check this on the namenode status page (http://namenode:50030). You can also ssh to a machine inside your ec2 cluster and submit your jobs from there, but this will greatly decrease your bandwidth entering/leaving ec2. -Michael On 11/12/07 9:28 AM, "jonathan doklovic" <[EMAIL PROTECTED]> wrote: Hi, I just got an EC2 cluster up and running, and the first thing I need to do is loop through a file on my local system and create a mapfile on the hadoop cludter from it. I tried doing this with a local java client, but when I call close() on my MapFile.Writer, I get the following messages over and over: 07/11/12 11:07:24 INFO fs.DFSClient: Waiting to find target node: /10.252.27.223:50010 07/11/12 11:08:30 INFO fs.DFSClient: Waiting to find target node: /10.252.31.31:50010 07/11/12 11:09:36 INFO fs.DFSClient: Waiting to find target node: /10.252.31.31:50010 07/11/12 11:10:43 INFO fs.DFSClient: Waiting to find target node: /10.252.27.223:50010 07/11/12 11:11:49 INFO fs.DFSClient: Waiting to find target node: /10.252.31.31:50010 07/11/12 11:12:55 INFO fs.DFSClient: Waiting to find target node: /10.252.22.192:50010 I believe it's because it's looking on my network for the ip instead of the ec2 cluster. Also, I thought maybe I could just copy the file to the cluster using ./hadoop dfs -put but I can't seem to get that to work either. any suggestions? - Jonathan
