I have tried using *slave.host.name* and give it the public address of my
data node. I can now see the node listed with its public address on the
dfshealth.jsp, however when I try to send a file to the HDFS from my
external server I still get :

*08/09/08 15:58:41 INFO dfs.DFSClient: Waiting to find target node:
10.251.75.177:50010
08/09/08 15:59:50 INFO dfs.DFSClient: Exception in createBlockOutputStream
java.net.SocketTimeoutException
08/09/08 15:59:50 INFO dfs.DFSClient: Abandoning block
blk_-8257572465338588575
08/09/08 15:59:50 INFO dfs.DFSClient: Waiting to find target node:
10.251.75.177:50010
08/09/08 15:59:56 WARN dfs.DFSClient: DataStreamer Exception:
java.io.IOException: Unable to create new block.
        at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2246)
        at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1700(DFSClient.java:1702)
        at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1842)

08/09/08 15:59:56 WARN dfs.DFSClient: Error Recovery for block
blk_-8257572465338588575 bad datanode[0]*

Is there another parameter I could specify to force the address of my
datanode? I have been searching on the EC2 forums and documentation and
apparently there is no way I can use *dfs.datanode.dns.interface* or *
dfs.datanode.dns.nameserver* to specify the public IP of my instance.

Has anyone else managed to send/retrieve stuff from HDFS on an EC2 cluster
from an external machine?

Thanks

Julien


2008/9/5 Julien Nioche <[EMAIL PROTECTED]>

> Hi guys,
>
> I am using Hadoop on a EC2 cluster and am trying to send files onto the
> HDFS from an external machine. It works up to the point where I get this
> error message :
> *Waiting to find target node: 10.250.7.148:50010*
>
> I've seen a discussion about a similar issue on *
> http://thread.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/2446/focus=2449
> * but there are no details on how to fix the problem.
>
> Any idea about how I can set up my EC2 instances so that they return their
> public IPs and not the internal Amazon ones? Anything I can specify for the
> parameters *dfs.datanode.dns.interface* and *dfs.datanode.dns.nameserver*?
>
>
> What I am trying to do is to put my input to be processed onto the HDFS and
> retrieve the output from there. What I am not entirely sure of is whether I
> can launch my job from the external machine. Most people seem to SSH to the
> master to do that.
>
> Thanks
>
> Julien
> --
> DigitalPebble Ltd
> http://www.digitalpebble.com
>



-- 
DigitalPebble Ltd
http://www.digitalpebble.com

Reply via email to