Hadoop version : 2.8.5 I have a hdfs set up in private data center (which is not exposed to internet ) .In the same data center I have another node (gateway node).Purpose of this gateway node is to provide access to hdfs from edge machine (which is present outside of data center) through public internet . To enable this kind of setup I have setup an ssh tunnel from edge machine to name node host and port(9000) through gateway node . something like
ssh -N -L <local-port>:<namenode-private-ip>:<namenodeport> <gateway-user>@<gatewayhost> -i <ssh-keys> -vvvv . When i did hadoop fs -ls hdfs://localhost:<local-port> it works fine from edge machine but when i executed hadoop fs -put <some-file> hdfs://localhost:<local-port> it fails with following error message. org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/<private-ip-of-datanode>:50010] at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:534) at org.apache.hadoop.hdfs.DataStreamer.createSocketForPipeline(DataStreamer.java:253) at org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1725) at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1679) at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:716) Looks like it is trying to write directly to private ip address of data node .How do i resolve this? Do let me know if some other information is needed . Thanks