Thank you so much for your reply . I have further question there are some blogs which talks about some similar setup like this one
https://github.com/vkovalchuk/hadoop-2.6.0-windows/wiki/How-to-access-HDFS-behind-firewall-using-SOCKS-proxy I am just curious how does that works. On Thu, Sep 12, 2019 at 11:05 PM Tony S. Wu <tonyswu....@gmail.com> wrote: > You need connectivity from edge node to the entire cluster, not just > namenode. Your topology, unfortunately, probably won’t work too well. A > proper VPN / IPSec tunnel might be a better idea. > > On Thu, Sep 12, 2019 at 12:04 AM saurabh pratap singh < > saurabh.cs...@gmail.com> wrote: > >> Hadoop version : 2.8.5 >> I have a hdfs set up in private data center (which is not exposed to >> internet ) .In the same data center I have another node (gateway >> node).Purpose of this gateway node is to provide access to hdfs from edge >> machine (which is present outside of data center) through public internet . >> To enable this kind of setup I have setup an ssh tunnel from edge machine >> to name node host and port(9000) through gateway node . >> something like >> >> ssh -N -L <local-port>:<namenode-private-ip>:<namenodeport> >> <gateway-user>@<gatewayhost> -i <ssh-keys> -vvvv . >> >> When i did hadoop fs -ls hdfs://localhost:<local-port> it works fine from >> edge machine but >> when i executed hadoop fs -put <some-file> hdfs://localhost:<local-port> >> it fails with following error message. >> >> org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while >> waiting for channel to be ready for connect. ch : >> java.nio.channels.SocketChannel[connection-pending >> remote=/<private-ip-of-datanode>:50010] >> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:534) >> at >> org.apache.hadoop.hdfs.DataStreamer.createSocketForPipeline(DataStreamer.java:253) >> at >> org.apache.hadoop.hdfs.DataStreamer.createBlockOutputStream(DataStreamer.java:1725) >> at >> org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1679) >> at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:716) >> >> >> Looks like it is trying to write directly to private ip address of data >> node .How do i resolve this? >> >> Do let me know if some other information is needed . >> >> Thanks >> >