Jay, Using the hdfs:// scheme is the right way, as you have determined. However…
A few things you need to ensure while using the Java FileSystem API to do your HDFS tasks: - Connect to NameNode's RPC port, not the web port. Default RPC port is usually 8020, but your fs.default.name config will tell you the right one. - Do your client and server Hadoop versions match perfectly? If not, make it so as you could run into protocol incompatibility issues between versions. - Ensure your client can connect to the RPC ports of NameNode and DataNode both for reads/writes. If there's a firewall, you may need to configure it to allow this. On Fri, Oct 28, 2011 at 11:22 AM, Jay Vyas <[email protected]> wrote: > I found a way to connect to hadoop via hftp, and it works fine, (read only) > : > > uri = "hftp://172.16.xxx.xxx:50070/"; > > System.out.println( "uri: " + uri ); > Configuration conf = new Configuration(); > > FileSystem fs = FileSystem.get( URI.create( uri ), conf ); > fs.printStatistics(); > > However, it appears that hftp is read only, and I want to read/write as well > as copy files, that is, I want to connect over hdfs . How can I enable hdfs > connections so that i can edit the actual , remote filesystem using the file > / path's APIs ? Are there ssh settings that have to be set before i can do > this > ? > > I tried to change the protocol above from "hftp" -> "hdfs", but I got the > following exception ... > > Exception in thread "main" java.io.IOException: Call to / > 172.16.112.131:50070 failed on local exception: java.io.EOFException at > org.apache.hadoop.ipc.Client.wrapException(Client.java:1139) at > org.apache.hadoop.ipc.Client.call(Client.java:1107) at > org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226) at > $Proxy0.getProtocolVersion(Unknown Source) at > org.apache.hadoop.ipc.RPC.getProxy(RPC.java:398) at > org.apache.hadoop.ipc.RPC.getProxy(RPC.java:384) at > org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:111) at > org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:213) at > org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:180) at > org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89) > at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1514) at > org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:67) at > org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:1548) at > org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1530) at > org.apache.hadoop.fs.FileSystem.get(FileSystem.java:228) at > sb.HadoopRemote.main(HadoopRemote.java:24) > -- Harsh J
