Good evening,

this topic seems very interesting.
To be sure I understood the case - do you mean that I can write a simple Java 
program and access a file stored in HDFS from within the java application?

Assuming that I have e.g. 10 files of size 30GB each stored on HDFS on a 
cluster of 15 nodes, how can I run a java program that accesses these files and 
reads some blocks from them? Is it possible to do it without copying the files 
via -copyToLocal ?

If yes, could anyone give some general directions on the general form of such a 
java code, and on how to run such a program?

Thank  you in advance
Sofia





________________________________
From: Uma Maheswara Rao G 72686 <[email protected]>
To: [email protected]
Sent: Monday, September 5, 2011 6:04 PM
Subject: Re: Is it possible to access the HDFS via Java OUTSIDE the Cluster?

Hi,

It is very much possible. Infact that is the main use case for Hadoop :-)

You need to put the hadoop-hdfs*.jar hdoop-common*.jar's in your class path 
from where you want to run the client program.

At client node side use the below sample code

Configuration conf=new Configuration(); //you can set the required  
configurations here
FileSystem fs =new DistributedFileSystem();
fs.initialize(new URI(<Name_Node_URL>), conf); 

fs.copyToLocal(srcPath, destPath)
fs.copyFromLocal(srcPath,destPath)
.....etc
There are many API exposed in FileSystem.java class. So, you can make use of 
them.


Regards,
Uma


----- Original Message -----
From: Ralf Heyde <[email protected]>
Date: Monday, September 5, 2011 7:59 pm
Subject: Is it possible to access the HDFS via Java OUTSIDE the Cluster?
To: [email protected]

> Hello,
> 
> 
> 
> I have found a HDFSClient which shows me, how to access my HDFS 
> from inside
> the cluster (i.e. running on a Node). 
> 
> 
> 
> My Idea is, that different processes may write 64M Chunks to HDFS from
> external Sources/Clients.
> 
> Is that possible? 
> 
> How that can be done? Does anybody have some Example Code?
> 
> 
> 
> Thanks,
> 
> 
> 
> Ralf
> 
> 
> 
> 

Reply via email to