I would use the FileSystem API.
Here is a Q&D example
import java.io.*;
import java.util.*;
import java.lang.*;
import java.net.URI;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.fs.FileStatus;
public class dirc {
public static void main ( String args[])
{
try {
String dirname = args[0];
Configuration conf = new Configuration(true);
FileSystem fs = FileSystem.get(conf);
Path path = new Path(dirname);
FileStatus fstatus[] = fs.listStatus(path);
for ( FileStatus f: fstatus ) {
System.out.println(f.getPath().toUri().getPath());
}
}catch ( IOException e ) {
System.out.println("Usage dirc <directory> ");
return ;
} catch (ArrayIndexOutOfBoundsException e) {
System.out.println("Usage dirc <directory> ");
return ;
}
}
}
________________________________
From: Steve Lewis <[email protected]>
To: [email protected]
Sent: Wed, August 25, 2010 9:04:41 AM
Subject: Re: How to enumerate files in the directories?
@Override
public HDFSFile[] getFiles(String directory) {
String result = executeCommand("hadoop fs -ls " + directory);
String[] items = result.split("\n");
List<HDFSFile> holder = new ArrayList<HDFSFile>();
for (int i = 1; i < items.length; i++) {
String item = items[i];
if (item.length() > MIN__FILE_LENGTH) {
try {
holder.add(new HDFSFile(item));
}
catch (Exception e) {
}
}
}
HDFSFile[] ret = new HDFSFile[holder.size()];
holder.toArray(ret);
return ret;
}
On Wed, Aug 25, 2010 at 12:36 AM, Denim Live <[email protected]> wrote:
Hello, how can one determine the names of the files in a particular hadoop
>directory, programmatically?
>
>
>
>
--
Steven M. Lewis PhD
Institute for Systems Biology
Seattle WA