Hadoop Input File Directory

Boyu Zhang Fri, 11 Sep 2009 14:18:27 -0700

Dear all,


I have an input file hierarchy of depth 3, something like
/data/user/dir_0/file0, /data/user/dir_1/file0, /data/user/dir_2/file0. I
want to run a mapreduce job to process all the files in the deepest levels.

 

One way of doing so is to specify the input path like /data/user/dir_0,
/data/user/dir_1, /data/user/dir_2, but this becomes infeasible when the
hierarchy grows.

 

I tried to specify the input path as /data/user, but I got errors like:
cannot open filename /data/user/dir_0.

 

My question is that is there any way that I can process all the files with
specifying the input data to the top level?

 

Thanks a lot!

 

Boyu Zhang

University of Delaware

Hadoop Input File Directory

Reply via email to