What Joey said. What you'll want is:
FileStatus[] fileStatuses = fs.listStatus(somePath); for (FileStatus fstat : fileStatuses) { Path file = fstat.getPath(); // Do other read/etc. logic here with Path and FileSystem as you want. } Also read the FileStatus API at http://hadoop.apache.org/common/docs/r1.0.0/api/org/apache/hadoop/fs/FileStatus.html for more information. On Mon, Mar 5, 2012 at 7:38 PM, Joey Echeverria <j...@cloudera.com> wrote: > You don't need to call readFields(), the FileStatus objects are > already initialized. You should just be able to call the various > getters to get the fields that you're interested in. > > -Joey > > On Mon, Mar 5, 2012 at 9:03 AM, Piyush Kansal <piyush.kan...@gmail.com> wrote: >> Harsh, >> >> When I trying to readFields as follows: >> >> FileStatus origFStatus[] = ipFs.listStatus( ip ); >> DataInput dataIp; >> origFStatus[ 0 ].readFields( dataIp ); >> >> I am getting a compilation error "variable dataIp might not have been >> initialized". >> >> How do we initialize it? Is there a direct method by which I can get the >> read the fields easily. >> >> >> On Mon, Mar 5, 2012 at 7:49 AM, Piyush Kansal <piyush.kan...@gmail.com> >> wrote: >>> >>> Thanks Harsh. It worked. >>> >>> >>> On Mon, Mar 5, 2012 at 5:58 AM, Harsh J <ha...@cloudera.com> wrote: >>>> >>>> Piyush, >>>> >>>> On Mon, Mar 5, 2012 at 3:16 PM, Piyush Kansal <piyush.kan...@gmail.com> >>>> wrote: >>>> > Ques 1: >>>> > ====== >>>> > I have a HDFS directory which contains the o/p files of reducer. I want >>>> > to >>>> > read all the part-r-* files present in this directory. >>>> > >>>> > I have already tried following options as follows but no luck: >>>> > - FileSystem.listStatus >>>> > >>>> > Can you please suggest how can I do it? >>>> >>>> Iterate over the FileStatus objects returned by listStatus (they'll be >>>> in the right order), and read them one by one. Does that not work for >>>> you? >>>> >>>> > Ques 2: >>>> > ====== >>>> > Since MultipleOutputs/MultipleOutputFormat is not there in 0.20.203, so >>>> > can >>>> > we achieve the same functionality provided by these classes. >>>> >>>> Upgrade to either 1.0.1 to get MultipleOutputs for new API (Was only >>>> recently released with that backport from 0.21), or to any alternative >>>> distributions that offer it back-ported, or perhaps switch back to >>>> using the stable (old) API which is still recommended to use for MR. >>>> >>>> Alternatively, read >>>> >>>> http://wiki.apache.org/hadoop/FAQ#Can_I_write_create.2BAC8-write-to_hdfs_files_directly_from_map.2BAC8-reduce_tasks.3F >>>> >>>> -- >>>> Harsh J >>> >>> >>> >>> >>> -- >>> Regards, >>> Piyush Kansal >>> >> >> >> >> -- >> Regards, >> Piyush Kansal >> > > > > -- > Joseph Echeverria > Cloudera, Inc. > 443.305.9434 -- Harsh J