What Joey said.

What you'll want is:

FileStatus[] fileStatuses = fs.listStatus(somePath);
for (FileStatus fstat : fileStatuses) {
  Path file = fstat.getPath();
  // Do other read/etc. logic here with Path and FileSystem as you want.
}

Also read the FileStatus API at
http://hadoop.apache.org/common/docs/r1.0.0/api/org/apache/hadoop/fs/FileStatus.html
for more information.

On Mon, Mar 5, 2012 at 7:38 PM, Joey Echeverria <j...@cloudera.com> wrote:
> You don't need to call readFields(), the FileStatus objects are
> already initialized. You should just be able to call the various
> getters to get the fields that you're interested in.
>
> -Joey
>
> On Mon, Mar 5, 2012 at 9:03 AM, Piyush Kansal <piyush.kan...@gmail.com> wrote:
>> Harsh,
>>
>> When I trying to readFields as follows:
>>
>> FileStatus origFStatus[] = ipFs.listStatus( ip );
>> DataInput dataIp;
>> origFStatus[ 0 ].readFields( dataIp );
>>
>> I am getting a compilation error "variable dataIp might not have been
>> initialized".
>>
>> How do we initialize it? Is there a direct method by which I can get the
>> read the fields easily.
>>
>>
>> On Mon, Mar 5, 2012 at 7:49 AM, Piyush Kansal <piyush.kan...@gmail.com>
>> wrote:
>>>
>>> Thanks Harsh. It worked.
>>>
>>>
>>> On Mon, Mar 5, 2012 at 5:58 AM, Harsh J <ha...@cloudera.com> wrote:
>>>>
>>>> Piyush,
>>>>
>>>> On Mon, Mar 5, 2012 at 3:16 PM, Piyush Kansal <piyush.kan...@gmail.com>
>>>> wrote:
>>>> > Ques 1:
>>>> > ======
>>>> > I have a HDFS directory which contains the o/p files of reducer. I want
>>>> > to
>>>> > read all the part-r-* files present in this directory.
>>>> >
>>>> > I have already tried following options as follows but no luck:
>>>> > - FileSystem.listStatus
>>>> >
>>>> > Can you please suggest how can I do it?
>>>>
>>>> Iterate over the FileStatus objects returned by listStatus (they'll be
>>>> in the right order), and read them one by one. Does that not work for
>>>> you?
>>>>
>>>> > Ques 2:
>>>> > ======
>>>> > Since MultipleOutputs/MultipleOutputFormat is not there in 0.20.203, so
>>>> > can
>>>> > we achieve the same functionality provided by these classes.
>>>>
>>>> Upgrade to either 1.0.1 to get MultipleOutputs for new API (Was only
>>>> recently released with that backport from 0.21), or to any alternative
>>>> distributions that offer it back-ported, or perhaps switch back to
>>>> using the stable (old) API which is still recommended to use for MR.
>>>>
>>>> Alternatively, read
>>>>
>>>> http://wiki.apache.org/hadoop/FAQ#Can_I_write_create.2BAC8-write-to_hdfs_files_directly_from_map.2BAC8-reduce_tasks.3F
>>>>
>>>> --
>>>> Harsh J
>>>
>>>
>>>
>>>
>>> --
>>> Regards,
>>> Piyush Kansal
>>>
>>
>>
>>
>> --
>> Regards,
>> Piyush Kansal
>>
>
>
>
> --
> Joseph Echeverria
> Cloudera, Inc.
> 443.305.9434



-- 
Harsh J

Reply via email to