Re: HDFS file content restrictions

Harsh J Fri, 04 Mar 2011 12:31:16 -0800

The class responsible for reading records as lines off a file, seek in
to the next block in sequence until the newline. This behavior, and
how it affects the Map tasks, is better documented here (see the
TextInputFormat example doc):
http://wiki.apache.org/hadoop/HadoopMapReduce


On Sat, Mar 5, 2011 at 1:54 AM, Kelly Burkhart <[email protected]> wrote:
> On Fri, Mar 4, 2011 at 1:42 PM, Harsh J <[email protected]> wrote:
>> HDFS does not operate with records in mind.
>
> So does that mean that HDFS will break a file at exactly <blocksize>
> bytes?  Map/Reduce *does* operate with records in mind, so what
> happens to the split record?  Does HDFS put the fragments back
> together and deliver the reconstructed record to one map?  Or are both
> fragments and consequently the whole record discarded?
>
> Thanks,
>
> -Kelly
>



-- 
Harsh J
www.harshj.com

Re: HDFS file content restrictions

Reply via email to