Hi Tim, Splits don't look at newlines in the TextInputFormat at least. So since the computed splits > default map numbers, I think a perfect file of 10 blocks will spawn only 10 mappers. The mapper's record reader is the one that reads until a newline (even after the end of its block length bytes).
On Wed, Sep 19, 2012 at 9:16 PM, Tim Robertson <timrobertson...@gmail.com> wrote: > I think the splitting recognises the end of line, so you might get 11 but > otherwise that looks correct. > > > > On Wed, Sep 19, 2012 at 5:42 PM, Pedro Sá da Costa <psdc1...@gmail.com> > wrote: >> >> >> >> If I've an input file of 640MB in size, and a split size of 64Mb, this >> file will be partitioned in 10 splits, and each split will be processed by a >> map task, right? >> >> -- >> Best regards, >> > -- Harsh J