Hi again Jim and Ted, I understood that each mapper will be getting a block of lines... but even thought I had only 2 mappers for a 16 lines of input file and TextInputFormat is used. A map-function is processed for each of those 16 lines!
I wanted a block of lines per map ... hence something like map1 has 8 lines and map2 has 8 lines. So first question: is there a difference between Mappers and maps ? Second: Does that mean I need to write my own inputFormat to make the InputSplit equal to multipleLines ??? Thank you, Maha On Feb 18, 2011, at 11:55 AM, Jim Falgout wrote: > That's right. The TextInputFormat handles situations where records cross > split boundaries. What your mapper will see is "whole" records. > > -----Original Message----- > From: maha [mailto:[email protected]] > Sent: Friday, February 18, 2011 1:14 PM > To: common-user > Subject: Quick question > > Hi all, > > I want to check if the following statement is right: > > If I use TextInputFormat to process a text file with 2000 lines (each ending > with \n) with 20 mappers. Then each map will have a sequence of COMPLETE > LINES . > > In other words, the input is not split byte-wise but by lines. > > Is that right? > > > Thank you, > Maha >
