Thanks for the fast reply! I've dug in the code a little bit, and it seems to me that I can achieve my goal by overloading Mapper.run method: just iterate over the whole split by using context.nextKeyValue() and then call map only with the values I need. Since I'm a novice Hadooper, am I thinking it the wrong way?
thanks again, yaron On Wed, Oct 12, 2011 at 12:44 PM, Harsh J <ha...@cloudera.com> wrote: > Hello Yaron, > > Yes, this is possible to do. > > You need to plug in your own RecordReader implementation into the job, > to control the emits and the action done before feeding key-value pair > data into map(…). > > On Wed, Oct 12, 2011 at 2:42 PM, Yaron Gonen <yaron.go...@gmail.com> > wrote: > > Hi, > > The map method in the Mapper gets as a parameter a single line from the > > split. Is there a way for Mappers to get the whole split as input? > > I'd like to scan the whole split before I decide which key-value pairs to > > emit to the reducer. > > Thanks > > yaron > > > > > > -- > Harsh J >