Thanks for the fast reply!
I've dug in the code a little bit, and it seems to me that I can achieve my
goal by overloading Mapper.run method: just iterate over the whole split by
using context.nextKeyValue() and then call map only with the values I need.
Since I'm a novice Hadooper, am I thinking it the wrong way?

thanks again,
yaron

On Wed, Oct 12, 2011 at 12:44 PM, Harsh J <ha...@cloudera.com> wrote:

> Hello Yaron,
>
> Yes, this is possible to do.
>
> You need to plug in your own RecordReader implementation into the job,
> to control the emits and the action done before feeding key-value pair
> data into map(…).
>
> On Wed, Oct 12, 2011 at 2:42 PM, Yaron Gonen <yaron.go...@gmail.com>
> wrote:
> > Hi,
> > The map method in the Mapper gets as a parameter a single line from the
> > split. Is there a way for Mappers to get the whole split as input?
> > I'd like to scan the whole split before I decide which key-value pairs to
> > emit to the reducer.
> > Thanks
> > yaron
> >
>
>
>
> --
> Harsh J
>

Reply via email to