Harsh, Can you please tell how can we use MultipleInputs using Job Object on hadoop 0.20.2. As you can see, in MultipleInputs, its using JobConf object. I want to use Job object as mentioned in new hadoop 0.21 API. I remember you talked about pulling out things from new API and add it into out project. Can you please add more light how can we do this ?
Thanks , Praveenesh. On Wed, Sep 7, 2011 at 2:57 AM, Harsh J <ha...@cloudera.com> wrote: > Sahana, > > Yes this is possible as well. Please take a look at the MultipleInputs > API @ > http://hadoop.apache.org/common/docs/r0.20.1/api/org/apache/hadoop/mapred/lib/MultipleInputs.html > > It will allow you to add a path each with its own mapper > implementation, and you can then have a common reducer since the key > is what you'll be matching against. > > On Wed, Sep 7, 2011 at 3:02 PM, Sahana Bhat <sana.b...@gmail.com> wrote: > > Hi, > > I understand that given a file, the file is split across 'n' > mapper > > instances, which is the normal case. > > The scenario i have is : > > 1. Two files which are not totally identical in terms of number of > columns > > (but have data that is similar in a few columns) need to be processed and > > after computation a single output file has to be generated. > > Note : CV - computedvalue > > File1 belonging to one dataset has data for : > > Date,counter1,counter2, CV1,CV2 > > File2 belonging to another dataset has data for : > > Date,counter1,counter2,CV3,CV4,CV5 > > Computation to be carried out on these two files is : > > CV6 =(CV1*CV5)/100 > > And the final emitted output file should have data in the sequence: > > Date,counter1,counter2,CV6 > > The idea is to have two mappers (not instances) run on each of the file, > and > > a single reducer that emits the final result file. > > Thanks, > > Sahana > > On Wed, Sep 7, 2011 at 2:40 PM, Harsh J <ha...@cloudera.com> wrote: > >> > >> Sahana, > >> > >> Yes. But, isn't that how it is normally? What makes you question this > >> capability? > >> > >> On Wed, Sep 7, 2011 at 2:37 PM, Sahana Bhat <sana.b...@gmail.com> > wrote: > >> > Hi, > >> > Is it possible to have multiple mappers where each mapper is > >> > operating on a different input file and whose result (which is a key > >> > value > >> > pair from different mappers) is processed by a single reducer? > >> > Regards, > >> > Sahana > >> > >> > >> > >> -- > >> Harsh J > > > > > > > > -- > Harsh J >