Re: Multiple Mappers and One Reducer

Harsh J Wed, 07 Sep 2011 02:58:54 -0700

Sahana,

Yes this is possible as well. Please take a look at the MultipleInputs
API @ 
http://hadoop.apache.org/common/docs/r0.20.1/api/org/apache/hadoop/mapred/lib/MultipleInputs.html


It will allow you to add a path each with its own mapper
implementation, and you can then have a common reducer since the key
is what you'll be matching against.

On Wed, Sep 7, 2011 at 3:02 PM, Sahana Bhat <sana.b...@gmail.com> wrote:
> Hi,
>         I understand that given a file, the file is split across 'n' mapper
> instances, which is the normal case.
> The scenario i have is :
> 1. Two files which are not totally identical in terms of number of columns
> (but have data that is similar in a few columns) need to be processed and
> after computation a single output file has to be generated.
> Note : CV - computedvalue
> File1 belonging to one dataset has data for :
> Date,counter1,counter2, CV1,CV2
> File2 belonging to another dataset has data for :
> Date,counter1,counter2,CV3,CV4,CV5
> Computation to be carried out on these two files is :
> CV6 =(CV1*CV5)/100
> And the final emitted output file should have data in the sequence:
> Date,counter1,counter2,CV6
> The idea is to have two mappers (not instances) run on each of the file, and
> a single reducer that emits the final result file.
> Thanks,
> Sahana
> On Wed, Sep 7, 2011 at 2:40 PM, Harsh J <ha...@cloudera.com> wrote:
>>
>> Sahana,
>>
>> Yes. But, isn't that how it is normally? What makes you question this
>> capability?
>>
>> On Wed, Sep 7, 2011 at 2:37 PM, Sahana Bhat <sana.b...@gmail.com> wrote:
>> > Hi,
>> >          Is it possible to have multiple mappers  where each mapper is
>> > operating on a different input file and whose result (which is a key
>> > value
>> > pair from different mappers) is processed by a single reducer?
>> > Regards,
>> > Sahana
>>
>>
>>
>> --
>> Harsh J
>
>



-- 
Harsh J

Re: Multiple Mappers and One Reducer

Reply via email to