I think I was able to create multiple reducers by setting this property
mapred.reduce.tasks = 10 in the MR code.


Try setting  this.  If it does not work, I will check my code n let u know.
But it is doable.

The multiplication part was tricky for the mapper part . Reducer part was
easy .


On Tuesday, 29 January 2013, Stuti Awasthi wrote:

> Hey Satish,
> Thanks a ton. It worked for me also. Is there any way to increase reducer
> also currently only single reducer is working.
>
> Thanks
> Stuti
>
> -----Original Message-----
> From: satish verma [mailto:[email protected]]
> Sent: Monday, January 28, 2013 7:13 PM
> To: [email protected]
> Subject: Re: MatrixMultiplicationJob runs with 1 mapper only ?
>
> I faced this problem too.
>
> Split the seq file in which ur data is there into Multiple files. Then run
> the matrix multiplication with the folder as input . If the folder contains
> N sequence files, N mappers will be created.
>
>
>
> On Monday, 28 January 2013, Sean Owen wrote:
>
> > These are settings to Hadoop, not Mahout. You may need to set them in
> > your cluster config. They are still only suggestions.
> >
> > The question still remains why you think you need several mappers. Why?
> >
> > On Mon, Jan 28, 2013 at 1:28 PM, Stuti Awasthi <[email protected]>
> > wrote:
> > > Hi,
> > > I would like to again consolidate all the steps which I performed.
> > >
> > > Issue : MatrixMultiplication example is getting executed with only 1
> > > map
> > task.
> > >
> > > Steps :
> > > 1. I created a file with size 104MB which is divided into 11 blocks
> > > with
> > size 10MB each. The file contains 200x100000 size of matrix.
> > > 2. I exported $MAHOUT_OPTS to the following
> > >           $   echo $MAHOUT_OPTS
> > >           -Dmapred.min.split.size=10485760 -Dmapred.map.tasks=7 3.
> > > Tried to execute matrix multiplication example using commandline :
> > > mahout matrixmult --inputPathA /test/points/matrixA --numRowsA 200
> > --numColsA 100000 --inputPathB /test/points/matrixA --numRowsB 200
> > --numColsB 100000 --tempDir /test/temp
> > >
> > > When I check the Jobtracker UI , its shows me following for the
> > > running
> > job :
> > > Running Map Tasks : 1
> > > Occupied Map Slots: 1
> > >
> > > How can I distribute the map task on different mappers for
> > MatrixMultiplication Job dynamically.
> > > Is it even possible that MatrixMultiplication can run distributedly
> > > on
> > multiple mappers as it internally uses CompositeInputFormat .
> > >
> > > Please Suggest
> > >
> > > Thanks
> > > Stuti
> > >
> > >
> > > -----Original Message-----
> > > From: Sean Owen [mailto:[email protected]]
> > > Sent: Wednesday, January 23, 2013 6:42 PM
> > > To: Mahout User List
> > > Subject: Re: MatrixMultiplicationJob runs with 1 mapper only ?
> > >
> > > Mappers are usually extremely fast since they start themselves on
> > > top of
> > the data and their job is usually just parsing and emitting key value
> > pairs. Hadoop's choices are usually fine.
> > >
> > > If not it is usually because the mapper is emitting far more data
> > > than
> > it ingests. Are you computing some kind of Cartesian product of input?
> > >
> > > That's slow no matter what. More mappers may increase parallelism
> > > but
> > its still a lot of I/O. Avoid it if you can by sampling or pruning
> > unimportant values. Otherwise , try to implement a Combiner.
> > > On Jan 23, 2013 12:04 PM, "Jonas Grote" <[email protected]> wrote:
> > >
> > >> I'd play with the mapred.map.tasks option. Setting it to something
> > >> bigger than 1 gave me performance improvements for various hadoop
> > >> jobs on my cluster.
> > >>
> > >>
> > >> 2013/1/16 Ashish <[email protected]>
> > >>
> > >> > I am afraid I don't know the answer. Need to experiment a bit more.
> > >> > I
> > >> have
> > >> > not used CompositeInputFormat so cannot comment.
> > >> >
> > >> > Probably, someone else on the ML(Mailing List) would be able to
> > >> > guide
> > >> here.
> > >> >
> > >> >
> > >> > On Wed, Jan 16, 2013 at 6:01 PM, Stuti Awasthi
> > >> > <::DISCLAIMER::
>
> ----------------------------------------------------------------------------------------------------------------------------------------------------
>
> The contents of this e-mail and any attachment(s) are confidential and
> intended for the named recipient(s) only.
> E-mail transmission is not guaranteed to be secure or error-free as
> information could be intercepted, corrupted,
> lost, destroyed, arrive late or incomplete, or may contain viruses in
> transmission. The e mail and its contents
> (with or without referred errors) shall therefore not attach any liability
> on the originator or HCL or its affiliates.
> Views or opinions, if any, presented in this email are solely those of the
> author and may not necessarily reflect the
> views or opinions of HCL or its affiliates. Any form of reproduction,
> dissemination, copying, disclosure, modification,
> distribution and / or publication of this message without the prior
> written consent of authorized representative of
> HCL is strictly prohibited. If you have received this email in error
> please delete it and notify the sender immediately.
> Before opening any email and/or attachments, please check them for viruses
> and other defects.
>
>
> ----------------------------------------------------------------------------------------------------------------------------------------------------
>
>

Reply via email to