I think I was able to create multiple reducers by setting this property mapred.reduce.tasks = 10 in the MR code.
Try setting this. If it does not work, I will check my code n let u know. But it is doable. The multiplication part was tricky for the mapper part . Reducer part was easy . On Tuesday, 29 January 2013, Stuti Awasthi wrote: > Hey Satish, > Thanks a ton. It worked for me also. Is there any way to increase reducer > also currently only single reducer is working. > > Thanks > Stuti > > -----Original Message----- > From: satish verma [mailto:[email protected]] > Sent: Monday, January 28, 2013 7:13 PM > To: [email protected] > Subject: Re: MatrixMultiplicationJob runs with 1 mapper only ? > > I faced this problem too. > > Split the seq file in which ur data is there into Multiple files. Then run > the matrix multiplication with the folder as input . If the folder contains > N sequence files, N mappers will be created. > > > > On Monday, 28 January 2013, Sean Owen wrote: > > > These are settings to Hadoop, not Mahout. You may need to set them in > > your cluster config. They are still only suggestions. > > > > The question still remains why you think you need several mappers. Why? > > > > On Mon, Jan 28, 2013 at 1:28 PM, Stuti Awasthi <[email protected]> > > wrote: > > > Hi, > > > I would like to again consolidate all the steps which I performed. > > > > > > Issue : MatrixMultiplication example is getting executed with only 1 > > > map > > task. > > > > > > Steps : > > > 1. I created a file with size 104MB which is divided into 11 blocks > > > with > > size 10MB each. The file contains 200x100000 size of matrix. > > > 2. I exported $MAHOUT_OPTS to the following > > > $ echo $MAHOUT_OPTS > > > -Dmapred.min.split.size=10485760 -Dmapred.map.tasks=7 3. > > > Tried to execute matrix multiplication example using commandline : > > > mahout matrixmult --inputPathA /test/points/matrixA --numRowsA 200 > > --numColsA 100000 --inputPathB /test/points/matrixA --numRowsB 200 > > --numColsB 100000 --tempDir /test/temp > > > > > > When I check the Jobtracker UI , its shows me following for the > > > running > > job : > > > Running Map Tasks : 1 > > > Occupied Map Slots: 1 > > > > > > How can I distribute the map task on different mappers for > > MatrixMultiplication Job dynamically. > > > Is it even possible that MatrixMultiplication can run distributedly > > > on > > multiple mappers as it internally uses CompositeInputFormat . > > > > > > Please Suggest > > > > > > Thanks > > > Stuti > > > > > > > > > -----Original Message----- > > > From: Sean Owen [mailto:[email protected]] > > > Sent: Wednesday, January 23, 2013 6:42 PM > > > To: Mahout User List > > > Subject: Re: MatrixMultiplicationJob runs with 1 mapper only ? > > > > > > Mappers are usually extremely fast since they start themselves on > > > top of > > the data and their job is usually just parsing and emitting key value > > pairs. Hadoop's choices are usually fine. > > > > > > If not it is usually because the mapper is emitting far more data > > > than > > it ingests. Are you computing some kind of Cartesian product of input? > > > > > > That's slow no matter what. More mappers may increase parallelism > > > but > > its still a lot of I/O. Avoid it if you can by sampling or pruning > > unimportant values. Otherwise , try to implement a Combiner. > > > On Jan 23, 2013 12:04 PM, "Jonas Grote" <[email protected]> wrote: > > > > > >> I'd play with the mapred.map.tasks option. Setting it to something > > >> bigger than 1 gave me performance improvements for various hadoop > > >> jobs on my cluster. > > >> > > >> > > >> 2013/1/16 Ashish <[email protected]> > > >> > > >> > I am afraid I don't know the answer. Need to experiment a bit more. > > >> > I > > >> have > > >> > not used CompositeInputFormat so cannot comment. > > >> > > > >> > Probably, someone else on the ML(Mailing List) would be able to > > >> > guide > > >> here. > > >> > > > >> > > > >> > On Wed, Jan 16, 2013 at 6:01 PM, Stuti Awasthi > > >> > <::DISCLAIMER:: > > ---------------------------------------------------------------------------------------------------------------------------------------------------- > > The contents of this e-mail and any attachment(s) are confidential and > intended for the named recipient(s) only. > E-mail transmission is not guaranteed to be secure or error-free as > information could be intercepted, corrupted, > lost, destroyed, arrive late or incomplete, or may contain viruses in > transmission. The e mail and its contents > (with or without referred errors) shall therefore not attach any liability > on the originator or HCL or its affiliates. > Views or opinions, if any, presented in this email are solely those of the > author and may not necessarily reflect the > views or opinions of HCL or its affiliates. Any form of reproduction, > dissemination, copying, disclosure, modification, > distribution and / or publication of this message without the prior > written consent of authorized representative of > HCL is strictly prohibited. If you have received this email in error > please delete it and notify the sender immediately. > Before opening any email and/or attachments, please check them for viruses > and other defects. > > > ---------------------------------------------------------------------------------------------------------------------------------------------------- > >
