RE: MatrixMultiplicationJob runs with 1 mapper only ?

Stuti Awasthi Tue, 29 Jan 2013 05:48:34 -0800

Hey Satish,
Thanks a ton. It worked for me also. Is there any way to increase reducer also 
currently only single reducer is working.


Thanks
Stuti

-----Original Message-----
From: satish verma [mailto:[email protected]] 
Sent: Monday, January 28, 2013 7:13 PM
To: [email protected]
Subject: Re: MatrixMultiplicationJob runs with 1 mapper only ?

I faced this problem too.

Split the seq file in which ur data is there into Multiple files. Then run the 
matrix multiplication with the folder as input . If the folder contains N 
sequence files, N mappers will be created.



On Monday, 28 January 2013, Sean Owen wrote:

> These are settings to Hadoop, not Mahout. You may need to set them in 
> your cluster config. They are still only suggestions.
>
> The question still remains why you think you need several mappers. Why?
>
> On Mon, Jan 28, 2013 at 1:28 PM, Stuti Awasthi <[email protected]>
> wrote:
> > Hi,
> > I would like to again consolidate all the steps which I performed.
> >
> > Issue : MatrixMultiplication example is getting executed with only 1 
> > map
> task.
> >
> > Steps :
> > 1. I created a file with size 104MB which is divided into 11 blocks 
> > with
> size 10MB each. The file contains 200x100000 size of matrix.
> > 2. I exported $MAHOUT_OPTS to the following
> >           $   echo $MAHOUT_OPTS
> >           -Dmapred.min.split.size=10485760 -Dmapred.map.tasks=7 3.  
> > Tried to execute matrix multiplication example using commandline :
> > mahout matrixmult --inputPathA /test/points/matrixA --numRowsA 200
> --numColsA 100000 --inputPathB /test/points/matrixA --numRowsB 200 
> --numColsB 100000 --tempDir /test/temp
> >
> > When I check the Jobtracker UI , its shows me following for the 
> > running
> job :
> > Running Map Tasks : 1
> > Occupied Map Slots: 1
> >
> > How can I distribute the map task on different mappers for
> MatrixMultiplication Job dynamically.
> > Is it even possible that MatrixMultiplication can run distributedly 
> > on
> multiple mappers as it internally uses CompositeInputFormat .
> >
> > Please Suggest
> >
> > Thanks
> > Stuti
> >
> >
> > -----Original Message-----
> > From: Sean Owen [mailto:[email protected]]
> > Sent: Wednesday, January 23, 2013 6:42 PM
> > To: Mahout User List
> > Subject: Re: MatrixMultiplicationJob runs with 1 mapper only ?
> >
> > Mappers are usually extremely fast since they start themselves on 
> > top of
> the data and their job is usually just parsing and emitting key value 
> pairs. Hadoop's choices are usually fine.
> >
> > If not it is usually because the mapper is emitting far more data 
> > than
> it ingests. Are you computing some kind of Cartesian product of input?
> >
> > That's slow no matter what. More mappers may increase parallelism 
> > but
> its still a lot of I/O. Avoid it if you can by sampling or pruning 
> unimportant values. Otherwise , try to implement a Combiner.
> > On Jan 23, 2013 12:04 PM, "Jonas Grote" <[email protected]> wrote:
> >
> >> I'd play with the mapred.map.tasks option. Setting it to something 
> >> bigger than 1 gave me performance improvements for various hadoop 
> >> jobs on my cluster.
> >>
> >>
> >> 2013/1/16 Ashish <[email protected]>
> >>
> >> > I am afraid I don't know the answer. Need to experiment a bit more.
> >> > I
> >> have
> >> > not used CompositeInputFormat so cannot comment.
> >> >
> >> > Probably, someone else on the ML(Mailing List) would be able to 
> >> > guide
> >> here.
> >> >
> >> >
> >> > On Wed, Jan 16, 2013 at 6:01 PM, Stuti Awasthi 
> >> > <[email protected]>
> >> > wrote:
> >> >
> >> > > Thanks Ashish,
> >> > >
> >> > > So according to the link if one is using CompositeInputFormat 
> >> > > then it
> >> > will
> >> > > take entire file as Input to a mapper without considering 
> >> > > InputSplits/blocksize.
> >> > > If I am understanding it correctly then it is asking to break 
> >> > > [Original Input File]->[flie1,file2,....] .
> >> > >
> >> > > So If my file is  [/test/MatrixA] --> [/test/smallfiles/file1, 
> >> > > [/test/smallfiles/file2, [/test/smallfiles/file3...............  
> >> > > ]
> >> > >
> >> > > Now will the input path in MatrixMultiplicationJob will be 
> >> > > directory
> >> path
> >> > > : /test/smallfiles  ??
> >> > >
> >> > > Will breaking file in such manner will cause problem in 
> >> > > algorithmic execution of MR job. Im not sure if output will be
> correct .
> >> > >
> >> > > -----Original Message-----
> >> > > From: Ashish [mailto:[email protected]]
> >> > > Sent: Wednesday, Januar


::DISCLAIMER::
----------------------------------------------------------------------------------------------------------------------------------------------------

The contents of this e-mail and any attachment(s) are confidential and intended 
for the named recipient(s) only.
E-mail transmission is not guaranteed to be secure or error-free as information 
could be intercepted, corrupted,
lost, destroyed, arrive late or incomplete, or may contain viruses in 
transmission. The e mail and its contents
(with or without referred errors) shall therefore not attach any liability on 
the originator or HCL or its affiliates.
Views or opinions, if any, presented in this email are solely those of the 
author and may not necessarily reflect the
views or opinions of HCL or its affiliates. Any form of reproduction, 
dissemination, copying, disclosure, modification,
distribution and / or publication of this message without the prior written 
consent of authorized representative of
HCL is strictly prohibited. If you have received this email in error please 
delete it and notify the sender immediately.
Before opening any email and/or attachments, please check them for viruses and 
other defects.

----------------------------------------------------------------------------------------------------------------------------------------------------

RE: MatrixMultiplicationJob runs with 1 mapper only ?

Reply via email to