Re: Setting Number of Mappers and Reducers in DistributedRowMatrix Jobs

Kris Jack Mon, 14 Jun 2010 07:40:25 -0700

Hi Sean,

Yes, I tried using those parameters but they didn't seem to have any
effect.  What's more, the number of reducers never increased above 1,
meaning that I never got to see any results when running with large data
sets (doing matrix multiplication).

I looked in the code to find where these parameters were being read by the
jobs that I was using (i.e. MatrixMultiplicationJob and TransposeJob) but
couldn't find them.  As a result, I modified their builders and called the
setNumMapTasks and setNumReducerTasks functions from the conf objects.  This
now works from the command line using the parameters that you suggested.

Please do let me know if I was just not calling them correctly or if you
think that there already exists an alternative way to do this.  I would like
to use Mahout as it was intended and not make lots of little changes myself
if they aren't necessary.

Thanks,
Kris

2010/6/11 Sean Owen <[email protected]>

> -Dmapred.map.tasks and same for reduce? These should be Hadoop params
> you set directly to Hadoop.
>
> On Fri, Jun 11, 2010 at 5:07 PM, Kris Jack <[email protected]> wrote:
> > Hi everyone,
> >
> > I am running code that uses some of the jobs defined in the
> > DistributedRowMatrix class and would like to know if I can define the
> number
> > of mappers and reducers that they use when running?  In particular, with
> the
> > jobs:
> >
> > - MatrixMultiplicationJob
> > - TransposeJob
> >
> > I am happy to comfortable with changing the code to get this to work but
> I
> > was wondering if the algorithmic logic being employed would allow
> multiple
> > mappers and reducers.
> >
> > Thanks,
> > Kris
> >
>

-- 
Dr Kris Jack,
http://www.mendeley.com/profiles/kris-jack/

Re: Setting Number of Mappers and Reducers in DistributedRowMatrix Jobs

Reply via email to