Over to dev list:

Sean, we currently have some jobs which accept numbers of mappers and reducers as optional command arguments and others that require the -D arguments to control same as you have written. Seems like our usability would improve if we adopted a consistent policy across all Mahout components. If so, would you argue that all use -D arguments for this control? What about situations where our default is not whatever Hadoop does by default? Would this result in noticable behavior changes? Also, some algorithms don't work with arbitrary numbers of reducers and some don't use reducers at all. What would you suggest?

Jeff


On 6/11/10 9:35 AM, Sean Owen wrote:
-Dmapred.map.tasks and same for reduce? These should be Hadoop params
you set directly to Hadoop.

On Fri, Jun 11, 2010 at 5:07 PM, Kris Jack<[email protected]>  wrote:
Hi everyone,

I am running code that uses some of the jobs defined in the
DistributedRowMatrix class and would like to know if I can define the number
of mappers and reducers that they use when running?  In particular, with the
jobs:

- MatrixMultiplicationJob
- TransposeJob

I am happy to comfortable with changing the code to get this to work but I
was wondering if the algorithmic logic being employed would allow multiple
mappers and reducers.

Thanks,
Kris


Reply via email to