[
https://issues.apache.org/jira/browse/MAHOUT-749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jeff Eastman updated MAHOUT-749:
--------------------------------
Attachment: MAHOUT-749.patch
This patch implements changes to the driver and mapper to utilize multiple
reducers. The driver is modified to decrease the number of reducers in each
iteration, finally to 1. The mapper is changed to send each of its outputs to a
different reducer, depending upon the number deployed in the iteration. The
unit tests are modified and run. This is ready for some experimentation with
larger datasets and multiple reducers specified by -Dmapred.reduce.tasks.
> MeanShift Cannot Use Multiple Reducers
> --------------------------------------
>
> Key: MAHOUT-749
> URL: https://issues.apache.org/jira/browse/MAHOUT-749
> Project: Mahout
> Issue Type: Improvement
> Reporter: Jeff Eastman
> Assignee: Jeff Eastman
> Attachments: MAHOUT-749.patch
>
>
> The MeanShiftCanopy clustering job sets the numReducers=1 and this severely
> limits its scalability for larger jobs.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira