[
https://issues.apache.org/jira/browse/MAHOUT-473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898343#action_12898343
]
Sean Owen commented on MAHOUT-473:
----------------------------------
I am not sure what you mean. Settings like "-Dmapred.reduce.tasks" are
parameters for Hadoop, not the Mahout job. They are passed on the command line
to Hadoop and processed. Hadoop passes them to Mahout, but Mahout doesn't care
about this value usually. But no, they are available.
You'd have to provide a clear description and patch of what you think the
problem is if you think there is still an issue here.
> add parameter -Dmapred.reduce.tasks when call job RowSimilarityJob in
> RecommenderJob
> ------------------------------------------------------------------------------------
>
> Key: MAHOUT-473
> URL: https://issues.apache.org/jira/browse/MAHOUT-473
> Project: Mahout
> Issue Type: Improvement
> Components: Collaborative Filtering
> Affects Versions: 0.4
> Reporter: Han Hui Wen
> Assignee: Sean Owen
> Attachments: screenshot-1.jpg
>
>
> In RecommenderJob
> {code:title=RecommenderJob.java|borderStyle=solid}
> int numberOfUsers = TasteHadoopUtils.readIntFromFile(getConf(),
> countUsersPath);
> if (shouldRunNextPhase(parsedArgs, currentPhase)) {
> /* Once DistributedRowMatrix uses the hadoop 0.20 API, we should
> refactor this call to something like
> * new DistributedRowMatrix(...).rowSimilarity(...) */
> try {
> RowSimilarityJob.main(new String[] { "-Dmapred.input.dir=" +
> maybePruneItemUserMatrixPath.toString(),
> "-Dmapred.output.dir=" + similarityMatrixPath.toString(),
> "--numberOfColumns",
> String.valueOf(numberOfUsers), "--similarityClassname",
> similarityClassname, "--maxSimilaritiesPerRow",
> String.valueOf(maxSimilaritiesPerItemConsidered + 1),
> "--tempDir", tempDirPath.toString() });
> } catch (Exception e) {
> throw new IllegalStateException("item-item-similarity computation
> failed", e);
> }
> }
> {code}
> We have not passed parameter -Dmapred.reduce.tasks when job RowSimilarityJob.
> It caused all three RowSimilarityJob sub-jobs run using 1 map and 1 reduce,
> so the sub jobs can not be scalable.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.