[ 
https://issues.apache.org/jira/browse/MAHOUT-473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898343#action_12898343
 ] 

Sean Owen commented on MAHOUT-473:
----------------------------------

I am not sure what you mean. Settings like "-Dmapred.reduce.tasks" are 
parameters for Hadoop, not the Mahout job. They are passed on the command line 
to Hadoop and processed. Hadoop passes them to Mahout, but Mahout doesn't care 
about this value usually. But no, they are available.

You'd have to provide a clear description and patch of what you think the 
problem is if you think there is still an issue here.

> add parameter -Dmapred.reduce.tasks when call job RowSimilarityJob in 
> RecommenderJob
> ------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-473
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-473
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>    Affects Versions: 0.4
>            Reporter: Han Hui Wen 
>            Assignee: Sean Owen
>         Attachments: screenshot-1.jpg
>
>
> In RecommenderJob
> {code:title=RecommenderJob.java|borderStyle=solid}
>     int numberOfUsers = TasteHadoopUtils.readIntFromFile(getConf(), 
> countUsersPath);
>     if (shouldRunNextPhase(parsedArgs, currentPhase)) {
>       /* Once DistributedRowMatrix uses the hadoop 0.20 API, we should 
> refactor this call to something like
>        * new DistributedRowMatrix(...).rowSimilarity(...) */
>       try {
>         RowSimilarityJob.main(new String[] { "-Dmapred.input.dir=" + 
> maybePruneItemUserMatrixPath.toString(),
>             "-Dmapred.output.dir=" + similarityMatrixPath.toString(), 
> "--numberOfColumns",
>             String.valueOf(numberOfUsers), "--similarityClassname", 
> similarityClassname, "--maxSimilaritiesPerRow",
>             String.valueOf(maxSimilaritiesPerItemConsidered + 1), 
> "--tempDir", tempDirPath.toString() });
>       } catch (Exception e) {
>         throw new IllegalStateException("item-item-similarity computation 
> failed", e);
>       }
>     }
> {code}
> We have not passed parameter -Dmapred.reduce.tasks when job RowSimilarityJob.
> It caused all three  RowSimilarityJob sub-jobs run using 1 map and 1 reduce, 
> so the sub jobs can not be scalable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to