I'll give it a try this weekend :)
Am 16.08.2010 17:30, schrieb Sean Owen (JIRA):
> [
> https://issues.apache.org/jira/browse/MAHOUT-473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898964#action_12898964
> ]
>
> Sean Owen commented on MAHOUT-473:
> ----------------------------------
>
> I understand. The better change is to actually instantiate and run
> RowSimilarityJob within RecommenderJob, but before running, pass its
> Configuration "conf" property object to the child job. Then I think it all
> works. Sebastian do you mind trying this?
>
>
>> add parameter -Dmapred.reduce.tasks when call job RowSimilarityJob in
>> RecommenderJob
>> ------------------------------------------------------------------------------------
>>
>> Key: MAHOUT-473
>> URL: https://issues.apache.org/jira/browse/MAHOUT-473
>> Project: Mahout
>> Issue Type: Improvement
>> Components: Collaborative Filtering
>> Affects Versions: 0.4
>> Reporter: Han Hui Wen
>> Assignee: Sean Owen
>> Attachments: screenshot-1.jpg
>>
>>
>> In RecommenderJob
>> {code:title=RecommenderJob.java|borderStyle=solid}
>> int numberOfUsers = TasteHadoopUtils.readIntFromFile(getConf(),
>> countUsersPath);
>> if (shouldRunNextPhase(parsedArgs, currentPhase)) {
>> /* Once DistributedRowMatrix uses the hadoop 0.20 API, we should
>> refactor this call to something like
>> * new DistributedRowMatrix(...).rowSimilarity(...) */
>> try {
>> RowSimilarityJob.main(new String[] { "-Dmapred.input.dir=" +
>> maybePruneItemUserMatrixPath.toString(),
>> "-Dmapred.output.dir=" + similarityMatrixPath.toString(),
>> "--numberOfColumns",
>> String.valueOf(numberOfUsers), "--similarityClassname",
>> similarityClassname, "--maxSimilaritiesPerRow",
>> String.valueOf(maxSimilaritiesPerItemConsidered + 1),
>> "--tempDir", tempDirPath.toString() });
>> } catch (Exception e) {
>> throw new IllegalStateException("item-item-similarity computation
>> failed", e);
>> }
>> }
>> {code}
>> We have not passed parameter -Dmapred.reduce.tasks when job RowSimilarityJob.
>> It caused all three RowSimilarityJob sub-jobs run using 1 map and 1 reduce,
>> so the sub jobs can not be scalable.
>>
>