I'll give it a try this weekend :)

Am 16.08.2010 17:30, schrieb Sean Owen (JIRA):
>     [ 
> https://issues.apache.org/jira/browse/MAHOUT-473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898964#action_12898964
>  ] 
>
> Sean Owen commented on MAHOUT-473:
> ----------------------------------
>
> I understand. The better change is to actually instantiate and run 
> RowSimilarityJob within RecommenderJob, but before running, pass its 
> Configuration "conf" property object to the child job. Then I think it all 
> works. Sebastian do you mind trying this?
>
>   
>> add parameter -Dmapred.reduce.tasks when call job RowSimilarityJob in 
>> RecommenderJob
>> ------------------------------------------------------------------------------------
>>
>>                 Key: MAHOUT-473
>>                 URL: https://issues.apache.org/jira/browse/MAHOUT-473
>>             Project: Mahout
>>          Issue Type: Improvement
>>          Components: Collaborative Filtering
>>    Affects Versions: 0.4
>>            Reporter: Han Hui Wen 
>>            Assignee: Sean Owen
>>         Attachments: screenshot-1.jpg
>>
>>
>> In RecommenderJob
>> {code:title=RecommenderJob.java|borderStyle=solid}
>>     int numberOfUsers = TasteHadoopUtils.readIntFromFile(getConf(), 
>> countUsersPath);
>>     if (shouldRunNextPhase(parsedArgs, currentPhase)) {
>>       /* Once DistributedRowMatrix uses the hadoop 0.20 API, we should 
>> refactor this call to something like
>>        * new DistributedRowMatrix(...).rowSimilarity(...) */
>>       try {
>>         RowSimilarityJob.main(new String[] { "-Dmapred.input.dir=" + 
>> maybePruneItemUserMatrixPath.toString(),
>>             "-Dmapred.output.dir=" + similarityMatrixPath.toString(), 
>> "--numberOfColumns",
>>             String.valueOf(numberOfUsers), "--similarityClassname", 
>> similarityClassname, "--maxSimilaritiesPerRow",
>>             String.valueOf(maxSimilaritiesPerItemConsidered + 1), 
>> "--tempDir", tempDirPath.toString() });
>>       } catch (Exception e) {
>>         throw new IllegalStateException("item-item-similarity computation 
>> failed", e);
>>       }
>>     }
>> {code}
>> We have not passed parameter -Dmapred.reduce.tasks when job RowSimilarityJob.
>> It caused all three  RowSimilarityJob sub-jobs run using 1 map and 1 reduce, 
>> so the sub jobs can not be scalable.
>>     
>   

Reply via email to