[ 
https://issues.apache.org/jira/browse/MAHOUT-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhaskar Devireddy updated MAHOUT-1042:
--------------------------------------

    Attachment: Mahout_1042.patch
    
> Hotspot in RecommenderJob-PartialMultiplyMapper-Reducer
> -------------------------------------------------------
>
>                 Key: MAHOUT-1042
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1042
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>    Affects Versions: 0.6, 0.7
>            Reporter: Bhaskar Devireddy
>            Assignee: Sean Owen
>            Priority: Minor
>         Attachments: Mahout_1042.patch
>
>
> While profiling PartialMultiplyMapper-Reducer job we noticed a hotspot 
> consuming more than 40% of the CPU time in 
> org.apache.mahout.math.RandomAccessSparseVector.assign method for the reducer 
> task.  We used the script provided in mahout examples for running ASF Email 
> recommendations for profiling. The hotspot is coming from the use of 
> Vector.plus(Vector x) method in AggregateAndRecommendReducerc class.  The 
> pattern used is VectorA = VectorA.plus(VectorB).  In this case VectorA 
> doesn't have to be cloned using assign method.  The attached patch addresses 
> the hotspot by eliminating cloning in the above case for plus and times 
> methods.  This patch while retaining functionality (verified the output with 
> and without patch), speeds up execution time of PartialMultiplyMapper-Reducer 
> job by more than 10X on x86 architectures.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to