Bhaskar Devireddy created MAHOUT-1042:
-----------------------------------------

             Summary: Hotspot in RecommenderJob-PartialMultiplyMapper-Reducer
                 Key: MAHOUT-1042
                 URL: https://issues.apache.org/jira/browse/MAHOUT-1042
             Project: Mahout
          Issue Type: Improvement
          Components: Collaborative Filtering
    Affects Versions: 0.7, 0.6
            Reporter: Bhaskar Devireddy
            Assignee: Sean Owen
            Priority: Minor


While profiling PartialMultiplyMapper-Reducer job we noticed a hotspot 
consuming more than 40% of the CPU time in 
org.apache.mahout.math.RandomAccessSparseVector.assign method for the reducer 
task.  We used the script provided in mahout examples for running ASF Email 
recommendations for profiling. The hotspot is coming from the use of 
Vector.plus(Vector x) method in AggregateAndRecommendReducerc class.  The 
pattern used is VectorA = VectorA.plus(VectorB).  In this case VectorA doesn't 
have to be cloned using assign method.  The attached patch addresses the 
hotspot by eliminating cloning in the above case for plus and times methods.  
This patch while retaining functionality (verified the output with and without 
patch), speeds up execution time of PartialMultiplyMapper-Reducer job by more 
than 10X on x86 architectures.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to