Nice spot! We have to use .assign and Functions.PLUS.
2012/7/9 Sean Owen (JIRA) <[email protected]>: > > [ > https://issues.apache.org/jira/browse/MAHOUT-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13409895#comment-13409895 > ] > > Sean Owen commented on MAHOUT-1042: > ----------------------------------- > > I like it but can this be done with the existing assign() method and a > DoubleFunction that adds? If not, I think a separate method like addTo() > would be better than a flag. > >> Hotspot in RecommenderJob-PartialMultiplyMapper-Reducer >> ------------------------------------------------------- >> >> Key: MAHOUT-1042 >> URL: https://issues.apache.org/jira/browse/MAHOUT-1042 >> Project: Mahout >> Issue Type: Improvement >> Components: Collaborative Filtering >> Affects Versions: 0.6, 0.7 >> Reporter: Bhaskar Devireddy >> Assignee: Sean Owen >> Priority: Minor >> Attachments: Mahout_1042.patch >> >> >> While profiling PartialMultiplyMapper-Reducer job we noticed a hotspot >> consuming more than 40% of the CPU time in >> org.apache.mahout.math.RandomAccessSparseVector.assign method for the >> reducer task. We used the script provided in mahout examples for running >> ASF Email recommendations for profiling. The hotspot is coming from the use >> of Vector.plus(Vector x) method in AggregateAndRecommendReducerc class. The >> pattern used is VectorA = VectorA.plus(VectorB). In this case VectorA >> doesn't have to be cloned using assign method. The attached patch addresses >> the hotspot by eliminating cloning in the above case for plus and times >> methods. This patch while retaining functionality (verified the output with >> and without patch), speeds up execution time of >> PartialMultiplyMapper-Reducer job by more than 10X on x86 architectures. > > -- > This message is automatically generated by JIRA. > If you think it was sent incorrectly, please contact your JIRA > administrators: > https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa > For more information on JIRA, see: http://www.atlassian.com/software/jira > >
