[ 
https://issues.apache.org/jira/browse/MAHOUT-960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13399366#comment-13399366
 ] 

Hudson commented on MAHOUT-960:
-------------------------------

Integrated in Mahout-Quality #1556 (See 
[https://builds.apache.org/job/Mahout-Quality/1556/])
    MAHOUT-960 Reduce memory usage of 
ImplicitFeedbackAlternatingLeastSquaresSolver (Revision 1352796)

     Result = SUCCESS
ssc : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1352796
Files : 
* 
/mahout/trunk/math/src/main/java/org/apache/mahout/math/als/ImplicitFeedbackAlternatingLeastSquaresSolver.java

                
> Reduce memory usage of ImplicitFeedbackAlternatingLeastSquaresSolver
> --------------------------------------------------------------------
>
>                 Key: MAHOUT-960
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-960
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>    Affects Versions: 0.6
>            Reporter: Doug Mittendorf
>            Assignee: Sebastian Schelter
>            Priority: Minor
>             Fix For: 0.8
>
>         Attachments: MAHOUT-960-1.patch, MAHOUT-960.patch
>
>
> One of the main limiting factors of the implicit ALS algorithm when 
> processing large datasets is the fact that it must fit the entire U or M 
> matrix in memory.  This is further compounded by the fact that the current 
> implementation represents the matrix in memory 3 times:
> 1. As an OpenIntObjectHashMap read in from disk
> 2. A sorted DenseMatrix representation of #1 to prepare for computing Y'Y
> 3. The transpose of #2 (another DenseMatrix)
> The #3 copy of the matrix can be eliminated by computing Y'Y directly from Y 
> without first computing the transpose of Y as an intermediate step.  This 
> should also be more efficient in terms of CPU usage.
> Note that the #1 copy of the matrix could also be eliminated if it's assumed 
> that the user and item IDs are sequentially assigned and ordered.  This would 
> allow the DenseMatrix to be populated directly from disk instead of reading 
> into an intermediate OpenIntObjectHashMap.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to