[jira] Commented: (MAHOUT-541) Incremental SVD Implementation

Tamas Jambor (JIRA) Tue, 08 Mar 2011 11:33:22 -0800

    [ 
https://issues.apache.org/jira/browse/MAHOUT-541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004151#comment-13004151
 ]


Tamas Jambor commented on MAHOUT-541:
-------------------------------------

the problem with #2 is that the variable randomNoise doesn't control entirely 
the noise that affects the rating (and the error). It also depends on the 
rating scale and the number of features. I suggest this solution (which does 
make my experiments more stable):

        double prefInterval = dataModel.getMaxPreference() - 
dataModel.getMinPreference();
        defaultValue = Math.sqrt((average - (prefInterval * 0.1)) / 
numFeatures);
        double interval = (prefInterval * 0.1)  / numFeatures;

leftVectors[userIndex][feature] = defaultValue + (random.nextDouble() - 0.5) * 
interval * randomNoise; 
rightVectors[itemIndex][feature] = defaultValue + (random.nextDouble() - 0.5) * 
interval * randomNoise;

> Incremental SVD Implementation
> ------------------------------
>
>                 Key: MAHOUT-541
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-541
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>    Affects Versions: 0.4
>            Reporter: Tamas Jambor
>            Assignee: Sean Owen
>             Fix For: 0.5
>
>         Attachments: MAHOUT-541.patch, MAHOUT-541.patch, MAHOUT-541.patch, 
> MAHOUT-541.patch, SVDPreference.java, TJExpectationMaximizationSVD.java, 
> TJSVDRecommender.java
>
>
> I thought I'd put up this implementation of the popular SVD algorithm for 
> recommender systems. It is based on the SVD implementation, but instead of 
> computing each user and each item matrix, it trains the model iteratively, 
> which was the original version that Simon Funk proposed.  The advantage of 
> this implementation is that you don't have to recalculate the dot product of 
> each user-item pair for each training cycle, they can be cached, which speeds 
> up the algorithm considerably.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (MAHOUT-541) Incremental SVD Implementation

Reply via email to