[
https://issues.apache.org/jira/browse/MAHOUT-541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004151#comment-13004151
]
Tamas Jambor commented on MAHOUT-541:
-------------------------------------
the problem with #2 is that the variable randomNoise doesn't control entirely
the noise that affects the rating (and the error). It also depends on the
rating scale and the number of features. I suggest this solution (which does
make my experiments more stable):
double prefInterval = dataModel.getMaxPreference() -
dataModel.getMinPreference();
defaultValue = Math.sqrt((average - (prefInterval * 0.1)) /
numFeatures);
double interval = (prefInterval * 0.1) / numFeatures;
leftVectors[userIndex][feature] = defaultValue + (random.nextDouble() - 0.5) *
interval * randomNoise;
rightVectors[itemIndex][feature] = defaultValue + (random.nextDouble() - 0.5) *
interval * randomNoise;
> Incremental SVD Implementation
> ------------------------------
>
> Key: MAHOUT-541
> URL: https://issues.apache.org/jira/browse/MAHOUT-541
> Project: Mahout
> Issue Type: Improvement
> Components: Collaborative Filtering
> Affects Versions: 0.4
> Reporter: Tamas Jambor
> Assignee: Sean Owen
> Fix For: 0.5
>
> Attachments: MAHOUT-541.patch, MAHOUT-541.patch, MAHOUT-541.patch,
> MAHOUT-541.patch, SVDPreference.java, TJExpectationMaximizationSVD.java,
> TJSVDRecommender.java
>
>
> I thought I'd put up this implementation of the popular SVD algorithm for
> recommender systems. It is based on the SVD implementation, but instead of
> computing each user and each item matrix, it trains the model iteratively,
> which was the original version that Simon Funk proposed. The advantage of
> this implementation is that you don't have to recalculate the dot product of
> each user-item pair for each training cycle, they can be cached, which speeds
> up the algorithm considerably.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira