[ 
https://issues.apache.org/jira/browse/MAHOUT-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi resolved MAHOUT-1274.
-----------------------------------

       Resolution: Won't Fix
    Fix Version/s: 1.0
         Assignee: Suneel Marthi

no activity for > 6 months, resolving this as 'Won't Fix'.

> SGD-based Online SVD recommender
> --------------------------------
>
>                 Key: MAHOUT-1274
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1274
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Collaborative Filtering
>            Reporter: Peng Cheng
>            Assignee: Suneel Marthi
>              Labels: collaborative-filtering, features, machine_learning, svd
>             Fix For: 1.0
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> an online SVD recommender is otherwise similar to an offline SVD recommender 
> except that, upon receiving one or several new recommendations, it can add 
> them into the training dataModel and update the result accordingly in real 
> time.
> an online SVD recommender should override setPreference(...) and 
> removePreference(...) in AbstractRecommender such that the factorization 
> result is updated in O(1) time and without retraining.
> Right now the slopeOneRecommender is the only component possessing such 
> capability.
> Since SGD is intrinsically an online algorithm and its CF implementation is 
> available in core-0.8 (See MAHOUT-1089, MAHOUT-1272), I presume it would be a 
> good time to convert it. Such feature could come in handy for some websites.
> Implementation: Adding new users, items, or increasing rating matrix rank are 
> just increasing size of user and item matrices. Reducing rating matrix rank 
> involves just one svd. The real challenge here is that sgd is NO ONE-PASS 
> algorithm, multiple passes are required to achieve an acceptable optimality 
> and even more so if hyperparameters are bad. But here are two possible 
> circumvents:
> 1. Use one-pass algorithms like averaged-SGD, not sure if it can ever work as 
> applying stochastic convex-opt algorithm to non-convex problem is anarchy. 
> But it may be a long shot.
> 2. Run incomplete passes in each online update using ratings randomly sampled 
> (but not uniformly sampled) from latest dataModel. I don't know how exactly 
> this should be done but new rating should be sampled more frequently. Uniform 
> sampling will results in old ratings being used more than new ratings in 
> total. If somebody has worked on this batch-to-online conversion before and 
> share his insight that would be awesome. This seems to be the most viable 
> option, if I get the non-uniform pseudorandom generator that maintains a 
> cumulative uniform distribution I want.
> I found a very old ticket (MAHOUT-572) mentioning online SVD recommender but 
> it didn't pay off. Hopefully its not a bad idea to submit a new ticket here.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to