Hi Peng,

We deprecated a lot of algorithms that we found to be not much used to
streamline our codebase for a coming 1.0 release.
Am 06.07.2013 10:25 schrieb "Peng Cheng (JIRA)" <[email protected]>:

>
>     [
> https://issues.apache.org/jira/browse/MAHOUT-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13701380#comment-13701380]
>
> Peng Cheng commented on MAHOUT-1274:
> ------------------------------------
>
> BTW may I ask (noobishly) that why you have deprecated the
> SlopeOneRecommender in the latest core-0.8 snapshot? i must have missed a
> lot in previous mahout-development emails before i join so apologies if its
> a stupid question.
>
> > SGD-based Online SVD recommender
> > --------------------------------
> >
> >                 Key: MAHOUT-1274
> >                 URL: https://issues.apache.org/jira/browse/MAHOUT-1274
> >             Project: Mahout
> >          Issue Type: New Feature
> >          Components: Collaborative Filtering
> >            Reporter: Peng Cheng
> >            Assignee: Sean Owen
> >              Labels: collaborative-filtering, features,
> machine_learning, svd
> >   Original Estimate: 336h
> >  Remaining Estimate: 336h
> >
> > an online SVD recommender is otherwise similar to an offline SVD
> recommender except that, upon receiving one or several new recommendations,
> it can add them into the training dataModel and update the result
> accordingly in real time.
> > an online SVD recommender should override setPreference(...) and
> removePreference(...) in AbstractRecommender such that the factorization
> result is updated in O(1) time and without retraining.
> > Right now the slopeOneRecommender is the only component possessing such
> capability.
> > Since SGD is intrinsically an online algorithm and its CF implementation
> is available in core-0.8 (See MAHOUT-1089, MAHOUT-1272), I presume it would
> be a good time to convert it. Such feature could come in handy for some
> websites.
> > Implementation: Adding new users, items, or increasing rating matrix
> rank are just increasing size of user and item matrices. Reducing rating
> matrix rank involves just one svd. The real challenge here is that sgd is
> NO ONE-PASS algorithm, multiple passes are required to achieve an
> acceptable optimality and even more so if hyperparameters are bad. But here
> are two possible circumvents:
> > 1. Use one-pass algorithms like averaged-SGD, not sure if it can ever
> work as applying stochastic convex-opt algorithm to non-convex problem is
> anarchy. But it may be a long shot.
> > 2. Run incomplete passes in each online update using ratings randomly
> sampled (but not uniformly sampled) from latest dataModel. I don't know how
> exactly this should be done but new rating should be sampled more
> frequently. Uniform sampling will results in old ratings being used more
> than new ratings in total. If somebody has worked on this batch-to-online
> conversion before and share his insight that would be awesome. This seems
> to be the most viable option, if I get the non-uniform pseudorandom
> generator that maintains a cumulative uniform distribution I want.
> > I found a very old ticket (MAHOUT-572) mentioning online SVD recommender
> but it didn't pay off. Hopefully its not a bad idea to submit a new ticket
> here.
>
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA
> administrators
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>

Reply via email to