HI Sebastian, how do you come up with a good Lambda to use with this weighted ALS?
On Mon, Dec 20, 2010 at 3:27 PM, Sebastian Schelter (JIRA) <[email protected]>wrote: > > [ > https://issues.apache.org/jira/browse/MAHOUT-542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel] > > Sebastian Schelter updated MAHOUT-542: > -------------------------------------- > > Attachment: MAHOUT-542-2.patch > > An updated version of the patch. I fixed a small bug, added more tests and > polished the code a little. > > The distributed matrix factorization works fine now on a toy example. The > next steps will be to use real data and do some holdout tests. > > > MapReduce implementation of ALS-WR > > ---------------------------------- > > > > Key: MAHOUT-542 > > URL: https://issues.apache.org/jira/browse/MAHOUT-542 > > Project: Mahout > > Issue Type: New Feature > > Components: Collaborative Filtering > > Affects Versions: 0.5 > > Reporter: Sebastian Schelter > > Attachments: MAHOUT-452.patch, MAHOUT-542-2.patch > > > > > > As Mahout is currently lacking a distributed collaborative filtering > algorithm that uses matrix factorization, I spent some time reading through > a couple of the Netflix papers and stumbled upon the "Large-scale Parallel > Collaborative Filtering for the Netflix Prize" available at > http://www.hpl.hp.com/personal/Robert_Schreiber/papers/2008%20AAIM%20Netflix/netflix_aaim08(submitted).pdf<http://www.hpl.hp.com/personal/Robert_Schreiber/papers/2008%20AAIM%20Netflix/netflix_aaim08%28submitted%29.pdf> > . > > It describes a parallel algorithm that uses "Alternating-Least-Squares > with Weighted-λ-Regularization" to factorize the preference-matrix and gives > some insights on how the authors distributed the computation using Matlab. > > It seemed to me that this approach could also easily be parallelized > using Map/Reduce, so I sat down and created a prototype version. I'm not > really sure I got the mathematical details correct (they need some > optimization anyway), but I wanna put up my prototype implementation here > per Yonik's law of patches. > > Maybe someone has the time and motivation to work a little on this with > me. It would be great if someone could validate the approach taken (I'm > willing to help as the code might not be intuitive to read) and could try to > factorize some test data and give feedback then. > > -- > This message is automatically generated by JIRA. > - > You can reply to this email to add a comment to the issue online. > >
