[jira] Commented: (MAHOUT-542) MapReduce implementation of ALS-WR

Ted Dunning (JIRA) Wed, 22 Dec 2010 15:37:26 -0800

    [ 
https://issues.apache.org/jira/browse/MAHOUT-542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12974436#action_12974436
 ]


Ted Dunning commented on MAHOUT-542:
------------------------------------

{quote}
I still don't see how to automatically learn lambda yet without running lot's 
of subsequent M/R jobs...
{quote}
Can you compute factorizations for multiple values of lambda in one go and then 
evaluate all of them in one pass?

This would require that parallelALS accept a list of lambdas and produce 
multiple outputs
It would also require that evaluateALS accept multiple models as well.

It should take about the same amount of time to test against lots of models as 
it does to test against a single model.  The distributed ALS might exhibit the 
same properties.

> MapReduce implementation of ALS-WR
> ----------------------------------
>
>                 Key: MAHOUT-542
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-542
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Collaborative Filtering
>    Affects Versions: 0.5
>            Reporter: Sebastian Schelter
>         Attachments: MAHOUT-452.patch, MAHOUT-542-2.patch, MAHOUT-542-3.patch
>
>
> As Mahout is currently lacking a distributed collaborative filtering 
> algorithm that uses matrix factorization, I spent some time reading through a 
> couple of the Netflix papers and stumbled upon the "Large-scale Parallel 
> Collaborative Filtering for the Netﬂix Prize" available at 
> http://www.hpl.hp.com/personal/Robert_Schreiber/papers/2008%20AAIM%20Netflix/netflix_aaim08(submitted).pdf.
> It describes a parallel algorithm that uses "Alternating-Least-Squares with 
> Weighted-λ-Regularization" to factorize the preference-matrix and gives some 
> insights on how the authors distributed the computation using Matlab.
> It seemed to me that this approach could also easily be parallelized using 
> Map/Reduce, so I sat down and created a prototype version. I'm not really 
> sure I got the mathematical details correct (they need some optimization 
> anyway), but I wanna put up my prototype implementation here per Yonik's law 
> of patches.
> Maybe someone has the time and motivation to work a little on this with me. 
> It would be great if someone could validate the approach taken (I'm willing 
> to help as the code might not be intuitive to read) and could try to 
> factorize some test data and give feedback then.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAHOUT-542) MapReduce implementation of ALS-WR

Reply via email to