Re: [jira] Updated: (MAHOUT-542) MapReduce implementation of ALS-WR

Dmitriy Lyubimov Tue, 21 Dec 2010 09:00:47 -0800

There's evolutionary algorithm in SGD to find those in adaptive way using
cross-validation but it may be too demanding in terms of # of experiments.
just FYI


On Mon, Dec 20, 2010 at 11:58 PM, Sebastian Schelter <
[email protected]> wrote:

> Hi Dmitriy,
>
> the paper states that it's easy to find a good lambda value with 3-4
> experiments. I still have to verify that assumption on a real dataset.
>
> --sebastian
>
>
> On 21.12.2010 00:57, Dmitriy Lyubimov wrote:
>
>>  HI Sebastian,
>>
>> how do you come up with a good Lambda to use with this weighted ALS?
>>
>> On Mon, Dec 20, 2010 at 3:27 PM, Sebastian Schelter (JIRA)
>> <[email protected]>wrote:
>>
>>       [
>>>
>>> https://issues.apache.org/jira/browse/MAHOUT-542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
>>> ]
>>>
>>> Sebastian Schelter updated MAHOUT-542:
>>> --------------------------------------
>>>
>>>     Attachment: MAHOUT-542-2.patch
>>>
>>> An updated version of the patch. I fixed a small bug, added more tests
>>> and
>>> polished the code a little.
>>>
>>> The distributed matrix factorization works fine now on a toy example. The
>>> next steps will be to use real data and do some holdout tests.
>>>
>>> MapReduce implementation of ALS-WR
>>>> ----------------------------------
>>>>
>>>>                 Key: MAHOUT-542
>>>>                 URL: https://issues.apache.org/jira/browse/MAHOUT-542
>>>>             Project: Mahout
>>>>          Issue Type: New Feature
>>>>          Components: Collaborative Filtering
>>>>    Affects Versions: 0.5
>>>>            Reporter: Sebastian Schelter
>>>>         Attachments: MAHOUT-452.patch, MAHOUT-542-2.patch
>>>>
>>>>
>>>> As Mahout is currently lacking a distributed collaborative filtering
>>>>
>>> algorithm that uses matrix factorization, I spent some time reading
>>> through
>>> a couple of the Netflix papers and stumbled upon the "Large-scale
>>> Parallel
>>> Collaborative Filtering for the Netﬂix Prize" available at
>>>
>>> http://www.hpl.hp.com/personal/Robert_Schreiber/papers/2008%20AAIM%20Netflix/netflix_aaim08(submitted).pdf
>>> <
>>> http://www.hpl.hp.com/personal/Robert_Schreiber/papers/2008%20AAIM%20Netflix/netflix_aaim08%28submitted%29.pdf>
>>>
>>>
>>> .
>>>
>>>> It describes a parallel algorithm that uses "Alternating-Least-Squares
>>>>
>>> with Weighted-λ-Regularization" to factorize the preference-matrix and
>>> gives
>>> some insights on how the authors distributed the computation using
>>> Matlab.
>>>
>>>> It seemed to me that this approach could also easily be parallelized
>>>>
>>> using Map/Reduce, so I sat down and created a prototype version. I'm not
>>> really sure I got the mathematical details correct (they need some
>>> optimization anyway), but I wanna put up my prototype implementation here
>>> per Yonik's law of patches.
>>>
>>>> Maybe someone has the time and motivation to work a little on this with
>>>>
>>> me. It would be great if someone could validate the approach taken (I'm
>>> willing to help as the code might not be intuitive to read) and could try
>>> to
>>> factorize some test data and give feedback then.
>>>
>>> --
>>> This message is automatically generated by JIRA.
>>> -
>>> You can reply to this email to add a comment to the issue online.
>>>
>>>
>>>
>

Re: [jira] Updated: (MAHOUT-542) MapReduce implementation of ALS-WR

Reply via email to