One small correction: ALS-WR is also only looking at rated examples.

2011/11/17 Sebastian Schelter <[email protected]>:
> I think Dmitriys description of the SGD and ALS-WR approach hits the
> nail on the head.
>
> However there is a third way to factorize the rating matrix which we
> haven't talked about yet. It's described in Yehuda Koren's
> "Collaborative Filtering for Implicit Feedback Datasets"
> http://research.yahoo.com/pub/2433 and I recently added it to
> ParallelALSFactorizationJob.
>
> This approach works on implicit feedback data (like the number of
> times a user watched a television series) and all unobserved
> interactions are by definition 0. Using a standard SVD would result in
> the problems Dmitriy described.
>
> But the paper introduces a very interesting approach: the user-item
> matrix holds 0s and 1s only (0 in a cell if there have been no
> interactions, 1 if there have been 1 or more interactions). This
> matrix is decomposed into two other matrices X and Y (user and item
> features) by minimizing the (regularized) squared error over all
> observations (which is the same as in ALS-WR). However the error is
> weighted by a confidence value that is very low if the user never
> interacted with the item (because he simply might not be aware that
> this item exists) and very high if the user interacted very often with
> the item (a good indication of preference). That should help to avoid
> the problems that Dmitriy described.
>
> --sebastian
>
>
> 2011/11/17 Dmitriy Lyubimov <[email protected]>:
>> On Thu, Nov 17, 2011 at 11:30 AM, Dmitriy Lyubimov <[email protected]> wrote:
>>> I will finish adding an option with Cholesky decomposition route to
>>> SSVD some time early in Q1 2012.
>>>
>>
>> PPS i already put some jobs in (they are in the trunk) for Cholesky
>> route. I thought it would be an easy mod but then i saw that it would
>> require a little bit more modifications to also support power
>> iterations the same way they are supported today (and also i still
>> kind of couldn't quite finish my thought process on what it would take
>> to modify U-job to produce U without Q in his case, it seems this
>> route will require a 100% special handling and i wouldn't be able to
>> reuse any of current U job for this option.
>>
>> For these reasons, i decided to wait until i figure all of the
>> remaining issues architecturally before i proceed. And that would
>> better be a one longer chunk of time rather than several little
>> chunks, which makes it dependent more on my schedule to figure where
>> that chunk might be.
>>
>

Reply via email to