Here's just one perspective --

Yes this is kind of how things like ALS work. The input values are
viewed as 'weights', not ratings. They're not reconstructed directly
but used as a weight in a loss function. This turns out to make more
sense when paired with a squared-error loss function, as it inevitably
is.

The nice thing is that weight-like data can naturally include many
different types of activities, since weights can be added
meaningfully.

But then how do you pick the weights? For this I don't have a strong,
principled answer, maybe someone else does. You can pick something
based on other information you have: if event A happens N times more
than B does, maybe B is N times more significant and deserves N times
as much weight. You can always test various values and evaluate test
metrics to see what works best.

On Tue, Jul 23, 2013 at 2:07 PM, Jayesh <jayesh.sidhw...@gmail.com> wrote:
> Hi,
>
> Consider this as a newbie question.
>
> I have been reading about CF algorithms. Everyone seems to be taking the
> preference value as ratings, or any singular attribute. However, in a
> typical ecommerce scenario the entire clickstream data is important ( with
> varying weights) to determine the affinity of the user vs item.
>
> So, my question is, in production, do we consider many such parameters to
> calculate user vs item affinity or do we just pick any one parameter.
>
> If we pick any one parameter, how do we decide which is the one that will
> reflect the affinity in the best possible way?
>
> If we consider many parameters, do we use any kind of a regression to
> formulate the affinity score (that takes into consideration all the
> features and their respective weights that impact the users liklehood) and
> run any CF algorithm over these scores?
>
>
> Thanks.
>
> --
> Best Regards,
>
> Jayesh

Reply via email to