Here's just one perspective -- Yes this is kind of how things like ALS work. The input values are viewed as 'weights', not ratings. They're not reconstructed directly but used as a weight in a loss function. This turns out to make more sense when paired with a squared-error loss function, as it inevitably is.
The nice thing is that weight-like data can naturally include many different types of activities, since weights can be added meaningfully. But then how do you pick the weights? For this I don't have a strong, principled answer, maybe someone else does. You can pick something based on other information you have: if event A happens N times more than B does, maybe B is N times more significant and deserves N times as much weight. You can always test various values and evaluate test metrics to see what works best. On Tue, Jul 23, 2013 at 2:07 PM, Jayesh <jayesh.sidhw...@gmail.com> wrote: > Hi, > > Consider this as a newbie question. > > I have been reading about CF algorithms. Everyone seems to be taking the > preference value as ratings, or any singular attribute. However, in a > typical ecommerce scenario the entire clickstream data is important ( with > varying weights) to determine the affinity of the user vs item. > > So, my question is, in production, do we consider many such parameters to > calculate user vs item affinity or do we just pick any one parameter. > > If we pick any one parameter, how do we decide which is the one that will > reflect the affinity in the best possible way? > > If we consider many parameters, do we use any kind of a regression to > formulate the affinity score (that takes into consideration all the > features and their respective weights that impact the users liklehood) and > run any CF algorithm over these scores? > > > Thanks. > > -- > Best Regards, > > Jayesh