On Mon, Mar 25, 2013 at 10:43 AM, Sean Owen <[email protected]> wrote:
> If your input is clicks, carts, etc. yes you ought to get generally > good results from something meant to consume implicit feedback, like > ALS (for implicit feedback, yes there are at least two main variants). > I think you are talking about the implicit version since you mention > 0/1. > > lambda is the regularization parameter. It is defined a bit > differently in the various papers though. Test a few values if you > can. > But you said "no weights in the regularization"... what do you mean? > you don't want to disable regularization entirely. > > I misspoke. I meant lambda=1. > On Mon, Mar 25, 2013 at 2:14 PM, Koobas <[email protected]> wrote: > > On Mon, Mar 25, 2013 at 9:52 AM, Sean Owen <[email protected]> wrote: > > > >> On Mon, Mar 25, 2013 at 1:41 PM, Koobas <[email protected]> wrote: > >> >> But the assumption works nicely for click-like data. Better still > when > >> >> you can "weakly" prefer to reconstruct the 0 for missing observations > >> >> and much more strongly prefer to reconstruct the "1" for observed > >> >> data. > >> >> > >> > > >> > This does seem intuitive. > >> > How does the benefit manifest itself? > >> > In lowering the RMSE of reconstructing the interaction matrix? > >> > Are there any indicators that it results in better recommendations? > >> > Koobas > >> > >> In this approach you are no longer reconstructing the interaction > >> matrix, so there is no RMSE vs the interaction matrix. You're > >> reconstructing a matrix of 0 and 1. Because entries are weighted > >> differently, you're not even minimizing RMSE over that matrix -- the > >> point is to take some errors more seriously than others. You're > >> minimizing a *weighted* RMSE, yes. > >> > >> Yes of course the goal is better recommendations. This broader idea > >> is harder to measure. You can use mean average precision to measure > >> the tendency to predict back interactions that were held out. > >> > >> Is it better? depends on better than *what*. Applying algorithms that > >> treat input like ratings doesn't work as well on click-like data. The > >> main problem is that these will tend to pay too much attention to > >> large values. For example if an item was clicked 1000 times, and you > >> are trying to actually reconstruct that "1000", then a 10% error > >> "costs" (0.1*1000)^2 = 10000. But a 10% error in reconstructing an > >> item that was clicked once "costs" (0.1*1)^2 = 0.01. The former is > >> considered a million times more important error-wise than the latter, > >> even though the intuition is that it's just 1000 times more important. > >> > >> Better than algorithms that ignore the weight entirely -- yes probably > >> if only because you are using more information. But as in all things > >> "it depends". > >> > > > > Let's say the following. > > Classic market basket. > > Implicit feedback. > > Ones and zeros in the input matrix, no weights in the regularization, > > lambda=1. > > What I will get is: > > A) a reasonable recommender, > > B) a joke of a recommender. >
