OK I keep thinking ALS-WR = weighted terms / implicit feedback but that's not the case here it seems. Well scratch that part, but I think the answer is still overfitting.
On Thu, May 9, 2013 at 2:45 PM, Gabor Bernat <ber...@primeranks.net> wrote: > I've used the constructor without that argument (or alpha). So I suppose > those take the default value, which I suppose is an explicit model, am I > right? > > Thanks, > > Bernát GÁBOR > > > On Thu, May 9, 2013 at 3:40 PM, Sebastian Schelter > <ssc.o...@googlemail.com>wrote: > >> Our ALSWRFactorizer can do both flavors of ALS (the one used for >> explicit and the one used for implicit data). @Gabor, what do you >> specify for the constructor argument "usesImplicitFeedback" ? >> >> >> On 09.05.2013 15:33, Sean Owen wrote: >> > RMSE would have the same potential issue. ALS-WR is going to prefer to >> > minimize one error at the expense of letting another get much larger, >> > whereas RMSE penalizes them all the same. It's maybe an indirect >> > issue here at best -- there's a moderate mismatch between the metric >> > and the nature of the algorithm. >> > >> > I think most of the explanation is simply overfitting then, as this is >> > test set error. I still think it is weird that the lowest MAE occurs >> > at f=1; maybe there's a good simple reason for that I'm missing off >> > the top of my head. >> > >> > FWIW When I tune for best parameters on this data set, according to a >> > mean average precision metric, I end up with an optimum more like 15 >> > features and lambda=0.05 (although, note, I'm using a different >> > default alpha, 1, and a somewhat different definition of lambda). >> > >> > >> > >> > On Thu, May 9, 2013 at 2:11 PM, Gabor Bernat <ber...@primeranks.net> >> wrote: >> >> I know, but the same is true for the RMSE. >> >> >> >> This is based on the Movielens 100k dataset, and by using the frameworks >> >> (random) sampling to split that into a training and an evaluation set. >> (the >> >> RMSRecommenderEvaluator or >> AverageAbsoluteDifferenceRecommenderEvaluators >> >> paramters - evaluation 1.0, training 0.75). >> >> >> >> Bernát GÁBOR >> >> >> >> >> >> On Thu, May 9, 2013 at 3:05 PM, Sean Owen <sro...@gmail.com> wrote: >> >> >> >>> (The MAE metric may also be a complicating issue... it's measuring >> >>> average error where all elements are equally weighted, but as the "WR" >> >>> suggests in ALS-WR, the loss function being minimized weights >> >>> different elements differently.) >> >>> >> >>> This is based on a test set right, separate from the training set? >> >>> If you are able, measure the MAE on your training set too. If >> >>> overfitting is the issue, you should see low error on the training >> >>> set, and higher error on the test set, when f is high and lambda is >> >>> low. >> >>> >> >>> On Thu, May 9, 2013 at 1:49 PM, Gabor Bernat <ber...@primeranks.net> >> >>> wrote: >> >>>> Hello, >> >>>> >> >>>> Here it is: http://i.imgur.com/3e1eTE5.png >> >>>> I've used 75% for training and 25% for evaluation. >> >>>> >> >>>> Well reasonably lambda gives close enough results, however not better. >> >>>> >> >>>> Thanks, >> >>>> >> >>>> >> >>>> Bernát GÁBOR >> >>>> >> >>>> >> >>>> On Thu, May 9, 2013 at 2:46 PM, Sean Owen <sro...@gmail.com> wrote: >> >>>> >> >>>>> This sounds like overfitting. More features lets you fit your >> training >> >>>>> set better, but at some point, fitting too well means you fit other >> >>>>> test data less well. Lambda resists overfitting, so setting it too >> low >> >>>>> increases the overfitting problem. >> >>>>> >> >>>>> I assume you still get better test set results with a reasonable >> lambda? >> >>>>> >> >>>>> On Thu, May 9, 2013 at 1:38 PM, Gabor Bernat <ber...@primeranks.net> >> >>>>> wrote: >> >>>>>> Hello, >> >>>>>> >> >>>>>> So I've been testing out the ALSWR with the Movielensk 100k dataset, >> >>> and >> >>>>>> I've run in some strange stuff. An example of this you can see in >> the >> >>>>>> attached picture. >> >>>>>> >> >>>>>> So I've used feature count1,2,4,8,16,32, same for iteration and >> >>> summed up >> >>>>>> the results in a table. So for a lambda higher than 0.07 the more >> >>>>> important >> >>>>>> factor seems to be the iteration count, while increasing the feature >> >>>>> count >> >>>>>> may improve the result, however not that much. And this is what one >> >>> could >> >>>>>> expect from the algrithm, so that's okay. >> >>>>>> >> >>>>>> The strange stuff comes for lambdas smaller than 0.075. In this case >> >>> the >> >>>>>> more important part becames the feature count, hovewer not more but >> >>> less >> >>>>> is >> >>>>>> better. Similary for the iteration count. Essentially the best score >> >>> is >> >>>>>> achieved for a really small lambda, and a single feature and >> iteration >> >>>>>> count. How is this possible, am I missing something? >> >>>>>> >> >>>>>> >> >>>>>> Bernát GÁBOR >> >>>>> >> >>> >> >>