I know, but the same is true for the RMSE. This is based on the Movielens 100k dataset, and by using the frameworks (random) sampling to split that into a training and an evaluation set. (the RMSRecommenderEvaluator or AverageAbsoluteDifferenceRecommenderEvaluators paramters - evaluation 1.0, training 0.75).
Bernát GÁBOR On Thu, May 9, 2013 at 3:05 PM, Sean Owen <[email protected]> wrote: > (The MAE metric may also be a complicating issue... it's measuring > average error where all elements are equally weighted, but as the "WR" > suggests in ALS-WR, the loss function being minimized weights > different elements differently.) > > This is based on a test set right, separate from the training set? > If you are able, measure the MAE on your training set too. If > overfitting is the issue, you should see low error on the training > set, and higher error on the test set, when f is high and lambda is > low. > > On Thu, May 9, 2013 at 1:49 PM, Gabor Bernat <[email protected]> > wrote: > > Hello, > > > > Here it is: http://i.imgur.com/3e1eTE5.png > > I've used 75% for training and 25% for evaluation. > > > > Well reasonably lambda gives close enough results, however not better. > > > > Thanks, > > > > > > Bernát GÁBOR > > > > > > On Thu, May 9, 2013 at 2:46 PM, Sean Owen <[email protected]> wrote: > > > >> This sounds like overfitting. More features lets you fit your training > >> set better, but at some point, fitting too well means you fit other > >> test data less well. Lambda resists overfitting, so setting it too low > >> increases the overfitting problem. > >> > >> I assume you still get better test set results with a reasonable lambda? > >> > >> On Thu, May 9, 2013 at 1:38 PM, Gabor Bernat <[email protected]> > >> wrote: > >> > Hello, > >> > > >> > So I've been testing out the ALSWR with the Movielensk 100k dataset, > and > >> > I've run in some strange stuff. An example of this you can see in the > >> > attached picture. > >> > > >> > So I've used feature count1,2,4,8,16,32, same for iteration and > summed up > >> > the results in a table. So for a lambda higher than 0.07 the more > >> important > >> > factor seems to be the iteration count, while increasing the feature > >> count > >> > may improve the result, however not that much. And this is what one > could > >> > expect from the algrithm, so that's okay. > >> > > >> > The strange stuff comes for lambdas smaller than 0.075. In this case > the > >> > more important part becames the feature count, hovewer not more but > less > >> is > >> > better. Similary for the iteration count. Essentially the best score > is > >> > achieved for a really small lambda, and a single feature and iteration > >> > count. How is this possible, am I missing something? > >> > > >> > > >> > Bernát GÁBOR > >> >
