I've used the constructor without that argument (or alpha). So I suppose those take the default value, which I suppose is an explicit model, am I right?
Thanks, Bernát GÁBOR On Thu, May 9, 2013 at 3:40 PM, Sebastian Schelter <[email protected]>wrote: > Our ALSWRFactorizer can do both flavors of ALS (the one used for > explicit and the one used for implicit data). @Gabor, what do you > specify for the constructor argument "usesImplicitFeedback" ? > > > On 09.05.2013 15:33, Sean Owen wrote: > > RMSE would have the same potential issue. ALS-WR is going to prefer to > > minimize one error at the expense of letting another get much larger, > > whereas RMSE penalizes them all the same. It's maybe an indirect > > issue here at best -- there's a moderate mismatch between the metric > > and the nature of the algorithm. > > > > I think most of the explanation is simply overfitting then, as this is > > test set error. I still think it is weird that the lowest MAE occurs > > at f=1; maybe there's a good simple reason for that I'm missing off > > the top of my head. > > > > FWIW When I tune for best parameters on this data set, according to a > > mean average precision metric, I end up with an optimum more like 15 > > features and lambda=0.05 (although, note, I'm using a different > > default alpha, 1, and a somewhat different definition of lambda). > > > > > > > > On Thu, May 9, 2013 at 2:11 PM, Gabor Bernat <[email protected]> > wrote: > >> I know, but the same is true for the RMSE. > >> > >> This is based on the Movielens 100k dataset, and by using the frameworks > >> (random) sampling to split that into a training and an evaluation set. > (the > >> RMSRecommenderEvaluator or > AverageAbsoluteDifferenceRecommenderEvaluators > >> paramters - evaluation 1.0, training 0.75). > >> > >> Bernát GÁBOR > >> > >> > >> On Thu, May 9, 2013 at 3:05 PM, Sean Owen <[email protected]> wrote: > >> > >>> (The MAE metric may also be a complicating issue... it's measuring > >>> average error where all elements are equally weighted, but as the "WR" > >>> suggests in ALS-WR, the loss function being minimized weights > >>> different elements differently.) > >>> > >>> This is based on a test set right, separate from the training set? > >>> If you are able, measure the MAE on your training set too. If > >>> overfitting is the issue, you should see low error on the training > >>> set, and higher error on the test set, when f is high and lambda is > >>> low. > >>> > >>> On Thu, May 9, 2013 at 1:49 PM, Gabor Bernat <[email protected]> > >>> wrote: > >>>> Hello, > >>>> > >>>> Here it is: http://i.imgur.com/3e1eTE5.png > >>>> I've used 75% for training and 25% for evaluation. > >>>> > >>>> Well reasonably lambda gives close enough results, however not better. > >>>> > >>>> Thanks, > >>>> > >>>> > >>>> Bernát GÁBOR > >>>> > >>>> > >>>> On Thu, May 9, 2013 at 2:46 PM, Sean Owen <[email protected]> wrote: > >>>> > >>>>> This sounds like overfitting. More features lets you fit your > training > >>>>> set better, but at some point, fitting too well means you fit other > >>>>> test data less well. Lambda resists overfitting, so setting it too > low > >>>>> increases the overfitting problem. > >>>>> > >>>>> I assume you still get better test set results with a reasonable > lambda? > >>>>> > >>>>> On Thu, May 9, 2013 at 1:38 PM, Gabor Bernat <[email protected]> > >>>>> wrote: > >>>>>> Hello, > >>>>>> > >>>>>> So I've been testing out the ALSWR with the Movielensk 100k dataset, > >>> and > >>>>>> I've run in some strange stuff. An example of this you can see in > the > >>>>>> attached picture. > >>>>>> > >>>>>> So I've used feature count1,2,4,8,16,32, same for iteration and > >>> summed up > >>>>>> the results in a table. So for a lambda higher than 0.07 the more > >>>>> important > >>>>>> factor seems to be the iteration count, while increasing the feature > >>>>> count > >>>>>> may improve the result, however not that much. And this is what one > >>> could > >>>>>> expect from the algrithm, so that's okay. > >>>>>> > >>>>>> The strange stuff comes for lambdas smaller than 0.075. In this case > >>> the > >>>>>> more important part becames the feature count, hovewer not more but > >>> less > >>>>> is > >>>>>> better. Similary for the iteration count. Essentially the best score > >>> is > >>>>>> achieved for a really small lambda, and a single feature and > iteration > >>>>>> count. How is this possible, am I missing something? > >>>>>> > >>>>>> > >>>>>> Bernát GÁBOR > >>>>> > >>> > >
