I've used the constructor without that argument (or alpha). So I suppose
those take the default value, which I suppose is an explicit model, am I
right?

Thanks,

Bernát GÁBOR


On Thu, May 9, 2013 at 3:40 PM, Sebastian Schelter
<[email protected]>wrote:

> Our ALSWRFactorizer can do both flavors of ALS (the one used for
> explicit and the one used for implicit data). @Gabor, what do you
> specify for the constructor argument "usesImplicitFeedback" ?
>
>
> On 09.05.2013 15:33, Sean Owen wrote:
> > RMSE would have the same potential issue. ALS-WR is going to prefer to
> > minimize one error at the expense of letting another get much larger,
> > whereas RMSE penalizes them all the same.  It's maybe an indirect
> > issue here at best -- there's a moderate mismatch between the metric
> > and the nature of the algorithm.
> >
> > I think most of the explanation is simply overfitting then, as this is
> > test set error. I still think it is weird that the lowest MAE occurs
> > at f=1; maybe there's a good simple reason for that I'm missing off
> > the top of my head.
> >
> > FWIW When I tune for best parameters on this data set, according to a
> > mean average precision metric, I end up with an optimum more like 15
> > features and lambda=0.05 (although, note, I'm using a different
> > default alpha, 1, and a somewhat different definition of lambda).
> >
> >
> >
> > On Thu, May 9, 2013 at 2:11 PM, Gabor Bernat <[email protected]>
> wrote:
> >> I know, but the same is true for the RMSE.
> >>
> >> This is based on the Movielens 100k dataset, and by using the frameworks
> >> (random) sampling to split that into a training and an evaluation set.
> (the
> >> RMSRecommenderEvaluator or
> AverageAbsoluteDifferenceRecommenderEvaluators
> >> paramters - evaluation 1.0, training 0.75).
> >>
> >> Bernát GÁBOR
> >>
> >>
> >> On Thu, May 9, 2013 at 3:05 PM, Sean Owen <[email protected]> wrote:
> >>
> >>> (The MAE metric may also be a complicating issue... it's measuring
> >>> average error where all elements are equally weighted, but as the "WR"
> >>> suggests in ALS-WR, the loss function being minimized weights
> >>> different elements differently.)
> >>>
> >>> This is based on a test set right, separate from the training set?
> >>> If you are able, measure the MAE on your training set too. If
> >>> overfitting is the issue, you should see low error on the training
> >>> set, and higher error on the test set, when f is high and lambda is
> >>> low.
> >>>
> >>> On Thu, May 9, 2013 at 1:49 PM, Gabor Bernat <[email protected]>
> >>> wrote:
> >>>> Hello,
> >>>>
> >>>> Here it is: http://i.imgur.com/3e1eTE5.png
> >>>> I've used 75% for training and 25% for evaluation.
> >>>>
> >>>> Well reasonably lambda gives close enough results, however not better.
> >>>>
> >>>> Thanks,
> >>>>
> >>>>
> >>>> Bernát GÁBOR
> >>>>
> >>>>
> >>>> On Thu, May 9, 2013 at 2:46 PM, Sean Owen <[email protected]> wrote:
> >>>>
> >>>>> This sounds like overfitting. More features lets you fit your
> training
> >>>>> set better, but at some point, fitting too well means you fit other
> >>>>> test data less well. Lambda resists overfitting, so setting it too
> low
> >>>>> increases the overfitting problem.
> >>>>>
> >>>>> I assume you still get better test set results with a reasonable
> lambda?
> >>>>>
> >>>>> On Thu, May 9, 2013 at 1:38 PM, Gabor Bernat <[email protected]>
> >>>>> wrote:
> >>>>>> Hello,
> >>>>>>
> >>>>>> So I've been testing out the ALSWR with the Movielensk 100k dataset,
> >>> and
> >>>>>> I've run in some strange stuff. An example of this you can see in
> the
> >>>>>> attached picture.
> >>>>>>
> >>>>>> So I've used feature count1,2,4,8,16,32, same for iteration and
> >>> summed up
> >>>>>> the results in a table. So for a lambda higher than 0.07 the more
> >>>>> important
> >>>>>> factor seems to be the iteration count, while increasing the feature
> >>>>> count
> >>>>>> may improve the result, however not that much. And this is what one
> >>> could
> >>>>>> expect from the algrithm, so that's okay.
> >>>>>>
> >>>>>> The strange stuff comes for lambdas smaller than 0.075. In this case
> >>> the
> >>>>>> more important part becames the feature count, hovewer not more but
> >>> less
> >>>>> is
> >>>>>> better. Similary for the iteration count. Essentially the best score
> >>> is
> >>>>>> achieved for a really small lambda, and a single feature and
> iteration
> >>>>>> count. How is this possible, am I missing something?
> >>>>>>
> >>>>>>
> >>>>>> Bernát GÁBOR
> >>>>>
> >>>
>
>

Reply via email to