(The MAE metric may also be a complicating issue... it's measuring
average error where all elements are equally weighted, but as the "WR"
suggests in ALS-WR, the loss function being minimized weights
different elements differently.)

This is based on a test set right, separate from the training set?
If you are able, measure the MAE on your training set too. If
overfitting is the issue, you should see low error on the training
set, and higher error on the test set, when f is high and lambda is
low.

On Thu, May 9, 2013 at 1:49 PM, Gabor Bernat <[email protected]> wrote:
> Hello,
>
> Here it is: http://i.imgur.com/3e1eTE5.png
> I've used 75% for training and 25% for evaluation.
>
> Well reasonably lambda gives close enough results, however not better.
>
> Thanks,
>
>
> Bernát GÁBOR
>
>
> On Thu, May 9, 2013 at 2:46 PM, Sean Owen <[email protected]> wrote:
>
>> This sounds like overfitting. More features lets you fit your training
>> set better, but at some point, fitting too well means you fit other
>> test data less well. Lambda resists overfitting, so setting it too low
>> increases the overfitting problem.
>>
>> I assume you still get better test set results with a reasonable lambda?
>>
>> On Thu, May 9, 2013 at 1:38 PM, Gabor Bernat <[email protected]>
>> wrote:
>> > Hello,
>> >
>> > So I've been testing out the ALSWR with the Movielensk 100k dataset, and
>> > I've run in some strange stuff. An example of this you can see in the
>> > attached picture.
>> >
>> > So I've used feature count1,2,4,8,16,32, same for iteration and summed up
>> > the results in a table. So for a lambda higher than 0.07 the more
>> important
>> > factor seems to be the iteration count, while increasing the feature
>> count
>> > may improve the result, however not that much. And this is what one could
>> > expect from the algrithm, so that's okay.
>> >
>> > The strange stuff comes for lambdas smaller than 0.075. In this case the
>> > more important part becames the feature count, hovewer not more but less
>> is
>> > better. Similary for the iteration count. Essentially the best score is
>> > achieved for a really small lambda, and a single feature and iteration
>> > count. How is this possible, am I missing something?
>> >
>> >
>> > Bernát GÁBOR
>>

Reply via email to