Re: [R] AIC and goodness of prediction - was: Re: goodness of prediction using a model (lm, glm, gam, brt,

2009-09-11 Thread Kingsford Jones
Hi Corrado,

Not being familiar with your research goals or data I can't make
recommendations, but I can suggest a couple of places to look for
information:  Frank Harrell's Regression Modeling Strategies and his
Design library available on CRAN, and Hastie et al's The Elements of
Statistical Learning.

A couple more comments below..

On Thu, Sep 10, 2009 at 11:48 AM, Corrado ct...@york.ac.uk wrote:
 Dear Kingsford,

 I apologise for breaking the thread, but I thought there were some more people
 who would be interested.

 What you propose is what I am using at the moment: the sum of the squares of
 the residuals, plus  variance / stdev. I am not really satisfied. I have also
 tried using R2, and it works well  but some people go a bit wild eyed when
 they see a negative R2 (which is perfectly reasonable when you use R2 as a
 measure of goodness of fit on prediction on a dataset different from the
 training set).

To get negative values I'm guessing you're using 1 - ((sum((obs -
pred)^2)) / (sum((obs - mean(obs))^2))?  If so a negative value
indicates the model is a worse predictor than using a constant.  Also
note the formula is just a linear transform of the one mentioned in my
last email.


 I was then wondering whether it would make sense to use AIC: the K in the
 formula will still be the number of parameters of the trained model, the sum
 of square residuals would be the (predicted - observed)^2, N would be the
 number of samples in the test dataset. I think it should work well.


Generally, when assessing predictive ability one is not concerned with
the number of parameters -- just how good the predictions are on data
that is independent of the model selection and fitting process.  Also,
the general definition of AIC uses likelihoods not SS residuals.
Also, using the SS resids you are once again back to a linear
transormation of the MSE estimate...


Kingsford




 What do you / other R list members think?

 Regards

 On Thursday 03 September 2009 15:06:14 Kingsford Jones wrote:
 There are many ways to measure prediction quality, and what you choose
 depends on the data and your goals.  A common measure for a
 quantitative response is mean squared error (i.e. 1/n * sum((observed
 - predicted)^2)) which incorporates bias and variance.  Common terms
 for what you are looking for are test error and generalization
 error.


 hth,
 Kingsford

 On Wed, Sep 2, 2009 at 11:56 PM, Corradoct...@york.ac.uk wrote:
  Dear R-friends,
 
  How do you test the goodness of prediction of a model, when you predict
  on a set of data DIFFERENT from the training set?
 
  I explain myself: you train your model M (e.g. glm,gam,regression tree,
  brt) on a set of data A with a response variable Y. You then predict the
  value of that same response variable Y on a different set of data B (e.g.
  predict.glm, predict.gam and so on). Dataset A and dataset B are
  different in the sense that they contain the same variable, for example
  temperature, measured in different sites, or on a different interval
  (e.g. B is a subinterval of A for interpolation, or a different interval
  for extrapolation). If you have the measured values for Y on the new
  interval, i.e. B, how do you measure how good is the prediction, that is
  how well model fits the Y on B (that is, how well does it predict)?
 
  In other words:
 
  Y~T,data=A for training
  Y~T,data=B for predicting
 
  I have devised a couple of method based around 1) standard deviation 2)
  R^2, but I am unhappy with them.
 
  Regards
  --
  Corrado Topi
 
  Global Climate Change  Biodiversity Indicators
  Area 18,Department of Biology
  University of York, York, YO10 5YW, UK
  Phone: + 44 (0) 1904 328645, E-mail: ct...@york.ac.uk
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html and provide commented,
  minimal, self-contained, reproducible code.



 --
 Corrado Topi

 Global Climate Change  Biodiversity Indicators
 Area 18,Department of Biology
 University of York, York, YO10 5YW, UK
 Phone: + 44 (0) 1904 328645, E-mail: ct...@york.ac.uk



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] AIC and goodness of prediction - was: Re: goodness of prediction using a model (lm, glm, gam, brt,

2009-09-10 Thread Corrado
Dear Kingsford,

I apologise for breaking the thread, but I thought there were some more people 
who would be interested.

What you propose is what I am using at the moment: the sum of the squares of 
the residuals, plus  variance / stdev. I am not really satisfied. I have also 
tried using R2, and it works well  but some people go a bit wild eyed when 
they see a negative R2 (which is perfectly reasonable when you use R2 as a 
measure of goodness of fit on prediction on a dataset different from the 
training set).

I was then wondering whether it would make sense to use AIC: the K in the 
formula will still be the number of parameters of the trained model, the sum 
of square residuals would be the (predicted - observed)^2, N would be the 
number of samples in the test dataset. I think it should work well.

What do you / other R list members think?

Regards

On Thursday 03 September 2009 15:06:14 Kingsford Jones wrote:
 There are many ways to measure prediction quality, and what you choose
 depends on the data and your goals.  A common measure for a
 quantitative response is mean squared error (i.e. 1/n * sum((observed
 - predicted)^2)) which incorporates bias and variance.  Common terms
 for what you are looking for are test error and generalization
 error.


 hth,
 Kingsford

 On Wed, Sep 2, 2009 at 11:56 PM, Corradoct...@york.ac.uk wrote:
  Dear R-friends,
 
  How do you test the goodness of prediction of a model, when you predict
  on a set of data DIFFERENT from the training set?
 
  I explain myself: you train your model M (e.g. glm,gam,regression tree,
  brt) on a set of data A with a response variable Y. You then predict the
  value of that same response variable Y on a different set of data B (e.g.
  predict.glm, predict.gam and so on). Dataset A and dataset B are
  different in the sense that they contain the same variable, for example
  temperature, measured in different sites, or on a different interval
  (e.g. B is a subinterval of A for interpolation, or a different interval
  for extrapolation). If you have the measured values for Y on the new
  interval, i.e. B, how do you measure how good is the prediction, that is
  how well model fits the Y on B (that is, how well does it predict)?
 
  In other words:
 
  Y~T,data=A for training
  Y~T,data=B for predicting
 
  I have devised a couple of method based around 1) standard deviation 2)
  R^2, but I am unhappy with them.
 
  Regards
  --
  Corrado Topi
 
  Global Climate Change  Biodiversity Indicators
  Area 18,Department of Biology
  University of York, York, YO10 5YW, UK
  Phone: + 44 (0) 1904 328645, E-mail: ct...@york.ac.uk
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html and provide commented,
  minimal, self-contained, reproducible code.



-- 
Corrado Topi

Global Climate Change  Biodiversity Indicators
Area 18,Department of Biology
University of York, York, YO10 5YW, UK
Phone: + 44 (0) 1904 328645, E-mail: ct...@york.ac.uk

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.