On Tue, 2013-01-15 at 14:40 +0000, Emmanuel Castella wrote:
> Dear R-users
> I am trying to understand the difference between the GCV score
> returned by summary() of a gam object in the mgcv package, and the
> output of a k-fold cross-validation of the same gam model. Thank you
> in advance for any answer or reference to a published document.
> Sincerely
> Emmanuel Castella

I don't have all the details to hand, but as no one has replied,
publicly at least, I'll have a stab. GCV is a means of approximating the
results of an explicit CV. The GCV score in mgcv:::gam() is the minimum
GCV score arrived at during fitting for various value of smoothness
penalty; in other words it is the thing that was minimised to arrive at
the fitted model for which smoothness selection was performed.

If you did a k-fold CV to find the optimal smoothing parameters, the
degrees of freedom for the fitted smooth(s) should be similar to the
degrees of freedom used by the model when GCV was used.

The difference between the two is that GCV approximates what you'd get
if you did the actual CV without having to actually fit all those models
during the CV steps. Not doing all the fitting can be a huge
computational saving!

For the details I strongly suggest you read Simon Wood's book

  Wood, S.N. (2006) Generalized Additive Models: An
  Introduction with R. Chapman and Hall/CRC.

And having said all that, Simon and colleagues have shown that GCV can
under-smooth data, especially when the objective function is very flat
in the region of the optimal model. The conclusion that I draw from
their papers is that using `method = "REML"` or `method = "ML"` is
usually provides the best-performing smoothness selection and in
addition one can turn on extra penalties (one via `select = TRUE`, the
other via an argument to each `s()` term) that will perform model
selection (i.e. shrink terms out of the model) for you.

HTH

G 

> 
> 
>       [[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
> 

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

_______________________________________________
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

Reply via email to