On Tue, 2013-01-15 at 14:40 +0000, Emmanuel Castella wrote: > Dear R-users > I am trying to understand the difference between the GCV score > returned by summary() of a gam object in the mgcv package, and the > output of a k-fold cross-validation of the same gam model. Thank you > in advance for any answer or reference to a published document. > Sincerely > Emmanuel Castella
I don't have all the details to hand, but as no one has replied, publicly at least, I'll have a stab. GCV is a means of approximating the results of an explicit CV. The GCV score in mgcv:::gam() is the minimum GCV score arrived at during fitting for various value of smoothness penalty; in other words it is the thing that was minimised to arrive at the fitted model for which smoothness selection was performed. If you did a k-fold CV to find the optimal smoothing parameters, the degrees of freedom for the fitted smooth(s) should be similar to the degrees of freedom used by the model when GCV was used. The difference between the two is that GCV approximates what you'd get if you did the actual CV without having to actually fit all those models during the CV steps. Not doing all the fitting can be a huge computational saving! For the details I strongly suggest you read Simon Wood's book Wood, S.N. (2006) Generalized Additive Models: An Introduction with R. Chapman and Hall/CRC. And having said all that, Simon and colleagues have shown that GCV can under-smooth data, especially when the objective function is very flat in the region of the optimal model. The conclusion that I draw from their papers is that using `method = "REML"` or `method = "ML"` is usually provides the best-performing smoothness selection and in addition one can turn on extra penalties (one via `select = TRUE`, the other via an argument to each `s()` term) that will perform model selection (i.e. shrink terms out of the model) for you. HTH G > > > [[alternative HTML version deleted]] > > _______________________________________________ > R-sig-ecology mailing list > R-sig-ecology@r-project.org > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology > -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% _______________________________________________ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology