Gavin,
thank you for your reply, I appreciate it!
After consulting the proposed paper, I have tried your suggestion
setting "select = T", which results again in another question:
If the p-value is "NA" does this mean that the smoothing term is droped
(or shrank to zero)? Independent of its high edf, is this predictor
(e.g. s(x1)) not relevant to explain y?
E.g.:
edf Ref.df F p-value
s(x1) 7.521e-09 1.402e-08 0.000 NA
s(x2) 5.408e+00 6.448e+00 3.049 0.00462 **
s(x3) 6.287e-09 1.217e-08 0.000 NA
s(x4) 2.152e+00 2.754e+00 5.037 0.00248 **
Best
Marco
Am 27.09.2011 11:40, schrieb Gavin Simpson:
On Tue, 2011-09-27 at 08:54 +0200, Marco Helbich wrote:
Dear list,
I am studying the influence of several environmental factors (numeric&
dummies) on species densities (= numeric) using the gam()
function with a gaussian link function in the mgcv package. As stated in
Wood (2006) there is no variable selection algorithm.
Is it an appropriate (iterative) approach to drop the predictor being
least significant (eg. p> 0.05), refit the model, compare the GCV/AIC
score and so forth. Should I first focus on the smoothing functions or
fixed effects? Or is such a distinction not important at all?
Perhaps someone has more experience with GAMs and can give me a helping
hand? Thanks in advance!
You could do that, but I would be sceptical of the results.
Marra and Wood (2011, Computational Statistics and Data Analysis 55;
2372-2387) compare various approaches for feature selection in GAMs.
IIRC, they concluded that an additional penalty term in the smoothness
selection procedure gave the best results. This can be activated in
mgcv::gam() by using the `select = TRUE` argument/setting.
HTH
G
Best
Marco
_______________________________________________
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology