subject:"\[R\] \[Rd\] Formulas in gam function of mgcv package"

Re: [R] [Rd] Formulas in gam function of mgcv package

2009-08-26 Thread Simon Wood

  I am trying to understand the relationships between:
 
  y~s(x1)+s(x2)+s(x3)+s(x4)
 
  and
 
  y~s(x1,x2,x3,x4)
 
  Does the latter contain the former? what about the smoothers of all
  interaction terms?
The first says that you want a model 
E(y) = f_1(x_1) + f_2(x_2) + f_3(x_3) + f_4(x_4) (1)
where the f_j are smooth functions. The additive decomposition is quite a 
strong assumption, since it assumes that the effect of x_j is not dependent 
on x_k unless j=k. The second model is just
E(y) = f(x_1,x_2,x_3,x4)  (2)
where f is a smooth function. This looks very general, but actually `s' terms 
assume isotropic smoothness, which is also quite a strong assumption. 

Now if I simply state that f and the f_j are `smooth functions', and leave it 
at that, then (2) would of course contain (1), but to actually estimate the 
models I need to state, mathematically, what I mean by `smooth'. Once I've 
done that I've pretty much determined the function spaces in which f and the 
f_j will lie, and in general (2) will no longer strictly contain (1). mgcv's 
`s' terms use a thin plate spline measure of smoothness for multivariate 
smooths, and this means that (1) will not be strictly nested within (2), 
since e.g. a 4D thin plate spline can not generally represent exactly what 
the sum of 4 1D splines can represent. 

If you want to acheive exact nesting then using tensor product smooths with 
something like 

y~te(x1)+te(x2)+te(x3)+te(x4)   (3)

y~te(x1,x2,x3,x4) (4)

will do the trick (because the function space for (4) is built up from the 
function spaces used in (3)). 

As to where all the 2 and 3 way interactions have gone in (4)... it's just 
like ANOVA - if you put in a 4 way interaction then the lower order 
interactions are not identifiable, unless you choose to add constraints to 
make them so. `mgcv' will allow you add main effects and interactions, and 
will handle the constraints automatically, but if this sort of functional 
ANOVA is a major component of what you want to do, then it is probably worth 
checking out the gss package and Chong Gu's book on smoothing spline ANOVA.

best,
Simon






-- 
 Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
 +44 1225 386603  www.maths.bath.ac.uk/~sw283

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [Rd] Formulas in gam function of mgcv package

2009-08-26 Thread Simon Wood

This will not work...
 2) y~s(x1,  ,x36)
Estimating a 36 dimensional functions reasonably well would require a 
tremendous quantity of data, but in any case the 36 dimensional TPS smoothnes 
measure will involve such high order derivatives that it will no longer be 
practically useful: in fact you will not have enough data to estimate the 
unpenalized coefficients of the smoother (and if you did R would run out of 
memory first).  

In such a high dimensional situation, I think that GAMs are really only useful 
if you have some prior knowledge of which variables are likely to interact 
(and it's not too many of them). If there's no prior information saying 
roughly what sort of smooth additive structure might be useful then, I'm not 
sure that GAMs are the right way to go, and some sort of machine learning 
approach might be better.

Then again, the real problem with 
y~s(x1,  ,x36)
is that the data just won't contain enough information to estimate s, if all 
you can say is that s is smooth, but this also means that it's very unlikely 
that you really need to estimate s(x1,  ,x36) in order to predict well. 
In that case, starting from 
y ~ s(x1) +  + s(x36)
and building the model up might result in something that does a reasonable 
predictive job. 

On the subject of tensor product smoothing vs isotropic smoothing. Isotropic 
smooths are really only reasonable if you think  that the smooth should 
display approximately the same amount of wiggliness in all directions. If 
this is not the case then tensor product smoothing is a better bet. Centering 
and scaling alone is not enough to ensure that isotropy is reasonable 
(although in particular cases it may help, of course).

best,
Simon



 I am trying to build a predictive model. Since the the variables are
 centred and scaled, I think I need an isotropic smooth. I am also
 interested in having the interactions between the variables included, that
 is not a purely additive model.

 It is not clear to me when should I give preference to tensor smooths,
 possibly because I have not understood well how they work.

 I am reading Wood(2003) as recommended and I have also read rather
 extensively Simon N. Wood. Generalized Additive Models: An Introduction,
 2006, but still I am stuck. Any additional suggestion or reading
 recommendation would be greatly appreciated.

 I have also some difficulties in understanding the values you have chosen
 for k in the first example (why 60?).

 Thanks

 Best,

 On Monday 24 August 2009 17:33:55 Gavin Simpson wrote:
  [Note R-Devel is the wrong list for such questions. R-Help is where this
  should have been directed - redirected there now]
 
  On Mon, 2009-08-24 at 17:02 +0100, Corrado wrote:
   Dear R-experts,
  
   I have a question on the formulas used in the gam function of the mgcv
   package.
  
   I am trying to understand the relationships between:
  
   y~s(x1)+s(x2)+s(x3)+s(x4)
  
   and
  
   y~s(x1,x2,x3,x4)
  
   Does the latter contain the former? what about the smoothers of all
   interaction terms?
 
  I'm not 100% certain how this scales to smooths of more than 2
  variables, but Sections 4.10.2 and 5.2.2 of Simon Wood's book GAM: An
  Introduction with R (2006, Chapman Hall/CRC) discuss this for smooths of
  2 variables.
 
  Strictly y ~ s(x1) + s(x2) is not nested in y ~ s(x1, x2) as the bases
  used to produce the smoothers in the two models may not be the same in
  both models. One option to ensure nestedness is to fit the more
  complicated model as something like this:
 
  ## if simpler model were: y ~ s(x1, k=20) + s(x2, k = 20)
  y ~ s(x1, k=20) + s(x2, k = 20) + s(x1, x2, k = 60)
^
  where the last term (^^^ above) has the same k as used in s(x1, x2)
 
  Note that these are isotropic smooths; are x1 and x2 measured in the
  same units etc.? Tensor product smooths may be more appropriate if not,
  and if we specify the bases when fitting models s(x1) + s(x2) *is*
  strictly nested in te(x1, x2), eg.
 
  y ~ s(x1, bs = cr, k = 10) + s(x2, bs = cr, k = 10)
 
  is strictly nested within
 
  y ~ te(x1, x2, k = 10)
  ## is the same as y ~ te(x1, x2, bs = cr, k = 10)
 
  [Note that bs = cr is the default basis in te() smooths, hence we
  don't need to specify it, and k = 10 refers to each individual smooth in
  the te().]
 
  HTH
 
  G
 
   I have (tried to) read the manual pages of gam, formula.gam,
   smooth.terms, linear.functional.terms but could not understand
   properly.
  
   Regards

-- 
 Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
 +44 1225 386603  www.maths.bath.ac.uk/~sw283

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [Rd] Formulas in gam function of mgcv package

2009-08-26 Thread Corrado

Dear Simon,

thanks for your answer.

I am running the model with both s and te smoothing, to compare.

A few questions on your email:

1) Isotropic smoothness: my variables are centred and scaled. I assumed an 
isotropic smoother (that is, a smoother that treats all the variables in the 
same way) was good. What do you think? Is my understanding of isotropic 
smoothing wrong? 

2) s(x1,, xn): it does not contains (1), but I thought it was true that it 
does improve on (1) by being free of including some interaction, albeit not 
explicitly  is my interpretation wrong?

3) te: I am confused! What does it mean that the function space for (4) is 
built up from the function spaces used in (3)? Does it mean that 
te(xi,,xn) is an expansion on the te(xi), including all the terms 
te(x1)*te(x2)**te(xj)**te(xn) of the different orders?

Example: in the case of 4 variables, including te(x1)*te(x2), te(x2)*te(x3), 
 te(x1)*te(x2)*te(x3)  to te(x1)*te(x2)*te(x3)*te(x4) .

Sorry for being particularly daft 

Regards


On Wednesday 26 August 2009 09:56:13 you wrote:
   I am trying to understand the relationships between:
  
   y~s(x1)+s(x2)+s(x3)+s(x4)
  
   and
  
   y~s(x1,x2,x3,x4)
  
   Does the latter contain the former? what about the smoothers of all
   interaction terms?

 The first says that you want a model
 E(y) = f_1(x_1) + f_2(x_2) + f_3(x_3) + f_4(x_4) (1)
 where the f_j are smooth functions. The additive decomposition is quite a
 strong assumption, since it assumes that the effect of x_j is not dependent
 on x_k unless j=k. The second model is just
 E(y) = f(x_1,x_2,x_3,x4)  (2)
 where f is a smooth function. This looks very general, but actually `s'
 terms assume isotropic smoothness, which is also quite a strong assumption.

 Now if I simply state that f and the f_j are `smooth functions', and leave
 it at that, then (2) would of course contain (1), but to actually estimate
 the models I need to state, mathematically, what I mean by `smooth'. Once
 I've done that I've pretty much determined the function spaces in which f
 and the f_j will lie, and in general (2) will no longer strictly contain
 (1). mgcv's `s' terms use a thin plate spline measure of smoothness for
 multivariate smooths, and this means that (1) will not be strictly nested
 within (2), since e.g. a 4D thin plate spline can not generally represent
 exactly what the sum of 4 1D splines can represent.

 If you want to acheive exact nesting then using tensor product smooths with
 something like

 y~te(x1)+te(x2)+te(x3)+te(x4)   (3)

 y~te(x1,x2,x3,x4) (4)

 will do the trick (because the function space for (4) is built up from the
 function spaces used in (3)).

 As to where all the 2 and 3 way interactions have gone in (4)... it's just
 like ANOVA - if you put in a 4 way interaction then the lower order
 interactions are not identifiable, unless you choose to add constraints to
 make them so. `mgcv' will allow you add main effects and interactions, and
 will handle the constraints automatically, but if this sort of functional
 ANOVA is a major component of what you want to do, then it is probably
 worth checking out the gss package and Chong Gu's book on smoothing spline
 ANOVA.

 best,
 Simon



-- 
Corrado Topi

Global Climate Change  Biodiversity Indicators
Area 18,Department of Biology
University of York, York, YO10 5YW, UK
Phone: + 44 (0) 1904 328645, E-mail: ct...@york.ac.uk

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [Rd] Formulas in gam function of mgcv package

2009-08-26 Thread Corrado

Dear Simon,

thanks again.

Concerning the whole 36 variables  well, I have run a principal components 
analysis, and I am only using part of them (I am running a test with the pc 
which cover the 95% of variance and then the 99%). :)  so I will possibly 
end up with s(x1,,x8). I wonder if using isotropic smoothers on principal 
component is a good idea  the variance diminishes from component to 
component, so theoretically also the wiggliness of the smoother should be less 
and less  what do you think? am I saying something stupid?

If that is the case, and if I want to enclose some interaction, then I have so 
include the interaction terms manually  like s(x1,x2). Is that right?

Sorry for the avalanche of questions, but I am trying to understand the 
principles underlying the working of gam in mgcv. It looks very powerful, 
particularly for exploring dependencies.

I have run te() instead of s(), but the predictive power seems to be less than 
with s() in this particular situation. At the same time, does te() include the 
interaction? I did not understand well your previous point on interaction term 
in te(): is te(x1,,xn) build as an expansion from the t(x1),  ,t(xn)? 
Then all the interaction terms should be included 

Finally, is it possible to incorporate both s() and te() terms in the formula?

Machine learning: I am not too well versed in the area. Did you mean 
regression trees or maximum entropy models?

Best,


On Wednesday 26 August 2009 10:27:08 Simon Wood wrote:
 This will not work...

  2) y~s(x1,  ,x36)

 Estimating a 36 dimensional functions reasonably well would require a
 tremendous quantity of data, but in any case the 36 dimensional TPS
 smoothnes measure will involve such high order derivatives that it will no
 longer be practically useful: in fact you will not have enough data to
 estimate the unpenalized coefficients of the smoother (and if you did R
 would run out of memory first).

 In such a high dimensional situation, I think that GAMs are really only
 useful if you have some prior knowledge of which variables are likely to
 interact (and it's not too many of them). If there's no prior information
 saying roughly what sort of smooth additive structure might be useful then,
 I'm not sure that GAMs are the right way to go, and some sort of machine
 learning approach might be better.

 Then again, the real problem with
 y~s(x1,  ,x36)
 is that the data just won't contain enough information to estimate s, if
 all you can say is that s is smooth, but this also means that it's very
 unlikely that you really need to estimate s(x1,  ,x36) in order to
 predict well. In that case, starting from
 y ~ s(x1) +  + s(x36)
 and building the model up might result in something that does a reasonable
 predictive job.

 On the subject of tensor product smoothing vs isotropic smoothing.
 Isotropic smooths are really only reasonable if you think  that the smooth
 should display approximately the same amount of wiggliness in all
 directions. If this is not the case then tensor product smoothing is a
 better bet. Centering and scaling alone is not enough to ensure that
 isotropy is reasonable (although in particular cases it may help, of
 course).

 best,
 Simon

  I am trying to build a predictive model. Since the the variables are
  centred and scaled, I think I need an isotropic smooth. I am also
  interested in having the interactions between the variables included,
  that is not a purely additive model.
 
  It is not clear to me when should I give preference to tensor smooths,
  possibly because I have not understood well how they work.
 
  I am reading Wood(2003) as recommended and I have also read rather
  extensively Simon N. Wood. Generalized Additive Models: An Introduction,
  2006, but still I am stuck. Any additional suggestion or reading
  recommendation would be greatly appreciated.
 
  I have also some difficulties in understanding the values you have chosen
  for k in the first example (why 60?).
 
  Thanks
 
  Best,
 
  On Monday 24 August 2009 17:33:55 Gavin Simpson wrote:
   [Note R-Devel is the wrong list for such questions. R-Help is where
   this should have been directed - redirected there now]
  
   On Mon, 2009-08-24 at 17:02 +0100, Corrado wrote:
Dear R-experts,
   
I have a question on the formulas used in the gam function of the
mgcv package.
   
I am trying to understand the relationships between:
   
y~s(x1)+s(x2)+s(x3)+s(x4)
   
and
   
y~s(x1,x2,x3,x4)
   
Does the latter contain the former? what about the smoothers of all
interaction terms?
  
   I'm not 100% certain how this scales to smooths of more than 2
   variables, but Sections 4.10.2 and 5.2.2 of Simon Wood's book GAM: An
   Introduction with R (2006, Chapman Hall/CRC) discuss this for smooths
   of 2 variables.
  
   Strictly y ~ s(x1) + s(x2) is not nested in y ~ s(x1, x2) as the bases
   used to produce the smoothers in

Re: [R] [Rd] Formulas in gam function of mgcv package

2009-08-26 Thread Gavin Simpson

On Tue, 2009-08-25 at 10:00 +0100, Corrado wrote:
 Dear Gavin / Rlings,
 
 thanks for your kind answer and sorry for posting to the dev mailing list.
 
 Concerning the specific of your answer:
 
 I am working with 6 to 36 covariates, and they are all centred and scaled. I 
 represented the problem with two variables to simplify the question.
 
 So ideally, the situation is:
 
 1) y ~ s(x1) +  + s(x36)
 
 vs.
 
 2) y~s(x1,  ,x36)

I think you are pushing things a bit with such a complicated smooth.
You're unlikely to be able to fit that either due to insufficient data
and / or hardware limits on your machine.

I see that Simon has responded to this as well, in a far more
comprehensive and informed manner than I could manage So I'll leave
it at that...

 
 I am trying to build a predictive model. Since the the variables are centred 
 and scaled, I think I need an isotropic smooth. I am also interested in 
 having 
 the interactions between the variables included, that is not a purely 
 additive 
 model.

That sounds a bit like data fishing; throw everything into the pot and
see what comes out of it.

snip /
 I have also some difficulties in understanding the values you have chosen for 
 k 
 in the first example (why 60?).

Sorry, that was a complication on my part. The main point was to show
that you need to try to get the same bases used in the s(x1) and s(x2)
parts of the formula; So if you had this model

y ~ s(x1, k = 20) + s(x2, k = 20)

You need something like 

y ~ s(x1, k = 20) + s(x2, k = 20) + s(x2, x2)

[If you wanted the bivariate smooth to be more complicated than the
default in mgcv, then you might have done:

y ~ s(x2, x2, k = 60) ## for example

in which case you could fit that model as 

y ~ s(x1, k = 20) + s(x2, k = 20) + s(x2, x2, k = 60)

] That was where the k = 60 came from, but in simplifying my response I
forgot to remove it.

Simon has since provided a more thorough response (Thanks Simon).

HTH

G

 
 Thanks
 
 Best,
 
 
 
 On Monday 24 August 2009 17:33:55 Gavin Simpson wrote:
  [Note R-Devel is the wrong list for such questions. R-Help is where this
  should have been directed - redirected there now]
 
  On Mon, 2009-08-24 at 17:02 +0100, Corrado wrote:
   Dear R-experts,
  
   I have a question on the formulas used in the gam function of the mgcv
   package.
  
   I am trying to understand the relationships between:
  
   y~s(x1)+s(x2)+s(x3)+s(x4)
  
   and
  
   y~s(x1,x2,x3,x4)
  
   Does the latter contain the former? what about the smoothers of all
   interaction terms?
 
  I'm not 100% certain how this scales to smooths of more than 2
  variables, but Sections 4.10.2 and 5.2.2 of Simon Wood's book GAM: An
  Introduction with R (2006, Chapman Hall/CRC) discuss this for smooths of
  2 variables.
 
  Strictly y ~ s(x1) + s(x2) is not nested in y ~ s(x1, x2) as the bases
  used to produce the smoothers in the two models may not be the same in
  both models. One option to ensure nestedness is to fit the more
  complicated model as something like this:
 
  ## if simpler model were: y ~ s(x1, k=20) + s(x2, k = 20)
  y ~ s(x1, k=20) + s(x2, k = 20) + s(x1, x2, k = 60)
^
  where the last term (^^^ above) has the same k as used in s(x1, x2)
 
  Note that these are isotropic smooths; are x1 and x2 measured in the
  same units etc.? Tensor product smooths may be more appropriate if not,
  and if we specify the bases when fitting models s(x1) + s(x2) *is*
  strictly nested in te(x1, x2), eg.
 
  y ~ s(x1, bs = cr, k = 10) + s(x2, bs = cr, k = 10)
 
  is strictly nested within
 
  y ~ te(x1, x2, k = 10)
  ## is the same as y ~ te(x1, x2, bs = cr, k = 10)
 
  [Note that bs = cr is the default basis in te() smooths, hence we
  don't need to specify it, and k = 10 refers to each individual smooth in
  the te().]
 
  HTH
 
  G
 
   I have (tried to) read the manual pages of gam, formula.gam,
   smooth.terms, linear.functional.terms but could not understand properly.
  
   Regards
 
 
 
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [Rd] Formulas in gam function of mgcv package

2009-08-25 Thread Corrado

Dear Gavin / Rlings,

thanks for your kind answer and sorry for posting to the dev mailing list.

Concerning the specific of your answer:

I am working with 6 to 36 covariates, and they are all centred and scaled. I 
represented the problem with two variables to simplify the question.

So ideally, the situation is:

1) y ~ s(x1) +  + s(x36)

vs.

2) y~s(x1,  ,x36)

I am trying to build a predictive model. Since the the variables are centred 
and scaled, I think I need an isotropic smooth. I am also interested in having 
the interactions between the variables included, that is not a purely additive 
model.

It is not clear to me when should I give preference to tensor smooths, 
possibly because I have not understood well how they work.

I am reading Wood(2003) as recommended and I have also read rather extensively 
Simon N. Wood. Generalized Additive Models: An Introduction, 2006, but still I 
am stuck. Any additional suggestion or reading recommendation would be greatly 
appreciated.

I have also some difficulties in understanding the values you have chosen for k 
in the first example (why 60?).

Thanks

Best,



On Monday 24 August 2009 17:33:55 Gavin Simpson wrote:
 [Note R-Devel is the wrong list for such questions. R-Help is where this
 should have been directed - redirected there now]

 On Mon, 2009-08-24 at 17:02 +0100, Corrado wrote:
  Dear R-experts,
 
  I have a question on the formulas used in the gam function of the mgcv
  package.
 
  I am trying to understand the relationships between:
 
  y~s(x1)+s(x2)+s(x3)+s(x4)
 
  and
 
  y~s(x1,x2,x3,x4)
 
  Does the latter contain the former? what about the smoothers of all
  interaction terms?

 I'm not 100% certain how this scales to smooths of more than 2
 variables, but Sections 4.10.2 and 5.2.2 of Simon Wood's book GAM: An
 Introduction with R (2006, Chapman Hall/CRC) discuss this for smooths of
 2 variables.

 Strictly y ~ s(x1) + s(x2) is not nested in y ~ s(x1, x2) as the bases
 used to produce the smoothers in the two models may not be the same in
 both models. One option to ensure nestedness is to fit the more
 complicated model as something like this:

 ## if simpler model were: y ~ s(x1, k=20) + s(x2, k = 20)
 y ~ s(x1, k=20) + s(x2, k = 20) + s(x1, x2, k = 60)
   ^
 where the last term (^^^ above) has the same k as used in s(x1, x2)

 Note that these are isotropic smooths; are x1 and x2 measured in the
 same units etc.? Tensor product smooths may be more appropriate if not,
 and if we specify the bases when fitting models s(x1) + s(x2) *is*
 strictly nested in te(x1, x2), eg.

 y ~ s(x1, bs = cr, k = 10) + s(x2, bs = cr, k = 10)

 is strictly nested within

 y ~ te(x1, x2, k = 10)
 ## is the same as y ~ te(x1, x2, bs = cr, k = 10)

 [Note that bs = cr is the default basis in te() smooths, hence we
 don't need to specify it, and k = 10 refers to each individual smooth in
 the te().]

 HTH

 G

  I have (tried to) read the manual pages of gam, formula.gam,
  smooth.terms, linear.functional.terms but could not understand properly.
 
  Regards



-- 
Corrado Topi

Global Climate Change  Biodiversity Indicators
Area 18,Department of Biology
University of York, York, YO10 5YW, UK
Phone: + 44 (0) 1904 328645, E-mail: ct...@york.ac.uk

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [Rd] Formulas in gam function of mgcv package

2009-08-24 Thread Gavin Simpson

[Note R-Devel is the wrong list for such questions. R-Help is where this
should have been directed - redirected there now]

On Mon, 2009-08-24 at 17:02 +0100, Corrado wrote:
 Dear R-experts,
 
 I have a question on the formulas used in the gam function of the mgcv 
 package.
 
 I am trying to understand the relationships between:
 
 y~s(x1)+s(x2)+s(x3)+s(x4)
 
 and 
 
 y~s(x1,x2,x3,x4)
 
 Does the latter contain the former? what about the smoothers of all 
 interaction terms?

I'm not 100% certain how this scales to smooths of more than 2
variables, but Sections 4.10.2 and 5.2.2 of Simon Wood's book GAM: An
Introduction with R (2006, Chapman Hall/CRC) discuss this for smooths of
2 variables.

Strictly y ~ s(x1) + s(x2) is not nested in y ~ s(x1, x2) as the bases
used to produce the smoothers in the two models may not be the same in
both models. One option to ensure nestedness is to fit the more
complicated model as something like this:

## if simpler model were: y ~ s(x1, k=20) + s(x2, k = 20)
y ~ s(x1, k=20) + s(x2, k = 20) + s(x1, x2, k = 60)
  ^ 
where the last term (^^^ above) has the same k as used in s(x1, x2)

Note that these are isotropic smooths; are x1 and x2 measured in the
same units etc.? Tensor product smooths may be more appropriate if not,
and if we specify the bases when fitting models s(x1) + s(x2) *is*
strictly nested in te(x1, x2), eg.

y ~ s(x1, bs = cr, k = 10) + s(x2, bs = cr, k = 10)

is strictly nested within

y ~ te(x1, x2, k = 10)
## is the same as y ~ te(x1, x2, bs = cr, k = 10)

[Note that bs = cr is the default basis in te() smooths, hence we
don't need to specify it, and k = 10 refers to each individual smooth in
the te().]

HTH

G

  
 
 I have (tried to) read the manual pages of gam, formula.gam, smooth.terms, 
 linear.functional.terms but could not understand properly.
 
 Regards
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [Rd] Formulas in gam function of mgcv package

Re: [R] [Rd] Formulas in gam function of mgcv package

Re: [R] [Rd] Formulas in gam function of mgcv package

Re: [R] [Rd] Formulas in gam function of mgcv package

Re: [R] [Rd] Formulas in gam function of mgcv package

Re: [R] [Rd] Formulas in gam function of mgcv package

Re: [R] [Rd] Formulas in gam function of mgcv package

7 matches

Site Navigation

Mail list logo

Footer information