Re: [R] [Rd] Formulas in gam function of mgcv package
I am trying to understand the relationships between: y~s(x1)+s(x2)+s(x3)+s(x4) and y~s(x1,x2,x3,x4) Does the latter contain the former? what about the smoothers of all interaction terms? The first says that you want a model E(y) = f_1(x_1) + f_2(x_2) + f_3(x_3) + f_4(x_4) (1) where the f_j are smooth functions. The additive decomposition is quite a strong assumption, since it assumes that the effect of x_j is not dependent on x_k unless j=k. The second model is just E(y) = f(x_1,x_2,x_3,x4) (2) where f is a smooth function. This looks very general, but actually `s' terms assume isotropic smoothness, which is also quite a strong assumption. Now if I simply state that f and the f_j are `smooth functions', and leave it at that, then (2) would of course contain (1), but to actually estimate the models I need to state, mathematically, what I mean by `smooth'. Once I've done that I've pretty much determined the function spaces in which f and the f_j will lie, and in general (2) will no longer strictly contain (1). mgcv's `s' terms use a thin plate spline measure of smoothness for multivariate smooths, and this means that (1) will not be strictly nested within (2), since e.g. a 4D thin plate spline can not generally represent exactly what the sum of 4 1D splines can represent. If you want to acheive exact nesting then using tensor product smooths with something like y~te(x1)+te(x2)+te(x3)+te(x4) (3) y~te(x1,x2,x3,x4) (4) will do the trick (because the function space for (4) is built up from the function spaces used in (3)). As to where all the 2 and 3 way interactions have gone in (4)... it's just like ANOVA - if you put in a 4 way interaction then the lower order interactions are not identifiable, unless you choose to add constraints to make them so. `mgcv' will allow you add main effects and interactions, and will handle the constraints automatically, but if this sort of functional ANOVA is a major component of what you want to do, then it is probably worth checking out the gss package and Chong Gu's book on smoothing spline ANOVA. best, Simon -- Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK +44 1225 386603 www.maths.bath.ac.uk/~sw283 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [Rd] Formulas in gam function of mgcv package
This will not work... 2) y~s(x1, ,x36) Estimating a 36 dimensional functions reasonably well would require a tremendous quantity of data, but in any case the 36 dimensional TPS smoothnes measure will involve such high order derivatives that it will no longer be practically useful: in fact you will not have enough data to estimate the unpenalized coefficients of the smoother (and if you did R would run out of memory first). In such a high dimensional situation, I think that GAMs are really only useful if you have some prior knowledge of which variables are likely to interact (and it's not too many of them). If there's no prior information saying roughly what sort of smooth additive structure might be useful then, I'm not sure that GAMs are the right way to go, and some sort of machine learning approach might be better. Then again, the real problem with y~s(x1, ,x36) is that the data just won't contain enough information to estimate s, if all you can say is that s is smooth, but this also means that it's very unlikely that you really need to estimate s(x1, ,x36) in order to predict well. In that case, starting from y ~ s(x1) + + s(x36) and building the model up might result in something that does a reasonable predictive job. On the subject of tensor product smoothing vs isotropic smoothing. Isotropic smooths are really only reasonable if you think that the smooth should display approximately the same amount of wiggliness in all directions. If this is not the case then tensor product smoothing is a better bet. Centering and scaling alone is not enough to ensure that isotropy is reasonable (although in particular cases it may help, of course). best, Simon I am trying to build a predictive model. Since the the variables are centred and scaled, I think I need an isotropic smooth. I am also interested in having the interactions between the variables included, that is not a purely additive model. It is not clear to me when should I give preference to tensor smooths, possibly because I have not understood well how they work. I am reading Wood(2003) as recommended and I have also read rather extensively Simon N. Wood. Generalized Additive Models: An Introduction, 2006, but still I am stuck. Any additional suggestion or reading recommendation would be greatly appreciated. I have also some difficulties in understanding the values you have chosen for k in the first example (why 60?). Thanks Best, On Monday 24 August 2009 17:33:55 Gavin Simpson wrote: [Note R-Devel is the wrong list for such questions. R-Help is where this should have been directed - redirected there now] On Mon, 2009-08-24 at 17:02 +0100, Corrado wrote: Dear R-experts, I have a question on the formulas used in the gam function of the mgcv package. I am trying to understand the relationships between: y~s(x1)+s(x2)+s(x3)+s(x4) and y~s(x1,x2,x3,x4) Does the latter contain the former? what about the smoothers of all interaction terms? I'm not 100% certain how this scales to smooths of more than 2 variables, but Sections 4.10.2 and 5.2.2 of Simon Wood's book GAM: An Introduction with R (2006, Chapman Hall/CRC) discuss this for smooths of 2 variables. Strictly y ~ s(x1) + s(x2) is not nested in y ~ s(x1, x2) as the bases used to produce the smoothers in the two models may not be the same in both models. One option to ensure nestedness is to fit the more complicated model as something like this: ## if simpler model were: y ~ s(x1, k=20) + s(x2, k = 20) y ~ s(x1, k=20) + s(x2, k = 20) + s(x1, x2, k = 60) ^ where the last term (^^^ above) has the same k as used in s(x1, x2) Note that these are isotropic smooths; are x1 and x2 measured in the same units etc.? Tensor product smooths may be more appropriate if not, and if we specify the bases when fitting models s(x1) + s(x2) *is* strictly nested in te(x1, x2), eg. y ~ s(x1, bs = cr, k = 10) + s(x2, bs = cr, k = 10) is strictly nested within y ~ te(x1, x2, k = 10) ## is the same as y ~ te(x1, x2, bs = cr, k = 10) [Note that bs = cr is the default basis in te() smooths, hence we don't need to specify it, and k = 10 refers to each individual smooth in the te().] HTH G I have (tried to) read the manual pages of gam, formula.gam, smooth.terms, linear.functional.terms but could not understand properly. Regards -- Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK +44 1225 386603 www.maths.bath.ac.uk/~sw283 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [Rd] Formulas in gam function of mgcv package
Dear Simon, thanks for your answer. I am running the model with both s and te smoothing, to compare. A few questions on your email: 1) Isotropic smoothness: my variables are centred and scaled. I assumed an isotropic smoother (that is, a smoother that treats all the variables in the same way) was good. What do you think? Is my understanding of isotropic smoothing wrong? 2) s(x1,, xn): it does not contains (1), but I thought it was true that it does improve on (1) by being free of including some interaction, albeit not explicitly is my interpretation wrong? 3) te: I am confused! What does it mean that the function space for (4) is built up from the function spaces used in (3)? Does it mean that te(xi,,xn) is an expansion on the te(xi), including all the terms te(x1)*te(x2)**te(xj)**te(xn) of the different orders? Example: in the case of 4 variables, including te(x1)*te(x2), te(x2)*te(x3), te(x1)*te(x2)*te(x3) to te(x1)*te(x2)*te(x3)*te(x4) . Sorry for being particularly daft Regards On Wednesday 26 August 2009 09:56:13 you wrote: I am trying to understand the relationships between: y~s(x1)+s(x2)+s(x3)+s(x4) and y~s(x1,x2,x3,x4) Does the latter contain the former? what about the smoothers of all interaction terms? The first says that you want a model E(y) = f_1(x_1) + f_2(x_2) + f_3(x_3) + f_4(x_4) (1) where the f_j are smooth functions. The additive decomposition is quite a strong assumption, since it assumes that the effect of x_j is not dependent on x_k unless j=k. The second model is just E(y) = f(x_1,x_2,x_3,x4) (2) where f is a smooth function. This looks very general, but actually `s' terms assume isotropic smoothness, which is also quite a strong assumption. Now if I simply state that f and the f_j are `smooth functions', and leave it at that, then (2) would of course contain (1), but to actually estimate the models I need to state, mathematically, what I mean by `smooth'. Once I've done that I've pretty much determined the function spaces in which f and the f_j will lie, and in general (2) will no longer strictly contain (1). mgcv's `s' terms use a thin plate spline measure of smoothness for multivariate smooths, and this means that (1) will not be strictly nested within (2), since e.g. a 4D thin plate spline can not generally represent exactly what the sum of 4 1D splines can represent. If you want to acheive exact nesting then using tensor product smooths with something like y~te(x1)+te(x2)+te(x3)+te(x4) (3) y~te(x1,x2,x3,x4) (4) will do the trick (because the function space for (4) is built up from the function spaces used in (3)). As to where all the 2 and 3 way interactions have gone in (4)... it's just like ANOVA - if you put in a 4 way interaction then the lower order interactions are not identifiable, unless you choose to add constraints to make them so. `mgcv' will allow you add main effects and interactions, and will handle the constraints automatically, but if this sort of functional ANOVA is a major component of what you want to do, then it is probably worth checking out the gss package and Chong Gu's book on smoothing spline ANOVA. best, Simon -- Corrado Topi Global Climate Change Biodiversity Indicators Area 18,Department of Biology University of York, York, YO10 5YW, UK Phone: + 44 (0) 1904 328645, E-mail: ct...@york.ac.uk __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [Rd] Formulas in gam function of mgcv package
Dear Simon, thanks again. Concerning the whole 36 variables well, I have run a principal components analysis, and I am only using part of them (I am running a test with the pc which cover the 95% of variance and then the 99%). :) so I will possibly end up with s(x1,,x8). I wonder if using isotropic smoothers on principal component is a good idea the variance diminishes from component to component, so theoretically also the wiggliness of the smoother should be less and less what do you think? am I saying something stupid? If that is the case, and if I want to enclose some interaction, then I have so include the interaction terms manually like s(x1,x2). Is that right? Sorry for the avalanche of questions, but I am trying to understand the principles underlying the working of gam in mgcv. It looks very powerful, particularly for exploring dependencies. I have run te() instead of s(), but the predictive power seems to be less than with s() in this particular situation. At the same time, does te() include the interaction? I did not understand well your previous point on interaction term in te(): is te(x1,,xn) build as an expansion from the t(x1), ,t(xn)? Then all the interaction terms should be included Finally, is it possible to incorporate both s() and te() terms in the formula? Machine learning: I am not too well versed in the area. Did you mean regression trees or maximum entropy models? Best, On Wednesday 26 August 2009 10:27:08 Simon Wood wrote: This will not work... 2) y~s(x1, ,x36) Estimating a 36 dimensional functions reasonably well would require a tremendous quantity of data, but in any case the 36 dimensional TPS smoothnes measure will involve such high order derivatives that it will no longer be practically useful: in fact you will not have enough data to estimate the unpenalized coefficients of the smoother (and if you did R would run out of memory first). In such a high dimensional situation, I think that GAMs are really only useful if you have some prior knowledge of which variables are likely to interact (and it's not too many of them). If there's no prior information saying roughly what sort of smooth additive structure might be useful then, I'm not sure that GAMs are the right way to go, and some sort of machine learning approach might be better. Then again, the real problem with y~s(x1, ,x36) is that the data just won't contain enough information to estimate s, if all you can say is that s is smooth, but this also means that it's very unlikely that you really need to estimate s(x1, ,x36) in order to predict well. In that case, starting from y ~ s(x1) + + s(x36) and building the model up might result in something that does a reasonable predictive job. On the subject of tensor product smoothing vs isotropic smoothing. Isotropic smooths are really only reasonable if you think that the smooth should display approximately the same amount of wiggliness in all directions. If this is not the case then tensor product smoothing is a better bet. Centering and scaling alone is not enough to ensure that isotropy is reasonable (although in particular cases it may help, of course). best, Simon I am trying to build a predictive model. Since the the variables are centred and scaled, I think I need an isotropic smooth. I am also interested in having the interactions between the variables included, that is not a purely additive model. It is not clear to me when should I give preference to tensor smooths, possibly because I have not understood well how they work. I am reading Wood(2003) as recommended and I have also read rather extensively Simon N. Wood. Generalized Additive Models: An Introduction, 2006, but still I am stuck. Any additional suggestion or reading recommendation would be greatly appreciated. I have also some difficulties in understanding the values you have chosen for k in the first example (why 60?). Thanks Best, On Monday 24 August 2009 17:33:55 Gavin Simpson wrote: [Note R-Devel is the wrong list for such questions. R-Help is where this should have been directed - redirected there now] On Mon, 2009-08-24 at 17:02 +0100, Corrado wrote: Dear R-experts, I have a question on the formulas used in the gam function of the mgcv package. I am trying to understand the relationships between: y~s(x1)+s(x2)+s(x3)+s(x4) and y~s(x1,x2,x3,x4) Does the latter contain the former? what about the smoothers of all interaction terms? I'm not 100% certain how this scales to smooths of more than 2 variables, but Sections 4.10.2 and 5.2.2 of Simon Wood's book GAM: An Introduction with R (2006, Chapman Hall/CRC) discuss this for smooths of 2 variables. Strictly y ~ s(x1) + s(x2) is not nested in y ~ s(x1, x2) as the bases used to produce the smoothers in
Re: [R] [Rd] Formulas in gam function of mgcv package
On Tue, 2009-08-25 at 10:00 +0100, Corrado wrote: Dear Gavin / Rlings, thanks for your kind answer and sorry for posting to the dev mailing list. Concerning the specific of your answer: I am working with 6 to 36 covariates, and they are all centred and scaled. I represented the problem with two variables to simplify the question. So ideally, the situation is: 1) y ~ s(x1) + + s(x36) vs. 2) y~s(x1, ,x36) I think you are pushing things a bit with such a complicated smooth. You're unlikely to be able to fit that either due to insufficient data and / or hardware limits on your machine. I see that Simon has responded to this as well, in a far more comprehensive and informed manner than I could manage So I'll leave it at that... I am trying to build a predictive model. Since the the variables are centred and scaled, I think I need an isotropic smooth. I am also interested in having the interactions between the variables included, that is not a purely additive model. That sounds a bit like data fishing; throw everything into the pot and see what comes out of it. snip / I have also some difficulties in understanding the values you have chosen for k in the first example (why 60?). Sorry, that was a complication on my part. The main point was to show that you need to try to get the same bases used in the s(x1) and s(x2) parts of the formula; So if you had this model y ~ s(x1, k = 20) + s(x2, k = 20) You need something like y ~ s(x1, k = 20) + s(x2, k = 20) + s(x2, x2) [If you wanted the bivariate smooth to be more complicated than the default in mgcv, then you might have done: y ~ s(x2, x2, k = 60) ## for example in which case you could fit that model as y ~ s(x1, k = 20) + s(x2, k = 20) + s(x2, x2, k = 60) ] That was where the k = 60 came from, but in simplifying my response I forgot to remove it. Simon has since provided a more thorough response (Thanks Simon). HTH G Thanks Best, On Monday 24 August 2009 17:33:55 Gavin Simpson wrote: [Note R-Devel is the wrong list for such questions. R-Help is where this should have been directed - redirected there now] On Mon, 2009-08-24 at 17:02 +0100, Corrado wrote: Dear R-experts, I have a question on the formulas used in the gam function of the mgcv package. I am trying to understand the relationships between: y~s(x1)+s(x2)+s(x3)+s(x4) and y~s(x1,x2,x3,x4) Does the latter contain the former? what about the smoothers of all interaction terms? I'm not 100% certain how this scales to smooths of more than 2 variables, but Sections 4.10.2 and 5.2.2 of Simon Wood's book GAM: An Introduction with R (2006, Chapman Hall/CRC) discuss this for smooths of 2 variables. Strictly y ~ s(x1) + s(x2) is not nested in y ~ s(x1, x2) as the bases used to produce the smoothers in the two models may not be the same in both models. One option to ensure nestedness is to fit the more complicated model as something like this: ## if simpler model were: y ~ s(x1, k=20) + s(x2, k = 20) y ~ s(x1, k=20) + s(x2, k = 20) + s(x1, x2, k = 60) ^ where the last term (^^^ above) has the same k as used in s(x1, x2) Note that these are isotropic smooths; are x1 and x2 measured in the same units etc.? Tensor product smooths may be more appropriate if not, and if we specify the bases when fitting models s(x1) + s(x2) *is* strictly nested in te(x1, x2), eg. y ~ s(x1, bs = cr, k = 10) + s(x2, bs = cr, k = 10) is strictly nested within y ~ te(x1, x2, k = 10) ## is the same as y ~ te(x1, x2, bs = cr, k = 10) [Note that bs = cr is the default basis in te() smooths, hence we don't need to specify it, and k = 10 refers to each individual smooth in the te().] HTH G I have (tried to) read the manual pages of gam, formula.gam, smooth.terms, linear.functional.terms but could not understand properly. Regards -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [Rd] Formulas in gam function of mgcv package
Dear Gavin / Rlings, thanks for your kind answer and sorry for posting to the dev mailing list. Concerning the specific of your answer: I am working with 6 to 36 covariates, and they are all centred and scaled. I represented the problem with two variables to simplify the question. So ideally, the situation is: 1) y ~ s(x1) + + s(x36) vs. 2) y~s(x1, ,x36) I am trying to build a predictive model. Since the the variables are centred and scaled, I think I need an isotropic smooth. I am also interested in having the interactions between the variables included, that is not a purely additive model. It is not clear to me when should I give preference to tensor smooths, possibly because I have not understood well how they work. I am reading Wood(2003) as recommended and I have also read rather extensively Simon N. Wood. Generalized Additive Models: An Introduction, 2006, but still I am stuck. Any additional suggestion or reading recommendation would be greatly appreciated. I have also some difficulties in understanding the values you have chosen for k in the first example (why 60?). Thanks Best, On Monday 24 August 2009 17:33:55 Gavin Simpson wrote: [Note R-Devel is the wrong list for such questions. R-Help is where this should have been directed - redirected there now] On Mon, 2009-08-24 at 17:02 +0100, Corrado wrote: Dear R-experts, I have a question on the formulas used in the gam function of the mgcv package. I am trying to understand the relationships between: y~s(x1)+s(x2)+s(x3)+s(x4) and y~s(x1,x2,x3,x4) Does the latter contain the former? what about the smoothers of all interaction terms? I'm not 100% certain how this scales to smooths of more than 2 variables, but Sections 4.10.2 and 5.2.2 of Simon Wood's book GAM: An Introduction with R (2006, Chapman Hall/CRC) discuss this for smooths of 2 variables. Strictly y ~ s(x1) + s(x2) is not nested in y ~ s(x1, x2) as the bases used to produce the smoothers in the two models may not be the same in both models. One option to ensure nestedness is to fit the more complicated model as something like this: ## if simpler model were: y ~ s(x1, k=20) + s(x2, k = 20) y ~ s(x1, k=20) + s(x2, k = 20) + s(x1, x2, k = 60) ^ where the last term (^^^ above) has the same k as used in s(x1, x2) Note that these are isotropic smooths; are x1 and x2 measured in the same units etc.? Tensor product smooths may be more appropriate if not, and if we specify the bases when fitting models s(x1) + s(x2) *is* strictly nested in te(x1, x2), eg. y ~ s(x1, bs = cr, k = 10) + s(x2, bs = cr, k = 10) is strictly nested within y ~ te(x1, x2, k = 10) ## is the same as y ~ te(x1, x2, bs = cr, k = 10) [Note that bs = cr is the default basis in te() smooths, hence we don't need to specify it, and k = 10 refers to each individual smooth in the te().] HTH G I have (tried to) read the manual pages of gam, formula.gam, smooth.terms, linear.functional.terms but could not understand properly. Regards -- Corrado Topi Global Climate Change Biodiversity Indicators Area 18,Department of Biology University of York, York, YO10 5YW, UK Phone: + 44 (0) 1904 328645, E-mail: ct...@york.ac.uk __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [Rd] Formulas in gam function of mgcv package
[Note R-Devel is the wrong list for such questions. R-Help is where this should have been directed - redirected there now] On Mon, 2009-08-24 at 17:02 +0100, Corrado wrote: Dear R-experts, I have a question on the formulas used in the gam function of the mgcv package. I am trying to understand the relationships between: y~s(x1)+s(x2)+s(x3)+s(x4) and y~s(x1,x2,x3,x4) Does the latter contain the former? what about the smoothers of all interaction terms? I'm not 100% certain how this scales to smooths of more than 2 variables, but Sections 4.10.2 and 5.2.2 of Simon Wood's book GAM: An Introduction with R (2006, Chapman Hall/CRC) discuss this for smooths of 2 variables. Strictly y ~ s(x1) + s(x2) is not nested in y ~ s(x1, x2) as the bases used to produce the smoothers in the two models may not be the same in both models. One option to ensure nestedness is to fit the more complicated model as something like this: ## if simpler model were: y ~ s(x1, k=20) + s(x2, k = 20) y ~ s(x1, k=20) + s(x2, k = 20) + s(x1, x2, k = 60) ^ where the last term (^^^ above) has the same k as used in s(x1, x2) Note that these are isotropic smooths; are x1 and x2 measured in the same units etc.? Tensor product smooths may be more appropriate if not, and if we specify the bases when fitting models s(x1) + s(x2) *is* strictly nested in te(x1, x2), eg. y ~ s(x1, bs = cr, k = 10) + s(x2, bs = cr, k = 10) is strictly nested within y ~ te(x1, x2, k = 10) ## is the same as y ~ te(x1, x2, bs = cr, k = 10) [Note that bs = cr is the default basis in te() smooths, hence we don't need to specify it, and k = 10 refers to each individual smooth in the te().] HTH G I have (tried to) read the manual pages of gam, formula.gam, smooth.terms, linear.functional.terms but could not understand properly. Regards -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.