Re: [R-sig-eco] forward selection RDA after controlling for constraints

2013-07-11 Thread Ingolf Kuehn
Dear Stephen,

I assume that your approach will account for spatial structure (large 
scale spatial trend), i.e. remove spatial structure prior to analysis 
and hence also remove spatially structured but ecologically potentially 
important variables. This approach, however, does not necessarily remove 
spatial autocorrelation (which can be thought of some sort of distance 
decay). Don't mind, it often gets confused.

I do not know any straightforward way, unfortunately, to control for 
spatial autocorrelation in multivariate analyses (unless some 
autoregressive models or GEE would  be implemented in RDA/CCA instead of 
glm) but I would think that the most promising workaround would be some 
sort of spatial filtering (see e.g., Diniz-Filho et al. 2003, Dray et 
al. 2006) as e.g. implemented in function ME {spdep} by Pedro 
Peres-Neto. Bini et al. (2009) showed that spatial filtering on the 
residuals is close to what is produced by ME (though Jari may warn 
against a regression/RDA on residuals). Filtering accounts for 
large-scale spatial structures but also for intermediate and especially 
small scale structures and by doing so it is efficient in accounting for 
spatial autocorrelation (though may introduce some overfitting).

This could mean in your context that you first do your RDA with the 
environmental predictors you are interested in, then use the residuals 
as a response in an analysis with spatial filters/PCNM as predictors and 
select those that significantly explain residual variation. Lastly add 
those selected to the set of environmental predictors. You might then 
need to reduce environmental predictors again. Variable selection 
procedure in such cases is to my knowledge not sufficiently solved, yet.

However, before accounting for spatial autocorrelation (SAC): Are you 
sure that your data is affected by this? Dis you test residual 
autocorrelation structure? If there is not SAC, there is no need to 
correct...

HTH
Ingolf

Bini, L.M., Diniz-Filho, J.A.F., Rangel, T. et al. 2009. Coefficient 
shifts in geographical ecology: an empirical evaluation of spatial and 
non-spatial regression. Ecography 32: 193-204. doi: 
10./j.1600-0587.2009.05717.x
Diniz-Filho, J.A.F., Bini, L.M., Hawkins, B.A. 2003. Spatial 
autocorrelation and red herrings in geographical ecology. Global Ecology 
and Biogeography 12: 53-64.
Dray, S., Legendre, P., Peres-Neto, P.R. 2006. Spatial modelling: a 
comprehensive framework for principal coordinate analysis of neighbour 
matrices (PCNM). Ecological Modelling 196: 483-493.


Am 10.07.2013 21:38, schrieb stephen sefick:
 Jari,

 Thank you for the quick reply.  Maybe I should use something like PCNM
 first with the lat/long data to then use in the rda?  I really appreciate
 all of your help.  Are there anyother/better ways to account for spatial
 autocorrelation.  I guess I need to show that spatial autocorellation
 exists and then if it does account for it?  Any reading etc. would be
 greatly appreciated.  I appreciate all of the help.
 kindest regards,

 Stephen

 P.S.  I will let you know about the stepwise selection and scope argument


 On Wed, Jul 10, 2013 at 2:28 PM, Jari Oksanen jari.oksa...@oulu.fi wrote:

 On 10/07/2013, at 21:00 PM, Stephen Sefick wrote:

 Hello all,

 I would like to run this by everyone and maybe get some hints as to what
 R functions I could use for this.  Ok, so I have macroinvertebrate
 assemblage data from across the SE.  I would like to control for geographic
 distance (lat/long), Watershed area, and year before submitting these data
 to an RDA with the rest of the environmental data using a variable
 selection technique.
 Does it make sense to detrend the data using a mlm on hellinger
 transfomed abundances with the above env variables as regressors and then
 submit the residuals to rda with the rest of the env variables I am
 interested in?


 Stephen,

 If you happen to use vegan functions for forward selection, please note
 that they all (should) take a scope argument that can (should) be a list of
 lower and upper scopes. Put your controlled variables (distance???,
 watershed area, year) in the lower scope and these plus other candidate
 variables in the upper scope, and there you go. I have used should,
 because I have rarely used these functions myself, and I'm not sure if
 lower scope really is implemented in all, but is *should* be: file a bug
 report if this fails.

 I have no idea how to have distance RDA. Well, I have ideas, but none that
 I have are very good.

 Using separate mlm and modelling residuals will not work quite correctly,
 because that ignores correlations between groups of variables. Vegan
 functions do not ignore those.

 Cheers, Jari Oksanen
 --
 Jari Oksanen, Dept Biology, Univ Oulu, 90014 Finland
 jari.oksa...@oulu.fi, Ph. +358 400 408593, http://cc.oulu.fi/~jarioksa






   [[alternative HTML version deleted]]

 ___
 R-sig-ecology mailing list
 

[R-sig-eco] forward selection RDA after controlling for constraints

2013-07-10 Thread Stephen Sefick

Hello all,

I would like to run this by everyone and maybe get some hints as to what 
R functions I could use for this.  Ok, so I have macroinvertebrate 
assemblage data from across the SE.  I would like to control for 
geographic distance (lat/long), Watershed area, and year before 
submitting these data to an RDA with the rest of the environmental data 
using a variable selection technique.


Does it make sense to detrend the data using a mlm on hellinger 
transfomed abundances with the above env variables as regressors and 
then submit the residuals to rda with the rest of the env variables I am 
interested in?


Many thanks for all of the help.
kindest regards,


--
Stephen Sefick
**
Auburn University
Biological Sciences
331 Funchess Hall
Auburn, Alabama
36849
**
sas0...@auburn.edu
http://www.auburn.edu/~sas0025
**

Let's not spend our time and resources thinking about things that are so 
little or so large that all they really do for us is puff us up and make 
us feel like gods.  We are mammals, and have not exhausted the annoying 
little problems of being mammals.


-K. Mullis

A big computer, a complex algorithm and a long time does not equal 
science.


  -Robert Gentleman

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] forward selection RDA after controlling for constraints

2013-07-10 Thread Jari Oksanen

On 10/07/2013, at 21:00 PM, Stephen Sefick wrote:

 Hello all,
 
 I would like to run this by everyone and maybe get some hints as to what R 
 functions I could use for this.  Ok, so I have macroinvertebrate assemblage 
 data from across the SE.  I would like to control for geographic distance 
 (lat/long), Watershed area, and year before submitting these data to an RDA 
 with the rest of the environmental data using a variable selection technique.
 
 Does it make sense to detrend the data using a mlm on hellinger transfomed 
 abundances with the above env variables as regressors and then submit the 
 residuals to rda with the rest of the env variables I am interested in?


Stephen,

If you happen to use vegan functions for forward selection, please note that 
they all (should) take a scope argument that can (should) be a list of lower 
and upper scopes. Put your controlled variables (distance???, watershed area, 
year) in the lower scope and these plus other candidate variables in the upper 
scope, and there you go. I have used should, because I have rarely used these 
functions myself, and I'm not sure if lower scope really is implemented in all, 
but is *should* be: file a bug report if this fails. 

I have no idea how to have distance RDA. Well, I have ideas, but none that I 
have are very good.

Using separate mlm and modelling residuals will not work quite correctly, 
because that ignores correlations between groups of variables. Vegan functions 
do not ignore those.

Cheers, Jari Oksanen
-- 
Jari Oksanen, Dept Biology, Univ Oulu, 90014 Finland
jari.oksa...@oulu.fi, Ph. +358 400 408593, http://cc.oulu.fi/~jarioksa

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] forward selection RDA after controlling for constraints

2013-07-10 Thread stephen sefick
Jari,

Thank you for the quick reply.  Maybe I should use something like PCNM
first with the lat/long data to then use in the rda?  I really appreciate
all of your help.  Are there anyother/better ways to account for spatial
autocorrelation.  I guess I need to show that spatial autocorellation
exists and then if it does account for it?  Any reading etc. would be
greatly appreciated.  I appreciate all of the help.
kindest regards,

Stephen

P.S.  I will let you know about the stepwise selection and scope argument


On Wed, Jul 10, 2013 at 2:28 PM, Jari Oksanen jari.oksa...@oulu.fi wrote:


 On 10/07/2013, at 21:00 PM, Stephen Sefick wrote:

  Hello all,
 
  I would like to run this by everyone and maybe get some hints as to what
 R functions I could use for this.  Ok, so I have macroinvertebrate
 assemblage data from across the SE.  I would like to control for geographic
 distance (lat/long), Watershed area, and year before submitting these data
 to an RDA with the rest of the environmental data using a variable
 selection technique.
 
  Does it make sense to detrend the data using a mlm on hellinger
 transfomed abundances with the above env variables as regressors and then
 submit the residuals to rda with the rest of the env variables I am
 interested in?


 Stephen,

 If you happen to use vegan functions for forward selection, please note
 that they all (should) take a scope argument that can (should) be a list of
 lower and upper scopes. Put your controlled variables (distance???,
 watershed area, year) in the lower scope and these plus other candidate
 variables in the upper scope, and there you go. I have used should,
 because I have rarely used these functions myself, and I'm not sure if
 lower scope really is implemented in all, but is *should* be: file a bug
 report if this fails.

 I have no idea how to have distance RDA. Well, I have ideas, but none that
 I have are very good.

 Using separate mlm and modelling residuals will not work quite correctly,
 because that ignores correlations between groups of variables. Vegan
 functions do not ignore those.

 Cheers, Jari Oksanen
 --
 Jari Oksanen, Dept Biology, Univ Oulu, 90014 Finland
 jari.oksa...@oulu.fi, Ph. +358 400 408593, http://cc.oulu.fi/~jarioksa







[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology