Re: [R-sig-eco] Community composition variance partitioning?
Alexandre, Both RDA and MRM are useful methods but they address different questions. The R2 value from RDA quantifies the proportion of the variance in species abundances that can be explained with environmental or spatial gradients. In other words, the response variables in the analysis are the species abundance values in the raw data matrix (the sites by species table). This targets the ecological question why are species more abundant in some sites than in others?. In contrast, the R2 value from MRM quantifies the proportion of variance in pairwise dissimilarity values that can be explained with environmental or spatial distances. In other words, the response variable is the compositional dissimilarity matrix. This targets the ecological question why are species compositions more similar between some sites than between others?. Both questions are related, of course, but they are not interchangeable. My personal opinion is that it's fine to run both kinds of analysis in parallel, but the results of each method should be interpreted according to it own null hypothesis, not according to the null hypothesis of the other method. Cheers, Hanna Alexandre Fadigas de Souza wrote Hi Steve, Thank you for your response to my message and for the suggestion. We are also performin RDA-based variance partitioning. Reading the literature on community composition variance partition, my impression was that there is a turmoil and the field is divided into two main fields in disagreement: rda- and partial mantel-based approaches using or not pcnm as spatial descriptors (as opposed to polinomials of lat long). Simulation comparisons concluded that all approaches are subotimal and have strenghts and weakenesses. This without mentioning the danish initiative to use mixed models as a comparative means to these two approaches. We decided to all three: rda, mantel, and mixed model approaches, so as to be able to compare results and see if congruent patterns emerge. To be more specific, in the mixed model approach ordination axes (e.g., pca on hellinger-transformed species data) are used as dependent variables and explanatory environmental factors are used as independent variables. Levels of spatial cluster are included as nesting effects. Sequential model adjustment shows if space is relevent and if the environment is relevant, in which case which environmental variables are relevant are also evaluated. Regarding the R2 problem in the multiple regression on distance matrices, it seems that indeed the problem was that we were including variables as extra columns and not as separate matrices in the formula. With change we obtained r2 in the expected order of increase. What do you think of this all-inclusive approach? All the best, Alexandre Dr. Alexandre F. Souza Professor Adjunto II Departamento de Botanica, Ecologia e Zoologia Universidade Federal do Rio Grande do Norte (UFRN) http://www.docente.ufrn.br/alexsouza Curriculo: lattes.cnpq.br/7844758818522706 ___ Alexandre, I'll leave it to Sarah to advise you on MRM (and I agree with Jari that the method you're describing is not going to work). I'll just add that it is not clear to me why the predictors (even geographic distance) have to be treated as distances to partition the variance in composition. I'm assuming the environmental variables were not originally in the form of euclidean distance matrices and that the raw measurements are available? As for the geographic distances, if you have lat and long coordinates, why not treat both lat and long as predictors and do the necessary analyses as partial distance-based redundancy analyses using capscale? In one analysis the geographic predictors could be partialled out (with the result explaining the fraction explained by the environment). In another, the environmental predictors could be partialled out (with the result explaining the fraction explained by the geographic distance) and in a third both geographic and environmental predictors could be considered with no conditioning covariates (which will give the total variance explained by both combined). Best Steve ___ R-sig-ecology mailing list R-sig-ecology@ https://stat.ethz.ch/mailman/listinfo/r-sig-ecology -- View this message in context: http://r-sig-ecology.471788.n2.nabble.com/Community-composition-variance-partitioning-tp7578565p7578702.html Sent from the r-sig-ecology mailing list archive at Nabble.com. ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
Re: [R-sig-eco] Community composition variance partitioning?
Hi Steve, Thank you for your response to my message and for the suggestion. We are also performin RDA-based variance partitioning. Reading the literature on community composition variance partition, my impression was that there is a turmoil and the field is divided into two main fields in disagreement: rda- and partial mantel-based approaches using or not pcnm as spatial descriptors (as opposed to polinomials of lat long). Simulation comparisons concluded that all approaches are subotimal and have strenghts and weakenesses. This without mentioning the danish initiative to use mixed models as a comparative means to these two approaches. We decided to all three: rda, mantel, and mixed model approaches, so as to be able to compare results and see if congruent patterns emerge. To be more specific, in the mixed model approach ordination axes (e.g., pca on hellinger-transformed species data) are used as dependent variables and explanatory environmental factors are used as independent variables. Levels of spatial cluster are included as nesting effects. Sequential model adjustment shows if space is relevent and if the environment is relevant, in which case which environmental variables are relevant are also evaluated. Regarding the R2 problem in the multiple regression on distance matrices, it seems that indeed the problem was that we were including variables as extra columns and not as separate matrices in the formula. With change we obtained r2 in the expected order of increase. What do you think of this all-inclusive approach? All the best, Alexandre Dr. Alexandre F. Souza Professor Adjunto II Departamento de Botanica, Ecologia e Zoologia Universidade Federal do Rio Grande do Norte (UFRN) http://www.docente.ufrn.br/alexsouza Curriculo: lattes.cnpq.br/7844758818522706 ___ Alexandre, I'll leave it to Sarah to advise you on MRM (and I agree with Jari that the method you're describing is not going to work). I'll just add that it is not clear to me why the predictors (even geographic distance) have to be treated as distances to partition the variance in composition. I'm assuming the environmental variables were not originally in the form of euclidean distance matrices and that the raw measurements are available? As for the geographic distances, if you have lat and long coordinates, why not treat both lat and long as predictors and do the necessary analyses as partial distance-based redundancy analyses using capscale? In one analysis the geographic predictors could be partialled out (with the result explaining the fraction explained by the environment). In another, the environmental predictors could be partialled out (with the result explaining the fraction explained by the geographic distance) and in a third both geographic and environmental predictors could be considered with no conditioning covariates (which will give the total variance explained by both combined). Best Steve ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
[R-sig-eco] Community composition variance partitioning?
Dear friends, My name is Alexandre and I am trying to analyze a dataset on floristic composition of tropical coastal vegetation by means of variance partition, according to the outlines of a Tuomisto's recent papers, specially Tuomisto, H., Ruokolainen, L., Ruokolainen, K., 2012. Modelling niche and neutral dynamics : on the ecological interpretation of variation partitioning results. Ecography (Cop.). 35, 961–971. I have a doubt, could you please give your opinion on it? We are proceeding a variance partition of the bray-curtis floristic distance using as explanatory fractions soil nutrition, topography, canopy openess and geographical distances (all as euclidean distance matrices). We are using the MRM function of the ecodist package: mrm - MRM(dist(species) ~ dist(soil) + dist(topograph) + dist(light) + dist(xy), data=my.data, nperm=1 The idea is that the overall R2 of this multiple regression should be used to assess the contributions of the spatial and environmental fractions through subtraction : Three separate multiple regression analyses are needed to assess the relative explanatory power of geographical and environmental distances. All of these have the same response variable (the compositional dissimilarity matrix), but each analysis uses a diff erent set of the explanatory variables. In these analyses the explanatory variables are: (I) the geographical distance matrix only, (II) the environmental diff erence matrices only, and (III) all the explanatory variables used in (I) or (II). Comparing the R 2 values from these three analyses allows partitioning the variance of the response dissimilarity matrix to four fractions. Fraction A is explained uniquely by the environmental diff erence matrices and equals R2 (III) R2 (I). Fraction B is explained jointly by the environmental and geographical distances and equals R2 (I) R2 (II) R2 (III). Fraction C is explained uniquely by geographical distances and equals R2 (III) R2 (II). Fraction D is unexplained by the available environmental and geographical dissimilarity matrices and equals 100% R2 (III) (throughout the present paper, R2 values are expressed as percentages rather than proportions). [Tuomisto et al. 2012] The problem is that the R2 of the overall model (containing all the explanatory variables) is smaller than most of the R2 of models containing each of the explanatory matrices. So it seems not possible to proceed with the approach proposed. Sincerely, Alexandre Dr. Alexandre F. Souza Professor Adjunto II Departamento de Botanica, Ecologia e Zoologia Universidade Federal do Rio Grande do Norte (UFRN) http://www.docente.ufrn.br/alexsouza Curriculo: lattes.cnpq.br/7844758818522706 ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
Re: [R-sig-eco] Community composition variance partitioning?
Hi, That seems a bit odd: can you provide a reproducible example, off-list if necessary? Sarah On Wed, Dec 4, 2013 at 12:50 PM, Alexandre Fadigas de Souza alexso...@cb.ufrn.br wrote: Dear friends, My name is Alexandre and I am trying to analyze a dataset on floristic composition of tropical coastal vegetation by means of variance partition, according to the outlines of a Tuomisto's recent papers, specially Tuomisto, H., Ruokolainen, L., Ruokolainen, K., 2012. Modelling niche and neutral dynamics : on the ecological interpretation of variation partitioning results. Ecography (Cop.). 35, 961–971. I have a doubt, could you please give your opinion on it? We are proceeding a variance partition of the bray-curtis floristic distance using as explanatory fractions soil nutrition, topography, canopy openess and geographical distances (all as euclidean distance matrices). We are using the MRM function of the ecodist package: mrm - MRM(dist(species) ~ dist(soil) + dist(topograph) + dist(light) + dist(xy), data=my.data, nperm=1 The idea is that the overall R2 of this multiple regression should be used to assess the contributions of the spatial and environmental fractions through subtraction : Three separate multiple regression analyses are needed to assess the relative explanatory power of geographical and environmental distances. All of these have the same response variable (the compositional dissimilarity matrix), but each analysis uses a diff erent set of the explanatory variables. In these analyses the explanatory variables are: (I) the geographical distance matrix only, (II) the environmental diff erence matrices only, and (III) all the explanatory variables used in (I) or (II). Comparing the R 2 values from these three analyses allows partitioning the variance of the response dissimilarity matrix to four fractions. Fraction A is explained uniquely by the environmental diff erence matrices and equals R2 (III) R2 (I). Fraction B is explained jointly by the environmental and geographical distances and equals R2 (I) R2 (II) R2 (III). Fraction C is explained uniquely by geographical distances and equals R2 (III) R2 (II). Fraction D is unexplained by the available environmental and geographical dissimilarity matrices and equals 100% R2 (III) (throughout the present paper, R2 values are expressed as percentages rather than proportions). [Tuomisto et al. 2012] The problem is that the R2 of the overall model (containing all the explanatory variables) is smaller than most of the R2 of models containing each of the explanatory matrices. So it seems not possible to proceed with the approach proposed. Sincerely, Alexandre Dr. Alexandre F. Souza Professor Adjunto II Departamento de Botanica, Ecologia e Zoologia Universidade Federal do Rio Grande do Norte (UFRN) http://www.docente.ufrn.br/alexsouza Curriculo: lattes.cnpq.br/7844758818522706 ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology -- Sarah Goslee http://www.stringpage.com http://www.sarahgoslee.com http://www.functionaldiversity.org ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
Re: [R-sig-eco] Community composition variance partitioning?
Hi, Not only odd, but impossible. If you have a model y ~ x1, and you *add* a new explanatory variable, you cannot get worse in raw R2. You can get worse in adjusted R2. You can also get worse if you add variables to a matrix for which you calculate distances. So dist(y) ~ dist([x1]) can have higher R2 than dist(y) ~ dist([x1,x2]) -- bioenv is based on this. Cheers, Jari Oksanen Sent from my iPad On 4.12.2013, at 20.19, Sarah Goslee sarah.gos...@gmail.com wrote: Hi, That seems a bit odd: can you provide a reproducible example, off-list if necessary? Sarah On Wed, Dec 4, 2013 at 12:50 PM, Alexandre Fadigas de Souza alexso...@cb.ufrn.br wrote: Dear friends, My name is Alexandre and I am trying to analyze a dataset on floristic composition of tropical coastal vegetation by means of variance partition, according to the outlines of a Tuomisto's recent papers, specially Tuomisto, H., Ruokolainen, L., Ruokolainen, K., 2012. Modelling niche and neutral dynamics : on the ecological interpretation of variation partitioning results. Ecography (Cop.). 35, 961–971. I have a doubt, could you please give your opinion on it? We are proceeding a variance partition of the bray-curtis floristic distance using as explanatory fractions soil nutrition, topography, canopy openess and geographical distances (all as euclidean distance matrices). We are using the MRM function of the ecodist package: mrm - MRM(dist(species) ~ dist(soil) + dist(topograph) + dist(light) + dist(xy), data=my.data, nperm=1 The idea is that the overall R2 of this multiple regression should be used to assess the contributions of the spatial and environmental fractions through subtraction : Three separate multiple regression analyses are needed to assess the relative explanatory power of geographical and environmental distances. All of these have the same response variable (the compositional dissimilarity matrix), but each analysis uses a diff erent set of the explanatory variables. In these analyses the explanatory variables are: (I) the geographical distance matrix only, (II) the environmental diff erence matrices only, and (III) all the explanatory variables used in (I) or (II). Comparing the R 2 values from these three analyses allows partitioning the variance of the response dissimilarity matrix to four fractions. Fraction A is explained uniquely by the environmental diff erence matrices and equals R2 (III) R2 (I). Fraction B is explained jointly by the environmental and geographical distances and equals R2 (I) R2 (II) R2 (III). Fraction C is explained uniquely by geographical distances and equals R2 (III) R2 (II). Fraction D is unexplained by the available environmental and geographical dissimilarity matrices and equals 100% R2 (III) (throughout the present paper, R2 values are expressed as percentages rather than proportions). [Tuomisto et al. 2012] The problem is that the R2 of the overall model (containing all the explanatory variables) is smaller than most of the R2 of models containing each of the explanatory matrices. So it seems not possible to proceed with the approach proposed. Sincerely, Alexandre Dr. Alexandre F. Souza Professor Adjunto II Departamento de Botanica, Ecologia e Zoologia Universidade Federal do Rio Grande do Norte (UFRN) http://www.docente.ufrn.br/alexsouza Curriculo: lattes.cnpq.br/7844758818522706 ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology -- Sarah Goslee http://www.stringpage.com http://www.sarahgoslee.com http://www.functionaldiversity.org ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
Re: [R-sig-eco] Community composition variance partitioning?
Alexandre, I'll leave it to Sarah to advise you on MRM (and I agree with Jari that the method you're describing is not going to work). I'll just add that it is not clear to me why the predictors (even geographic distance) have to be treated as distances to partition the variance in composition. I'm assuming the environmental variables were not originally in the form of euclidean distance matrices and that the raw measurements are available? As for the geographic distances, if you have lat and long coordinates, why not treat both lat and long as predictors and do the necessary analyses as partial distance-based redundancy analyses using capscale? In one analysis the geographic predictors could be partialled out (with the result explaining the fraction explained by the environment). In another, the environmental predictors could be partialled out (with the result explaining the fraction explained by the geographic distance) and in a third both geographic and environmental predictors could be considered with no conditioning covariates (which will give the total variance explained by both combined). Best Steve J. Stephen Brewer Professor Department of Biology PO Box 1848 University of Mississippi University, Mississippi 38677-1848 Brewer web page - http://home.olemiss.edu/~jbrewer/ FAX - 662-915-5144 Phone - 662-915-1077 On 12/4/13 11:50 AM, Alexandre Fadigas de Souza alexso...@cb.ufrn.br wrote: Dear friends, My name is Alexandre and I am trying to analyze a dataset on floristic composition of tropical coastal vegetation by means of variance partition, according to the outlines of a Tuomisto's recent papers, specially Tuomisto, H., Ruokolainen, L., Ruokolainen, K., 2012. Modelling niche and neutral dynamics : on the ecological interpretation of variation partitioning results. Ecography (Cop.). 35, 961971. I have a doubt, could you please give your opinion on it? We are proceeding a variance partition of the bray-curtis floristic distance using as explanatory fractions soil nutrition, topography, canopy openess and geographical distances (all as euclidean distance matrices). We are using the MRM function of the ecodist package: mrm - MRM(dist(species) ~ dist(soil) + dist(topograph) + dist(light) + dist(xy), data=my.data, nperm=1 The idea is that the overall R2 of this multiple regression should be used to assess the contributions of the spatial and environmental fractions through subtraction : Three separate multiple regression analyses are needed to assess the relative explanatory power of geographical and environmental distances. All of these have the same response variable (the compositional dissimilarity matrix), but each analysis uses a diff erent set of the explanatory variables. In these analyses the explanatory variables are: (I) the geographical distance matrix only, (II) the environmental diff erence matrices only, and (III) all the explanatory variables used in (I) or (II). Comparing the R 2 values from these three analyses allows partitioning the variance of the response dissimilarity matrix to four fractions. Fraction A is explained uniquely by the environmental diff erence matrices and equals R2 (III) R2 (I). Fraction B is explained jointly by the environmental and geographical distances and equals R2 (I) R2 (II) R2 (III). Fraction C is explained uniquely by geographical distances and equals R2 (III) R2 (II). Fraction D is unexplained by the available environmental and geographical dissimilarity matrices and equals 100% R2 (III) (throughout the present paper, R2 values are expressed as percentages rather than proportions). [Tuomisto et al. 2012] The problem is that the R2 of the overall model (containing all the explanatory variables) is smaller than most of the R2 of models containing each of the explanatory matrices. So it seems not possible to proceed with the approach proposed. Sincerely, Alexandre Dr. Alexandre F. Souza Professor Adjunto II Departamento de Botanica, Ecologia e Zoologia Universidade Federal do Rio Grande do Norte (UFRN) http://www.docente.ufrn.br/alexsouza Curriculo: lattes.cnpq.br/7844758818522706 ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology