Re: [R-sig-eco] Fwd: how to calculate axis variance in metaMDS, pakage vegan?
Dear Gian, I'm still not quite sure about the functioning of adonis(). I made some proofs and I don't understand how does it calculate the degrees of freedom of a nested factor. But this is most probably due to my lack of experience with this function (as I told you I usually work with the PERMANOVA add-in for PRIMER developed by Marti Jane Anderson and others). Anyway, in your table of results _I think_ you miss the effect of the term Community(Host) [which means: the factor community nested in the factor host]. In the table you send me you can only see that there is a significant effect of the term Host (which I think it means that the communities are significantly different between hosts). Nevertheless you still do not know if your communities are significantly different between each other, within each host. Now it depends on the hypothesis you intend to test. It probably does not make any sense to ask if communities differ within each host, but _I think_ you still have to include the term Community(Host) in your table. Though, I would wait for the opinion of a more experienced user of adonis() and ANOVA testing in general. HTH Cheers, Dulce Maria Dulce Subida ~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~* Instituto de Ciencias Marinas de Andalucía (ICMAN) Consejo Superior de Investigaciones Científicas (CSIC) Campus Universitário Río San Pedro 11510 Puerto Real - Cádiz. España. www.icman.csic.es http://www.icman.csic.es/ 0034 956832612 ext. 316 ~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~* Gian Maria Niccolò Benucci escribió: Hi Maria, Yes, I think it's right, maybe now I did the correct function, It seems that the area effect is also visible... adonis(ABCDsqrt ~ Host, method=bray, data=env.table, permutations=99, strata=env.table$Community) Call: adonis(formula = ABCDsqrt ~ Host, data = env.table, permutations = 99, method=bray, strata = env.table$Community) Df SumsOfSqs MeanSqs F.Model R2 Pr(F) Host 1.0 1.64429 1.64429 5.38984 0.1242 0.01 ** Residuals 38.0 11.59276 0.30507 0.8758 Total 39.0 13.23705 1. --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 don't you think? Gian 2009/12/9 Maria Dulce Subida mdsub...@icman.csic.es Hi Gian, I've never used adonis() [I'm a R beginner] but I've been doing multivariate analysis for some time: I usually use PRIMER with the PERMANOVA add-in. nMDS followed by PERMANOVA works quite well for me in experimental designs similar to yours. But it seems to me that you have nestedness in your design and this should be considered when you do adonis(). After a quick look to the ?adonis? documentation, *I think* you should state strata = env.table$Community in your adonis() function, since your community factor is nested within the host factor. Otherwise you're getting wrong pseudo-F values as well as wrong p-values. Good luck! Cheers, Dulce Maria Dulce Subida ~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~* Instituto de Ciencias Marinas de Andalucía (ICMAN) Consejo Superior de Investigaciones Científicas (CSIC) Campus Universitário Río San Pedro 11510 Puerto Real - Cádiz. España. www.icman.csic.es 0034 956832612 ext. 316 ~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~* Gian Maria Niccolò Benucci escribió: Jari, Gavin, Chris, Gabriel and Carsten... Many thank you all for your support and kindness... and for your competence and experience that could not be ever comparized to mine at least in that stuffs... Gabriel said: .*..I found this mailing list very helpful many times for my own questions, but also very informative when just following the threads on other questions... * I complitely agree about that, so here I am to go deeper inside my statistical problems... As Gavin argued the plot: NMS.2$stress [1] 24.53723 NMS.3$stress [1] 16.29226 NMS.4$stress [1] 11.79951 plot(2:4, c(24.53723, 16.29226, 11.79951), type = b) didn't show significally differences... ...so as him suggested I did the stressplot() and got shepard graphs... (just to specify, sqrtABCD is the square roots transforming of the species matrix) stressplot(NMS.2) Using step-across dissimilarities: Too long or NA distances: 230 out of 780 (29.5%) Stepping across 780 dissimilarities... Non-metric fit, R2=0.94 Linear fit, R2=0.719 stressplot(NMS.3) Using step-across dissimilarities: Too long or NA distances: 230 out of 780 (29.5%) Stepping across 780 dissimilarities... Non-metric fit, R2=0.973 Linear fit, R2=0.815 stressplot(NMS.4) Using step-across dissimilarities: Too long or NA distances: 230 out of 780 (29.5%) Stepping across 780 dissimilarities... Non-metric fit, R2=0.986 Linear fit, R2=0.875 From this data is clear that the fit is better for
Re: [R-sig-eco] Fwd: how to calculate axis variance in metaMDS, pakage vegan?
On Wed, 2009-12-09 at 10:25 +0100, gabriel singer wrote: Gian, You may also want to use betadisper() to check whether the host effect is due to differences in location or dispersion (or both). This is equivalent to checking homogeneity of variance when running a classical ANOVA. cheers, g Good point Gabriel, but I'd caution against using betadisper just at the moment in Vegan. A user, and subsequently confirmed by Jari, notified us that the default (and currently only) method using the dispersion around the group centroid (average) and a permutation test was anti-conservative. Since then Jari has written code to allow us to include the dispersion around the spatial median within betadisper and initial tests suggests this has the right Type I error rate in the permutation test. I had hoped to have included this by now, but having been under the weather for the past month I have not yet finished working on it. An updated version should be on r-forge in the next few days. G Gian Maria Niccolò Benucci wrote: Jari, Gavin, Chris, Gabriel and Carsten... Many thank you all for your support and kindness... and for your competence and experience that could not be ever comparized to mine at least in that stuffs... Gabriel said: .*..I found this mailing list very helpful many times for my own questions, but also very informative when just following the threads on other questions... * I complitely agree about that, so here I am to go deeper inside my statistical problems... As Gavin argued the plot: NMS.2$stress [1] 24.53723 NMS.3$stress [1] 16.29226 NMS.4$stress [1] 11.79951 plot(2:4, c(24.53723, 16.29226, 11.79951), type = b) didn't show significally differences... ...so as him suggested I did the stressplot() and got shepard graphs... (just to specify, sqrtABCD is the square roots transforming of the species matrix) stressplot(NMS.2) Using step-across dissimilarities: Too long or NA distances: 230 out of 780 (29.5%) Stepping across 780 dissimilarities... Non-metric fit, R2=0.94 Linear fit, R2=0.719 stressplot(NMS.3) Using step-across dissimilarities: Too long or NA distances: 230 out of 780 (29.5%) Stepping across 780 dissimilarities... Non-metric fit, R2=0.973 Linear fit, R2=0.815 stressplot(NMS.4) Using step-across dissimilarities: Too long or NA distances: 230 out of 780 (29.5%) Stepping across 780 dissimilarities... Non-metric fit, R2=0.986 Linear fit, R2=0.875 From this data is clear that the fit is better for the NMS.4 (k=4) also the blue points into the graph are more near to red line, less spare around the graph space... But maybe the R2 values of the NMS.2 aren't so bad in correlation terms, are they? In reason of what Gabriel said: *...I personally like a combination of NMDS with the permutational MANOVA approach (by Marti Anderson) implemented in the function adonis() in vegan. You can use the same dissimilarity measure (Bray-Curtis) used for the NMDS and can test the Area vs. the Host effect on parasite (was it?) composition. I think that could be a very useful complement to an NMDS-derived ordination plot and then you may also regard high-stress representations (and that´s what all the low-dimensional ordination plots really ARE!) in a different light.*.. adonis(sqrtABCD ~ Host*Community, method=bray, data=env.table, permutations=99) Call: adonis(formula = sqrtABCD ~ Host * Community, data = env.table, permutations = 99, method = bray) Df SumsOfSqs MeanSqs F.Model R2 Pr(F) Host 1.0 1.64429 1.64429 5.47874 0.1242 0.01 ** Community 2.0 0.78834 0.39417 1.31337 0.0596 0.23 Residuals 36.0 10.80441 0.30012 0.8162 Total 39.0 13.23705 1. --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 ...So, I would explain a little about my datasets: - the species matrix is done by roots samples in which were counted the ectomycorrhizal fungal species present (cells entities are different tips individuals); - sample where taken into four Area (A,B,C,D). The ares are about 30 meters far away one to each other; - areas A and B are both form Corylus roots while areas C and D are both from Ostrya roots. To be more clear that is the enviromental matix used: env.table CommunityHost A1 A Corylus A2 A Corylus A3 A Corylus A4 A Corylus A5 A Corylus A6 A Corylus A7 A Corylus A8 A Corylus A9 A Corylus A10 A Corylus B1 B Corylus B2 B Corylus B3 B Corylus B4 B Corylus B5 B Corylus B6 B Corylus B7 B Corylus
Re: [R-sig-eco] Fwd: how to calculate axis variance in metaMDS, pakage vegan?
Gian,, I applaud your continued struggle to understand the best route of analysis. If I were you, I would ignore the rude comments made by some and focus on the constructive replies from others. Good luck! Chris Habeck Ph.D. Candidate Department of Zoology University of Wisconsin, Madison http://habeckecology.wikispaces.com/ 2009/12/7 Gian Maria Niccolò Benucci gian.benu...@gmail.com Hi Gavin and Hi all, I will not go in front of a bus for sure, I not mad, at least I am not still mad... :) I would like to tell you that I am a Ph.D. student, and for what I know, Ph.D. student still have to understand things studing those from whom wrote before them... Isac Newton became famous not only for his science but also for a famous phrase that, if I don't remember it bad, act like this : If I have seen so much far away is because I stand on shoulders of Giants... I think that it needs any comment, and express itself the concept... So, I am so sorry, I also don't like the me to attitude, but you don't know how is my reality here, and I can assure you that also If I am still a student, I am alone in my research, and If have a tutor and boss for italian rules I don't have a boss for statistics, couse none could help me on that... So what could I do if I don't take models in already published literature? Anyway, I don't want to seem like the victim, I have a brain that works and I am doing my best to understand and improve my knowledge and at least lean and grow, for sure, step by step, and with a big humility, in science and in this case in statistics... Anyway... For continuing the brainstorm if I can...The Host effect is what I think is more interesting for the ecological point of view of my trials also becasue the 4 communities have two by two the same host, I mean A and B, Corylus, while B and C, Ostrya... If I plot the factors of the envifit into the graph and the evidence of separation seems clear... That's are my metaMDS with 2 and 3 dimensions: NMS.1 Call: metaMDS(comm = sqrtABCD, distance = bray, k = 2, trymax = 100, autotransform = F) Nonmetric Multidimensional Scaling using isoMDS (MASS package) Data: sqrtABCD Distance: bray shortest Dimensions: 2 Stress: 24.54342 Two convergent solutions found after 18 tries Scaling: centring, PC rotation, halfchange scaling Species: expanded scores based on sqrtABCD NMS.ABCD.2ef ***FACTORS: Centroids: NMDS1 NMDS2 CommunityA -0.3271 0.1984 CommunityB -0.1956 0.1768 CommunityC 0.2520 -0.2847 CommunityD 0.2706 -0.0905 HostCorylus -0.2613 0.1876 HostOstrya 0.2613 -0.1876 Goodness of fit: r2 Pr(r) Community 0.1897 0.017982 * Host 0.1778 0.001998 ** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 P values based on 1000 permutations. NMS.1.3 Call: metaMDS(comm = sqrtABCD, distance = bray, k = 3, trymax = 100, autotransform = F) Nonmetric Multidimensional Scaling using isoMDS (MASS package) Data: sqrtABCD Distance: bray shortest Dimensions: 3 Stress: 16.29226 Two convergent solutions found after 6 tries Scaling: centring, PC rotation, halfchange scaling Species: expanded scores based on sqrtABCD NMS.ABCD.3ef ***FACTORS: Centroids: NMDS1 NMDS2 NMDS3 CommunityA 0.3881 -0.2702 0.1536 CommunityB 0.1407 -0.2344 0.0197 CommunityC -0.2053 0.3566 -0.0219 CommunityD -0.3235 0.1480 -0.1514 HostCorylus 0.2644 -0.2523 0.0866 HostOstrya -0.2644 0.2523 -0.0866 Goodness of fit: r2 Pr(r) Community 0.1798 0.005994 ** Host 0.1581 0.000999 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 P values based on 1000 permutations. I got 10 sample units for each community data (40 in total) You said at the end : *I do wonder if you are not hitting the curse of dimensionality here?* Can you explain me what do you mean for hitting the curse of dimensionality if I am not so demanding... ... and then: *it would be nice to look at the ordination but how you do that I don't know.* I would be glad if you see the graphs of my ordinations, Can I send them to you? That would be great... let me know about that. I used to plot in this way: plot(NMS.1, type=n, dis= sp) ordisymbol(NMS.1, env.table, Host, legend=T) Anyway I have to admit that with 2 and at least 3 dimensions the points into the ordinantion plot are better separated in reasons to the data matrix, so what to do? better fittind of points ant bigger stress or the contrary? I think is enough, thank you so much for your help, I'll appreciate any comments! :) Gian And thank you all for the kind responses... I do not want to torture myself for sure... :) I red (lot of) publications about fungal community ecology studies (soil fungi), my research field indeed, and all uses NMDS or DCA as ordination techniques... So, I am only
Re: [R-sig-eco] Fwd: how to calculate axis variance in metaMDS, pakage vegan?
On Mon, 2009-12-07 at 20:10 +0100, Gian Maria Niccolò Benucci wrote: Hi Gavin and Hi all, I will not go in front of a bus for sure, I not mad, at least I am not still mad... :) I would like to tell you that I am a Ph.D. student, and for what I know, Ph.D. student still have to understand things studing those from whom wrote before them... That's fine, but temper that with a realisation that not everyone knows what they are doing numerically. So be critical about what you read, learn about the methods and what they do. snip / Anyway... For continuing the brainstorm if I can...The Host effect is what I think is more interesting for the ecological point of view of my trials also becasue the 4 communities have two by two the same host, I mean A and B, Corylus, while B and C, Ostrya... If I plot the factors of the envifit into the graph and the evidence of separation seems clear... That's are my metaMDS with 2 and 3 dimensions: Thanks for these: one way of trying to choose a dimensionality for the solution is to plot the stress as a function of k (k on the x-axis, stress on the y) - this is often called a screeplot as you are looking for a dramatic change in slope. I took your stresses and plotted them against k (crudely): plot(2:4, c(24.54342, 16.29226, 11.68632), type = b) and doesn't seem to be any noticeable change here, so not much help there. Looking at the goodness of fit stats, the story they tell doesn't really change much depending on whether you use 2,3, or 4 dimensions. So perhaps stick with 2 in that case. Also, try: stressplot(MOD) where mod is the object returned by metaMDS. The stressplot plots your original dissimilarities against dissimilarities derived from the nMDS configuration. It also shows the monotonic regression fit and a few goodness of fit criteria. You could evaluate the models with different k using these plots. NMS.1 Call: metaMDS(comm = sqrtABCD, distance = bray, k = 2, trymax = 100, autotransform = F) snip / I got 10 sample units for each community data (40 in total) You said at the end : *I do wonder if you are not hitting the curse of dimensionality here?* Can you explain me what do you mean for hitting the curse of dimensionality if I am not so demanding... Yep, sorry, that was a bit cryptic. Curse of dimensionality is a phrase coined by Belman (1961) and refers to the problem of defining localness in high dimensions; neighbourhoods with a fixed number of samples become less local as the number of dimensions in creases. basically, if you have a number of dimensions, the more dimensions you have the easier it is for a sample to lie a long way from the rest of the data along a single dimension and thus have large dissimilarity. This doesn't appear to be the case here though; 4 is low dimensionality (hence my wondering if this was or wasn't a problem), but when you'd only shown the k=4 data, I did wonder if the low r2 was due to you points being widely spread along one of the 4D; i.e. was the more complex solution leading to the low r2? By looks of things, the low r2 is probably more to do with the small, but significant, effects of your two covariates. ... and then: *it would be nice to look at the ordination but how you do that I don't know.* I would be glad if you see the graphs of my ordinations, Can I send them to you? That would be great... let me know about that. I used to plot in this way: plot(NMS.1, type=n, dis= sp) ordisymbol(NMS.1, env.table, Host, legend=T) Anyway I have to admit that with 2 and at least 3 dimensions the points into the ordinantion plot are better separated in reasons to the data matrix, so what to do? better fittind of points ant bigger stress or the contrary? If this were me, seeing as the interpretation/results don't change, I'd probably stick with k=2 so you can easily draw the ordination for presentation in your phd work or future papers. HTH G I think is enough, thank you so much for your help, I'll appreciate any comments! :) Gian And thank you all for the kind responses... I do not want to torture myself for sure... :) I red (lot of) publications about fungal community ecology studies (soil fungi), my research field indeed, and all uses NMDS or DCA as ordination techniques... So, I am only trying to do my best useing R for calculating them... Would you walk in front of a bus if you saw lots of other people doing it? I doubt it. This kind of me to attitude to science is quite demoralising when reviewing manuscripts and reading the literature. DCA was invented to solve a specific problem with CA - namely the arch artefact. I forget whether this is in Jari's public lecture notes, vegan vignettes/tutorials or in one of his lectures on a course we taught together, but DCA replaces the arch artefact with other artefacts that make the points look like a trumpet or a diamond in ordination space. Why DCA is used
Re: [R-sig-eco] Fwd: how to calculate axis variance in metaMDS, pakage vegan?
Hi Gian and others, I think we better stop worrying about subjective interpretations of emotional backgrounds of what in other aspects are absolutely helpful discussion threads... I guess part of the challenge on this mailing list is to span the whole range of expertise with useful discussion/output/help for everyone, be it a student or an expert. I found this mailing list very helpful many times for my own questions, but also very informative when just following the threads on other questions... Gian, in my opinion, 2 dimensions are absolutely ok, especially if they do visualize an (obvious) effect in your study. In other words, if 2 dimensions show you an effect of Host but not of Area, the effect is obviously strong enough. Then I would not worry about stress too much. However, there may still be an effect of Area, maybe visible in more dimensions, but it´s obviously of minor importance. I personally like a combination of NMDS with the permutational MANOVA approach (by Marti Anderson) implemented in the function adonis() in vegan. You can use the same dissimilarity measure (Bray-Curtis) used for the NMDS and can test the Area vs. the Host effect on parasite (was it?) composition. I think that could be a very useful complement to an NMDS-derived ordination plot and then you may also regard high-stress representations (and that´s what all the low-dimensional ordination plots really ARE!) in a different light. Complementations like the permanova are in my opinion better than trying the full spectrum of ordination methods until finally some kind of pattern gets uncovered (comes quite close to the much too often encountered data-fishing expeditions). And though copying analysis strategies is probably not quite like throwing yourself in front of a bus, there is some benefit in using what people working in a specific field regard their standard methods (wait for the reviews to discover this). In any case, a responsible choice for a type of analysis is oriented along the study design and the data at hand. cheers, gabriel -- Dr. Gabriel Singer Department of Freshwater Ecology - University of Vienna and Wassercluster Lunz Biologische Station GmbH +43-(0)664-1266747 gabriel.sin...@univie.ac.at Gian Maria Niccolò Benucci wrote: Hi Gavin and Hi all, I will not go in front of a bus for sure, I not mad, at least I am not still mad... :) I would like to tell you that I am a Ph.D. student, and for what I know, Ph.D. student still have to understand things studing those from whom wrote before them... Isac Newton became famous not only for his science but also for a famous phrase that, if I don't remember it bad, act like this : If I have seen so much far away is because I stand on shoulders of Giants... I think that it needs any comment, and express itself the concept... So, I am so sorry, I also don't like the me to attitude, but you don't know how is my reality here, and I can assure you that also If I am still a student, I am alone in my research, and If have a tutor and boss for italian rules I don't have a boss for statistics, couse none could help me on that... So what could I do if I don't take models in already published literature? Anyway, I don't want to seem like the victim, I have a brain that works and I am doing my best to understand and improve my knowledge and at least lean and grow, for sure, step by step, and with a big humility, in science and in this case in statistics... Anyway... For continuing the brainstorm if I can...The Host effect is what I think is more interesting for the ecological point of view of my trials also becasue the 4 communities have two by two the same host, I mean A and B, Corylus, while B and C, Ostrya... If I plot the factors of the envifit into the graph and the evidence of separation seems clear... That's are my metaMDS with 2 and 3 dimensions: NMS.1 Call: metaMDS(comm = sqrtABCD, distance = bray, k = 2, trymax = 100, autotransform = F) Nonmetric Multidimensional Scaling using isoMDS (MASS package) Data: sqrtABCD Distance: bray shortest Dimensions: 2 Stress: 24.54342 Two convergent solutions found after 18 tries Scaling: centring, PC rotation, halfchange scaling Species: expanded scores based on ‘sqrtABCD’ NMS.ABCD.2ef ***FACTORS: Centroids: NMDS1 NMDS2 CommunityA -0.3271 0.1984 CommunityB -0.1956 0.1768 CommunityC 0.2520 -0.2847 CommunityD 0.2706 -0.0905 HostCorylus -0.2613 0.1876 HostOstrya 0.2613 -0.1876 Goodness of fit: r2 Pr(r) Community 0.1897 0.017982 * Host 0.1778 0.001998 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 P values based on 1000 permutations. NMS.1.3 Call: metaMDS(comm = sqrtABCD, distance = bray, k = 3, trymax = 100, autotransform = F) Nonmetric Multidimensional Scaling using isoMDS (MASS package) Data: sqrtABCD Distance: bray shortest Dimensions: 3 Stress: 16.29226 Two
Re: [R-sig-eco] Fwd: how to calculate axis variance in metaMDS, pakage vegan?
On Sun, 2009-12-06 at 17:56 +0100, Gian Maria Niccolò Benucci wrote: Hi all again, And thank you all for the kind responses... I do not want to torture myself for sure... :) I red (lot of) publications about fungal community ecology studies (soil fungi), my research field indeed, and all uses NMDS or DCA as ordination techniques... So, I am only trying to do my best useing R for calculating them... Would you walk in front of a bus if you saw lots of other people doing it? I doubt it. This kind of me to attitude to science is quite demoralising when reviewing manuscripts and reading the literature. DCA was invented to solve a specific problem with CA - namely the arch artefact. I forget whether this is in Jari's public lecture notes, vegan vignettes/tutorials or in one of his lectures on a course we taught together, but DCA replaces the arch artefact with other artefacts that make the points look like a trumpet or a diamond in ordination space. Why DCA is used as a default instead of a special case escapes me. You really shouldn't use DCA at all if you can get away with it as it is doing some nasty things to your data. Alternatives; i) NMDS ii) PCA after application of a transformation (Legendre Gallagher 2001, Oecologia). And there are probably others... What I need now is a good environmental interpretation of my work... Then I found the fantastic Jari's pdf about Multivariate Analysis of Ecological Communities in R: vegan tutorial and I went to the passage about factors and vectors fitting... That's my R code: NMS.ABCDsqrt Call: metaMDS(comm = sqrtABCD, distance = bray, k = 4, trymax = 100, autotransform = F) Nonmetric Multidimensional Scaling using isoMDS (MASS package) Data: sqrtABCD Distance: bray shortest Dimensions: 4 Stress: 11.68632 What was the stress with k = 2 and k = 3. As Jari has already mentioned, how are you going to interpret and visualise this 4D configuration of points (you can't plot NMDS1 vs NMDS2, NMDS1 vs NMDS3 etc. for reasons explained to you earlier in this thread). Two convergent solutions found after 2 tries Scaling: centring, PC rotation, halfchange scaling Species: expanded scores based on sqrtABCD envfit(NMS.ABCDsqrt, env.table, permu=1000) -NMS.ABCDsqrtef NMS.ABCDsqrtef ***FACTORS: Centroids: NMDS1 NMDS2 NMDS3 NMDS4 CommunityA -0.3821 0.3822 -0.1173 -0.1232 CommunityB -0.1849 0.2748 0.0076 -0.0720 CommunityC 0.2206 -0.4261 -0.0505 0.1197 CommunityD 0.3465 -0.2310 0.1603 0.0756 HostCorylus -0.2835 0.3285 -0.0549 -0.0976 HostOstrya 0.2835 -0.3285 0.0549 0.0976 Goodness of fit: r2 Pr(r) Community 0.2009 0.001998 ** Host 0.1818 0.000999 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 P values based on 1000 permutations. names(NMS.ABCDsqrtef) [1] vectors factors NMS.ABCDsqrtef$vectors NULL I have an enviromental matrix (env.table) that contains only area (A,B,C,D) and tree species (Corylus sp. and Ostrya sp. ) differentiation, so I have the area and the tree species of each samples is related to... I could plot factors on the graph but not vectors because there aren't vectors in reason to the absence of numerical data in the env. matrix... isn't it right? Aren't the R2 values too low? Did you read ?envfit ? It states: (r^2). For factors this is defined as r^2 = 1 - ss_w/ss_t, where ss_w and ss_t are within-group and total sums of squares. So this statistic here is looking at how constrained within the 4D space the levels of each factor are in relation to the overall spread of the points. This looks to me like some evidence for grouping of your sites on basis of Community and stronger evidence for Host. The effect is small but significant. I do wonder if you are not hitting the curse of dimensionality here? The interpretation will depend on the number of samples. It would be nice to look at the ordination but how you do that I don't know. G Many many thank you for answering... Gian Jari, I am here again ... :) So, to try having a comparison of the real goodness of my metaMDS data I tried to perform a DCA (with same input table) Then please forgive me if I do somethign wrong with it... That's my R code: Why DCA? What lead you to torture your data so? decorana(sqrtABCD, iweigh=0, ira=0) - DCA.1 DCA.1 Call: decorana(veg = sqrtABCD, iweigh = 0, ira = 0) Detrended correspondence analysis with 26 segments. Rescaling of axes with 4 iterations. DCA1 DCA2 DCA3 DCA4 Eigenvalues 0.6688 0.5387 0.4822 0.3752 Decorana values 0.7912 0.5795 0.4145 0.2931 Axis lengths5.9974 3.7036 3.6121 3.3802 In that situation the graph is still good but the differences between the two clades are little more confused, maybe in the axe II (I mean the vertical one) in this case
Re: [R-sig-eco] Fwd: how to calculate axis variance in metaMDS, pakage vegan?
Hi Gavin and Hi all, I will not go in front of a bus for sure, I not mad, at least I am not still mad... :) I would like to tell you that I am a Ph.D. student, and for what I know, Ph.D. student still have to understand things studing those from whom wrote before them... Isac Newton became famous not only for his science but also for a famous phrase that, if I don't remember it bad, act like this : If I have seen so much far away is because I stand on shoulders of Giants... I think that it needs any comment, and express itself the concept... So, I am so sorry, I also don't like the me to attitude, but you don't know how is my reality here, and I can assure you that also If I am still a student, I am alone in my research, and If have a tutor and boss for italian rules I don't have a boss for statistics, couse none could help me on that... So what could I do if I don't take models in already published literature? Anyway, I don't want to seem like the victim, I have a brain that works and I am doing my best to understand and improve my knowledge and at least lean and grow, for sure, step by step, and with a big humility, in science and in this case in statistics... Anyway... For continuing the brainstorm if I can...The Host effect is what I think is more interesting for the ecological point of view of my trials also becasue the 4 communities have two by two the same host, I mean A and B, Corylus, while B and C, Ostrya... If I plot the factors of the envifit into the graph and the evidence of separation seems clear... That's are my metaMDS with 2 and 3 dimensions: NMS.1 Call: metaMDS(comm = sqrtABCD, distance = bray, k = 2, trymax = 100, autotransform = F) Nonmetric Multidimensional Scaling using isoMDS (MASS package) Data: sqrtABCD Distance: bray shortest Dimensions: 2 Stress: 24.54342 Two convergent solutions found after 18 tries Scaling: centring, PC rotation, halfchange scaling Species: expanded scores based on sqrtABCD NMS.ABCD.2ef ***FACTORS: Centroids: NMDS1 NMDS2 CommunityA -0.3271 0.1984 CommunityB -0.1956 0.1768 CommunityC 0.2520 -0.2847 CommunityD 0.2706 -0.0905 HostCorylus -0.2613 0.1876 HostOstrya 0.2613 -0.1876 Goodness of fit: r2 Pr(r) Community 0.1897 0.017982 * Host 0.1778 0.001998 ** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 P values based on 1000 permutations. NMS.1.3 Call: metaMDS(comm = sqrtABCD, distance = bray, k = 3, trymax = 100, autotransform = F) Nonmetric Multidimensional Scaling using isoMDS (MASS package) Data: sqrtABCD Distance: bray shortest Dimensions: 3 Stress: 16.29226 Two convergent solutions found after 6 tries Scaling: centring, PC rotation, halfchange scaling Species: expanded scores based on sqrtABCD NMS.ABCD.3ef ***FACTORS: Centroids: NMDS1 NMDS2 NMDS3 CommunityA 0.3881 -0.2702 0.1536 CommunityB 0.1407 -0.2344 0.0197 CommunityC -0.2053 0.3566 -0.0219 CommunityD -0.3235 0.1480 -0.1514 HostCorylus 0.2644 -0.2523 0.0866 HostOstrya -0.2644 0.2523 -0.0866 Goodness of fit: r2 Pr(r) Community 0.1798 0.005994 ** Host 0.1581 0.000999 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 P values based on 1000 permutations. I got 10 sample units for each community data (40 in total) You said at the end : *I do wonder if you are not hitting the curse of dimensionality here?* Can you explain me what do you mean for hitting the curse of dimensionality if I am not so demanding... ... and then: *it would be nice to look at the ordination but how you do that I don't know.* I would be glad if you see the graphs of my ordinations, Can I send them to you? That would be great... let me know about that. I used to plot in this way: plot(NMS.1, type=n, dis= sp) ordisymbol(NMS.1, env.table, Host, legend=T) Anyway I have to admit that with 2 and at least 3 dimensions the points into the ordinantion plot are better separated in reasons to the data matrix, so what to do? better fittind of points ant bigger stress or the contrary? I think is enough, thank you so much for your help, I'll appreciate any comments! :) Gian And thank you all for the kind responses... I do not want to torture myself for sure... :) I red (lot of) publications about fungal community ecology studies (soil fungi), my research field indeed, and all uses NMDS or DCA as ordination techniques... So, I am only trying to do my best useing R for calculating them... Would you walk in front of a bus if you saw lots of other people doing it? I doubt it. This kind of me to attitude to science is quite demoralising when reviewing manuscripts and reading the literature. DCA was invented to solve a specific problem with CA - namely the arch artefact. I forget whether this is in Jari's public lecture notes, vegan vignettes/tutorials or in one of his lectures on a course we taught together, but DCA replaces the
Re: [R-sig-eco] Fwd: how to calculate axis variance in metaMDS, pakage vegan?
Hi all again, And thank you all for the kind responses... I do not want to torture myself for sure... :) I red (lot of) publications about fungal community ecology studies (soil fungi), my research field indeed, and all uses NMDS or DCA as ordination techniques... So, I am only trying to do my best useing R for calculating them... What I need now is a good environmental interpretation of my work... Then I found the fantastic Jari's pdf about Multivariate Analysis of Ecological Communities in R: vegan tutorial and I went to the passage about factors and vectors fitting... That's my R code: NMS.ABCDsqrt Call: metaMDS(comm = sqrtABCD, distance = bray, k = 4, trymax = 100, autotransform = F) Nonmetric Multidimensional Scaling using isoMDS (MASS package) Data: sqrtABCD Distance: bray shortest Dimensions: 4 Stress: 11.68632 Two convergent solutions found after 2 tries Scaling: centring, PC rotation, halfchange scaling Species: expanded scores based on sqrtABCD envfit(NMS.ABCDsqrt, env.table, permu=1000) -NMS.ABCDsqrtef NMS.ABCDsqrtef ***FACTORS: Centroids: NMDS1 NMDS2 NMDS3 NMDS4 CommunityA -0.3821 0.3822 -0.1173 -0.1232 CommunityB -0.1849 0.2748 0.0076 -0.0720 CommunityC 0.2206 -0.4261 -0.0505 0.1197 CommunityD 0.3465 -0.2310 0.1603 0.0756 HostCorylus -0.2835 0.3285 -0.0549 -0.0976 HostOstrya 0.2835 -0.3285 0.0549 0.0976 Goodness of fit: r2 Pr(r) Community 0.2009 0.001998 ** Host 0.1818 0.000999 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 P values based on 1000 permutations. names(NMS.ABCDsqrtef) [1] vectors factors NMS.ABCDsqrtef$vectors NULL I have an enviromental matrix (env.table) that contains only area (A,B,C,D) and tree species (Corylus sp. and Ostrya sp. ) differentiation, so I have the area and the tree species of each samples is related to... I could plot factors on the graph but not vectors because there aren't vectors in reason to the absence of numerical data in the env. matrix... isn't it right? Aren't the R2 values too low? Many many thank you for answering... Gian Jari, I am here again ... :) So, to try having a comparison of the real goodness of my metaMDS data I tried to perform a DCA (with same input table) Then please forgive me if I do somethign wrong with it... That's my R code: Why DCA? What lead you to torture your data so? decorana(sqrtABCD, iweigh=0, ira=0) - DCA.1 DCA.1 Call: decorana(veg = sqrtABCD, iweigh = 0, ira = 0) Detrended correspondence analysis with 26 segments. Rescaling of axes with 4 iterations. DCA1 DCA2 DCA3 DCA4 Eigenvalues 0.6688 0.5387 0.4822 0.3752 Decorana values 0.7912 0.5795 0.4145 0.2931 Axis lengths5.9974 3.7036 3.6121 3.3802 In that situation the graph is still good but the differences between the two clades are little more confused, maybe in the axe II (I mean the vertical one) in this case there is a better separation. What do the Decorana values really mean? ?decorana Basically, in the original DECORANA code the Eigenvalues reported were computed at the wrong stage of the detrending processes. Jari realised this when interfacing the old DECORANA code with R. Jari altered the code to compute the correct Eigenvalues, but chose to also report the values you'd get from DECORANA or Canoco to stop people complaining that vegan was doing DCA incorrectly. And how about the segments? What about them? Do you know how DCA works? The standard detrending breaks the first (D)CA axis into 26 sequential chunks or segements. the 26 is the default, but it can be changed. Within each chunk, the mean trial site score for axis 2 for sites in that chunk is subtracted from the trial axis 2 site scores of the sites in the chunk. This detrending is what gets rid of the arch found in some CA plots and is the reason DCA was invented. How can I do something better? Are you trying to separate the two clades? Do you know a priori which samples belong to which clade? If so, one of the many classification methods in R would be more useful as they look to separate the a priori defined groups best. The methods you have been using thus far aim to represent the dissimilarities between samples best in a low dimensional space. HTH G Many thank you in advance, G. [[alternative HTML version deleted]] ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology