Re: [R-sig-eco] Fwd: how to calculate axis variance in metaMDS, pakage vegan?

2009-12-10 Thread Maria Dulce Subida
Dear Gian,

I'm still not quite sure about the functioning of adonis(). I made some 
proofs and I don't understand how does it calculate the degrees of 
freedom of a nested factor. But this is most probably due to my lack of 
experience with this function (as I told you I usually work with the 
PERMANOVA add-in for PRIMER developed by Marti Jane Anderson and others).
Anyway, in your table of results _I think_ you miss the effect of the 
term Community(Host) [which means: the factor community nested in the 
factor host]. In the table you send me you can only see that there is a 
significant effect of the term Host (which I think it means that the 
communities are significantly different between hosts). Nevertheless you 
still do not know if your communities are significantly different 
between each other, within each host. Now it depends on the hypothesis 
you intend to test. It probably does not make any sense to ask if 
communities differ within each host, but _I think_ you still have to 
include the term Community(Host) in your table. Though, I would wait for 
the opinion of a more experienced user of adonis() and ANOVA testing in 
general.

HTH

Cheers,

Dulce


Maria Dulce Subida

 

~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*

 

Instituto de Ciencias Marinas de Andalucía (ICMAN)

Consejo Superior de Investigaciones Científicas (CSIC)

Campus Universitário Río San Pedro

11510 Puerto Real - Cádiz. España.

 

www.icman.csic.es http://www.icman.csic.es/   0034 
956832612 ext. 316

 

~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*

 



Gian Maria Niccolò Benucci escribió:
 Hi Maria,

 Yes, I think it's right, maybe now I did the correct function, It seems that
 the area effect is also visible...

   
 adonis(ABCDsqrt ~ Host, method=bray, data=env.table, permutations=99,
 
 strata=env.table$Community)

 Call: adonis(formula = ABCDsqrt ~ Host, data = env.table, permutations = 99,
 method=bray, strata = env.table$Community)

 Df SumsOfSqs  MeanSqs  F.Model R2 Pr(F)
 Host   1.0   1.64429  1.64429  5.38984 0.1242   0.01 **
 Residuals 38.0  11.59276  0.30507  0.8758
 Total 39.0  13.23705   1.
 ---
 Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


 don't you think?


 Gian




 2009/12/9 Maria Dulce Subida mdsub...@icman.csic.es

   
  Hi Gian,


 I've never used adonis() [I'm a R beginner] but I've been doing
 multivariate analysis for some time: I usually use PRIMER with the PERMANOVA
 add-in. nMDS followed by PERMANOVA works quite well for me in experimental
 designs similar to yours. But it seems to me that you have nestedness in
 your design and this should be considered when you do adonis(). After a
 quick look to the ?adonis? documentation, *I think* you should state
 strata = env.table$Community in your adonis() function, since your
 community factor is nested within the host factor. Otherwise you're
 getting wrong pseudo-F values as well as wrong p-values.
 Good luck!

 Cheers,

 Dulce


  Maria Dulce Subida



 ~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*



 Instituto de Ciencias Marinas de Andalucía (ICMAN)

 Consejo Superior de Investigaciones Científicas (CSIC)

 Campus Universitário Río San Pedro

 11510 Puerto Real - Cádiz. España.



 www.icman.csic.es   0034 956832612 ext. 316



 ~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*




 Gian Maria Niccolò Benucci escribió:

 Jari, Gavin, Chris, Gabriel and Carsten...

 Many thank you all for your support and kindness... and for your competence
 and experience that could not be ever comparized to mine at least in that
 stuffs...

 Gabriel said: .*..I found this mailing list very helpful many times for my
 own questions, but also very informative when just following the threads on
 other questions...
 *
 I complitely agree about that, so here I am to go deeper inside my
 statistical problems...

 As Gavin argued the plot:



  NMS.2$stress


  [1] 24.53723


  NMS.3$stress


  [1] 16.29226


  NMS.4$stress


  [1] 11.79951


  plot(2:4, c(24.53723, 16.29226, 11.79951), type = b)


  didn't show significally differences...

 ...so as him suggested I did the stressplot() and got shepard graphs...
 (just to specify, sqrtABCD is the square roots transforming of the species
 matrix)



  stressplot(NMS.2)


  Using step-across dissimilarities:
 Too long or NA distances: 230 out of 780 (29.5%)
 Stepping across 780 dissimilarities...


 Non-metric fit, R2=0.94
 Linear fit, R2=0.719



  stressplot(NMS.3)


  Using step-across dissimilarities:
 Too long or NA distances: 230 out of 780 (29.5%)
 Stepping across 780 dissimilarities...


 Non-metric fit, R2=0.973
 Linear fit, R2=0.815



  stressplot(NMS.4)


  Using step-across dissimilarities:
 Too long or NA distances: 230 out of 780 (29.5%)
 Stepping across 780 dissimilarities...

 Non-metric fit, R2=0.986
 Linear fit, R2=0.875

 From this data is clear that the fit is better for 

Re: [R-sig-eco] Fwd: how to calculate axis variance in metaMDS, pakage vegan?

2009-12-09 Thread Gavin Simpson
On Wed, 2009-12-09 at 10:25 +0100, gabriel singer wrote:
 Gian,
 
 You may also want to use betadisper() to check whether the host effect 
 is due to differences in location or dispersion (or both). This is 
 equivalent to checking homogeneity of variance when running a classical 
 ANOVA.
 
 cheers, g

Good point Gabriel, but I'd caution against using betadisper just at the
moment in Vegan. A user, and subsequently confirmed by Jari, notified us
that the default (and currently only) method using the dispersion around
the group centroid (average) and a permutation test was
anti-conservative. Since then Jari has written code to allow us to
include the dispersion around the spatial median within betadisper and
initial tests suggests this has the right Type I error rate in the
permutation test. I had hoped to have included this by now, but having
been under the weather for the past month I have not yet finished
working on it.

An updated version should be on r-forge in the next few days.

G

 
 
 Gian Maria Niccolò Benucci wrote:
  Jari, Gavin, Chris, Gabriel and Carsten...
 
  Many thank you all for your support and kindness... and for your competence
  and experience that could not be ever comparized to mine at least in that
  stuffs...
 
  Gabriel said: .*..I found this mailing list very helpful many times for my
  own questions, but also very informative when just following the threads on
  other questions...
  *
  I complitely agree about that, so here I am to go deeper inside my
  statistical problems...
 
  As Gavin argued the plot:
 

  NMS.2$stress
  
  [1] 24.53723

  NMS.3$stress
  
  [1] 16.29226

  NMS.4$stress
  
  [1] 11.79951

  plot(2:4, c(24.53723, 16.29226, 11.79951), type = b)
  
 
  didn't show significally differences...
 
  ...so as him suggested I did the stressplot() and got shepard graphs...
  (just to specify, sqrtABCD is the square roots transforming of the species
  matrix)
 

  stressplot(NMS.2)
  
  Using step-across dissimilarities:
  Too long or NA distances: 230 out of 780 (29.5%)
  Stepping across 780 dissimilarities...
 
 
  Non-metric fit, R2=0.94
  Linear fit, R2=0.719
 

  stressplot(NMS.3)
  
  Using step-across dissimilarities:
  Too long or NA distances: 230 out of 780 (29.5%)
  Stepping across 780 dissimilarities...
 
 
  Non-metric fit, R2=0.973
  Linear fit, R2=0.815
 

  stressplot(NMS.4)
  
  Using step-across dissimilarities:
  Too long or NA distances: 230 out of 780 (29.5%)
  Stepping across 780 dissimilarities...
 
  Non-metric fit, R2=0.986
  Linear fit, R2=0.875
 
  From this data is clear that the fit is better for the NMS.4 (k=4) also the
  blue points into the graph are more near to red line, less spare around the
  graph space...
 
  But maybe the R2 values of the NMS.2 aren't so bad in correlation terms, are
  they?
 
  In reason of what Gabriel said: *...I personally like a combination of NMDS
  with the permutational MANOVA approach (by Marti Anderson) implemented in
  the function adonis() in vegan. You can use the same dissimilarity measure
  (Bray-Curtis) used for the NMDS and can test the Area vs. the Host
  effect on parasite (was it?) composition. I think that could be a very
  useful complement to an NMDS-derived ordination plot and then you may also
  regard high-stress representations (and that´s what all the
  low-dimensional ordination plots really ARE!) in a different light.*..
 
 

  adonis(sqrtABCD ~ Host*Community, method=bray, data=env.table,
  
  permutations=99)
 
  Call:
  adonis(formula = sqrtABCD ~ Host * Community, data = env.table,
  permutations = 99, method = bray)
 
  Df SumsOfSqs  MeanSqs  F.Model R2 Pr(F)
  Host   1.0   1.64429  1.64429  5.47874 0.1242   0.01 **
  Community  2.0   0.78834  0.39417  1.31337 0.0596   0.23
  Residuals 36.0  10.80441  0.30012  0.8162
  Total 39.0  13.23705   1.
  ---
  Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

 
  ...So, I would explain a little about my datasets:
 
  - the species matrix is done by roots samples in which were counted the
  ectomycorrhizal fungal species present (cells entities are different tips
  individuals);
  - sample where taken into four Area (A,B,C,D). The ares are about 30
  meters far away one to each other;
  - areas A and B are both form Corylus roots while areas C and D are both
  from Ostrya roots.
 
  To be more clear that is the enviromental matix used:
 

  env.table
  
  CommunityHost
  A1  A Corylus
  A2  A Corylus
  A3  A Corylus
  A4  A Corylus
  A5  A Corylus
  A6  A Corylus
  A7  A Corylus
  A8  A Corylus
  A9  A Corylus
  A10 A Corylus
  B1  B Corylus
  B2  B Corylus
  B3  B Corylus
  B4  B Corylus
  B5  B Corylus
  B6  B Corylus
  B7  B Corylus

Re: [R-sig-eco] Fwd: how to calculate axis variance in metaMDS, pakage vegan?

2009-12-08 Thread Chris Habeck
Gian,,

I applaud your continued struggle to understand the best route of analysis.
 If I were you, I would ignore the rude comments made by some and focus on
the constructive replies from others.  Good luck!

Chris Habeck
Ph.D. Candidate
Department of Zoology
University of Wisconsin, Madison
http://habeckecology.wikispaces.com/


2009/12/7 Gian Maria Niccolò Benucci gian.benu...@gmail.com

 Hi Gavin and Hi all,

 I will not go in front of a bus for sure, I not mad, at least I am not
 still
 mad... :)

 I would like to tell you that I am a Ph.D. student, and for what I know,
 Ph.D. student still have to understand things studing those from whom wrote
 before them...

 Isac Newton became famous not only for his science but also for a famous
 phrase that, if I don't remember it bad, act like this : If I have seen so
 much far away is because I stand on shoulders of Giants... I think that it
 needs any comment, and express itself the concept...

 So, I am so sorry, I also don't like the me to attitude, but you don't
 know how is my reality here, and I can assure you that also If I am still a
 student, I am alone in my research, and If have a tutor and boss for
 italian rules I don't have a boss for statistics, couse none could help me
 on that...
 So what could I do if I don't take models in already published literature?

 Anyway, I don't want to seem like the victim, I have a brain that works and
 I am doing my best to understand and improve my knowledge and at least lean
 and grow, for sure, step by step, and with a big humility, in science and
 in
 this case in statistics...

 Anyway... For continuing the brainstorm if I can...The Host effect is what
 I
 think is more interesting for the ecological point of view of my trials
 also
 becasue the 4 communities have two by two the same host, I mean A and B,
 Corylus, while B and C, Ostrya...
 If I plot the factors of the envifit into the graph and the evidence of
 separation seems clear...

 That's are my metaMDS with 2 and 3 dimensions:

  NMS.1

 Call:
 metaMDS(comm = sqrtABCD, distance = bray, k = 2, trymax = 100,
 autotransform = F)

 Nonmetric Multidimensional Scaling using isoMDS (MASS package)

 Data: sqrtABCD
 Distance: bray shortest

 Dimensions: 2
 Stress: 24.54342
 Two convergent solutions found after 18 tries
 Scaling: centring, PC rotation, halfchange scaling
 Species: expanded scores based on ‘sqrtABCD’

  NMS.ABCD.2ef

 ***FACTORS:

 Centroids:
  NMDS1   NMDS2
 CommunityA  -0.3271  0.1984
 CommunityB  -0.1956  0.1768
 CommunityC   0.2520 -0.2847
 CommunityD   0.2706 -0.0905
 HostCorylus -0.2613  0.1876
 HostOstrya   0.2613 -0.1876

 Goodness of fit:
  r2   Pr(r)
 Community 0.1897 0.017982 *
 Host  0.1778 0.001998 **
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 P values based on 1000 permutations.
  NMS.1.3

 Call:
 metaMDS(comm = sqrtABCD, distance = bray, k = 3, trymax = 100,
 autotransform = F)

 Nonmetric Multidimensional Scaling using isoMDS (MASS package)

 Data: sqrtABCD
 Distance: bray shortest

 Dimensions: 3
 Stress: 16.29226
 Two convergent solutions found after 6 tries
 Scaling: centring, PC rotation, halfchange scaling
 Species: expanded scores based on ‘sqrtABCD’

  NMS.ABCD.3ef

 ***FACTORS:

 Centroids:
  NMDS1   NMDS2   NMDS3
 CommunityA   0.3881 -0.2702  0.1536
 CommunityB   0.1407 -0.2344  0.0197
 CommunityC  -0.2053  0.3566 -0.0219
 CommunityD  -0.3235  0.1480 -0.1514
 HostCorylus  0.2644 -0.2523  0.0866
 HostOstrya  -0.2644  0.2523 -0.0866

 Goodness of fit:
  r2   Pr(r)
 Community 0.1798 0.005994 **
 Host  0.1581 0.000999 ***
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 P values based on 1000 permutations.
 

 I got 10 sample units for each community data (40 in total)

 You said at the end : *I do wonder if you are not hitting the curse of
 dimensionality here?*

 Can you explain me what do you mean for hitting the curse of
 dimensionality if I am not so demanding...

 ... and then: *it would be nice to look at the ordination but how you do
 that I don't know.*

 I would be glad if you see the graphs of my ordinations, Can I send them to
 you? That would be great... let me know about that. I used to plot in this
 way:

  plot(NMS.1, type=n, dis= sp)
  ordisymbol(NMS.1, env.table, Host, legend=T)

 Anyway I have to admit that with 2 and at least 3 dimensions the points
 into
 the ordinantion plot are better separated in reasons to the data matrix, so
 what to do? better fittind of points ant bigger stress or the contrary?

 I think is enough, thank you so much for your help, I'll appreciate any
 comments! :)

 Gian






  And thank you all for the kind responses...
  
   I do not want to torture myself for sure... :) I red (lot of)
  publications
   about fungal community ecology studies (soil fungi), my research field
   indeed, and all uses NMDS or DCA as ordination techniques... So, I am
  only
   

Re: [R-sig-eco] Fwd: how to calculate axis variance in metaMDS, pakage vegan?

2009-12-08 Thread Gavin Simpson
On Mon, 2009-12-07 at 20:10 +0100, Gian Maria Niccolò Benucci wrote:
 Hi Gavin and Hi all,
 
 I will not go in front of a bus for sure, I not mad, at least I am not still
 mad... :)
 
 I would like to tell you that I am a Ph.D. student, and for what I know,
 Ph.D. student still have to understand things studing those from whom wrote
 before them...

That's fine, but temper that with a realisation that not everyone knows
what they are doing numerically. So be critical about what you read,
learn about the methods and what they do.

snip /

 Anyway... For continuing the brainstorm if I can...The Host effect is what I
 think is more interesting for the ecological point of view of my trials also
 becasue the 4 communities have two by two the same host, I mean A and B,
 Corylus, while B and C, Ostrya...
 If I plot the factors of the envifit into the graph and the evidence of
 separation seems clear...
 
 That's are my metaMDS with 2 and 3 dimensions:

Thanks for these: one way of trying to choose a dimensionality for the
solution is to plot the stress as a function of k (k on the x-axis,
stress on the y) - this is often called a screeplot as you are looking
for a dramatic change in slope. I took your stresses and plotted them
against k (crudely):

plot(2:4, c(24.54342, 16.29226, 11.68632), type = b)

and doesn't seem to be any noticeable change here, so not much help
there.

Looking at the goodness of fit stats, the story they tell doesn't really
change much depending on whether you use 2,3, or 4 dimensions. So
perhaps stick with 2 in that case.

Also, try:

stressplot(MOD)

where mod is the object returned by metaMDS. The stressplot plots your
original dissimilarities against dissimilarities derived from the nMDS
configuration. It also shows the monotonic regression fit and a few
goodness of fit criteria. You could evaluate the models with different k
using these plots.

 
  NMS.1
 
 Call:
 metaMDS(comm = sqrtABCD, distance = bray, k = 2, trymax = 100,
 autotransform = F)
snip /
 I got 10 sample units for each community data (40 in total)
 
 You said at the end : *I do wonder if you are not hitting the curse of
 dimensionality here?*
 
 Can you explain me what do you mean for hitting the curse of
 dimensionality if I am not so demanding...

Yep, sorry, that was a bit cryptic. Curse of dimensionality is a phrase
coined by Belman (1961) and refers to the problem of defining
localness in high dimensions; neighbourhoods with a fixed number of
samples become less local as the number of dimensions in creases.
basically, if you have a number of dimensions, the more dimensions you
have the easier it is for a sample to lie a long way from the rest of
the data along a single dimension and thus have large dissimilarity.

This doesn't appear to be the case here though; 4 is low dimensionality
(hence my wondering if this was or wasn't a problem), but when you'd
only shown the k=4 data, I did wonder if the low r2 was due to you
points being widely spread along one of the 4D; i.e. was the more
complex solution leading to the low r2?

By looks of things, the low r2 is probably more to do with the small,
but significant, effects of your two covariates.

 
 ... and then: *it would be nice to look at the ordination but how you do
 that I don't know.*
 
 I would be glad if you see the graphs of my ordinations, Can I send them to
 you? That would be great... let me know about that. I used to plot in this
 way:
 
  plot(NMS.1, type=n, dis= sp)
  ordisymbol(NMS.1, env.table, Host, legend=T)
 
 Anyway I have to admit that with 2 and at least 3 dimensions the points into
 the ordinantion plot are better separated in reasons to the data matrix, so
 what to do? better fittind of points ant bigger stress or the contrary?

If this were me, seeing as the interpretation/results don't change, I'd
probably stick with k=2 so you can easily draw the ordination for
presentation in your phd work or future papers.

HTH

G

 
 I think is enough, thank you so much for your help, I'll appreciate any
 comments! :)
 
 Gian
 
 
 
 
 
 
  And thank you all for the kind responses...
  
   I do not want to torture myself for sure... :) I red (lot of)
  publications
   about fungal community ecology studies (soil fungi), my research field
   indeed, and all uses NMDS or DCA as ordination techniques... So, I am
  only
   trying to do my best useing R for calculating them...
 
  Would you walk in front of a bus if you saw lots of other people doing
  it? I doubt it. This kind of me to attitude to science is quite
  demoralising when reviewing manuscripts and reading the literature.
 
  DCA was invented to solve a specific problem with CA - namely the arch
  artefact. I forget whether this is in Jari's public lecture notes, vegan
  vignettes/tutorials or in one of his lectures on a course we taught
  together, but DCA replaces the arch artefact with other artefacts that
  make the points look like a trumpet or a diamond in ordination space.
 
  Why DCA is used 

Re: [R-sig-eco] Fwd: how to calculate axis variance in metaMDS, pakage vegan?

2009-12-08 Thread gabriel singer

Hi Gian and others,

I think we better stop worrying about subjective interpretations of 
emotional backgrounds of what in other aspects are absolutely helpful 
discussion threads... I guess part of the challenge on this mailing list 
is to span the whole range of expertise with useful 
discussion/output/help for everyone, be it a student or an expert. I 
found this mailing list very helpful many times for my own questions, 
but also very informative when just following the threads on other 
questions...


Gian, in my opinion, 2 dimensions are absolutely ok, especially if they 
do visualize an (obvious) effect in your study. In other words, if 2 
dimensions show you an effect of Host but not of Area, the effect is 
obviously strong enough. Then I would not worry about stress too much. 
However, there may still be an effect of Area, maybe visible in more 
dimensions, but it´s obviously of minor importance.


I personally like a combination of NMDS with the permutational MANOVA 
approach (by Marti Anderson) implemented in the function adonis() in 
vegan. You can use the same dissimilarity measure (Bray-Curtis) used for 
the NMDS and can test the Area vs. the Host effect on parasite (was 
it?) composition. I think that could be a very useful complement to an 
NMDS-derived ordination plot and then you may also regard high-stress 
representations (and that´s what all the low-dimensional ordination 
plots really ARE!) in a different light.


Complementations like the permanova are in my opinion better than trying 
the full spectrum of ordination methods until finally some kind of 
pattern gets uncovered (comes quite close to the much too often 
encountered data-fishing expeditions). And though copying analysis 
strategies is probably not quite like throwing yourself in front of a 
bus, there is some benefit in using what people working in a specific 
field regard their standard methods (wait for the reviews to discover 
this). In any case, a responsible choice for a type of analysis is 
oriented along the study design and the data at hand.


cheers, gabriel

--
Dr. Gabriel Singer
Department of Freshwater Ecology - University of Vienna
and Wassercluster Lunz Biologische Station GmbH
+43-(0)664-1266747
gabriel.sin...@univie.ac.at



Gian Maria Niccolò Benucci wrote:

Hi Gavin and Hi all,

I will not go in front of a bus for sure, I not mad, at least I am not still
mad... :)

I would like to tell you that I am a Ph.D. student, and for what I know,
Ph.D. student still have to understand things studing those from whom wrote
before them...

Isac Newton became famous not only for his science but also for a famous
phrase that, if I don't remember it bad, act like this : If I have seen so
much far away is because I stand on shoulders of Giants... I think that it
needs any comment, and express itself the concept...

So, I am so sorry, I also don't like the me to attitude, but you don't
know how is my reality here, and I can assure you that also If I am still a
student, I am alone in my research, and If have a tutor and boss for
italian rules I don't have a boss for statistics, couse none could help me
on that...
So what could I do if I don't take models in already published literature?

Anyway, I don't want to seem like the victim, I have a brain that works and
I am doing my best to understand and improve my knowledge and at least lean
and grow, for sure, step by step, and with a big humility, in science and in
this case in statistics...

Anyway... For continuing the brainstorm if I can...The Host effect is what I
think is more interesting for the ecological point of view of my trials also
becasue the 4 communities have two by two the same host, I mean A and B,
Corylus, while B and C, Ostrya...
If I plot the factors of the envifit into the graph and the evidence of
separation seems clear...

That's are my metaMDS with 2 and 3 dimensions:

  

NMS.1



Call:
metaMDS(comm = sqrtABCD, distance = bray, k = 2, trymax = 100,
autotransform = F)

Nonmetric Multidimensional Scaling using isoMDS (MASS package)

Data: sqrtABCD
Distance: bray shortest

Dimensions: 2
Stress: 24.54342
Two convergent solutions found after 18 tries
Scaling: centring, PC rotation, halfchange scaling
Species: expanded scores based on ‘sqrtABCD’

  

NMS.ABCD.2ef



***FACTORS:

Centroids:
  NMDS1   NMDS2
CommunityA  -0.3271  0.1984
CommunityB  -0.1956  0.1768
CommunityC   0.2520 -0.2847
CommunityD   0.2706 -0.0905
HostCorylus -0.2613  0.1876
HostOstrya   0.2613 -0.1876

Goodness of fit:
  r2   Pr(r)
Community 0.1897 0.017982 *
Host  0.1778 0.001998 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
P values based on 1000 permutations.
  

NMS.1.3



Call:
metaMDS(comm = sqrtABCD, distance = bray, k = 3, trymax = 100,
autotransform = F)

Nonmetric Multidimensional Scaling using isoMDS (MASS package)

Data: sqrtABCD
Distance: bray shortest

Dimensions: 3
Stress: 16.29226
Two 

Re: [R-sig-eco] Fwd: how to calculate axis variance in metaMDS, pakage vegan?

2009-12-07 Thread Gavin Simpson
On Sun, 2009-12-06 at 17:56 +0100, Gian Maria Niccolò Benucci wrote:
 Hi all again,
 
 And thank you all for the kind responses...
 
 I do not want to torture myself for sure... :) I red (lot of) publications
 about fungal community ecology studies (soil fungi), my research field
 indeed, and all uses NMDS or DCA as ordination techniques... So, I am only
 trying to do my best useing R for calculating them...

Would you walk in front of a bus if you saw lots of other people doing
it? I doubt it. This kind of me to attitude to science is quite
demoralising when reviewing manuscripts and reading the literature.

DCA was invented to solve a specific problem with CA - namely the arch
artefact. I forget whether this is in Jari's public lecture notes, vegan
vignettes/tutorials or in one of his lectures on a course we taught
together, but DCA replaces the arch artefact with other artefacts that
make the points look like a trumpet or a diamond in ordination space.

Why DCA is used as a default instead of a special case escapes me. You
really shouldn't use DCA at all if you can get away with it as it is
doing some nasty things to your data.

Alternatives; i) NMDS ii) PCA after application of a transformation
(Legendre  Gallagher 2001, Oecologia). And there are probably others...

 
 What I need now is a good environmental interpretation of my work...
 
 Then I found the fantastic Jari's pdf about Multivariate Analysis of
 Ecological Communities in R: vegan tutorial and I went to the passage about
 factors and vectors fitting...
 That's my R code:
 
  NMS.ABCDsqrt
 
 Call:
 metaMDS(comm = sqrtABCD, distance = bray, k = 4, trymax = 100,
 autotransform = F)
 
 Nonmetric Multidimensional Scaling using isoMDS (MASS package)
 
 Data: sqrtABCD
 Distance: bray shortest
 
 Dimensions: 4
 Stress: 11.68632

What was the stress with k = 2 and k = 3. As Jari has already mentioned,
how are you going to interpret and visualise this 4D configuration of
points (you can't plot NMDS1 vs NMDS2, NMDS1 vs NMDS3 etc. for reasons
explained to you earlier in this thread).

 Two convergent solutions found after 2 tries
 Scaling: centring, PC rotation, halfchange scaling
 Species: expanded scores based on sqrtABCD
 
  envfit(NMS.ABCDsqrt, env.table, permu=1000) -NMS.ABCDsqrtef
  NMS.ABCDsqrtef
 
 ***FACTORS:
 
 Centroids:
   NMDS1   NMDS2   NMDS3   NMDS4
 CommunityA  -0.3821  0.3822 -0.1173 -0.1232
 CommunityB  -0.1849  0.2748  0.0076 -0.0720
 CommunityC   0.2206 -0.4261 -0.0505  0.1197
 CommunityD   0.3465 -0.2310  0.1603  0.0756
 HostCorylus -0.2835  0.3285 -0.0549 -0.0976
 HostOstrya   0.2835 -0.3285  0.0549  0.0976
 
 Goodness of fit:
   r2   Pr(r)
 Community 0.2009 0.001998 **
 Host  0.1818 0.000999 ***
 ---
 Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1
 P values based on 1000 permutations.
 
  names(NMS.ABCDsqrtef)
 [1] vectors factors
  NMS.ABCDsqrtef$vectors
 NULL
 
 
 I have an enviromental matrix (env.table) that contains only area (A,B,C,D)
 and tree species (Corylus sp. and Ostrya sp. ) differentiation, so I have
 the area and the tree species of each samples is related to...
 I could plot factors on the graph but not vectors because there aren't
 vectors in reason to the absence of numerical data in the env. matrix...
 isn't it right? Aren't the R2 values too low?

Did you read ?envfit ? It states:

 (r^2). For factors this is defined as r^2 = 1 - ss_w/ss_t, where
 ss_w and ss_t are within-group and total sums of squares.

So this statistic here is looking at how constrained within the 4D space
the levels of each factor are in relation to the overall spread of the
points. This looks to me like some evidence for grouping of your sites
on basis of Community and stronger evidence for Host. The effect is
small but significant. I do wonder if you are not hitting the curse of
dimensionality here?

The interpretation will depend on the number of samples. It would be
nice to look at the ordination but how you do that I don't know.

G

 
 Many many thank you for answering...
 
 Gian
 
 
 
 
 
 
 
 
 
 
  Jari,
  
   I am here again ... :)
   So, to try having a comparison of the real goodness of my metaMDS data I
   tried to perform a DCA (with same input table)
   Then please forgive me if I do somethign wrong with it... That's my R
  code:
 
  Why DCA? What lead you to torture your data so?
 
   decorana(sqrtABCD, iweigh=0, ira=0) - DCA.1
DCA.1
  
   Call:
   decorana(veg = sqrtABCD, iweigh = 0, ira = 0)
  
   Detrended correspondence analysis with 26 segments.
   Rescaling of axes with 4 iterations.
  
 DCA1   DCA2   DCA3   DCA4
   Eigenvalues 0.6688 0.5387 0.4822 0.3752
   Decorana values 0.7912 0.5795 0.4145 0.2931
   Axis lengths5.9974 3.7036 3.6121 3.3802
  
   
  
   In that situation the graph is still good but the differences between the
   two clades are little more confused, maybe in the axe II (I mean the
   vertical one) in this case 

Re: [R-sig-eco] Fwd: how to calculate axis variance in metaMDS, pakage vegan?

2009-12-07 Thread Gian Maria Niccolò Benucci
Hi Gavin and Hi all,

I will not go in front of a bus for sure, I not mad, at least I am not still
mad... :)

I would like to tell you that I am a Ph.D. student, and for what I know,
Ph.D. student still have to understand things studing those from whom wrote
before them...

Isac Newton became famous not only for his science but also for a famous
phrase that, if I don't remember it bad, act like this : If I have seen so
much far away is because I stand on shoulders of Giants... I think that it
needs any comment, and express itself the concept...

So, I am so sorry, I also don't like the me to attitude, but you don't
know how is my reality here, and I can assure you that also If I am still a
student, I am alone in my research, and If have a tutor and boss for
italian rules I don't have a boss for statistics, couse none could help me
on that...
So what could I do if I don't take models in already published literature?

Anyway, I don't want to seem like the victim, I have a brain that works and
I am doing my best to understand and improve my knowledge and at least lean
and grow, for sure, step by step, and with a big humility, in science and in
this case in statistics...

Anyway... For continuing the brainstorm if I can...The Host effect is what I
think is more interesting for the ecological point of view of my trials also
becasue the 4 communities have two by two the same host, I mean A and B,
Corylus, while B and C, Ostrya...
If I plot the factors of the envifit into the graph and the evidence of
separation seems clear...

That's are my metaMDS with 2 and 3 dimensions:

 NMS.1

Call:
metaMDS(comm = sqrtABCD, distance = bray, k = 2, trymax = 100,
autotransform = F)

Nonmetric Multidimensional Scaling using isoMDS (MASS package)

Data: sqrtABCD
Distance: bray shortest

Dimensions: 2
Stress: 24.54342
Two convergent solutions found after 18 tries
Scaling: centring, PC rotation, halfchange scaling
Species: expanded scores based on ‘sqrtABCD’

 NMS.ABCD.2ef

***FACTORS:

Centroids:
  NMDS1   NMDS2
CommunityA  -0.3271  0.1984
CommunityB  -0.1956  0.1768
CommunityC   0.2520 -0.2847
CommunityD   0.2706 -0.0905
HostCorylus -0.2613  0.1876
HostOstrya   0.2613 -0.1876

Goodness of fit:
  r2   Pr(r)
Community 0.1897 0.017982 *
Host  0.1778 0.001998 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
P values based on 1000 permutations.
 NMS.1.3

Call:
metaMDS(comm = sqrtABCD, distance = bray, k = 3, trymax = 100,
autotransform = F)

Nonmetric Multidimensional Scaling using isoMDS (MASS package)

Data: sqrtABCD
Distance: bray shortest

Dimensions: 3
Stress: 16.29226
Two convergent solutions found after 6 tries
Scaling: centring, PC rotation, halfchange scaling
Species: expanded scores based on ‘sqrtABCD’

 NMS.ABCD.3ef

***FACTORS:

Centroids:
  NMDS1   NMDS2   NMDS3
CommunityA   0.3881 -0.2702  0.1536
CommunityB   0.1407 -0.2344  0.0197
CommunityC  -0.2053  0.3566 -0.0219
CommunityD  -0.3235  0.1480 -0.1514
HostCorylus  0.2644 -0.2523  0.0866
HostOstrya  -0.2644  0.2523 -0.0866

Goodness of fit:
  r2   Pr(r)
Community 0.1798 0.005994 **
Host  0.1581 0.000999 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
P values based on 1000 permutations.


I got 10 sample units for each community data (40 in total)

You said at the end : *I do wonder if you are not hitting the curse of
dimensionality here?*

Can you explain me what do you mean for hitting the curse of
dimensionality if I am not so demanding...

... and then: *it would be nice to look at the ordination but how you do
that I don't know.*

I would be glad if you see the graphs of my ordinations, Can I send them to
you? That would be great... let me know about that. I used to plot in this
way:

 plot(NMS.1, type=n, dis= sp)
 ordisymbol(NMS.1, env.table, Host, legend=T)

Anyway I have to admit that with 2 and at least 3 dimensions the points into
the ordinantion plot are better separated in reasons to the data matrix, so
what to do? better fittind of points ant bigger stress or the contrary?

I think is enough, thank you so much for your help, I'll appreciate any
comments! :)

Gian






 And thank you all for the kind responses...
 
  I do not want to torture myself for sure... :) I red (lot of)
 publications
  about fungal community ecology studies (soil fungi), my research field
  indeed, and all uses NMDS or DCA as ordination techniques... So, I am
 only
  trying to do my best useing R for calculating them...

 Would you walk in front of a bus if you saw lots of other people doing
 it? I doubt it. This kind of me to attitude to science is quite
 demoralising when reviewing manuscripts and reading the literature.

 DCA was invented to solve a specific problem with CA - namely the arch
 artefact. I forget whether this is in Jari's public lecture notes, vegan
 vignettes/tutorials or in one of his lectures on a course we taught
 together, but DCA replaces the 

Re: [R-sig-eco] Fwd: how to calculate axis variance in metaMDS, pakage vegan?

2009-12-06 Thread Gian Maria Niccolò Benucci
Hi all again,

And thank you all for the kind responses...

I do not want to torture myself for sure... :) I red (lot of) publications
about fungal community ecology studies (soil fungi), my research field
indeed, and all uses NMDS or DCA as ordination techniques... So, I am only
trying to do my best useing R for calculating them...

What I need now is a good environmental interpretation of my work...

Then I found the fantastic Jari's pdf about Multivariate Analysis of
Ecological Communities in R: vegan tutorial and I went to the passage about
factors and vectors fitting...
That's my R code:

 NMS.ABCDsqrt

Call:
metaMDS(comm = sqrtABCD, distance = bray, k = 4, trymax = 100,
autotransform = F)

Nonmetric Multidimensional Scaling using isoMDS (MASS package)

Data: sqrtABCD
Distance: bray shortest

Dimensions: 4
Stress: 11.68632
Two convergent solutions found after 2 tries
Scaling: centring, PC rotation, halfchange scaling
Species: expanded scores based on ‘sqrtABCD’

 envfit(NMS.ABCDsqrt, env.table, permu=1000) -NMS.ABCDsqrtef
 NMS.ABCDsqrtef

***FACTORS:

Centroids:
  NMDS1   NMDS2   NMDS3   NMDS4
CommunityA  -0.3821  0.3822 -0.1173 -0.1232
CommunityB  -0.1849  0.2748  0.0076 -0.0720
CommunityC   0.2206 -0.4261 -0.0505  0.1197
CommunityD   0.3465 -0.2310  0.1603  0.0756
HostCorylus -0.2835  0.3285 -0.0549 -0.0976
HostOstrya   0.2835 -0.3285  0.0549  0.0976

Goodness of fit:
  r2   Pr(r)
Community 0.2009 0.001998 **
Host  0.1818 0.000999 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
P values based on 1000 permutations.

 names(NMS.ABCDsqrtef)
[1] vectors factors
 NMS.ABCDsqrtef$vectors
NULL


I have an enviromental matrix (env.table) that contains only area (A,B,C,D)
and tree species (Corylus sp. and Ostrya sp. ) differentiation, so I have
the area and the tree species of each samples is related to...
I could plot factors on the graph but not vectors because there aren't
vectors in reason to the absence of numerical data in the env. matrix...
isn't it right? Aren't the R2 values too low?

Many many thank you for answering...

Gian










 Jari,
 
  I am here again ... :)
  So, to try having a comparison of the real goodness of my metaMDS data I
  tried to perform a DCA (with same input table)
  Then please forgive me if I do somethign wrong with it... That's my R
 code:

 Why DCA? What lead you to torture your data so?

  decorana(sqrtABCD, iweigh=0, ira=0) - DCA.1
   DCA.1
 
  Call:
  decorana(veg = sqrtABCD, iweigh = 0, ira = 0)
 
  Detrended correspondence analysis with 26 segments.
  Rescaling of axes with 4 iterations.
 
DCA1   DCA2   DCA3   DCA4
  Eigenvalues 0.6688 0.5387 0.4822 0.3752
  Decorana values 0.7912 0.5795 0.4145 0.2931
  Axis lengths5.9974 3.7036 3.6121 3.3802
 
  
 
  In that situation the graph is still good but the differences between the
  two clades are little more confused, maybe in the axe II (I mean the
  vertical one) in this case there is a better separation.
  What do the Decorana values really mean?

 ?decorana

 Basically, in the original DECORANA code the Eigenvalues reported were
 computed at the wrong stage of the detrending processes. Jari realised
 this when interfacing the old DECORANA code with R. Jari altered the
 code to compute the correct Eigenvalues, but chose to also report the
 values you'd get from DECORANA or Canoco to stop people complaining that
 vegan was doing DCA incorrectly.

   And how about the segments?

 What about them? Do you know how DCA works? The standard detrending
 breaks the first (D)CA axis into 26 sequential chunks or segements. the
 26 is the default, but it can be changed. Within each chunk, the mean
 trial site score for axis 2 for sites in that chunk is subtracted from
 the trial axis 2 site scores of the sites in the chunk. This detrending
 is what gets rid of the arch found in some CA plots and is the reason
 DCA was invented.

 
  How can I do something better?

 Are you trying to separate the two clades? Do you know a priori which
 samples belong to which clade? If so, one of the many classification
 methods in R would be more useful as they look to separate the a priori
 defined groups best. The methods you have been using thus far aim to
 represent the dissimilarities between samples best in a low dimensional
 space.

 HTH

 G

 
  Many thank you in advance,
 
  G.


[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology