Re: [R-sig-eco] 'grouping' grouping variable

2011-12-16 Thread gabriel singer

hi jakub,

I would suggest starting with standardizing your environmental variables 
with scale(), then compute Euclidean distances with e.g. vegdist() in 
{vegan} and run a cluster analysis on the distance matrix with hclust(). 
Choose a cutoff for minimum dissimilarity and group your sites 
accordingly. If you happen to have an idea about the number of groups 
you expect, then kmeans() may be an alternative.


cheers, gabriel

On 12/16/11 1:14 AM, Jakub Szymkowiak wrote:

Hello,
I have a problem and I don't know how can I solve it.
I have one grouping variable (16 regions in my country). Every region 
is described by several environmental variables, in example arable 
fields area, woodland area or meadows area.
I want to group this regions to small number of groups so that, the 
similar regions (in terms of my environmental variables) will be in 
the same group.

Any cues, how can I solve this?

Cheers,
Jakub

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology



___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] post hoc in Kruskal Wallis

2011-11-23 Thread gabriel singer

Jakub,

Do a pairwise wilcoxon(), then adjust P-values with p.adjust(). This 
would be the classical frequentist follow-up.


cheers, gabriel

On 11/23/11 6:21 PM, Jakub Szymkowiak wrote:

Hi,
does anyone know, how can I perform post-hoc tests (especially Least 
Significant Difference and Sheffe Test) for results from 
Kruskal-Wallis test? In KruskaI-Wallis test I found some significant 
differences between tested groups, but I want to know between which 
groups this difference is really signifficant.


Cheers,
Jakub

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology



--
Dr. Gabriel Singer
Department of Limnology - University of Vienna
and Wassercluster Lunz Biologische Station GmbH
+43-(0)664-1266747
gabriel.sin...@univie.ac.at

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] interpreting adonis results

2011-11-17 Thread gabriel singer
... dangerous wording, there could in fact be a location effect of 
'location' and/or a dispersion effect of 'location'.


Gian, I suggest you add a test of a dispersion effect using the function 
betadisper(), then you know a bit more about the type of effect.


gabriel

On 11/16/11 11:02 PM, Gavin Simpson wrote:

On Wed, 2011-11-16 at 03:43 +0100, Gian Maria Niccolò Benucci wrote:

Hi all,

I had 84 samples collected in 7 different sites.
In each sample were individuated the different fungal species and recorded.
I would test if exist a real difference between the sites and if exist a
sort of site effect that structure the fungal communities...
Then, I did adonis test


adonis(community.sq ~ location, data=env.table, permutations=999)

Call:
adonis(formula = community.sq ~ location, data = env.table, permutations =
999)

   Df SumsOfSqs MeanSqs F.Model  R2 Pr(>F)
location   612.593 2.09886  6.8867 0.34922  0.001 ***
Residuals 7723.467 0.30477 0.65078
Total 8336.060 1.0
---
Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1



The significance is  R2=0.349 at P=0.001
Can I assure that exist a strong site effect in structuring the communities
in each site?

Depends. The test is one of no effect of `location`. You have found
evidence against this hypothesis and thus could reject this hypothesis,
instead accepting the alternative hypothesis that there is an effect of
`location`. As to the strength of this effect? ~35% of the sums of
squares can be explained by `location`. Substantially more of the
variance remains unexplained. As I know nothing about your subject area,
I am unable to comment further on the strength of the relationship.

Seeing as many ecologists whose work I read would say an effect is
significant if the p-value was>= 0.05. Not that I subscribe to this way
or working, but by that criterion, you have identified a significant
`location` effect.

HTH

G


Thanks for helping,

G.

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


--
Dr. Gabriel Singer
Department of Limnology - University of Vienna
and Wassercluster Lunz Biologische Station GmbH
+43-(0)664-1266747
gabriel.sin...@univie.ac.at

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Change in rotated NMDS scores as a response variable

2011-03-11 Thread gabriel singer
hmmm... I think Gavin´s approach definitely has more power, though I 
don´t quite see why the original idea should not work. Orthogonality is 
not an implicit feature of an NMDS but it´s also not "prevented"...
First, I think quite often NMDS still reproduces/extracts orthogonal 
features of a dataset.
Second, even if NMDS does not care for orthogonality, a "specific" 
feature of the dataset (say, the "moisture information" in herb data) 
can behave more or less linearly or at least monotonic in *any* 
direction on a 2D-plane, in which case the extraction of a rotated axis 
makes complete sense. However, even in this case an ordisurf fit will 
greatly help to understand if that´s a legitimate and reasonable 
approach as I understand.


gabriel


On 3/10/11 1:04 PM, Gavin Simpson wrote:

On Fri, 2011-02-18 at 10:41 -0800, Erik Frenzel wrote:

Hello all,
I'm interested in adapting a technique from a recent paper

Harrison, S., E. I. Damschen and J. B. Grace 2010. Ecological
contingency in the effects of climate change on forest herbs.
Proceedings of the National Academy of Sciences (USA), 107:
19362-19367.

In which a plot's change in NMDS scores over time was used as a
response variable:

"To measure the overall resemblance of any given herb community to
communities found in warm (steep, southerly) versus cool (moderate,
northerly) topographic microclimates, we used an ordination approach
(also see 28). We ordinated the
herb data using NMS ordination in PC-ORD version 4.14 (39), excluding
species found in<5% of samples. We rotated axis 1 of the ordination
to maximize its correlation with Whittaker’s topographic moisture
gradient, so that a low axis 1 score indicated a community in a mesic
environment such as a moderate north-facing slope, and a high axis 1
score indicated a community in a warm environment such as a steep
south-facing slope. Under a warming climate, we expect the community
at any given site to show a higher axis 1 score in 2007–2009 than in
1949–1951, indicating that herb composition has shifted over time in
the same direction that composition changes over space from mesic
(cooler and moister) to xeric (warmer and drier) topographic
microclimates. For each site we calculated the difference between its
1949–1951 and 2007–2009 axis 1 ordination scores. In this case, a high
value means a community that has shifted to become more dominated by
xeric-adapted species."

Jari Oksanen has a post on the the r-forge page
(https://r-forge.r-project.org/forum/message.php?msg_id=1311&group_id=68)
warning against using rotated NMDS scores in a Structural Equation
Model. Are there problems with using a "change in scores" as a
response variable in this kind of hypothesis testing?

I'm genuinely underwhelmed by this approach. i) there isn't such a thing
as nMDS axes so does it make sense to take some 1-d coordinate system
out of a 2-d coordinate system and relate it to an external variable? It
would be like trying to identify patterns in all the cities of the world
on the basis of what line of longitude they happened to lie on. Where
this sort of thing does make sense is in methods that do identify
orthogonal components from a data matrix such that axis 1 explains a
component of the variation in the data, and axis 2 another, different
(orthogonal) component of the variation.

If this were me, I would have taken the 2-d nMDS configuration and
fitted a response surface for Whittaker's topographic moisture into the
ordination (using ordisurf) and then take the fitted values of the
response surface for each site as the species-related topographic
moisture "information", which could be plotted as a function of time.

HTH

G


This was done in PC Ord.  Has anyone used "metaMDSrotate" in vegan to
do this kind of analysis in R? Does anyone have any examples or code
they'd be willing to share or point me to?

Thanks,
Erik

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


--
Dr. Gabriel Singer
Department of Limnology - University of Vienna
and Wassercluster Lunz Biologische Station GmbH
+43-(0)664-1266747
gabriel.sin...@univie.ac.at

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


[R-sig-eco] cluster defined by environment followed by mrpp

2011-02-22 Thread gabriel singer

Hi list,

Conducting sort of an opinion poll among list members.
Start with two matrices, one environmental, one species, same sites. I 
wondered what people think of defining groups by a cluster analysis 
based on the environmental variables (say, hclust or similar). Then 
testing for a difference among those groups with regard to the second 
species data set (say, adonis or mrpp).
I guess the spatial people will feel hurt at least? The strategy seems 
to be quite common, though. Would be nice to hear opinions.


cheers, gabriel

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] is 1 hour long enough to assume independance?

2010-07-25 Thread gabriel singer

hi chris,

I think you are not quite giving us enough information to assess this 
situation. Otherwise, I´d think that any data coming from ONE dingo 
(i.e. one radiocollar) will never be independent, the 1 hour is not the 
problem. Or can you tell otherwise?


gab

On 7/21/10 3:52 AM, Chris Howden wrote:

Morning All,

I'm doing a Resource Selection Function Analysis on dingos and we are
having a bit of a debate on independence.

We're using a landscape unit of 40x40m (from a GIS) and have radio
collared data every 1 hour. So we can put a dingo in a specific 40x40 grid
very hour.

I'm concerned about the independence of the data since its only 1 hour
apart.

As such I'm proposing we split each day up into 4 periods (dawn, dusk,
night and day)  and randomly sample 1 fix from each. I feel that this data
will be independent. There is also evidence that dingos act differently in
these 4 periods, which further increases the chance of independence.

I was wondering what people thought?

Is 1 hour far enough apart to assume independence? Is splitting the day
into 4 periods and randomly sampling far enough apart to assume
independence? Or is even that too close, and should it be further apart,
like 1 day.


Chris Howden
Founding Partner
Tricky Solutions
Tricky Solutions 4 Tricky Problems
Evidence Based Strategic Development, IP development, Data Analysis,
Modelling, and Training
(mobile) 0410 689 945
(fax / office) (+618) 8952 7878
ch...@trickysolutions.com.au

-Original Message-
From: r-sig-ecology-boun...@r-project.org
[mailto:r-sig-ecology-boun...@r-project.org] On Behalf Of Kingsford Jones
Sent: Monday, 19 July 2010 4:40 AM
To: lgj200306
Cc: r-sig-ecology@r-project.org
Subject: Re: [R-sig-eco] A question about PCNM analysis

lgj200306,

You didn't tell us, but since the problem was 'all the same' on both
machines I'm guessing both instances used a 32bit build of R under
Windows.  If so, you'll be able to access, at most, about 3.5Gb of RAM
(see RW-FAQ 2.9).  The best solution is to upgrade to a 64bit build
(IMO preferrably Linux, but a 64bit windows port is now on CRAN).  You
can also manage memory more carefully.  E.g., the error indicates
there's no contiguous block of memory to hold an object of size
190.7Mb at the time the error's thrown.  That may be because all RAM
is allocated, or because of fragmentation.  R holds everything in
memory so when working w/ large objects in a restricted setting you'll
want to write unneeded objects to disc, clean up, and reload when
needed (see ?save, ?load, ?rm, and ?gc).  More info can be found at
?Memory and by Googling: R memory mangagement.  Also, for some cases
there are R packages that facilitate memory management: ff, bigmemory,
biglars, bigtabulate, biganalytics, biglm,...


Kingsford Jones

On Sun, Jul 18, 2010 at 4:14 AM, lgj200306  wrote:
   

Hi, all
   I want to do PCNM analysis using vegan and PCNM packages,my R code as
 

follow:
   

   >  bci10m=data.frame(x=rep(1:100,each=50),y=rep(1:50,times=100))
   >  bci10m.d=dist(bci10m)
   >  library(PCNM)
   >  pcnms10m.analysis1=pcnm(bci10m.d)#code 1##using function of pcnm
 

contained in vegan package
   

   >  pcnms10m.analysis2=PCNM(bci10m.d) #code 2##using function of PCNM
 

contained in PCNM package
   

   >  bci20m=data.frame(x=rep(1:50,each=25),y=rep(1:25,times=50))
   >  bci20m.d=dist(bci20m)
   >  pcnms20m.analysis1=pcnm(bci20m.d)#code 3
   >  pcnms20m.analysis2=PCNM(bci20m.d)#code 4


   The result shows that code 1,3,4 are all ok, I can get what I want
 

using these three commands. However, code4 can't be carried out. Error
message shows:"cannot allocate vector of size 190.7 Mb ". I have asked a
professor about this question, he told me that maybe my computer's memory
was not enough and suggested me closing the calculation of Moran_I. Then I
recalculated these codes using another computer that had high capability.
The problem was all the same. I don't know the reason.
   

   Another question, if I want to know how many pcnm eigenvectors'
 

Moran_I are higher than expected Moran_I after using code1, how can I
achive it in R?
   

   Thanks for your attention!
2010-07-18



lgj200306

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

 

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


   


--
Dr. Gabriel Singer
Department of Limnology - University of Vienna
and Wassercluster Lunz Biologische Station GmbH
+43-(0)664-1266747
gabriel.sin...

Re: [R-sig-eco] change arrow colour when plotting rda

2010-05-06 Thread gabriel singer
Don´t know how to solve the problem with the default plots... but you 
may want to consider plotting arrows manually using either arrows() in 
package graphics or Arrows() in package shape. The latter one has a 
couple of aestethically appealing options :-)

cheers, gabriel


On 07/05/2010 00:09, Devoto Mariano wrote:
> Dear all,
> can enyone please tell me how to change the colour of the arrows of the
> environmental variables when plotting a contrained ordination done in rda?
> They are in blue, but i want black.
> I've tried any possible combination I can think of in plot(), arrows(),
> ordiplot() and plot.cca() and none of them would work.
> Thanks in advance for your reply,
> M.
>
>
>
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>


[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] NMS axis variance and legend

2010-04-15 Thread gabriel singer
dear alida,

legend() should help to get the legends, just ask for help(legend), it´s 
pretty easy.

then for the variance explained: with an NMS the only measure of fit you 
get is the stress value, there isn´t anything like a percentage of 
explained variance. you may want to regard the stress value as the 
percentage of distances among points that is not reproduced by the 
ordination. Though that´s a bit of a sloppy way to see it.

keep struggling, it´s worth it :-)

cheers, gabriel



On 15/04/2010 22:19, Alida Mercado wrote:
> Hello,
>
> I'm doing an NMS, and have decided to try out vegan and do everything
> in R.  In this attempt, I haven't been able to figure out how to get
> the variance explained by each axis, nor the total variation
> explained by the ordination.  the other issue I have is how to get a
> legend, because I want to represent different sites in the
> ordination, but the sites are grouped by season in which they were
> sampled in order to look at differences in seasonality.  Therefore I
> would like to have a legend that represent each season by different
> symbols.  I've decided to start using R, so there are still some
> issues I have to figure out and learn the commands and functions to
> do what I need.
>
> If you have any suggestions, please let me know,
>
> Thanks in advance,
> Alida
>
>
> ~~~
> Alida Mercado Cárdenas
>
> Ph.D. Candidate, Entomology-Neotropical Environment Option
> McGill University&  Smithsonian Tropical Research Institute
>
> http://weevils-n-dreams.blogspot.com/
> ~~~
>
>
>   [[alternative HTML version deleted]]
>
>
>
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>


[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] adonis question

2010-03-12 Thread gabriel singer

Hi Jaime,


The interactions are just a matter of defining the formula as such, e.g. 
adonis(dist~factor1*factor2).
I suppose, a multiple comparison (with the reasoning of a post-hoc test) 
can just be done using adonis() for pairwise comparisons and then use 
p.adjust().


Cheers, gabriel

On 12/03/2010 19:12, Jaime Pinzon wrote:

Hi



After performing a permutational ANOVA with adonis in vegan, is there a way
to do multiple comparisons for significant factors with more than 2 levels
as well for significant interactions? Any help would be very much
appreciated



Thanks,



Jaime


[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


   


--

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] permutation for PCA

2009-12-18 Thread gabriel singer

maybe PCAsignificance() in package {BiodiversityR} could be of help...

cheers, gabriel

Dragos Zaharescu wrote:

Hi everyone,
I was struggling for a while with performing a permutation/crossvalidation test 
of nonlinear PCA in order to assess the significance of the contribution of the 
separate variables to the nonlinearPCA solutions. Does anyone have an idea on 
how/package to perform this. For PCA maybe?
Any hint would be much appreciated.
 
Dragos




Dragos Zaharescu
Animal Anatomy Laboratory
Faculty of Biological Sciences
Vigo University, apd. 137
36310, Vigo (Pontevedra), SPAIN
zaha_dra...@yahoo.com
zdra...@uvigo.es
http://webs.uvigo.es/zdragos/

~ You should be the change you want to see in the world ~ Ghandi




___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Fwd: Fwd: how to calculate "axis variance" in metaMDS, pakage vegan?

2009-12-10 Thread gabriel singer
A difference between two communities within a host could still exist and 
could make perfect sense, too, when you regard "community" as a random 
factor. Then "community" may introduce some extra variation (compared to 
the within-community variation), experimentally seen interesting and 
important, because the replication of communities makes sure you are not 
pseudoreplicating. I am not sure however, how to declare the correct df 
for the random factor in adonis in this case... anybody knows better than I?




Gian Maria Niccolò Benucci wrote:

Maria,


*...Nevertheless you still do not know if your communities are significantly
different between each other, within each host. Now it depends on the
hypothesis you intend to test.*..


I think no sense for "Community" inside "Host"... Couse A and B are from the
same host "Corylus", and C and D are from host "Ostrya".
So the effect between two host tree species is real, but difference between
two community inside the same host (A vs B i.g.) could not be.
That is also confirmed by my diversity indices data I got (see my lastest
post), they show that A and B are alwasy different from C and D, but between
A and B (and for sure also between C and D ) there are no statistical
differences (ANOVA).

I think both "host" and "community" effect and if use them separately I got
:

  

adonis(sqrtABCD ~ Host, method="bray", data=env.table, permutations=99)



Call: adonis(formula = sqrtABCD ~ Host, data = env.table, permutations =
99,  method = "bray")

Df SumsOfSqs  MeanSqs  F.Model R2 Pr(>F)
Host   1.0   1.64429  1.64429  5.38984 0.1242   0.01 **

Residuals 38.0  11.59276  0.30507  0.8758
Total 39.0  13.23705   1.
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


  

adonis(sqrtABCD ~ Community, method="bray", data=env.table,


permutations=99)

Call: adonis(formula = sqrtABCD ~ Community, data = env.table, permutations
= 99,  method = "bray")

Df SumsOfSqs  MeanSqs  F.Model R2 Pr(>F)
Community  3.0   2.43264  0.81088  2.70182 0.1838   0.01 **

Residuals 36.0  10.80441  0.30012  0.8162
Total 39.0  13.23705   1.
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
  



Thank you so much to all want to write any comments on that...

Cheers,


Gian




  



___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology



___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Fwd: how to calculate "axis variance" in metaMDS, pakage vegan?

2009-12-09 Thread gabriel singer
asy different from Ostrya one.

...So, I think that "Host" effect is  clear while the effect of "Community"
couldn't be the same in reason to that areas are similar 2 by 2, ...is it
right?

When I plot the MNS.2 and I watch to the Graph I clearly see that sample
points of A,B areas or Corylus are positioned on the left side while areas C
and D of Ostrya are more sparse and are positioned into the low right
side...

So, what else to say... I'll leave you space for any comments :

Tank you all,

Gian

[[alternative HTML version deleted]]

  



___________
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
  


--
Dr. Gabriel Singer
Department of Freshwater Ecology - University of Vienna
and Wassercluster Lunz Biologische Station GmbH
+43-(0)664-1266747
gabriel.sin...@univie.ac.at

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Fwd: how to calculate "axis variance" in metaMDS, pakage vegan?

2009-12-08 Thread gabriel singer

Hi Gian and others,

I think we better stop worrying about subjective interpretations of 
emotional backgrounds of what in other aspects are absolutely helpful 
discussion threads... I guess part of the challenge on this mailing list 
is to span the whole range of expertise with useful 
discussion/output/help for everyone, be it a student or an expert. I 
found this mailing list very helpful many times for my own questions, 
but also very informative when just following the threads on other 
questions...


Gian, in my opinion, 2 dimensions are absolutely ok, especially if they 
do visualize an (obvious) effect in your study. In other words, if 2 
dimensions show you an effect of "Host" but not of "Area", the effect is 
obviously strong enough. Then I would not worry about stress too much. 
However, there may still be an effect of "Area", maybe visible in more 
dimensions, but it´s obviously of minor importance.


I personally like a combination of NMDS with the permutational MANOVA 
approach (by Marti Anderson) implemented in the function adonis() in 
vegan. You can use the same dissimilarity measure (Bray-Curtis) used for 
the NMDS and can test the "Area" vs. the "Host" effect on parasite (was 
it?) composition. I think that could be a very useful complement to an 
NMDS-derived ordination plot and then you may also regard high-stress 
"representations" (and that´s what all the low-dimensional ordination 
plots really ARE!) in a different light.


Complementations like the permanova are in my opinion better than trying 
the full spectrum of ordination methods until finally some kind of 
pattern gets uncovered (comes quite close to the much too often 
encountered data-fishing expeditions). And though copying analysis 
strategies is probably not quite like throwing yourself in front of a 
bus, there is some benefit in using what people working in a specific 
field regard their "standard" methods (wait for the reviews to discover 
this). In any case, a responsible choice for a type of analysis is 
oriented along the study design and the data at hand.


cheers, gabriel

--
Dr. Gabriel Singer
Department of Freshwater Ecology - University of Vienna
and Wassercluster Lunz Biologische Station GmbH
+43-(0)664-1266747
gabriel.sin...@univie.ac.at



Gian Maria Niccolò Benucci wrote:

Hi Gavin and Hi all,

I will not go in front of a bus for sure, I not mad, at least I am not still
mad... :)

I would like to tell you that I am a Ph.D. student, and for what I know,
Ph.D. student still have to understand things studing those from whom wrote
before them...

Isac Newton became famous not only for his science but also for a famous
phrase that, if I don't remember it bad, act like this :" If I have seen so
much far away is because I stand on shoulders of Giants"... I think that it
needs any comment, and express itself the concept...

So, I am so sorry, I also don't like the "me to" attitude, but you don't
know how is my reality here, and I can assure you that also If I am still a
"student", I am alone in my research, and If have a tutor and boss for
italian rules I don't have a boss for statistics, couse none could help me
on that...
So what could I do if I don't take models in already published literature?

Anyway, I don't want to seem like the victim, I have a brain that works and
I am doing my best to understand and improve my knowledge and at least lean
and grow, for sure, step by step, and with a big humility, in science and in
this case in statistics...

Anyway... For continuing the brainstorm if I can...The Host effect is what I
think is more interesting for the ecological point of view of my trials also
becasue the 4 communities have two by two the same host, I mean A and B,
Corylus, while B and C, Ostrya...
If I plot the factors of the envifit into the graph and the evidence of
separation seems clear...

That's are my metaMDS with 2 and 3 dimensions:

  

NMS.1



Call:
metaMDS(comm = sqrtABCD, distance = "bray", k = 2, trymax = 100,
autotransform = F)

Nonmetric Multidimensional Scaling using isoMDS (MASS package)

Data: sqrtABCD
Distance: bray shortest

Dimensions: 2
Stress: 24.54342
Two convergent solutions found after 18 tries
Scaling: centring, PC rotation, halfchange scaling
Species: expanded scores based on ‘sqrtABCD’

  

NMS.ABCD.2ef



***FACTORS:

Centroids:
  NMDS1   NMDS2
CommunityA  -0.3271  0.1984
CommunityB  -0.1956  0.1768
CommunityC   0.2520 -0.2847
CommunityD   0.2706 -0.0905
HostCorylus -0.2613  0.1876
HostOstrya   0.2613 -0.1876

Goodness of fit:
  r2   Pr(>r)
Community 0.1897 0.017982 *
Host  0.1778 0.001998 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
P values based on 1000 permutations.
  

NMS.1.3



Call:
metaMDS(comm = sqrtABCD, distance = "bray", k = 

Re: [R-sig-eco] capscale() for PCoA-CDA

2009-12-04 Thread gabriel singer

Dear Jari and others,

Hi everybody,

Anybody has used capscale() in package vegan to compute a PCoA-CDA as
suggested by Anderson and Willis 2003 (Ecology 84: 511 ff) using one or
more factors as "predictors"?

Then I wonder about:

*) How to interpret interactions of factors? Why are interactions
(specified as "~factor1*factor2" in the function call) shown as
continuous predictors (using arrows) in the plot function? Wouldn´t
centroids for all cells in the design be more appropriate? Aren´t
factorial interactions in a CDA setting more or less meaningless?



Internally capscale() uses constrasts of variables, and they are treated as
continuous variables and shown as arrows in plots. However, if the
constrasts correspond to simple factors, they are not drawn but their
centroids are shown. For ordered factors you get both centroids and the
arrows. The interactions of contrasts cannot be shown as simple class means
and therefore they are drawn as arrows. The simple centroids are not
appropriate, but you should have centroids of all combinations of class
levels of interacting factors.

If you think that factorial interactions in *** (what is CDA?) are
meaningless, why do you want to use them?

I wouldn't say they are meaningless, because that depends on your meaning.
Often they are difficult to interpret, but that's another issue.
  

I understand the arrows for interactions now, thanks.

I used CDA in the sense of Anderson and Willis 2003 (and others) as 
Canonical Disicriminant Analysis,
as such it is - at least to my understanding - equivalent to 
Discriminant Function Analyses.
When CDA aka DFA is used with 2 interacting factors, it will try to best 
separate groups and that
is *any groups*, and I can´t see why (and how) there should be 
preference given to any grouping
criterion (factor 1, factor 2 or both)... In the end a 4-level factor 
should be as good as

a 2*2 factorial combination. In this sense I used the word "meaningless".

In fact, capscale() results for a 1*4 constraint (1 factor, 4 levels) 
are identical with a 2*2 constraint.
However, centroids are at differnt positions (!), in fact centroids of 
all combinations of class levels are at

weird (wrong as I think) positions in the 2*2 case!?

Still, "interactions" finally make sense when interpreting the plot, 
that´s quite true.
  

*) How to get classification statistics? And how to efficiently run a
"leave 1 out" classification analysis? I thought of manually writing
code that checks for the closest centroid. Would it be appropriate to
use Euclidean distance as a criterion for this since it happens in PCo
space? Probably there are more efficient functions which I do not know
of, yet,... for example a function that allows extraction of distances
of all objects to all centroids?



There is no such thing. Contributed code will be reviewed for inclusion into
vegan.
 
  

*) Is the application of capscale on a Euclidean distance matrix
equivalent to a classical DFA aka CDA on the original data - or am I
completely wrong with this idea?



No, it isn't equal to "DFA aka CDA". Perhaps... Depends on what are DFA and
CDA. With Euclidean distances, capscale() is equivalent to redundancy
analysis (RDA). Guessing that "DFA aka CDA" are discriminant analysis, RDA
is not equal to them. The major difference is that RDA uses no information
about scatter of points with respect to the class centroids, but it only
uses class centroids. The RDA tries to maximize the distances among class
centroids, but it doesn't try to maximize the separation of points of
different classes. The methods are very different although the results may
have some similarities.

This is connected to the previous question: because RDA (that is in the
heart of capscale()) does not try to optimize in classification, there is no
classification statistic to be optimized. That should be estimated
independently of the analysis and after the analysis, and there are no
functions for the purpose in vegan.
 
  
Slightly confused now... Anderson and Willis (2003) describe PCoA on a 
dissimilarity structure, followed by
CDA or CCorA and call the procedure CAP (Canonical A of Principal 
Coordinates). I will call the latter two
approaches PCoA-CDA and PCoA-CCorA. Now, I get that CCorA differs from 
RDA mainly conceptually,
so there is not much (any?) difference between PCoA-CCorA and PCoA-RDA = 
capscale().

Now, is PCoA-CDA really equivalent to db-RDA (in the sense of Legendre and
Anderson 1999)?  I initially thought this would be the case. They both 
use a set of dummy variables to code
for the factor and treat these as continous predictors. A second thought 
tells me they can´t be the same. Then
maybe what´s left is only the term capscale() which is not the same as 
CAP in the case of PCoA-CDA...

Seems I am getting lost in the panoply of acronyms, sorry...

*) Given only one factor as a "predictor", I guess using permutest() or
anova() on an object resulting from capscale is comple

[R-sig-eco] capscale() for PCoA-CDA

2009-12-03 Thread gabriel singer

Hi everybody,

Anybody has used capscale() in package vegan to compute a PCoA-CDA as 
suggested by Anderson and Willis 2003 (Ecology 84: 511 ff) using one or 
more factors as "predictors"?


Then I wonder about:

*) How to interpret interactions of factors? Why are interactions 
(specified as "~factor1*factor2" in the function call) shown as 
continuous predictors (using arrows) in the plot function? Wouldn´t 
centroids for all cells in the design be more appropriate? Aren´t 
factorial interactions in a CDA setting more or less meaningless?


*) How to get classification statistics? And how to efficiently run a 
"leave 1 out" classification analysis? I thought of manually writing 
code that checks for the closest centroid. Would it be appropriate to 
use Euclidean distance as a criterion for this since it happens in PCo 
space? Probably there are more efficient functions which I do not know 
of, yet,... for example a function that allows extraction of distances 
of all objects to all centroids?


*) Is the application of capscale on a Euclidean distance matrix 
equivalent to a classical DFA aka CDA on the original data - or am I 
completely wrong with this idea?


*) Given only one factor as a "predictor", I guess using permutest() or 
anova() on an object resulting from capscale is completely equivalent to 
a direct application of adonis()? Correct?


These are lots of questions at once and no code to play with, sorry... 
Thanks for any help!


Gabriel

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] how to calculate "axis variance" in metaMDS, pakage vegan?

2009-12-01 Thread gabriel singer

gian,

you may try consecutive MDS-analyses with increasing number of 
dimensions (the parameter k in the isoMDS() or metaMDS() function). then 
plot stress against the number of dimensions and judge similar to a 
scree-plot in PCA. this should tell you how many dimensions to use for 
the MDS and as such also an appropriate associate stress-value.


cheers, gabriel

Gian Maria Niccolò Benucci wrote:

Okey, really many thanks... So having low Stress value is foundamental, as
it is as lower as higher the model fit the data, is that right? How can I
know if my Stress is correct? I mean, if it is enough low to asses that the
model fit good the samples data shifts into the graph...
Is there a treshold or something?
I would appreciate any pdf or kind of reviews on ordination models for
community ecology data... :)
Thank you really much!
Cheers,

Gian


2009/12/1 Gian Maria Niccolò Benucci 

  

Hi Hi there,

I am trying to use funcion metaMDS (vegan pakage) for Community Ecology
data, but I find no way to calculate the "expressed variance" of the first 2
axis? is there a way to do that?
Thanks a lot in advance,

Gian







  



___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
  


--
Dr. Gabriel Singer
Department of Freshwater Ecology - University of Vienna
and Wassercluster Lunz Biologische Station GmbH
+43-(0)664-1266747
gabriel.sin...@univie.ac.at

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] how to calculate "axis variance" in metaMDS, pakage vegan?

2009-12-01 Thread gabriel singer

hi gian,

no, there is no such way. A MDS can´t express "explained variance". 
However, the stress value is the overall measure of quality of fit of 
your MDS to the data. There are various measures of stress, but loosely 
speaking you can regard the stress as a percentage of  variation NOT 
explained by ALL dimensions in your MDS.


cheers, g

Gian Maria Niccolò Benucci wrote:

Hi Hi there,

I am trying to use funcion metaMDS (vegan pakage) for Community Ecology
data, but I find no way to calculate the "expressed variance" of the first 2
axis? is there a way to do that?
Thanks a lot in advance,

Gian

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology





___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] vegan: envfit (vectorfit)

2009-09-15 Thread gabriel singer

gavin and jari,

thanks, all makes sense I have to state that remembering the 
discussion we had some weeks ago about fitting underlying (or 
environmental) variables to a MDS ordination, that using vectorfit for 
this purpose indeed would make sense for me, too. As long as before 
choosing the representation as a vector (which would indeed suggest 
linear behaviour over ordination space), a linear or at least monotonic 
behaviour of the metric variable over ordination space is checked (e.g. 
given using ordisurf) or different opinions?


cheers, g

Gavin Simpson wrote:

On Tue, 2009-09-15 at 17:02 +0200, gabriel singer wrote:
  

Hi vegan-users and programmers,

Can anybody tell me how the function vectorfit (envfit) computes arrow 
lengths (as fits of a metric variable onto an ordination) exactly? I 
understand the scaling bit in the end, but have troubles to understand 
how actually the direction and strength of gradient of the environmental 
variable with the ordination is identified. Obviously it´s not a mere 
correlation between the environment variable and ordination scores, as 
is usually done for a PCA for example (the "loadings" as opposed to the 
eigenvectors).



It is a least squares fit of the following form:

Y ~ scores1 + scores2

where Y is the vector or matrix of numeric variables you wish to have
vectors for, and scores1 and scores2 are the user-selected axes of the
ordination configuration. If Y is a matrix then each variable (column)
in that matrix enters as a separate regression.

Effectively, it uses the locations of the points (sites) in the selected
2D ordination space to predict the observed values of the variables for
which vectors are being fitted.

The arrow heads are the normalised coefficients for scores1 and scores2,
and hence represent the normalised change in response for a unit change
in the scores1 and scores2 (the axis or site scores). As these are
normalised, the large the coefficient (change in response for unit
change in the site scores) the stringer the relationship between the
sites scores and the vector.

A key issue in the implementation is to consider the ordination space
into which you project vectors as a 2D configuration of points and we
want to relate these "locations" to the values of a secondary set of
variable.

HTH

G

  

thanks a lot for any good ideas..

gabriel


    


--
Dr. Gabriel Singer
Department of Freshwater Ecology - University of Vienna
and Wassercluster Lunz Biologische Station GmbH
+43-(0)664-1266747
gabriel.sin...@univie.ac.at

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


[R-sig-eco] vegan: envfit (vectorfit)

2009-09-15 Thread gabriel singer

Hi vegan-users and programmers,

Can anybody tell me how the function vectorfit (envfit) computes arrow 
lengths (as fits of a metric variable onto an ordination) exactly? I 
understand the scaling bit in the end, but have troubles to understand 
how actually the direction and strength of gradient of the environmental 
variable with the ordination is identified. Obviously it´s not a mere 
correlation between the environment variable and ordination scores, as 
is usually done for a PCA for example (the "loadings" as opposed to the 
eigenvectors).


thanks a lot for any good ideas..

gabriel


--
Dr. Gabriel Singer
Department of Freshwater Ecology - University of Vienna
and Wassercluster Lunz Biologische Station GmbH
+43-(0)664-1266747
gabriel.sin...@univie.ac.at

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] wascores() for metaMDS?

2009-08-20 Thread gabriel singer


Dear Jari and Gavin,

thanks a lot, everything clear... with the connection to CCA I now get 
the meaning of the species scores, almost trivial after all...


gg



Gavin Simpson wrote:

On Wed, 2009-08-19 at 11:40 +0200, gabriel singer wrote:
  

Hi sig-ecology!

Here comes a probably stupid question... I am looking for smart ways to 
include information about underlying variables in MDS plots. In other 
words, after having computed an ordination with isoMDS or metaMDS from a 
community table, I would like to add something like species 
coefficients/loadings as vectors to the plot of sites. As no species 
coefficients exist in this case, the best I could come up with so far is 
simply vectors calculated from correlation coefficients of the 
individual species with the site scores (on two MDS axes).
The function metaMDS allows to compute "species scores" using the 
function wascores() I have now pondered for 2 days how these scores 
are calculated and what their precise meaning would be.



An individual taxon's "species score" is computed as the weighted
average of the "site scores", weights being the abundance of that taxon
in each site. It is the abundance weighted centroid of all the samples
in which the species occurs. The motivation for this is that in CA,
species scores are weighted averages of site scores that are themselves
weighted averages of species scores and so on in the Two-way algorithm
of Mark Hill - not that vegan computes the CA solution that way in cca()
- so it is an analogous approach to computing species scores for nMDS.

  
Would these 
species scores be appropriate to show as vectors in the MDS?



Not as vectors, as that implies directionality or increasing abundance
and there is no reason to assume that the abundance of a given taxon
will increase linearly or even monotonically in a given direction across
the nMDS plot.

Although I hesitate to call it that, the species score computed as the
weighted average of the site scores, is an optima (of nMDS site scores)
and thus abundance declines as one moves away from the point. So in this
sense, you display the species scores in the same manner as on a CA or
CCA plot, as a point, instead of the vector in PCA/RDA. However, the
decline in CA is uniform in any direction (fitted not actual abundance),
i.e. in 2-D the species score is the point at the top of a 2-D
bell-shaped surface as this is the implied response model in CA. With
nMDS there is no reason to assume this is the case.

For one or two taxa, you could just project a surface of actual
abundances using ordisurf() or you could just use the points as you
would in a CA diagram, more or less. The problem with the surface
approach is that you can only show a couple of species at most on a
single ordination plot.

ordisurf would likely be the best option for most extra data you wish to
impose on to the nMDS plot, again for the reason that the relationship
between nMDS axes and the variable of interest need not be a simple
linear or monotonic surface.

HTH

G

  

Thanks for any answer...

Gabriel Singer

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology



--
Dr. Gabriel Singer
Department of Freshwater Ecology - University of Vienna
and Wassercluster Lunz Biologische Station GmbH
+43-(0)664-1266747
gabriel.sin...@univie.ac.at

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


[R-sig-eco] wascores() for metaMDS?

2009-08-19 Thread gabriel singer

Hi sig-ecology!

Here comes a probably stupid question... I am looking for smart ways to 
include information about underlying variables in MDS plots. In other 
words, after having computed an ordination with isoMDS or metaMDS from a 
community table, I would like to add something like species 
coefficients/loadings as vectors to the plot of sites. As no species 
coefficients exist in this case, the best I could come up with so far is 
simply vectors calculated from correlation coefficients of the 
individual species with the site scores (on two MDS axes).
The function metaMDS allows to compute "species scores" using the 
function wascores() I have now pondered for 2 days how these scores 
are calculated and what their precise meaning would be. Would these 
species scores be appropriate to show as vectors in the MDS?

Thanks for any answer...

Gabriel Singer

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology