Re: [R-sig-eco] PERMANOVA and adonis: trying to clear my confusion

2019-01-08 Thread Steve Brewer
Pedro,

Just to clarify, the approach that I’m advocating (centroidmatrix) comes 
directly from what Anderson describes in Primer + Permanova (2008) manual for 
dealing with split-plot designs (which are a type of nested design, pp. 62-64). 
My understanding of what is done in adonis/adonis 2 is that nesting is handled 
using the strata function. This randomization restriction could change the 
p-value of a test, but I don’t think it changes the fact that adonis is using 
the residual error term to test all effects in the model (somebody correct me 
if I am wrong about that). Maybe that is a reasonable approach, but that is not 
how I learned how to do nested, split-plot, and repeated measures using an 
Expected Means Squares approach, which I believe requires the use of different 
error terms to test different effects.

I mention this because in your second message you mention differences based on 
the df of the tests. The randomization restriction, by itself, is not going to 
affect the error df of the test, only the p-value. Perhaps someone out there 
does know the answer, but I have personally not found any model statement that 
will allow one to use anything but the residual error term to test effects 
using adonis.

Incidentally, nested.npmanova (BiodiversityR implemented through adonis) uses 
two different error terms to test effects, which in my opinion is a valid 
approach, but its use is restricted to those simple cases in which you have a 
whole/sub/residual nested arrangement (i.e., no crossing of factors and thus no 
interactions).

Steve


Stephen Brewer
jbre...@olemiss.edu<mailto:jbre...@olemiss.edu>
Professor
Department of Biology
PO Box 1848
 University of Mississippi
University, Mississippi 38677-1848
 Brewer web page - https://jstephenbrewer.wordpress.com
FAX - 662-915-5144 Phone - 662-202-5877





On Jan 8, 2019, at 11:54 AM, Pedro Neves 
mailto:pedro.ne...@oom.arditi.pt>> wrote:

On 08/01/19 14:01, Steve Brewer wrote:
Pedro,

In my opinion, the problem is that adonis can only use the residual error to 
test effects. It doesn’t matter what formula you use, it simply does not 
account for random effects.

You will need to break up the analysis into two separate analyses, wherein the 
tests of season and habitat are based on a matrix of centroids for each 
location and the test of location is done with the original species matrix.

Hi Steve:

Thanks for your input and for the "centroidmatrix" - I'll check it out. However 
I believe one should be able to conduct the analysis with my design using 
adonis/adonis 2. I say this based on the documentation of the functions 
(because the code is based on the original algorithm of Anderson and because 
there's one example of using a nested). I think it's just a matter of finding 
out how to restrict the permutations within Habitats (in my case).

I think the example for the nested design on the Vegan documentation is not 
very clear (also judging for the amount of times related questions pop up on 
the internet...) and I'd love to be able to learn how to conduct this analysis 
properly with R and Vegan and then contribute with more documentation that 
might help others...

So, if anyone on the list could point me some hints

All the best:

Pedro



[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] PERMANOVA and adonis: trying to clear my confusion

2019-01-08 Thread Steve Brewer
I’ve generated a function called “centroidmatrix” that might be helpful to you. 
I think this could allow you to test the season and habitat and their 
interaction effects using the correct error term.

#' centroidmatrix
#' This function allows you to generate a matrix of centroids from a 
non-Euclidean distance matrix using the betadisper function in vegan.
#' @param factorname The name of a treatment factor file containing the 
treatment factors and other sources of variation and a grouping variable for 
calculation of centroids in the columns and observations in rows
#' @param speciesname The name of species file which has species in columns and 
observations in rows in the same order as in the factorname file and of the 
same length
#' @param groupvarname The name of the grouping variable in the factorname file 
for which centroids are calculated
#' @param distancetype The type of distance or dissimilarity matrix to be 
calculated from the species file using the vegdist function in vegan Default is 
"bray"
#' @export
#' @examples
#' centroidmatrix()

centroidmatrix <- function(factorname, speciesname, groupvarname, distancetype){
groupvar <- factor(groupvarname)
braydist <- vegdist(speciesname, method = distancetype)
betadisperout <- betadisper(braydist, groupvar, type = 'centroid')
braycent <- betadispout$centroids


return(braycent)


}

Stephen Brewer
jbre...@olemiss.edu
Professor
Department of Biology
PO Box 1848
 University of Mississippi
University, Mississippi 38677-1848
 Brewer web page - https://jstephenbrewer.wordpress.com
FAX - 662-915-5144 Phone - 662-202-5877





On Jan 8, 2019, at 7:49 AM, Pedro Neves 
mailto:pedro.ne...@oom.arditi.pt>> wrote:

On 08/01/19 09:59, Pedro Neves wrote:

Assuming that my species data is Spec_data and the factors data is Env_data, my 
formulation is:

adonis(Spec_data ~ Season * Habitat/Location, strata=Habitat, data=Env_data, 
method="bray")

Is this formulation correct, according to my experimental design?

Hi again:

In trying to respond to my own question, I compared the results of the 
PERMANOVA obtained with R:vegan and using the original FORTRAN program from 
Marti Anderson and Primer 6+.

The results from R are different from the ones obtained with the other two 
programs and the difference on the degrees of freedom with R, clearly indicate 
that the formula is incorrect:


Results from R (using adonis2(formula = Spec_data ~ Season * Habitat/Location, 
data = Env_data, method = "bray", by = "terms", strata = Habitat):

   Df SumOfSqs  R2   F Pr(>F)
--
Season 1   1.4465 0.07112  50.360  0.001 ***
Habitat 1   5.9292 0.29150 206.425  0.001 ***
Season:Habitat 1   0.7575 0.03724  26.373  0.001 ***
Season:Habitat:Local   4   4.3942 0.21604  38.246  0.001 ***
Residual  272   7.8127 0.38410
Total 279  20.3401 1.0
---


Results from PERMANOVA (original FORTRAN code):

 Source dfSS   MS  F P(perm) P(MC)
 -
  Se  1 14029.6135   14029.61351.5637 0.2592  0.2358
  Ha  1 56723.5220   56723.52204.2938 0.0002  0.0188
  Lo(Ha)  2 26421.0270   13210.5135   28.1272 0.0002  0.0002
  SexHa   1  7504.58497504.58490.8364 0.3864  0.5542
  SexLo(Ha)   2 17944.45828972.2291   19.1032 0.0002  0.0002
  Residual  272127750.3725 469.6705
  Total 279250373.5780
 -

Results from PRIMER:

PERMANOVA table of results
   Unique
Source df SSMSPseudo-FP(perm)  perms P(MC)

Se  1  14465 14465  1,8194 0,315351 0,2024
Ha  1  12752 12752 0,34679 0,6594 3 0,8376
Lo(Ha)  2  73540 36770  128,01 0,0002 49820,0002
SexHa  1 8616,68616,6  1,0838 0,3414 510,4186
SexLo(Ha)  2  159017950,4  27,679 0,0002 49800,0002
Res272  78127287,23
Total2792,034E5
--

What am I missing in R formulation?

Thanks in advance:


Pedro

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology



Re: [R-sig-eco] 2-way adonis (PERMANOVA) incl interaction - how to test for main effects?

2018-11-01 Thread Steve Brewer
Gian,

I am bit confused by what your concern is. First, if the imbalance is not that 
severe, the approach you take to analyzing a two-way permanova (type I, type 
II, type III ss) is not going to matter that much. Indeed, if the design were 
balanced, they would give you identical results. Second, regardless of the lack 
of balance, for the models y ~ A + B + A:B  and y ~ B + A + A:B, the test for 
the interaction will be the same. So, I don’t understand why you would want to 
drop the main effects from the model, effectively combining them with 
interaction. That doesn’t make any sense to me. The problem is with the tests 
of the main effects.

My advice is to run both models (i.e., A first, then B first) using type I ss. 
As mentioned, both models will give you the same interaction result. If the 
interaction is all that you’re interested in, problem solved. Interpret only 
the interaction and and ignore the main effects. If the interaction is not 
significant and low, then interpret only the main effects, focusing only on the 
second main effect in each of the differently-ordered models (which are 
equivalent to Type II ss tests). And these results will tell you pretty the 
same thing as type III tests if there is little or no interaction. I would not 
worry about trying to estimate the main effects while controlling for the 
interaction (Ellen’s question), which cannot be done using type I or type II SS 
in 2-way permanova using adonis. But why would you want to? The lack of a 
balanced design results in the main effects and the interaction not being 
independent of one another.  Forcing that independence by using type III ss can 
only work by essentially "throwing away" some of the information associated 
with the main effects, possibly resulting in an overly conservative test. The 
lower the interaction, however, the less is thrown away and the less it matters.


Steve




Stephen Brewer
jbre...@olemiss.edu
Professor
Department of Biology
PO Box 1848
 University of Mississippi
University, Mississippi 38677-1848
 Brewer web page - https://jstephenbrewer.wordpress.com
FAX - 662-915-5144 Phone - 662-202-5877





On Oct 31, 2018, at 5:45 PM, Gian Maria Niccolò Benucci 
mailto:gian.benu...@gmail.com>> wrote:

Thank you Jari,

So to test if there are significant interaction I should use Stage:Growhouse
i.e. A:B. This will test the interaction and main effects that are marginal
and so removed. How matters then if I include by="margin" or not? The R2
are the same (please see below) but the p-value changes. I assume the
second way is most correct, is it?

*> adonis2(t(otu_fungi_out) ~ Stage : Growhouse, data=metadata_fungi_out,
permutations=)*
Permutation test for adonis under reduced model
Terms added sequentially (first to last)
Permutation: free
Number of permutations: 

adonis2(formula = t(otu_fungi_out) ~ Stage:Growhouse, data =
metadata_fungi_out, permutations = )
   Df SumOfSqs  R2  F Pr(>F)
Stage:Growhouse  3   1.0812 0.23075 1.9998 0.0211 *
Residual20   3.6045 0.76925
Total   23   4.6857 1.0
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


*> adonis2(t(otu_fungi_out) ~ Stage : Growhouse, data=metadata_fungi_out,
by = "margin", permutations=)*
Permutation test for adonis under reduced model
Marginal effects of terms
Permutation: free
Number of permutations: 

adonis2(formula = t(otu_fungi_out) ~ Stage:Growhouse, data =
metadata_fungi_out, permutations = , by = "margin")
   Df SumOfSqs  R2  F Pr(>F)
Stage:Growhouse  3   1.0812 0.23075 1.9998  0.006 **
Residual20   3.6045 0.76925
Total   23   4.6857 1.0
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


Cheers,

Gian

On Tue, 30 Oct 2018 at 05:47, Jari Oksanen 
mailto:jari.oksa...@oulu.fi>> wrote:

Hello Gian,

These formulae expand into different models. Compare

model.matrix(~ Stage:Growhouse, data=metadata_fungi_out)
model.matrix(~ Stage*Growhouse, data=metadata_fungi_out)

The first model (Stage:Growhouse) will also contain (implicitly) main
effects and all these terms are marginal and can be removed, whereas the
latter Stage*Growhouse expands to explicit main effects and interaction
effects, and only the interaction effects are marginal and can be removed.
This is also reflected in the degrees of freedom in your anova table: In
the first case Stage:Growhouse has 3 df, and in the latter only 1 df (and
the main effects ignored had 2 df).

Ciao, Giari

On 29 Oct 2018, at 19:11, Gian Maria Niccolò Benucci <
gian.benu...@gmail.com> wrote:

Hello Jari,

It is a little bit confusing. If A*B unfolds in A+B+A:B then A:B is the
real interaction component.
So, which if the code below will test the variance for the interaction
alone?

adonis2(t(otu_fungi_out) ~ *Stage : Growhouse*, data=metadata_fungi_out,
by = "margin", permutations=)
Permutation test 

Re: [R-sig-eco] repeated measures ANOVA and PERMANOVA (does it exist?)

2018-01-24 Thread Steve Brewer
David,

Assuming the distance/similarity measure is non-euclidean (e.g., Bray-Curtis), 
see attached pdf document containing a previous post on this subject. This 
approach is based on the one that Marti Anderson describes in Primer.

Good luck,
Steve


Stephen Brewer
jbre...@olemiss.edu
Professor
Department of Biology
PO Box 1848
 University of Mississippi
University, Mississippi 38677-1848
 Brewer web page - https://jstephenbrewer.wordpress.com
FAX - 662-915-5144 Phone - 662-202-5877





On Jan 23, 2018, at 3:33 PM, David Barfknecht 
> wrote:

Hello all,

I am currently working on a project that uses species occurrence data (0/1) to 
construct an NMDS ordination where some points are the same location but 
through time (10 locations X 3 survey times = 30 observations). I want to use 
ANOSIM and PERMANOVA to look at both individual locations controlling for time 
and all survey periods controlling for location. Basically, I want to look at 
location and times as separate factors.  I am curious if anyone have ever had 
to write a script for repeated measures ANOSIM and/or PERMANOVA. I am aware of 
how to do the unrepeated measures version of these in the vegan package, but 
not repeated measures versions. Any suggestions?

If not, are there other methods more appropriate to investigate this?

Thank you in advance for any advice.

Sent from Mail for Windows 10


[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Covariate in MANYGLM

2016-01-21 Thread Steve Brewer
David,

Thanks for your response to Martijn's question. I too am interested in
using manyglm in analyzing multifactor experiments, and I think I've
figured out how to do this. One question though regarding the model and
sequential sums of squares. I think understand correctly that the anova
results for the last term (I.e., factor) in the model are equivalent to
partial sums of squares results. If so, you would just re-run the
analysis, putting a different factor last in the model to get partial sums
of squares results for it. However, when trying to figure which species
contribute most to the multivariate results, it seems to me that their
responses (and the associated stats, e.g., 2*log-likelihood) are
associated with response to all the terms in the model combined, right? If
it is, is it possible to rank species in terms of their responses to the
particular factor of interest, while still accounting for the other
factors (or covariate, as in Martijn's case)?

If I'm not making myself clear, I guess what I'm asking is, using
Martijn's ancova example, how do you rank species in terms of their least
squares or corrected mean responses to the treatment (I.e., corrected for
the covariate)?

Thanks,
Steve


J. Stephen Brewer 
Professor 
Department of Biology
PO Box 1848
 University of Mississippi
University, Mississippi 38677-1848
 Brewer web page - http://home.olemiss.edu/~jbrewer/
FAX - 662-915-5144
Phone - 662-202-5877




On 1/20/16 6:19 PM, "David Warton"  wrote:

>Hi Martijn,
>Thanks for your question.  Always delighted to help people playing with
>mvabund.
>
>Let me start by complementing you on your excellent taste in software ;)
>
>manyglm has been written to behave very much like the glm function (in
>fact it works by fitting glm's to each species).  So the answers to your
>questions are quite generic and apply to glm usage also.
>
>First, how to account for different plot sizes - the best way to do this
>is actually using an offset, rather than an additional predictor.  For
>presence/absence data, the most natural way to do this is using the
>complementary log-log link function (which can be understood as modelling
>mean abundance from data truncated to presence/absence) and adding an
>offset for log(Area) such that abundance is assumed proportional to plot
>area.  If not sure about this assumption you could additionally put
>log(area) in as a predictor to see if there was any effect beyond
>proportionality.  Note we use log(Area) because the cloglog link, like
>the log link in Poisson regression, models effects on mean abundance on a
>log scale.  (Just on the off chance that are some distance-based users
>reading this e-mail - differences in sampling intensity, whether intended
>or otherwise, seem to be quite common in this type of data.  Yet I am yet
>to see a reliable way to account for such effects without specifying a
>formal statistical model to account for them, distance-based "fixes"
>frequently go awry.)
>
>Anyway, using mvabund version 3.11.6 as currently on github you can fit
>such a model as follows:
>   ft_cloglog = manyglm(birdmva~ Treatment+offset(log(Area)), data=x,
>family=binomial("cloglog"))
>Unfortunately we recently found a bug in anova.manyglm for cloglog models
>- it is overriding the cloglog choice and fitting logistic regressions
>(logit link) to perform the anova.  It doesn't do this is manyglm, just
>in the anova.  Alice (Yi) Wang is working on this as we speak and I can
>let you know when the corrected code is on github.  Please contact me
>directly if you are having trouble loading the latest version from github.
>
>As for your second question, yes results absolutely do change as you
>change the order in which terms are entered into the model.  If you are
>familiar with the concept of Type I Sums of Squares, this is more-or-less
>what the anova function does on R.  That is, it adds terms to the model
>sequentially and tests if the effect of adding each of these terms (in
>this sequence) is significant.  So for example in your model 1, you are
>testing for a (marginal) effect of moss, i.e. an effect of moss without
>anything else in the model, then you test for an effect of soil.dry given
>that you already have moss in the model.  Your model 2 does the reverse,
>testing for a marginal effect of soil.dry, then testing for an effect of
>moss given soil.dry.  So what is more appropriate for your context - well
>presumably you are interested in the effect of Treatment after accounting
>for the effect of sampling intensity (via different areas), so if you
>have a term in your model for log(Area), it should appear before
>Treatment in your manyglm call.
>
>All the best
>David
>
>
> 
>David Warton
>Professor and Australian Research Council Future Fellow
>School of Mathematics and Statistics, Evolution & Ecology Research
>Centre, Centre for Ecosystem Science
>The University of New South Wales NSW 2052 AUSTRALIA
>phone (61)(2) 9385-7031
>fax (61)(2) 

Re: [R-sig-eco] evaluating multiple responses using chi square

2015-06-15 Thread Steve Brewer
David,

Thanks for this. Very helpful. Yes, I was hoping to maybe deal with lack
of independence in a separate step by taking into account correlations
among a few select species, particularly the more common ones.

I'll be interested to see what your group comes up with.

Best,
Steve

J. Stephen Brewer 
Professor 
Department of Biology
PO Box 1848
 University of Mississippi
University, Mississippi 38677-1848
 Brewer web page - http://home.olemiss.edu/~jbrewer/
FAX - 662-915-5144
Phone - 662-202-5877




On 6/13/15 6:37 AM, David Warton david.war...@unsw.edu.au wrote:

Hi Steve,
Yes you are right in what you say, and it looks like you have identified
the problem already - you can sum chi-square random variables to obtain a
chi-square variable whose df is the sum of the df's of component
variables, but only if they are mutually independent.

Community datasets, with abundances or presence-absences from multiple
taxa collected at the same place, are commonly referred to as
multivariate precisely because the multiple responses are typically
dependent, and hence statistics calculated for separate response
variables are also dependent.  You can still get somewhere with the
theory though - the sum of dependent chi-squares could be re-expressed as
a weighted sum of chi-squares - but the weightings couldn't be estimated
reliably unless you had lots of information in the data from replicate
observations, which we tend not to.  This is the reason we went for
resampling.  (The main alternative, which we are currently looking at, is
covariance modelling as a strategy to estimate and account for
correlation in a parsimonious way.)

All the best
David

 
David Warton
Professor and Australian Research Council Future Fellow
School of Mathematics and Statistics and the Evolution  Ecology Research
Centre
The University of New South Wales NSW 2052 AUSTRALIA
phone (61)(2) 9385-7031
fax (61)(2) 9385-7123
 
http://www.eco-stats.unsw.edu.au/ecostats15.html




--

Date: Wed, 10 Jun 2015 09:57:37 -0500
From: Steve Brewer jbre...@olemiss.edu
To: r-sig-ecology@r-project.org
Subject: [R-sig-eco] evaluating multiple responses using chi square
Message-ID: d19dba91.384e6%jbre...@olemiss.edu
Content-Type: text/plain; charset=UTF-8

Dear Listserv community,

I realize that this is more a statistical theory question, rather an R
application question, but I hope those familiar the theory underlying
manyglm and manylm could help me.

In evaluating the overall response of a community to a treatment, I'm
aware that one class of approaches involves doing univariate analyses for
each species (e.g., ANOVA, t-test, chi square, logistic modeling, etc)
and then summing the results across all species and evaluating
statistical significance with a randomization procedure.

My question is has anyone considered using a chi square test instead of
randomization to obtain the significance value? The p value for any test
statistic (F, t) can be converted to a chi square value with 1 df.
Because chi square values are additive (assuming independence), it makes
sense to me that you could simply add up the chi square values for all
species and evaluate the significance of the resulting sum assuming a df
equal to the number of tests (species). Presumably, one could use
different tests for different species, depending on whichever is most
appropriate (e.g., anova for common species that differ in abundance
between treatments or chi square or a logistic model for species that
differ in terms of frequency of occurrence between treatments). If one
were confident that the univariate assumptions held for each species'
test, other than the assumption of independence of responses among
species, I'm wondering what if anything is wrong with such an approach
for obtaining a significance value. Perhaps something similar is being
done when the log likelihood value is calculated?
If so, what are the similarities or differences?

Thanks, and I apologize if this question is too basic or has already been
answered.

Steve




J. Stephen Brewer
Professor
Department of Biology
PO Box 1848
 University of Mississippi
University, Mississippi 38677-1848
 Brewer web page - http://home.olemiss.edu/~jbrewer/ FAX - 662-915-5144
Phone - 662-202-5877



   [[alternative HTML version deleted]]



--

Subject: Digest Footer

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

--

End of R-sig-ecology Digest, Vol 87, Issue 6

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] evaulating multiple responses using Chi-square

2015-06-11 Thread Steve Brewer
Thanks Phillip
That's helpful. I hadn't thought about the one-tailed aspect of it. The
idea came from path analysis, wherein the goodness-of-fit test is a sum of
the chi squares for the unspecified paths. I'll look into this further.

Steve


J. Stephen Brewer 
Professor 
Department of Biology
PO Box 1848
 University of Mississippi
University, Mississippi 38677-1848
 Brewer web page - http://home.olemiss.edu/~jbrewer/
FAX - 662-915-5144
Phone - 662-202-5877




On 6/11/15 6:32 AM, Dixon, Philip M [STAT] pdi...@iastate.edu wrote:

Yes, the idea has been considered, by no less than RA Fisher.  If you
want a name, it is sometimes known as Fisher's combined probability test.
 Usually applied to one-sided tests of the same null hypothesis.   Yes,
the p-values can come from different sources.  Before using it in the
community context, I would want to think carefully about what it tells me
when applied to two-sided tests of different species.

Philip


   [[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Nested Permanova with repeated measures

2015-03-17 Thread Steve Brewer
 �.� 0.1 � � 1
 

Thank you very much

Paul Moquin
 


[R-sig-eco] Permanova with nested data
Steve Brewer jbrewer at olemiss.edu
mailto:r-sig-ecology%40r-project.org?Subject=Re%3A%20%5BR-sig-eco%5D%20Perm
anova%20with%20nested%20dataIn-Reply-To=%3CCD50D4BB.1E27D%25jbrewer%40olemi
ss.edu%3E 
Mon Feb 25 16:13:18 CET 2013

* Previous message: [R-sig-eco] lme4 for pseudoreplication avoidance
https://stat.ethz.ch/pipermail/r-sig-ecology/2013-February/003594.html
* Next message: [R-sig-eco] Validation of linear regression
https://stat.ethz.ch/pipermail/r-sig-ecology/2013-February/003597.html
* Messages sorted by: [ date ]
https://stat.ethz.ch/pipermail/r-sig-ecology/2013-February/date.html#3595
[ thread ] 
https://stat.ethz.ch/pipermail/r-sig-ecology/2013-February/thread.html#3595
  [ subject ] 
https://stat.ethz.ch/pipermail/r-sig-ecology/2013-February/subject.html#359
5  [ author ] 
https://stat.ethz.ch/pipermail/r-sig-ecology/2013-February/author.html#3595
 

Beth and others,

Given several recent queries regarding how to analyze repeated-measures
and split-plot perManova using adonis, I thought I would pass along what I
think is a reasonable solution.

I just saw the recent exchange over the use of BiodiversityR to do nested
perMANOVA. I was unaware of this function until today.

With that in mind, it is possible to do a simple repeated-measures
permanova using two different analyses, one for the within-subjects
effects and one for the between-subjects effects. The same approach
applies to a simple split-plot analysis.

For the within-subjects (or sub-plot) effects, you use adonis and the
strata function. The model formula could look something like:

Assume species responses are in Speciesdata and the treatment, time, and
plot effects are in envfactors

Adonis(Speciesdata ~ betweensubtrtmt * time + plot, data = envfactors,
strata = plot)

Where plot is nested within the betweensubtrtmnt

Strata restricts the permutation, and the residual error term will give
you the correct test for the time effect and the betweensubtrtmnt * time
interaction, but the test for the betweensubtrmnt main effect will be
wrong because plot, and not the residual error term, is the correct error
term for testing it.

To get a test for the betweensubtrtmnt main effect, load the BiodiversityR
package (I use the 1.6 version, but see the recent discussion about this)
and use the nested.npmanova function.

nested.npmanova(speciesdata ~ betweensubtrtmnt + plot, data = envfactors)

In this case, the betweensubtrtmnt is tested with plot; plot is tested
with the residual error term but that latter test is not correct in this
instance and is usually not of interest anyway.

Note that the default distance is euclidean; you'll to use method to
specify a different distance, e.g.,

nested.npmanova(speciesdata ~ betweensubtrtmnt + plot, data = envfactors;
method =bray)


The same principles apply to a simple split-plot design, except that the
whole-plot treatment is treated like a between-subjects treatment and the
sub-plot treatment is treated like the time effect.

Hope this is of some help.

Steve

J. Stephen Brewer 
Professor 
Department of Biology
PO Box 1848
 University of Mississippi
University, Mississippi 38677-1848
 Brewer web page - http://home.olemiss.edu/~jbrewer/
FAX - 662-915-5144 tel:662-915-5144
Phone - 662-915-1077 tel:662-915-1077



[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

Re: [R-sig-eco] Vegan-Adonis-NMDS-SIMPER

2014-03-27 Thread Steve Brewer
Brandon,

Yes. Just remove the observations containing the levels you don't want to
include and run permanova on the observations containing the levels of
interest.

Good luck,

Steve


J. Stephen Brewer 
Professor 
Department of Biology
PO Box 1848
 University of Mississippi
University, Mississippi 38677-1848
 Brewer web page - http://home.olemiss.edu/~jbrewer/
FAX - 662-915-5144 Phone - 662-915-1077

From:  Brandon Gerig bge...@nd.edu
Date:  Thu, 27 Mar 2014 08:47:58 -0400
To:  Pyro Maniac jbre...@olemiss.edu, r-sig-ecology@r-project.org
Subject:  Re: [R-sig-eco] Vegan-Adonis-NMDS-SIMPER

Hi Steve,

Yes, this is precisely what I am interested in doing. It seems like
betadisper might be a good way to visualize differences/similarities in the
dispersion and examine differences among centroids for the levels within a
factor. Am I correct in thinking that if I conduct additional PERMANOVA
tests on a reduced data set, I could be evaluating differences between the
levels of a main effect?

Could anyone provide a citation for a paper that uses a similar procedure?


On Wed, Mar 26, 2014 at 3:21 PM, Steve Brewer jbre...@olemiss.edu wrote:
 Brandon,
 
 Are you asking if you can use betadisper as a substitute for post-anova
 pairwise comparisons among levels? After using betadisper to obtain
 dispersions, I believe you can plot the centroids for each level. In
 addition to telling you if the dispersions differ among levels, you could
 see how the centroids differ from one another. Is this what you want to
 know? If so, realize that it won't give you pairwise significance tests
 for differences between levels. For that, you might want to do additional
 permanovas on reduced datasets containing only the two levels you want to
 compare. You could then adjust the p-values for multiple tests after the
 fact.
 
 Hope this helps,
 
 Steve
 
 
 J. Stephen Brewer
 Professor
 Department of Biology
 PO Box 1848
  University of Mississippi
 University, Mississippi 38677-1848
  Brewer web page - http://home.olemiss.edu/~jbrewer/
 FAX - 662-915-5144 tel:662-915-5144
 Phone - 662-915-1077 tel:662-915-1077
 
 
 
 
 On 3/26/14 10:57 AM, Brandon Gerig bge...@nd.edu wrote:
 
 Thanks for the words of caution on simper.
 
 Am I completely off base in thinking that betadiver function (analgous to
 Levene's test) could be used to examine variation between levels within
 main effects?
 
 Cheers
 
 
 On Mon, Mar 24, 2014 at 5:08 PM, Brandon Gerig bge...@nd.edu wrote:
 
  I am assessing the level of similarity between PCB congener profiles in
  spawning salmon and resident stream in stream reaches with and without
  salmon to determine if salmon are a significant vector for PCBs in
  tributary foodwebs of the Great Lakes.
 
  My data set is arranged in a matrix where the columns represent the
  congener of interest and the rows represent either a salmon (migratory)
 or
  resident fish (non migratory) from different sites.  You can think of
 this
  in a manner analogous to columns representing species composition and
 rows
  representing site.
 
  Currently, I am using the function Adonis to test for dissimilarity
  between fish species, stream reaches (with and without salmon) and lake
  basin (Superior, Huron, Michigan).
  The model statement is:
 
 
 m1adonis(congener~FISH*REACH*BASIN,data=pcbcov,method=bray,permutation
 s=999)
 
  The output indicates significant main effects of FISH, REACH, and BASIN
  and significant interactions between FISH and BASIN, and BASIN and
 REACH.
 
  Is it best to then interpret this output via an NMDS ordination plot or
  use something like the betadiver function to examine variances between
 main
  effect levels or both?
 
  Also,  can anyone recommend a procedure to identify the congeners that
  contribute most to the dissimilarity between fish, reaches, and
 basins?. I
  was thinking the SIMPER procedure but am not yet sold.
 
  Any advice is appreciated!
  --
  Brandon Gerig
  PhD Student
  Department of Biological Sciences
  University of Notre Dame
 
 
 
 
 --
 Brandon Gerig
 PhD Student
 Department of Biological Sciences
 University of Notre Dame
 
[[alternative HTML version deleted]]
 
 ___
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
 
 



-- 
Brandon Gerig
PhD Student
Department of Biological Sciences
University of Notre Dame



[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Vegan-Adonis-NMDS-SIMPER

2014-03-27 Thread Steve Brewer
Gavin and Brandon,

Yes, I am aware that betadisper() does not actually give you a test of
differences between centroids, but the fact that it does calculate
centroids is quite valuable for interpretation, in my opinion, especially
when using non-euclidean distance matrices (e.g., Bray-Curtis) and also if
you would prefer NOT to do additional pairwise tests between levels, but
still would like to have some idea as to which pairwise differences
between levels might be most responsible for the effect. When using
bray-curtis distances, you can't get centroids by calculating averages of
abundances among the observations of interest. If you just want to use a
NMDS ordination with levels symbol-coded to make them distinct, that's
fine. Sometimes folks calculate the average axis score per group or level
of group and plot that. That's fine, too. The nice thing about obtaining
centroids calculated using betadisper() is that they are based on a
principal coordinates analysis that uses ALL the axes, not just the first
two or three axes in the ordination. It is likely that if the first two or
three axes of the NMDS explain most of the important variation, the
average scores per level for those three axes will probably tell the same
information as the centroids will.

Even though it wasn't intended for this purpose, Sharon Graham and I,
together, figured out that you could use the centroids calculated by
betadisper() to analyze split-plot and repeated-measures designs using
adonis. So, its value extends beyond what it was intended for.


Steve
 
J. Stephen Brewer 
Professor 
Department of Biology
PO Box 1848
 University of Mississippi
University, Mississippi 38677-1848
 Brewer web page - http://home.olemiss.edu/~jbrewer/
FAX - 662-915-5144
Phone - 662-915-1077




On 3/27/14 10:47 AM, Gavin Simpson ucfa...@gmail.com wrote:

Note that `betadisper()` only considers statistically dispersions
about the group centroids. It might show the centroids and return
their values, but it doesn't consider differences in those centroids.
As far is `betadisper()` is concerned, the group centroids could all
be made exactly equal and it wouldn't change the results as it is only
the spread about the centroid that is used.

HTH

G

On 27 March 2014 06:47, Brandon Gerig bge...@nd.edu wrote:
 Hi Steve,

 Yes, this is precisely what I am interested in doing. It seems like
 betadisper might be a good way to visualize differences/similarities in
the
 dispersion and examine differences among centroids for the levels
within a
 factor. Am I correct in thinking that if I conduct additional PERMANOVA
 tests on a reduced data set, I could be evaluating differences between
the
 levels of a main effect?

 Could anyone provide a citation for a paper that uses a similar
procedure?


 On Wed, Mar 26, 2014 at 3:21 PM, Steve Brewer jbre...@olemiss.edu
wrote:

 Brandon,

 Are you asking if you can use betadisper as a substitute for post-anova
 pairwise comparisons among levels? After using betadisper to obtain
 dispersions, I believe you can plot the centroids for each level. In
 addition to telling you if the dispersions differ among levels, you
could
 see how the centroids differ from one another. Is this what you want to
 know? If so, realize that it won't give you pairwise significance tests
 for differences between levels. For that, you might want to do
additional
 permanovas on reduced datasets containing only the two levels you want
to
 compare. You could then adjust the p-values for multiple tests after
the
 fact.

 Hope this helps,

 Steve


 J. Stephen Brewer
 Professor
 Department of Biology
 PO Box 1848
  University of Mississippi
 University, Mississippi 38677-1848
  Brewer web page - http://home.olemiss.edu/~jbrewer/
 FAX - 662-915-5144
 Phone - 662-915-1077




 On 3/26/14 10:57 AM, Brandon Gerig bge...@nd.edu wrote:

 Thanks for the words of caution on simper.
 
 Am I completely off base in thinking that betadiver function
(analgous to
 Levene's test) could be used to examine variation between levels
within
 main effects?
 
 Cheers
 
 
 On Mon, Mar 24, 2014 at 5:08 PM, Brandon Gerig bge...@nd.edu wrote:
 
  I am assessing the level of similarity between PCB congener
profiles in
  spawning salmon and resident stream in stream reaches with and
without
  salmon to determine if salmon are a significant vector for PCBs in
  tributary foodwebs of the Great Lakes.
 
  My data set is arranged in a matrix where the columns represent the
  congener of interest and the rows represent either a salmon
(migratory)
 or
  resident fish (non migratory) from different sites.  You can think
of
 this
  in a manner analogous to columns representing species composition
and
 rows
  representing site.
 
  Currently, I am using the function Adonis to test for dissimilarity
  between fish species, stream reaches (with and without salmon) and
lake
  basin (Superior, Huron, Michigan).
  The model statement is:
 
 
 
m1adonis(congener~FISH*REACH*BASIN,data=pcbcov,method

Re: [R-sig-eco] Vegan-Adonis-NMDS-SIMPER

2014-03-26 Thread Steve Brewer
Brandon,

Are you asking if you can use betadisper as a substitute for post-anova
pairwise comparisons among levels? After using betadisper to obtain
dispersions, I believe you can plot the centroids for each level. In
addition to telling you if the dispersions differ among levels, you could
see how the centroids differ from one another. Is this what you want to
know? If so, realize that it won't give you pairwise significance tests
for differences between levels. For that, you might want to do additional
permanovas on reduced datasets containing only the two levels you want to
compare. You could then adjust the p-values for multiple tests after the
fact.

Hope this helps,

Steve


J. Stephen Brewer 
Professor 
Department of Biology
PO Box 1848
 University of Mississippi
University, Mississippi 38677-1848
 Brewer web page - http://home.olemiss.edu/~jbrewer/
FAX - 662-915-5144
Phone - 662-915-1077




On 3/26/14 10:57 AM, Brandon Gerig bge...@nd.edu wrote:

Thanks for the words of caution on simper.

Am I completely off base in thinking that betadiver function (analgous to
Levene's test) could be used to examine variation between levels within
main effects?

Cheers


On Mon, Mar 24, 2014 at 5:08 PM, Brandon Gerig bge...@nd.edu wrote:

 I am assessing the level of similarity between PCB congener profiles in
 spawning salmon and resident stream in stream reaches with and without
 salmon to determine if salmon are a significant vector for PCBs in
 tributary foodwebs of the Great Lakes.

 My data set is arranged in a matrix where the columns represent the
 congener of interest and the rows represent either a salmon (migratory)
or
 resident fish (non migratory) from different sites.  You can think of
this
 in a manner analogous to columns representing species composition and
rows
 representing site.

 Currently, I am using the function Adonis to test for dissimilarity
 between fish species, stream reaches (with and without salmon) and lake
 basin (Superior, Huron, Michigan).
 The model statement is:

 
m1adonis(congener~FISH*REACH*BASIN,data=pcbcov,method=bray,permutation
s=999)

 The output indicates significant main effects of FISH, REACH, and BASIN
 and significant interactions between FISH and BASIN, and BASIN and
REACH.

 Is it best to then interpret this output via an NMDS ordination plot or
 use something like the betadiver function to examine variances between
main
 effect levels or both?

 Also,  can anyone recommend a procedure to identify the congeners that
 contribute most to the dissimilarity between fish, reaches, and
basins?. I
 was thinking the SIMPER procedure but am not yet sold.

 Any advice is appreciated!
 --
 Brandon Gerig
 PhD Student
 Department of Biological Sciences
 University of Notre Dame




-- 
Brandon Gerig
PhD Student
Department of Biological Sciences
University of Notre Dame

   [[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Extract residuals from adonis function in vegan package

2014-03-18 Thread Steve Brewer
Alicia

One more thought. I wonder if part of the problem is that you're
attempting to use ISA to do something it was not designed to deal with.
Sometimes that can work and result in a clever new approach, but in this
case, I don't see how it can work. As a recall, ISA is done by taking the
product of relative abundance and relative frequency. Therefore, as
mentioned by Jari, it can only be used with positive values of abundance
and frequency. Yes, it is true that you can transform the residual
abundances to make them all positive. That seems perfectly reasonable if
you then intend to use the residual abundances (and the residual
abundances only) to examine associations with environmental groups. What
is puzzling me is how one calculates a relative frequency using residual
abundances or how one calculates a residual frequency. Is it your
intention to calculate residual frequencies? It's not clear to me how that
could be done. I suppose you could leave out the relative frequency when
doing the ISA, but that leads me to the following question: Can you tell
us why indicator species analysis is a better approach for determining
indication of particular environmental conditions than are species scores
generated from a capscale ordination?  Given the flexibility of capscale
to deal with categorical predictors, it has never been clear to me what
advantages ISA has over species scores in interpreting environmental
associations (other than generating a Monte Carlo-derived significance
value). If there is not much difference, then I would do a capscale
analysis using region as a Condition factor (as suggested by Arne) and
then examine the species scores.

Steve

 
J. Stephen Brewer 
Professor 
Department of Biology
PO Box 1848
 University of Mississippi
University, Mississippi 38677-1848
 Brewer web page - http://home.olemiss.edu/~jbrewer/
FAX - 662-915-5144
Phone - 662-915-1077




On 3/18/14 11:48 AM, Alicia Valdés aliciavaldes1...@gmail.com wrote:

Thanks for so many thoughts!


Arne: I could use Condition() for that but something like
capscale(communitydistances ~ environmentalvar+Condition(region))
would not attempt what I need, because I need to fit a model only with
region, extract residuals, and then do another analysis (ISA) with these
residuals.
However, if this kind of formulation is correct, it could also be useful
to
see the effects of environmental factors while controlling for the effect
of region

Indicator species analysis could be done for each region separately. -
yes,
but I would like to try both approaches, ISA for all regions (but removing
the region effect) and ISA for each region separately, and compare the
results


Jari: OK I guess that even if they could be calculated, residuals from the
adonis() point of view are not suitable for me, as they are
dissimilarities. Sorry if I was confusing, what I need are residuals of
raw
data, as you say. So I guess the residuals() method of rda() should work.
Yes, I know that ISA needs non-negative values, I was attending to
transform these residuals to make them all positive.

Pierre: I checked manyglm() function in the mvabund package and I think it
could be useful too, you can indeed get residuals from a mutivariate GLM
with family=binomial.


Ivailo: I think you maybe misunderstood my approach, I want to perform an
ISA based on differences in species composition, but I want to focus on
the
part of these differences which is not caused by the region studied. I
want
to identify indicator species for different environmental conditions and
caracteristics of forest patches (like landscape management type, forest
age and others), but I am not interested in finding the characteristic
species for the different regions. To make it clearer, I have data from
different regions, into each of the regions there are a series of forest
patches which vary in management type, age and other factors. So the
cluster I am using in multipatt() is a classification into management
types, age groups, etc. As I said before, I am not interested in use
regions for site classification. I hope this makes the analysis easier
to
understand now!

So my main doubts now are: for this kind of purpose, 1) what is your
opinion on using a community data table in the ISA which contains not the
actual presence/absence, but residuals form a previous model with region
effect?; and 2) Which kind of residuals do you find more appropiate to use
here (rda residuals of multivariate GLM residuals)?

Cheers,

Alicia



2014-03-18 16:23 GMT+01:00 Ivailo ubuntero.9...@gmail.com:

 On Tue, Mar 18, 2014 at 5:02 PM, Alicia Valdés
 aliciavaldes1...@gmail.com wrote:
 ...
  However, what I attempt to do is to perform an indicator species
analysis
  (ISA) with these residuals. I want to see if I can find species which
are
  indicators for different environmental conditions, but first I would
like
  to remove the differences in species composition due to the study
region
 ...

 Isn't an indicator species 

Re: [R-sig-eco] Community composition variance partitioning?

2013-12-04 Thread Steve Brewer
Alexandre,

I'll leave it to Sarah to advise you on MRM (and I agree with Jari that
the method you're describing is not going to work). I'll just add that it
is not clear to me why the predictors (even geographic distance) have to
be treated as distances to partition the variance in composition. I'm
assuming the environmental variables were not originally in the form of
euclidean distance matrices and that the raw measurements are available?
As for the geographic distances, if you have lat and long coordinates, why
not treat both lat and long as predictors and do the necessary analyses as
partial distance-based redundancy analyses using capscale? In one analysis
the geographic predictors could be partialled out (with the result
explaining the fraction explained by the environment). In another, the
environmental predictors could be partialled out (with the result
explaining the fraction explained by the geographic distance) and in a
third both geographic and environmental predictors could be considered
with no conditioning covariates (which will give the total variance
explained by both combined).

Best
Steve


J. Stephen Brewer 
Professor 
Department of Biology
PO Box 1848
 University of Mississippi
University, Mississippi 38677-1848
 Brewer web page - http://home.olemiss.edu/~jbrewer/
FAX - 662-915-5144
Phone - 662-915-1077




On 12/4/13 11:50 AM, Alexandre Fadigas de Souza alexso...@cb.ufrn.br
wrote:

Dear friends,

   My name is Alexandre and I am trying to analyze a dataset on floristic
composition of tropical coastal vegetation by means of variance
partition, according to the outlines of a Tuomisto's recent papers,
specially

Tuomisto, H., Ruokolainen, L., Ruokolainen, K., 2012. Modelling niche and
neutral dynamics : on the ecological interpretation of variation
partitioning results. Ecography (Cop.). 35, 961­971.

   I have a doubt, could you please give your opinion on it?

   We are proceeding a variance partition of the bray-curtis floristic
distance using as explanatory fractions soil nutrition, topography,
canopy openess and geographical distances (all as euclidean distance
matrices).

We are using the MRM function of the ecodist package:

mrm - MRM(dist(species) ~ dist(soil) + dist(topograph) + dist(light) +
dist(xy), data=my.data, nperm=1

The idea is that the overall R2 of this multiple regression should be
used to assess the contributions of the spatial and environmental
fractions through subtraction :

Three separate multiple regression analyses are needed
to assess the relative explanatory power of geographical
and environmental distances. All of these have the same
response variable (the compositional dissimilarity matrix),
but each analysis uses a diff erent set of the explanatory
variables. In these analyses the explanatory variables are:
(I) the geographical distance matrix only, (II) the environmental
diff erence matrices only, and (III) all the explanatory
variables used in (I) or (II). Comparing the R 2 values
from these three analyses allows partitioning the variance
of the response dissimilarity matrix to four fractions.
Fraction A is explained uniquely by the environmental
diff erence matrices and equals R2 (III) R2 (I). Fraction B
is explained jointly by the environmental and geographical
distances and equals R2 (I) R2 (II) R2 (III). Fraction C
is explained uniquely by geographical distances and
equals R2 (III) R2 (II). Fraction D is unexplained by the
available environmental and geographical dissimilarity
matrices and equals 100% R2 (III) (throughout the present
paper, R2 values are expressed as percentages rather
than proportions). [Tuomisto et al. 2012]

The problem is that the R2 of the overall model (containing all the
explanatory variables) is smaller than most of the R2 of models
containing each of the explanatory matrices. So it seems not possible to
proceed with the approach proposed.


Sincerely,

Alexandre

Dr. Alexandre F. Souza
Professor Adjunto II Departamento de Botanica, Ecologia e Zoologia
Universidade Federal do Rio Grande do Norte (UFRN)
http://www.docente.ufrn.br/alexsouza  Curriculo:
lattes.cnpq.br/7844758818522706

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] angular statistics

2013-10-16 Thread Steve Brewer
Peter,

For my purposes (I.e., estimating exposure and drying potential in
northern hemisphere temperate forests), I simply subtract 45 degrees from
the measured aspect in degrees, convert to radians, and then take the
cosine of the adjusted angle. If I want to make exposure positive, I then
reverse the sign. In this way, southwest-facing slopes get the maximum
value (1) and northeast-facing slopes get the lowest (-1). As others have
mentioned, this approach gives equal weight to east-west and north-south
variation in exposure, which may or may not be valid for a given situation.

In your case, it sounds like you want to assume the east-facing aspects
are maximally exposed. In that case, I would just subtract 90 degrees from
your degrees measurement, convert to radians, and then take the cosine,
which I believe amounts to the same approach that Don suggested.
East-facing slopes should end up with a value of 1 and west-facing slopes
a value of -1 (due north and south will have values of 0). If you want to
give north-facing aspects less exposure than south-facing aspects (I don't
know whether you are in the northern or southern hemisphere), then you
could subtract 135 degrees from your measurements, making southeast
aspects the most exposed.

Steve
 
J. Stephen Brewer 
Professor 
Department of Biology
PO Box 1848
 University of Mississippi
University, Mississippi 38677-1848
 Brewer web page - http://home.olemiss.edu/~jbrewer/
FAX - 662-915-5144
Phone - 662-915-1077




On 10/15/13 11:59 AM, Peter Nelson pnel...@cfr-west.org wrote:

I want to include the exposure (measured in degrees, for example,
East-facing is 90) of various coastal sites in GLM and CCA analyses. Is
there an appropriate transformation that I can apply to these
measurements that will allow me to do this? I've found plenty of
information on comparing headings, calculating means, etc, but nothing on
how exposure might be used as a continuous independent variable.

Treating exposure as a categorical variable (East, Southwest, etc) seems
like a fallback option, but then there is just as much of a 'difference'
between SE and E sites as there is between SE and NW sites!

Thanks, Pete
___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] split-split-plot design and adonis

2013-04-26 Thread Steve Brewer
Sharon,

I've had difficulty trying to figure out how to do a split-split-plot
anova on bray-curtis distances in R and thus may not be much help to you.
If you didn't have the habitat effect and associated subplots, you could
do a simple split-plot analysis using two separate analyses (two-way
nested.npmanova with plot nested within treatment for the whole-plot test
and adonis and strata for the split-plot test: species~treatment*shade +
plot; strata = plot for the split-plot effects). See my February 25 2013
message entitled Permanova with anova. It was intended for
repeated-measures but applies to split-plots as well.

This approach can't be used for split-split analysis, however, because
nested.npmanova won't handle two levels of nesting.

What we need is a way for R to calculate distances among centroids given a
particular grouping variable and then make a distance matrix from those.

For example, in your case, you need to be able to calculate distances
among centroids among each treatment by shade level combination (perhaps
call it treatshade, with each combination having a unique label), and then
create a matrix from those centroids. You could then use nested.npmanova
to test the main effect of treatment invert.hel ~ treatment + plot; you
ignore the test of plot. You could then analyze the effects of shade and
treatment:shade, while ignoring the test for treatment using invert.hel ~
treatment*shade + plot, strata = plot. Next, for the split-split plot, you
would analyze the original distance matrix (I.e., distance among
subplots). In this case, however, you would need to create a new
categorical predictor variable in which each treatment by shade
combination gets a unique label (treatshade), but this time it will be
repeated for each habitat and subplot. Make the following model -
invert.hel ~ treatshade + habitat + habitat:shade + habitat:treatment +
habitat:treatment:shade + plot, strata = plot. In this case, you can
evaluate the habitat effect, the habitat*shade interaction, the
habitat*treatment interaction, and the three-way interaction, and ignore
the treatshade effect.

Sorry I can't be of more help. Maybe somebody on the listserv knows how to
calculate distances among centroids (or for that matter calculate
Bray-curtis centroids). That would help a lot.


J. Stephen Brewer 
Professor 
Department of Biology
PO Box 1848
 University of Mississippi
University, Mississippi 38677-1848
 Brewer web page - http://home.olemiss.edu/~jbrewer/
FAX - 662-915-5144
Phone - 662-915-1077




On 4/26/13 1:40 AM, s.graham sharon.gra...@pg.canterbury.ac.nz wrote:

Hello all, 

I've been trying to adapt the code suggested by Kay to a similar
split-split-plot experiment.  My experimental design was as follows:

6 plots, 3 of which received treatment and 3 which were control
Each plot was split into shade and no shade
within each shade/no shade there were 6 subplots, 3 of which had habitat
and 3 which did not

So in total I had 72 sub-plots (3 reps of each possible combination),
hierarchically arranged: Treatment/Shade/Habitat.  I have done mixed
effects
models on measures of community composition, such as species richness, etc
and found that Treatment, Shade, Treatment:Shade (interaction) and
Treatment:Leaves were all significant predictors of richness.  Now I would
like to test the effects of my experimental combinations
(Trt/Shade/Habitat)
and interactions on community composition.

Following Steve's comments and Kay's suggestions to Arnaud, I set up my
code
as the following: 

# To test for Treatment:Shade
adonis(invert.hel ~ Treatment*Shade*Habitat+Plot, method=bray,
strata=Plot, perm=999)

# To test for Shade:Habitat and Treatment:Habitat
Plot_Shade - interaction(Plot,Shade)
adonis(invert.hel ~ Treatment*Shade*Habitat+Plot_Shade, method=bray,
strata=Plot_Shade, perm=999)

# To test for Shade, Habitat main effects and Shade:Habitat
set.seed(123)
adonis(invert.hel ~ Shade * Habitat, perm = 999, strata = Plot)
** NOTE: I am a little confused whether this accounts for Habitat being
nested within Shade?

# To test Treatment main effect:
nobs = length(Plot)
ctrl - permControl(strata = Plot, within = Within(type = free))
(fit - adonis(invert.hel ~ Treatment, permutations = 1))

### number of perms
B - 999

### setting up frame which will be populated by random r2 values:
pop - rep(NA, B + 1)

### the first entry will be the true r2:
pop[1] - fit$aov.tab[1, 5]

for(i in 2:(B+1)){
 idx - shuffle(nobs, control = ctrl)
 fit.rand - adonis(invert.hel ~ Treatment[idx], permutations = 1)
 pop[i] - fit.rand$aov.tab[1,5]
}

### get the p-value (H0: Treatment has no effect on location of
communities):
print(pval - sum(pop = pop[1]) / (B + 1))

** Here, I get a p-value of 1, but Treatment has the highest R2 of all
variables, and others with lower R2 (Shade, Habitat) came out as
significant.  Treatment was also highly significant in my mixed effects
models.  So I'm confused by that.

When I look at the pop dataframe 

Re: [R-sig-eco] Permanova with nested data

2013-02-25 Thread Steve Brewer
Beth and others,

Given several recent queries regarding how to analyze repeated-measures
and split-plot perManova using adonis, I thought I would pass along what I
think is a reasonable solution.

I just saw the recent exchange over the use of BiodiversityR to do nested
perMANOVA. I was unaware of this function until today.

With that in mind, it is possible to do a simple repeated-measures
permanova using two different analyses, one for the within-subjects
effects and one for the between-subjects effects. The same approach
applies to a simple split-plot analysis.

For the within-subjects (or sub-plot) effects, you use adonis and the
strata function. The model formula could look something like:

Assume species responses are in Speciesdata and the treatment, time, and
plot effects are in envfactors

Adonis(Speciesdata ~ betweensubtrtmt * time + plot, data = envfactors,
strata = plot)

Where plot is nested within the betweensubtrtmnt

Strata restricts the permutation, and the residual error term will give
you the correct test for the time effect and the betweensubtrtmnt * time
interaction, but the test for the betweensubtrmnt main effect will be
wrong because plot, and not the residual error term, is the correct error
term for testing it.

To get a test for the betweensubtrtmnt main effect, load the BiodiversityR
package (I use the 1.6 version, but see the recent discussion about this)
and use the nested.npmanova function.

Nested.npmanova(speciesdata ~ betweensubtrtmnt + plot; data = envfactors)

In this case, the betweensubtrtmnt is tested with plot; plot is tested
with the residual error term but that latter test is not correct in this
instance and is usually not of interest anyway.

Note that the default distance is euclidean; you'll to use method to
specify a different distance, e.g.,

Nested.npmanova(speciesdata ~ betweensubtrtmnt + plot; data = envfactors;
method =bray)


The same principles apply to a simple split-plot design, except that the
whole-plot treatment is treated like a between-subjects treatment and the
sub-plot treatment is treated like the time effect.

Hope this is of some help.

Steve

J. Stephen Brewer 
Professor 
Department of Biology
PO Box 1848
 University of Mississippi
University, Mississippi 38677-1848
 Brewer web page - http://home.olemiss.edu/~jbrewer/
FAX - 662-915-5144
Phone - 662-915-1077




On 12/5/12 10:04 AM, Beth Atkinson beth.atkin...@bristol.ac.uk wrote:

Hi,

Apologies if this has been asked before, I looked looked through the
archives and couldn't find a solution.


I have plant community data from 32 woodland plots. Plots are grouped
into 
sites (8 sites in total). Half the sites (4) are on one soil type, and
half 
on another. At each site there are 4 plots, each under a different
management regime .  I want to know if the plant communities under
different management regimes and on different soil types differ.

My data is as follows:


veg - a data frame containg the community data (abundances of each
species 
at each plot)


environ-read.csv(file.choose(),header=TRUE)

 str(environ)
'data.frame':   32 obs. of  4 variables:
 $ code: Factor w/ 32 levels cf.1,cf.3,..: 1 9 17 25 2 10 18 26 3 11
...
 $ site: Factor w/ 8 levels eight,five,..: 5 5 5 5 8 8 8 8 3 3 ...
 $ type: Factor w/ 4 levels clearfell,plantation,..: 1 2 3 4 1 2 3 4
1 
2 ...
 $ nvc : Factor w/ 2 levels W10,W8: 2 2 2 2 2 2 2 2 1 1 ...

# site is the name of each site
# type refers to the management regime
# nvc refers to the soil type (actually the NVC classification)

Prior to PERMANOVA I used betadisper() to test for homogeneity of
multivariate dispersion.  No difference in dispersion was found either
between plots on different site types, or plots on different NVC types.


I carried out a PERMANOVA using adonis on this data as follows:

adon.mod1.bray-adonis(veg~ type*nvc, data=environ,strata=environ$site,
   method = bray, permutations=999)

I used strata=environ$site as management regime is nested within site.

 adon.mod1.bray

Call:
adonis(formula = veg ~ type * nvc, data = environ, permutations = 999,
method = bray, strata = environ$site)

Terms added sequentially (first to last)

  Df SumsOfSqs MeanSqs F.Model  R2 Pr(F)
type   32.2358 0.74528  3.5742 0.24662  0.001 ***
nvc11.0905 1.09045  5.2296 0.12028  0.001 ***
type:nvc   30.7353 0.24509  1.1754 0.08110  0.208
Residuals 245.0044 0.20852 0.55200
Total 319.0659 1.0


Is this acceptable. Is the strata=environ$site part correct?

If data points are only being permutted within sites, and each site only
occurs on one soil type, is there no permutation between NVC/soil types?

I've also heard that adonis does not give the correct p values when data
is 
nested. Is this correct, and a problem in the above example? If so is
there 
a more suitable analysis? Would a db-RDA as follows be ok?:


dbRDA-capscale(veg~type*nvc+Condition(site), data=environ,

Re: [R-sig-eco] adonis and temporal changes

2013-02-18 Thread Steve Brewer
Valerie,

Adonis does not define fixed or random effects, and you therefore cannot
define multiple error terms. However, if your model statement looks
something like this - isolation*year + site, strata = site - then you will
get the correct test for the isolation x year interaction and the correct
test for the year effect. The test for isolation will be wrong, because
the residual error is used to test all effects, when it is only
appropriate for testing the year effect and the year * isolation
interaction. The isolation between-subjects effect should be tested with
the site effect but is not.

The key point is here to make strata = site and to NOT specify the site-
interactions with isolation or year. In this way, site will be treated as
a block for the within-subjects effects and thus could be considered a
random effect.

Hope this helps.

 
J. Stephen Brewer 
Professor 
Department of Biology
PO Box 1848
 University of Mississippi
University, Mississippi 38677-1848
 Brewer web page - http://home.olemiss.edu/~jbrewer/
FAX - 662-915-5144
Phone - 662-915-1077




On 2/18/13 8:19 AM, v_coudr...@voila.fr v_coudr...@voila.fr wrote:

Thank you for these explanations. If I put strata=site, this means that
for each site my dissimilarity matrix of year 1 and year 2 will be
permuted and the observed
changes compared to these random permutation? Adding site as a fixed
factor then ensure that I am testing changes in time site by site. Am I
correct?

To my design:
I have 30 permanent sites, 10 of each category of isolation (Isolation =
factor with 3 levels: 3x10 sites = 30 sites). I conducted the samples in
three years in each
site. I have thus 1 sampling (species composition) pro site pro year. I
would like to know how the sampled communities change with time, either
on a site basis, 
or at the level of isolation (I may compare multi-site dissimilarity
among isolation levels between years).
I am not really interested in knowing what proportion of differences in
species community is due to space vs time, but I would like to really
focus on the temporal
changes. That's why I think putting site as a fixed effect should be
appropriate. But if you have any suggestion or think this is not correct,
I would be pleased to
have your opinion.
Cheers,

Valerie





On 18/02/2013, at 14:04 PM, Pierre THIRIET wrote:

 Dear Valérie,
 
 If I remember well, your design includes:
 Isolation categories: 3 levels
 Sites: nested within Isolation categories (10 levels, a total of 30
sites)
 How many replicates per site and time?
 Time:? how many years you have? Only one sampling per year? Within
sites and years, samples were random or it is always exactly the same
area you 
sample (e.g. permanent quadrats)?
 
 for adonis, consider that strata is for constraining permutations,
which is different than terms in the formulae.
 
Exactly. The 'strata' only influence the permutations and have no effect
in formula nor effect defined in the formula.

Currently the 'strata' are the only way to constrain the permutations.
However, in the R-Forge version of vegan and in vegan 2.2-0 (to be
released in April) you
can give a permutation matrix as an input to adonis. You can generate the
permutation matrix with, say, shuffleSet function of the permute package.
This allows 
generation of restricted permutations for instance for time series. Vegan
command vegandocs(permutations) will open up the vignette of the
permute package 
for your inspection, and this will give some examples of defining
restricted permutations. At some timeframe we are completely moving to
the permute package,
but you can already use its permutation matrices as input with these new
and upcoming versions of vegan from R-Forge.

Cheers, Jari
--
Jari Oksanen, Dept Biology, Univ Oulu, 90014 Finland
___
C'est l'année du Serpent ! Connaissez-vous votre signe astral chinois ?
Découvrez-le ici http://astrocenter.voila.fr/voila/Presentation.aspx?
product=StEdCH2K2Af=-3000
___
C'est l'année du Serpent ! Connaissez-vous votre signe astral chinois ?
Découvrez-le ici 
http://astrocenter.voila.fr/voila/Presentation.aspx?product=StEdCH2K2Af=-
3000

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] adonis and temporal changes

2013-02-18 Thread Steve Brewer
Valerie,

If I understand your design correctly, you're doing a repeated measures
analysis, in which isolation is a between-subjects (I.e., between-sites)
effect. Year and the year x isolation interaction are within-subjects
effects. Because repeated measurements on composition are being taken on
the same site in three years, you use strata to restrict the permutation
within each site as if site were were a random block containing the
different years of measurement. Accordingly, there should be two error
terms: site(isolation) to test the isolation main effect, and the
site*year(isolation), which in this case is equivalent to the residual
error, which is the appropriate error term for testing the year effect and
the year x isolation interaction. The test for isolation is wrong because
adonis cannot use more than one error term to test effect and thus is
using the residual error to test all effects. It should use the
site(isolation) term to test the isolation effect, but it does not. Using
the residual error to test the isolation effect amounts to
pseudoreplication. It assumes that the three measurements of composition
in different years on the same site are independent observations. They are
not. Often, however, people are not interested in the between-subjects
effects (in this case, the main effect of isolation). Rather they are
interested in the interaction with time (in this case, isolation x year).

I don't see that you are justified in pooling any term with the error term
just because it is not significant. Again, the problem is
pseudoreplication. You're treating correlated observations as if they were
independent observations. Pooling the isolation x year interaction with
the residual error term artificially inflates your error df even more.

I'm afraid I don't know R well enough to explain how to analyze the
covariate. 


J. Stephen Brewer 
Professor 
Department of Biology
PO Box 1848
 University of Mississippi
University, Mississippi 38677-1848
 Brewer web page - http://home.olemiss.edu/~jbrewer/
FAX - 662-915-5144
Phone - 662-915-1077




On 2/18/13 1:49 PM, v_coudr...@voila.fr v_coudr...@voila.fr wrote:

Dear Steve,

Thank you very much. I do not exactly understand why the test for
isolation will be wrong, would you have some some explanation?
In a linear regression, you cannot assess the effect of single variable
if the interaction (in which your variable is part) is significant. So if
I get a significant result
for the isolation*year effect I should conclude that there is an
interaction between isolation and year. If the interaction is not
significant, should I drop it to get the
correct estimate for the year effect?
I would have an additional question: I have also an environemental
gradient (continuous, one value pro site, constant over the years). Is it
possible to include it?

Best wishes
Valerie


 Message du 18/02/13 à 15h41
 De : Steve Brewer
 A : v_coudr...@voila.fr, r-sig-ecology@r-project.org
 Copie à : 
 Objet : Re: [R-sig-eco] adonis and temporal changes
 
 Valerie,
 
 Adonis does not define fixed or random effects, and you therefore cannot
 define multiple error terms. However, if your model statement looks
 something like this - isolation*year + site, strata = site - then you
will
 get the correct test for the isolation x year interaction and the
correct
 test for the year effect. The test for isolation will be wrong, because
 the residual error is used to test all effects, when it is only
 appropriate for testing the year effect and the year * isolation
 interaction. The isolation between-subjects effect should be tested with
 the site effect but is not.
 
 The key point is here to make strata = site and to NOT specify the site-
 interactions with isolation or year. In this way, site will be treated
as
 a block for the within-subjects effects and thus could be considered a
 random effect.
 
 Hope this helps.
 
 
 J. Stephen Brewer
 Professor 
 Department of Biology
 PO Box 1848
 University of Mississippi
 University, Mississippi 38677-1848
 Brewer web page - http://home.olemiss.edu/~jbrewer/
 FAX - 662-915-5144
 Phone - 662-915-1077
 
 
 
 
 On 2/18/13 8:19 AM, v_coudr...@voila.fr  wrote:
 
 Thank you for these explanations. If I put strata=site, this means that
 for each site my dissimilarity matrix of year 1 and year 2 will be
 permuted and the observed
 changes compared to these random permutation? Adding site as a fixed
 factor then ensure that I am testing changes in time site by site. Am I
 correct?
 
 To my design:
 I have 30 permanent sites, 10 of each category of isolation (Isolation
=
 factor with 3 levels: 3x10 sites = 30 sites). I conducted the samples
in
 three years in each
 site. I have thus 1 sampling (species composition) pro site pro year. I
 would like to know how the sampled communities change with time, either
 on a site basis,
 or at the level of isolation (I may compare multi-site dissimilarity
 among isolation levels between years).
 I am not really

Re: [R-sig-eco] Adonis and Random Effects

2013-02-04 Thread Steve Brewer
Erin,

There have been a lot of similar queries (e.g., repeated measures, nested
permanova). Jari can correct me if I am wrong, but as far as I know, no
one has developed a way to define multiple error terms in adonis.


You can use adonis, however, to get the split-plot effects. If you want to
make a grassland a random effect, use the following statement

adonis(formula = community_distance_matrix ~ Treatment + Grassland +
GrasslandPlot, strata = GrasslandPlot)


The treatment effect will be correct because the residual error term
(which is equivalent to treatment x GrasslandPlot interaction nested
within Grassland) is the correct error term. The Grassland effect,
however, will not be tested correctly because it is using the residual
error term when it should be using GrasslandPLot as the error term. You
can determine what the F stat for Grassland should be, however, using the
Ms Grassland and MS GrasslandPlot from the anova table to construct the F
test. You just won't get a p-value for the test.

If you want to treat Grassland as a fixed effect, the model is similar but
defines the interaction

adonis(formula = community_distance_matrix ~ Treatment*Grassland +
GrasslandPlot, strata = GrasslandPlot)


In this case, the treatment x grassland interaction will be tested
correctly, as will the treatment effect, but not the Grassland effect.

Unfortunately, you cannot just take averages of abundances across the
treatment and control in each plot and then do a separate analysis of
Grassland and GrasslandPLot (unless you're using Euclidean distances). I
suspect you're not using Euclidean distances.

Hope this helps some.

Good luck,
Steve


J. Stephen Brewer 
Professor 
Department of Biology
PO Box 1848
 University of Mississippi
University, Mississippi 38677-1848
 Brewer web page - http://home.olemiss.edu/~jbrewer/
FAX - 662-915-5144
Phone - 662-915-1077




On 2/4/13 1:14 AM, Erin Nuccio enuc...@gmail.com wrote:

Hello List,

Is adonis capable of modeling random effects?  I'm analyzing the impact
of a treatment on the microbial community in a split-plot design (2
treatments per plot, 4 plots per grassland, 3 grasslands total). I would
like to quantify how much of the variance is due to the Treatment versus
the Grassland.  It seems like Grassland should be a random effect, since
there are thousands of grasslands, and I'm only looking at 3.

I have tried to use the notation that works with lme4, and it's not
working for me (see below for formula and error messages).  If adonis
can't do random effects, are there any alternatives?  Or, considering my
goal, are there any other programs I should look into?  Any suggestions
would be highly appreciated!

Thanks for your help,
Erin



Here's what I think I should run:
adonis(formula = community_distance_matrix ~ Treatment + (1|Grassland) +
(1|GrasslandPlot), strata = GrasslandPlot)

Here are my factors:
'data.frame':  24 obs. of  4 variables:
 $ Treatment: Factor w/ 2 levels T1,T2: 1 1 1 1 1 2 2 2 1 1 ...
 $ Grassland: Factor w/ 3 levels G1,G2,G3: 3 3 1 1 1 2 2 1 2 2
...
 $ Plot : Factor w/ 4 levels P1,P2,P3,P4: 1 2 2 3 4 1 3 2
1 2 ...
 $ GrasslandPlot: Factor w/ 12 levels G1:P1,G1:P2,G1:P3..: 9 10 2 3
4 5 7 2 5 6 ...

And here's the error message:
Error in `contrasts-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
  contrasts can be applied only to factors with 2 or more levels
In addition: Warning messages:
1: In Ops.factor(1, Grassland) : | not meaningful for factors
2: In Ops.factor(1, GrasslandPlot) : | not meaningful for factors

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] summing F stats and permutation

2012-11-29 Thread Steve Brewer
Jari,

Thanks. This is helpful. I knew that RDA obtained predicted abundances for
each species, but I didn't know that function rda could be used to
calculate F statistics for each predictor for each species. That does seem
like a promising way to approach the problem. So the code you've sent me
is showing how to extract and sums the numerator and denominator SS and
then calculate the ratio from those sums and the dfs? Or is it showing how
to obtain Fs for each species and then summing them?

Incidentally, I was initially very skeptical of the sumF approach until I
started comparing sumF results to perMANOVA results (using PC-Ord) on
several different datasets of mine and got strikingly similar results
between the two techniques. It seems to go against what a lot of
statisticians are saying about the unique value of multivariate stats. I
don't have a satisfactory explanation for why it seems to work with my
data.

Steve


J. Stephen Brewer 
Professor 
Department of Biology
PO Box 1848
 University of Mississippi
University, Mississippi 38677-1848
 Brewer web page - http://home.olemiss.edu/~jbrewer/
FAX - 662-915-5144
Phone - 662-915-1077




On 11/29/12 9:43 AM, Jari Oksanen jari.oksa...@oulu.fi wrote:

Steve,

This is R, so it is not about whether this can be done, but how this can
be done. Unfortunately, doing exactly this requires some fluency in R.
Doing something similar is very simple.

The description of your problem sounds very much like the description of
permutation test in redundancy analysis (RDA). The difference is that in
RDA you sum up nominators and denominators before getting the ratio, but
in your model you sum up the ratios. So in RDA test you have (num_1 +
num_2 + ... + num_p)/(den_1 + den_2 + ... + den_p), and in your
description you have num_1/den_1 + num_2/den_2 + ... + num_p/den_p. The
former test in canned for instance in the vegan package, but the latter
you must develop yourself (and the former method of summing variances
instead of their ratios feels sounder). It would not be too many lines of
code to fit your code, though. Please note that RDA works by fitting
linear models for each species independently so that you can get all
needed statistics from a fitted RDA in the vegan package (function rda).
The following function extracts F-values by species from a fitted
vegan:::rda() result object:

spF -
function (object) 
{
   inherits(object, rda) || stop(needs an rda() result object)
   df1 - object$CCA$qrank
   df2 - nrow(object$CA$Xbar) - df1 - 1
   num - colSums(predict(object, type=working, model=CCA)^2)
   den - colSums(predict(object, type=working, model=CA)^2)
   (num/df1)/(den/df2)
}

HTH, Jari Oksanen

From: r-sig-ecology-boun...@r-project.org
[r-sig-ecology-boun...@r-project.org] on behalf of Steve Brewer
[jbre...@olemiss.edu]
Sent: 29 November 2012 16:42
To: r-sig-ecology@r-project.org
Subject: [R-sig-eco] summing F stats and permutation

Dear Colleagues,

I'm wondering if anyone in this group has developed code for doing a sumF
test for examining community responses in an experiment. For those not
familiar, sumF is a simple univariate alternative to MANOVA and perMANOVA,
wherein univariate ANOVAs and their associated F statistics are calculated
for each species' abundance and then the F statistic for each effect is
summed over all species. The significance of the resulting summed F
statistic is then evaluated using random permutation. The summed F
statistic
is interpreted as an overall community response to the treatments, whereas
the F statistic for each species provides a measure of the contribution
that
species makes to treatment differences.

I could envision a variety of ways in which this could be done in R, but
I'm
not adept enough in R to figure out how to do it myself. One possibility
might involve using permute or shuffle to get the randomized data
matrices,
but it is not clear to me how one could simultaneously calculate the
Anovas
for all species and sum the resulting F statistics for each random
permutation. There is no reason why traditional F statistics would have to
be used. Pseudo-F statistics based on distances for each species'
abundance
could be calculated instead and then summed across species.

PLEASE NOTE THAT I AM ALREADY AWARE OF THE OBJECTIONS TO THIS APPROACH TO
COMMUNITY ANALYSIS. Nevertheless, I am interested in pursuing this using
R,
if possible.

Any suggestions are welcomed.

Thanks,

Steve
J. Stephen Brewer
Professor
Department of Biology
PO Box 1848
 University of Mississippi
University, Mississippi 38677-1848
 Brewer web page - http://home.olemiss.edu/~jbrewer/
FAX - 662-915-5144 Phone - 662-915-1077



[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

___
R-sig-ecology mailing list
R-sig-ecology@r

Re: [R-sig-eco] Using residuals as dependent variables

2012-06-22 Thread Steve Brewer
Chris,

Another thing to keep in mind is that when you run the regression analysis
using residuals, as opposed to putting all predictors in the multiple
regression from the beginning (oceanographic data and productivity data),
you are in effect inflating the error df for the analysis of catch
residuals against productivity. In the multiple regression approach, one
df is removed from the error df for every predictor variable in the model.
When you run it as two separate analyses, as you propose, the df removed
from the error df in the first analysis (the one with oceanographic data)
are are put back in into the error df for the second analysis of catch
residuals vs productivity. This is usually not a big deal when the first
analysis contains only one or two predictors and lots of observations. But
when the reverse is true, you're more likely to get a significant
relationship between catch residuals and productivity even when none
really exists.

As others have suggested, why not put productivity and oceanographic data
together in a single mult reg model?

Hope this helps.

Steve



J. Stephen Brewer 
Professor 
Department of Biology
PO Box 1848
 University of Mississippi
University, Mississippi 38677-1848
 Brewer web page - http://home.olemiss.edu/~jbrewer/
FAX - 662-915-5144
Phone - 662-915-1077




On 6/21/12 12:06 PM, Chris Mcowen chrismco...@gmail.com wrote:

Dear List,

 

I am wondering if the methodological approach I am taking is correct and
would be very grateful if people could comment and make suggestions, much
appreciated.

 

I have developed the best model ( AIC model selection) using oceanographic
data ( i.e. SST, chlorophyll, NPP...x6) to predict reported fisheries
catch
for 52 countries.

 

I then extract the residuals from the model and anything positive has a
higher catch than would be predicted given the level of productivity in
the
country, with the opposite being true.

 

What I want to do is:

 

1.   Regress a suite of ecological and socioeconomic variables against
the residuals from the oceanographic model to determine which factors
cause
some countries to be above and some below. I.E as trophic level increase
the
residuals become increasingly negative.

2.   Ideally ( and I am unsure how or if it is possible) work out for
each country which variables/s cause the poor fit of that country to the
oceanographic model.

 

Thanks in advance for any suggestions / possible methods.

 

Chris 

 

P.S - Below is the type of conclusions I am drawing

 

There are a number of reasons why some countries have higher / lower catch
than you would expect.

 

For example if the target fishery is a high trophic level species then the
link between primary productivity and catch will be lesser than if the
species was a lower trophic level ( transfer efficiency etc etc)-
resulting
in a negative residual.  Alternatively it maybe that the area is being
overfished i.e. the north sea meaning more fish are being caught in that
region than it can sustain - resulting in a high positive residual (as
predicted by the model)

 

In reality it is likely a combination of this plus other, however some
factors will be relevant to others i.e. Somalia has a really low catch
compared to its productivity likely due to piracy and poor reporting of
statistics. 

 

 

 


   [[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] split-split-plot design and adonis

2012-03-08 Thread Steve Brewer
Kay,

Seems like a clever and reasonable approach, but I have a couple of
comments/questions.

First, it seems that with this approach, you cannot evaluate the sediment
x hydrology, the sediment x depth or the sediment x hydrology x depth
interactions. I'm not sure if Arnaud is interested in these interactions,
but one generally is in most split-plots.

Second, I tried your approach with my own data (which is just a split
plot). Perhaps I did it wrong, but the resulting anova seems wrong to me.
The residual df are too high, I think because the analysis is not taking
the strata variation and its df (in Arnaud's case, the site effect) out of
the error term.

Here's a suggestion. First, simplify Arnaud's design to a split plot
(removing the depth effect. In the species file, this will entail summing
abundances across depths for each site x hydrology combination.

Use a design predictor file as originally described by Arnaud (except
without the depth factor); I.e., site1, site2, ... site9, three sediment
types - sed1, sed2, sed3, and two hydrology types - dry, wet.

To analyze the split-plot effects, I would suggest the following

Adonis(community ~ sediment*hydrology + site, strata = site)

You should get an anova table that will give you the sediment main effect,
the hydrology main effect, the sediment*hydrology interaction, the site
effect (probably not of interest), and residual error. ***Note that the
sediment main effect is garbage here because it is tested with the
residual error term, which is not the correct error term and must be
disregarded; Your approach to getting it seems reasonable to me.*** The
hydrology effect and the hydrology*treatment interaction should be
correct, however, because both are tested with the residual (I.e.,
split-plot) error term.

For the split-split plot effects, you need to make a term equivalent to
the site x hydrology term (as described by Arnaud). That will be the
strata effect for the split-split plot (but not for the split-plot
effects). In this case, you'll use a species dataset in which the
abundances are not summed across depths for each site x hydro combination.
This is identical to the original species data file. In this case, I
suggest the following analysis:

Adonis(community ~ sediment*hydrology*depth + site_hydro, strata =
site_hydro).

Here, you disregard all results except those of depth and its interactions
with sediment and/or hydrology.

Another possible way to test the sediment main effect would be to create a
new species data file in which abundances are summed across all
hydrologies and depths for each site. The adonis statement is then simply:

Adonis(community ~ sediment)

I'd appreciate any feedback.

Thanks,
Steve
 


J. Stephen Brewer 
Professor 
Department of Biology
PO Box 1848
 University of Mississippi
University, Mississippi 38677-1848
 Brewer web page - http://home.olemiss.edu/~jbrewer/
FAX - 662-915-5144
Phone - 662-915-1077




On 3/5/12 4:45 AM, Kay Cichini kay.cich...@gmail.com wrote:

Hi Arnaud and list-members,

what about the following:

library(vegan)

Sediment-as.factor(rep(c(Sed1,Sed2,Sed3),each=36))
Site-as.factor(rep(c(Site1,Site2,Site3,Site4,Site5,Site6,Sit
e7,Site8,Site9),each=12))
Hydrology-as.factor(rep(rep(c(Dry,Wet),each=6),9))
Depth-as.factor(rep(rep(c(D1,D2),each=3),18))

nobs = length(Site)

# depth and hydrology are nested within sites:
table(Site, Depth)
table(Site, Hydrology)

# so the following call is ok as it restricts permutation on sites -
# that is, levels of depth and hydrology are shuffled only within and not
# across sites:
set.seed(123)
community - matrix(rnorm(nobs * 3, 10, 1), nobs, 3)
adonis(community ~ Hydrology * Depth, perm = 999, strata = Site)

# sediment is different:
table(Site, Sediment)

# to allow for this structure you would need to shuffle sites as
# a whole. the below restriction will randomly assign levels of sediment
# to the sites - like this:
ctrl = permControl(strata = Site, permute.strata = TRUE)
table(Site, perm.Sediment = Sediment[permuted.index2(nobs, control =
ctrl)])

### to test sediment you'd need a customized code:
(fit - adonis(community ~ Sediment, permutations = 1))

### number of perms
B - 999

### setting up frame which will be populated by
### random r2 values:
pop - rep(NA, B + 1)

### the first entry will be the true r2:
pop[1] - fit$aov.tab[1, 5]

for(i in 2:(B+1)){
 idx - permuted.index2(nobs, control = ctrl)
 fit.rand - adonis(community ~ Sediment[idx], permutations = 1)
 pop[i] - fit.rand$aov.tab[1,5]
}

### get the p-value (H0: Sediment has no effect on location of
communities):
print(pval - sum(pop = pop[1]) / (B + 1))

### make a histogram to see random R2-values and the true one:
hist(pop, xlab = Population R2)
abline(v = pop[1], col = 2, lty = 3)

What do you think?
Kay


2012/3/4 Arnaud Foulquier arnaud.foulqu...@gmail.com

 I have run the two analysis. The only differences are the p-values that
 appear in the anova tables. Other values (Df, 

Re: [R-sig-eco] split-split-plot design and adonis

2012-03-08 Thread Steve Brewer
I just realized a mistake in my suggested approach. See below in CAPS.
J. Stephen Brewer 
Professor 
Department of Biology
PO Box 1848
 University of Mississippi
University, Mississippi 38677-1848
 Brewer web page - http://home.olemiss.edu/~jbrewer/
FAX - 662-915-5144
Phone - 662-915-1077




On 3/8/12 1:07 PM, Steve Brewer jbre...@olemiss.edu wrote:

Kay,

Seems like a clever and reasonable approach, but I have a couple of
comments/questions.

First, it seems that with this approach, you cannot evaluate the sediment
x hydrology, the sediment x depth or the sediment x hydrology x depth
interactions. I'm not sure if Arnaud is interested in these interactions,
but one generally is in most split-plots.

Second, I tried your approach with my own data (which is just a split
plot). Perhaps I did it wrong, but the resulting anova seems wrong to me.
The residual df are too high, I think because the analysis is not taking
the strata variation and its df (in Arnaud's case, the site effect) out of
the error term.

Here's a suggestion. First, simplify Arnaud's design to a split plot
(removing the depth effect. In the species file, this will entail summing
abundances across depths for each site x hydrology combination.

UNFORTUNATELY, A SIMPLE SUMMING OF THE RAW ABUNDANCES WON'T WORK HERE
(UNLESS YOU'RE USING EUCLIDEAN DISTANCES, BUT MOST FOLKS ARE USING
BRAY-CURTIS DISTANCES FOR SPECIES COMPOSITION DATA).

Use a design predictor file as originally described by Arnaud (except
without the depth factor); I.e., site1, site2, ... site9, three sediment
types - sed1, sed2, sed3, and two hydrology types - dry, wet.

To analyze the split-plot effects, I would suggest the following

Adonis(community ~ sediment*hydrology + site, strata = site)



You should get an anova table that will give you the sediment main effect,
the hydrology main effect, the sediment*hydrology interaction, the site
effect (probably not of interest), and residual error. ***Note that the
sediment main effect is garbage here because it is tested with the
residual error term, which is not the correct error term and must be
disregarded; Your approach to getting it seems reasonable to me.*** The
hydrology effect and the hydrology*treatment interaction should be
correct, however, because both are tested with the residual (I.e.,
split-plot) error term.

THIS IS ONLY CORRECT IF IT IS INDEED A SPLIT-PLOT DESIGN AND NOT
SPLIT-SPLIT PLOT.


For the split-split plot effects, you need to make a term equivalent to
the site x hydrology term (as described by Arnaud). That will be the
strata effect for the split-split plot (but not for the split-plot
effects). In this case, you'll use a species dataset in which the
abundances are not summed across depths for each site x hydro combination.
This is identical to the original species data file. In this case, I
suggest the following analysis:

Adonis(community ~ sediment*hydrology*depth + site_hydro, strata =
site_hydro).

Here, you disregard all results except those of depth and its interactions
with sediment and/or hydrology.

THIS WILL WORK. SO, IF YOU'RE ONLY INTERESTED IN THE EFFECTS OF DEPTH AND
ITS INTERACTIONS WITH SEDIMENT AND/OR HYDROLOGY, THEN YOU COULD DO THIS.

Another possible way to test the sediment main effect would be to create a
new species data file in which abundances are summed across all
hydrologies and depths for each site. The adonis statement is then simply:

Adonis(community ~ sediment)

NOPE. THIS WON'T WORK. CANNOT SUM THE ABUNDANCES.

EITHER MUST BE DONE THE WAY KAY SUGGESTED OR ONE MUST DO A PCO AND
CALCULATE DISTANCES FROM CENTROIDS.

SORRY.

I'd appreciate any feedback.

Thanks,
Steve
 


J. Stephen Brewer 
Professor 
Department of Biology
PO Box 1848
 University of Mississippi
University, Mississippi 38677-1848
 Brewer web page - http://home.olemiss.edu/~jbrewer/
FAX - 662-915-5144
Phone - 662-915-1077




On 3/5/12 4:45 AM, Kay Cichini kay.cich...@gmail.com wrote:

Hi Arnaud and list-members,

what about the following:

library(vegan)

Sediment-as.factor(rep(c(Sed1,Sed2,Sed3),each=36))
Site-as.factor(rep(c(Site1,Site2,Site3,Site4,Site5,Site6,Si
t
e7,Site8,Site9),each=12))
Hydrology-as.factor(rep(rep(c(Dry,Wet),each=6),9))
Depth-as.factor(rep(rep(c(D1,D2),each=3),18))

nobs = length(Site)

# depth and hydrology are nested within sites:
table(Site, Depth)
table(Site, Hydrology)

# so the following call is ok as it restricts permutation on sites -
# that is, levels of depth and hydrology are shuffled only within and not
# across sites:
set.seed(123)
community - matrix(rnorm(nobs * 3, 10, 1), nobs, 3)
adonis(community ~ Hydrology * Depth, perm = 999, strata = Site)

# sediment is different:
table(Site, Sediment)

# to allow for this structure you would need to shuffle sites as
# a whole. the below restriction will randomly assign levels of sediment
# to the sites - like this:
ctrl = permControl(strata = Site, permute.strata = TRUE)
table(Site, perm.Sediment = Sediment[permuted.index2(nobs

Re: [R-sig-eco] split-split-plot design and adonis

2012-03-08 Thread Steve Brewer
Edi and Kay,

Assuming one is using Bray-Curtis distances for the species composition
matrix, how are the bc distances among the larger plots (e.g., whole plots
or sites) or the split-plots (site/hydro combos) being calculated here?

I understand that one can use permutation to shuffle the sites as a whole,
but in order to get Sums of Squares for sites to test the sediment effect,
don't you need to be able to calculate distances among sites? How do you
do this if sites are subdivided into hydrologies (and hydrologies
subdivided into depths). You cannot just add up or average the raw
abundances.

Steve

 
J. Stephen Brewer 
Professor 
Department of Biology
PO Box 1848
 University of Mississippi
University, Mississippi 38677-1848
 Brewer web page - http://home.olemiss.edu/~jbrewer/
FAX - 662-915-5144
Phone - 662-915-1077




On 3/5/12 5:44 AM, Eduard Szöcs szoe8...@uni-landau.de wrote:

Also have a look at the new permute package:
http://cran.r-project.org/web/packages/permute/index.html

AFAIK permute is not fully hooked up in vegan at the moment (at least
not in adonis),
but you could hack the adonis function to use shuffle() instead of
permuted.index().

The permutation design, as proposed by Kay, would then look like this:

ctrl - permControl(strata = Site,
 within = Within(type = none),
 blocks = Blocks(type = free))
table(Site, perm.Sediment = Sediment[shuffle(nobs, control = ctrl)])


HTH,

Edi

  




On 05/03/12 11:45, Kay Cichini wrote:
 Hi Arnaud and list-members,

 what about the following:

 library(vegan)

 Sediment-as.factor(rep(c(Sed1,Sed2,Sed3),each=36))
 
Site-as.factor(rep(c(Site1,Site2,Site3,Site4,Site5,Site6,Si
te7,Site8,Site9),each=12))
 Hydrology-as.factor(rep(rep(c(Dry,Wet),each=6),9))
 Depth-as.factor(rep(rep(c(D1,D2),each=3),18))

 nobs = length(Site)

 # depth and hydrology are nested within sites:
 table(Site, Depth)
 table(Site, Hydrology)

 # so the following call is ok as it restricts permutation on sites -
 # that is, levels of depth and hydrology are shuffled only within and
not
 # across sites:
 set.seed(123)
 community- matrix(rnorm(nobs * 3, 10, 1), nobs, 3)
 adonis(community ~ Hydrology * Depth, perm = 999, strata = Site)

 # sediment is different:
 table(Site, Sediment)

 # to allow for this structure you would need to shuffle sites as
 # a whole. the below restriction will randomly assign levels of sediment
 # to the sites - like this:
 ctrl = permControl(strata = Site, permute.strata = TRUE)
 table(Site, perm.Sediment = Sediment[permuted.index2(nobs, control =
ctrl)])

 ### to test sediment you'd need a customized code:
 (fit- adonis(community ~ Sediment, permutations = 1))

 ### number of perms
 B- 999

 ### setting up frame which will be populated by
 ### random r2 values:
 pop- rep(NA, B + 1)

 ### the first entry will be the true r2:
 pop[1]- fit$aov.tab[1, 5]

 for(i in 2:(B+1)){
   idx- permuted.index2(nobs, control = ctrl)
   fit.rand- adonis(community ~ Sediment[idx], permutations = 1)
   pop[i]- fit.rand$aov.tab[1,5]
 }

 ### get the p-value (H0: Sediment has no effect on location of
communities):
 print(pval- sum(pop= pop[1]) / (B + 1))

 ### make a histogram to see random R2-values and the true one:
 hist(pop, xlab = Population R2)
 abline(v = pop[1], col = 2, lty = 3)

 What do you think?
 Kay


 2012/3/4 Arnaud Foulquierarnaud.foulqu...@gmail.com

 I have run the two analysis. The only differences are the p-values that
 appear in the anova tables. Other values (Df, SumsOfSqs, MeanSqs,
F.Model)
 are not modified because they are calculated on the original (not
 permuted) data.

 The p-value is calculated by comparing the value of F obtained with the
 original data to the distribution obtained by permutations. If i
understand
 well, the strata argument is used to constrain the permutations of the
 original data within the groups specified in the strata argument. The
 distribution of the F-statistic that is generated by permutations is
not
 the
 same according to whether the strata argument is specified or not.
This is
 why the p-values are modified when the strata argument is specified.

 However, I'm not sure that strata=Site_Hydro correctly specify the
nested
 structure of my data.





 --
 View this message in context:
 
http://r-sig-ecology.471788.n2.nabble.com/split-split-plot-design-and-ad
onis-tp7332029p7342690.html
 Sent from the r-sig-ecology mailing list archive at Nabble.com.

 ___
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

  [[alternative HTML version deleted]]

 ___
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] split-split-plot design and adonis

2012-03-01 Thread Steve Brewer
Arnaud,

Have you already tried to run the analysis using the Site_Hydro
interaction as the strata effect? If so, can you show us the anova table?

One thing I'm wondering about is how adonis is dealing with the apparent
nesting of sites within sediment types. From your description, site does
not appear to me to be equivalent to a block at the level of sediment
type (although it does appear so at the level of hydrology).

I've not been successful at doing either a split plot or a repeated
measures using Adonis. I don't know whether it is because I have set up
the data frame containing the factors incorrectly or whether the arguments
or models in adonis I'm using are incorrect, but the resulting anovas I
get always show the residual error term being used to test all the
effects, which is incorrect and not what I want.

Best,
Steve

 
J. Stephen Brewer 
Professor 
Department of Biology
PO Box 1848
 University of Mississippi
University, Mississippi 38677-1848
 Brewer web page - http://home.olemiss.edu/~jbrewer/
FAX - 662-915-5144
Phone - 662-915-1077




On 3/1/12 3:31 AM, Arnaud Foulquier arnaud.foulqu...@gmail.com wrote:

Dear colleagues,

I¹m working on a data set in order to evaluate the impact of drying on
sediment microbial community structure. The objective is to determine if
the
impact of drying varies among sediment types and/or depth within the
sediment.

The experimental design is as follows: The first factor *Sediment*
corresponds to three types of sediment (coded Sed1, Sed2, Sed3). For each
type of Sediment, sampling was carried out on three sites (3 sites for
Sed1,
3 sites for Sed2, 3 sites for Sed3). *Site* is coded : Site1, Site2, ...,
Site9. The next factor is *Hydrology*: within each site, sampling is
carried
out in a dry plot and in a wet plot (coded Dry /Wet). Within each of the
previous plot, sampling is carried out at two *Depths* (D1, D2) in
triplicate.

There are a total of n = 108 samples = 3 Sediment * 3 Sites * 2 Hydrology
*
2 Depths * 3 Replicates.

I use the adonis function in the following way :

Sediment-as.factor(rep(c(Sed1,Sed2,Sed3),each=36))
Site-as.factor(rep(c(Site1,Site2,Site3,Site4,Site5,Site6,Sit
e7,Site8,Site9),each=12))
Hydrology-as.factor(rep(rep(c(Dry,Wet),each=6),9))
Depth-as.factor(rep(rep(c(D1,D2),each=3),18))

adonis(community~Sediment*Hydrology*Depth,perm=999,strata=Site)

I need to specify strata=Site to account for the stratified structure of
the
design.
However, I think it is incomplete and that I also should specify that
Hydrology is nested in Site (I use the lme function in the univariate case
as follows :
http://stats.stackexchange.com/questions/19818/split-split-plot-design-and
-lme)

I tried the following :

Site_Hydro-interaction(Site,Hydro)
adonis(community~Sediment*Hydrology*Depth,perm=999,strata=Site_Hydro)

Is it the correct way to account for the stratified structure of my data?

Any help on this question would be much appreciated.

Arnaud





--
View this message in context:
http://r-sig-ecology.471788.n2.nabble.com/split-split-plot-design-and-adon
is-tp7332029p7332029.html
Sent from the r-sig-ecology mailing list archive at Nabble.com.

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


[R-sig-eco] repeated measures permutation MANOVA

2012-02-29 Thread Steve Brewer
Hey Folks,

I am new to R and am trying to modify my multivariate eco-statistics course
by giving my students the option of doing all of the assignments in R.

One of the exercises I've been doing in Primer, but would like to do in R,
is a repeated measures permutation MANOVA, wherein I examine how plant
species composition responds to a restoration treatment over time. The
experimental design is pretty simple: a CRD with 10 treated plots and 10
control plots (each plot is considered equivalent to a random subject)
with composition measured in two different years, similar to a simple BACI
study, except that I have two after censuses (1 year after the treatment
was initiated and 3 years after the treatment was initiated) and no before
census of the vegetation.

I thought I might be able to do this using adonis and the strata argument,
but it is not clear to me how to nest plot within treatment but not within
year so that the treatment effect is tested with between-subjects error
while year and its interaction with treatment are tested with
within-subjects error.

I'm happy to provide more details if necessary.

Thanks,
Steve


J. Stephen Brewer 
Professor 
Department of Biology
PO Box 1848
 University of Mississippi
University, Mississippi 38677-1848
 Brewer web page - http://home.olemiss.edu/~jbrewer/
FAX - 662-915-5144 Phone - 662-915-1077



[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


[R-sig-eco] Multivariate repeated measures ANOVA w 2 bs factors

2012-01-03 Thread Steve Brewer
Hey folks,

I am new to R and am trying to modify my multivariate eco-statistics course
by giving my students the option of doing all of the assignments in R.
Currently, all the assignments are done in either JMP, PC-Ord, or Primer.

One difficulty I'm having is figuring out how to do a multivariate repeated
measures ANOVA when there are two between-subjects factors. In other words,
I cannot figure out how to analyze a factorial experiment (clipping and
nutrient addition and their interaction) with three years of measurement of
shoot densities of a grass using a multivariate approach. I currently have
my students use JMP to do this and compare the results to univariate
approaches (I.e., assumed sphericity, G-G/H-F adjusted error df).

I have found examples using univariate approaches (e.g., using aov) and have
figured out how to use gls to address issues pertaining to the structure of
variance-covariance matrix. I have also found examples using multmodel and
Anova to do mv rm ANOVA in which there is a single between-subjects factor
and multiple time measurements. But I cannot find any examples of the latter
in which there are at least two between-subjects factors. Maybe the answer
is obvious, but my understanding of R is still very rudimentary.

Please understand that the purpose of this request is not necessarily to
find the best way to analyze the data, but to instruct students about the
use of a multivariate approach. If necessary, I could simplify the design,
but would prefer to keep the design I have, if possible.

Thanks,
Steve



J. Stephen Brewer 
Professor 
Department of Biology
PO Box 1848
 University of Mississippi
University, Mississippi 38677-1848
 Brewer web page - http://home.olemiss.edu/~jbrewer/
FAX - 662-915-5144 Phone - 662-915-1077



[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology