Re: [R-sig-eco] repeated measures NMDS?

2010-11-11 Thread Kay Cecil Cichini

gavin,

sorry - of course it should be permute.strata=F, permuting within 
individual sites!

but despite of this the code should work, doesn't it?

thanks,
kay



Gavin Simpson schrieb:

On Wed, 2010-11-10 at 23:33 +0100, Kay Cecil Cichini wrote:

hi eduard,

i faced similar problems recently and came to the below solution.
i only try to address the pseudoreplication with an appropiate  
permutation scheme.

when it comes to testing the interactions, things may get more complicated.

the code is in no way approven of, but at least it maybe good enough  
for a starter.


best,
kay


Hi Kay,

I don't think you have this right.




If you have measured repeatedly, say 5 times, on the same 10
individuals, or if you have ten fields and you take 5 quadrats from
each, you need to permute *within* the individuals/fields, not permute
the individuals/fields which is what permute.strata does. permute.strata
would be useful in evaluating factors that vary at the block
(individuals/fields) level, not at the sample levels.

From what Eduard and you describe, the code you show is not the correct
permutation. But I may have misunderstood your intention.

Also, be careful with permuted.index2 - there are reasons why it hasn't
been integrated (design goals changed and we felt it would work best in
a separate package that others could draw upon without loading all of
vegan) and the code has festered a bit and may contain bugs - buyer
beware!

G


library(vegan)

### species matrix with 5 sp.
### one env.variable
### a factor denoting blocks of repeated measurments

sp-matrix(runif(24*3*5,0,100),24*3,5)
env-rnorm(24*3,10,2)
rep.mes-gl(24,3)

### NMDS:
sol-metaMDS(sp,trymax=5)

fit-envfit(sol~env)
plot(sol)
plot(fit)

### testing code for appropiate randomization,
### permuting blocks of 3 as a whole:
permuted.index2(nrow(sp),permControl(strata = rep.mes,permute.strata=T))


correctly, this should say:
### testing code for appropiate randomization,
### permuting within sites:
permuted.index2(nrow(sp),permControl(strata = rep.mes))


B=4999

### setting up frame for population of r2 values:
pop-rep(NA,B+1)
pop[1]-fit$vectors$r



and:

### loop:
for(i in 2:(B+1)){
fit.rand-envfit(sol~env[permuted.index2(nrow(sp),
  permControl(strata = rep.mes))])
   pop[i]-fit.rand$vectors$r
}

### p-value:

 print(pval-sum(poppop[1])/B+1)

here a bracket was missing:
print(pval-sum(poppop[1])/(B+1))



### compare to anti-conservative p-value from envfit(),
### not restricting permutations:

envfit(sol~env,perm=B)


Zitat von Eduard Szöcs szoe8...@uni-landau.de:


Thanks, that helped.

permuted.index2() generates these types of permutations. But  
envfit() does not use this yet.
What if I modify vectorfit() (used by envfit() ) in such a way that  
it uses permuted.index2() instead of permuted.index()?



Eduard Szöcs





Am 08.11.2010 22:01, schrieb Gavin Simpson:

On Mon, 2010-11-08 at 15:39 +0100, Eduard Szöcs wrote:

Hi listers,

I have species and environmental data for 24 sites that were sampled
thrice. If I want to analyze the data with NMDS I could run metaMDS on
the whole dataset (24 sites x 3 times = 72) and then fit environmental
data, but this would be some kind of pseudoreplication given that the
samplings are not independent and the gradients may be overestimated,
wouldn`t it?

For environmental data a factor could be included for the sampling
dates - but this would not be possible for species data.

Is there an elegant way either to aggregate data before ordination or
to conduct sth. like a repeated measures NMDS?

Thank you in advance,
Eduard Szöcs

Depends on how you want to fit the env data - the pseudo-replication
isn't relevant o the nMDS. If you are doing it via function `envfit()`,
then look at argument `'strata'` which should, in your case, be set to a
factor with 24 levels. This won't be perfect because your data are a
timeseries and, strictly, one should permute them whilst maintaining
their ordering in time, but as yet we don't have these types of
permutations hooked into vegan.

If you are doing the fitting some other way you'll need to include
site as a fixed effect factor to account for the within site
correlation.

You don't need to worry about the species data and accounting for
sampling interval. You aren't testing the nMDS axes or anything like
that, and all the species info has been reduced to dissimilarities and
thence to a set of nMDS coordinates. You need to account for the pseudo
rep at the environmental modelling level, not the species level.

HTH

G


___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology




___
R-sig-ecology mailing list
R-sig-ecology@r-project.org

Re: [R-sig-eco] repeated measures NMDS?

2010-11-11 Thread Kay Cecil Cichini

thanks a lot for the illustrative example.

..referring to your quote:

...This of course doesn't account for any temporal correlation within 
sites nor, if the observations on the 24 blocks were made at the same 
times, that you might want to have the same permutation within each block.

In the former there are 3^24 possible permutations (time series within
blocks), so 999 random perms is reasonable, *but* some of these random
perms (in permuted.index()) will not respect the temporal ordering and
thus you aren't really exploring the correct NULL.
With the latter constraint (same temporal perm within blocks), there are
3 random permutations, so good luck getting a reasonable p-value from
that.

..so for the time beeing we assume the former case -
and for the latter there is no way out.

yours,
kay

ps: in germany/austria there are two alternative spellings for the name 
kai/kay - beeing a male name opposed to the english kay.




Gavin Simpson schrieb:

[Apologies - I replied with this only to Kay. Hopefully she won't mind
receiving it twice!]

On Thu, 2010-11-11 at 10:32 +0100, Kay Cecil Cichini wrote:

thanks a lot for your elaborations.

of course, envfit(..,strata=rep.mes) does it.

then, at least, i consider it exercise for other cases, were you really 
might need a handmade permutation


so, as to round this off, i actually can't analyse this very design in 
such a way, with the right NULL concerned - but were to go from here?


You could hook up the code in 'permute'. It contains a new
permuted.index() function (and currently no NAMESPACE, so will overwrite
(mask) the vegan version if loaded after vegan) which will break all the
permutation code in vegan).

Here is your example, modified to use the code in permute. I post this
to illustrate how you'd use the new code. There are lots of examples
in ?permuted.index (for the permute package, not the vegan package
version), but *don't* touch the permutation t-test example code as it
uses permCheck() and it might call allPerms() and allPerms() *IS*
*WRONG* for some designs --- this is the last bit I need to fix/get
working before we can make our first release of this code.

HTH

G

Here's the example script:

## Load packages
require(vegan)
require(permute)

## Data
set.seed(123)
sp - matrix(runif(24*3*5, 0, 100), nrow = 24 * 3, ncol = 5)
env - rnorm(24*3, 10, 2)
rep.mes - gl(24, 3)

### NMDS:
sol - metaMDS(sp, trymax = 5)
fit - envfit(sol~env, permutations = 0) ## perms now won't work!

B - 999 ## number of perms

### setting up frame for population of r2 values:
pop - rep(NA, B + 1)
pop[1] - fit$vectors$r

## set-up a Control object:
ctrl - permControl(strata = rep.mes,
within = Within(type = series, mirror = FALSE))
## we turn off mirroring as time should only flow in one direction

## Number of observations
nobs - nrow(sp)

## check it works
matrix(permuted.index(nobs, control = ctrl), ncol = 3, byrow = TRUE)
## Yep - Phew!!!

### loop:
set.seed(1)
for(i in 2:(B+1)){
idx - permuted.index(nobs, control = ctrl)
fit.rand - envfit(sol ~ env[idx], permutations = 0)
pop[i] - fit.rand$vectors$r
}

### p-value:
pval - sum(pop = pop[1]) / (B + 1)
pval

I get:


pval

[1] 0.286

Now to compare with the actual permutation you'd have gotten from
env.fit, you first need:

detach(package:permute)

Then run:

set.seed(1)
fit2 - envfit(sol~env, permutations = 999, strata = rep.mes)
fit2


***VECTORS

  NMDS1   NMDS2 r2 Pr(r)
env 0.28727 0.95785 0.0315  0.321
P values based on 999 permutations, stratified within strata.

a simplistic approach could be, averaging sites, yielding n=24 for 
further testing.


yours,
kay


Gavin Simpson schrieb:

On Thu, 2010-11-11 at 09:50 +0100, Kay Cecil Cichini wrote:

gavin,

sorry - of course it should be permute.strata=F, permuting within 
individual sites!

but despite of this the code should work, doesn't it?

Yes, it should - i.e the permutation will be random within the blocks.
Whether it does or not is another matter entirely. AFAICR, this option
in permuted.index2() did work.

*But*, this is doing exactly what the original permuted.index() does if
you set argument 'strata' to be the grouping factor - in your case:

envfit(sol ~ env, strata = rep.mes, perm = 999)

for example. So there is no need to code up the analysis by hand.

This of course doesn't account for any temporal correlation within sites
nor, if the observations on the 24 blocks were made at the same times,
that you might want to have the same permutation within each block.

In the former there are 3^24 possible permutations (time series within
blocks), so 999 random perms is reasonable, *but* some of these random
perms (in permuted.index()) will not respect the temporal ordering and
thus you aren't really exploring the correct NULL.

With the latter constraint (same temporal perm within blocks), there are
3 random permutations, so good luck getting a reasonable p-value from
that.

The two restricted permutations 

Re: [R-sig-eco] repeated measures NMDS?

2010-11-11 Thread Gavin Simpson
On Thu, 2010-11-11 at 13:03 +0100, Kay Cecil Cichini wrote:
 thanks a lot for the illustrative example.
 
 ..referring to your quote:
 
 ...This of course doesn't account for any temporal correlation within 
 sites nor, if the observations on the 24 blocks were made at the same 
 times, that you might want to have the same permutation within each block.
 In the former there are 3^24 possible permutations (time series within
 blocks), so 999 random perms is reasonable, *but* some of these random
 perms (in permuted.index()) will not respect the temporal ordering and
 thus you aren't really exploring the correct NULL.
 With the latter constraint (same temporal perm within blocks), there are
 3 random permutations, so good luck getting a reasonable p-value from
 that.
 
 ..so for the time beeing we assume the former case -
 and for the latter there is no way out.

Yes - for the case of wanting the same temporal permutation within each
block there only are 3 permutations (6 if you allow time to go backwards
[mirror = TRUE]), but this includes the observed ordering, so only 2 (5)
other permutations to try. This is where permutation tests fail. If the
observed statistic is bigger than the statistic from the two random
permutations, this is an exact statistic in the sense that you've
evaluated all possible orderings consistent with the null, but all you
can is that the p-value is p  0.333.

Having said that, we can perhaps try to be reasonable and relax some of
the assumptions (how often do our data fully meet all the assumptions of
the parametric statistical approaches we use?) and be happy with a null
that respects temporal autocorrelation within block, but not across
blocks. One might then choose to accept as a significant result a
permutation p that is say p = 0.01 or even p = 0.001, rather than the
usual p = 0.05, to help guard against using the wrong Null.

 yours,
 kay
 
 ps: in germany/austria there are two alternative spellings for the name 
 kai/kay - beeing a male name opposed to the english kay.

I am truly sorry for my mistake - please accept my apologies. Totally
unintentional I assure you.

All the best,

G

 
 Gavin Simpson schrieb:
  [Apologies - I replied with this only to Kay. Hopefully she won't mind
  receiving it twice!]
  
  On Thu, 2010-11-11 at 10:32 +0100, Kay Cecil Cichini wrote:
  thanks a lot for your elaborations.
 
  of course, envfit(..,strata=rep.mes) does it.
 
  then, at least, i consider it exercise for other cases, were you really 
  might need a handmade permutation
 
  so, as to round this off, i actually can't analyse this very design in 
  such a way, with the right NULL concerned - but were to go from here?
  
  You could hook up the code in 'permute'. It contains a new
  permuted.index() function (and currently no NAMESPACE, so will overwrite
  (mask) the vegan version if loaded after vegan) which will break all the
  permutation code in vegan).
  
  Here is your example, modified to use the code in permute. I post this
  to illustrate how you'd use the new code. There are lots of examples
  in ?permuted.index (for the permute package, not the vegan package
  version), but *don't* touch the permutation t-test example code as it
  uses permCheck() and it might call allPerms() and allPerms() *IS*
  *WRONG* for some designs --- this is the last bit I need to fix/get
  working before we can make our first release of this code.
  
  HTH
  
  G
  
  Here's the example script:
  
  ## Load packages
  require(vegan)
  require(permute)
  
  ## Data
  set.seed(123)
  sp - matrix(runif(24*3*5, 0, 100), nrow = 24 * 3, ncol = 5)
  env - rnorm(24*3, 10, 2)
  rep.mes - gl(24, 3)
  
  ### NMDS:
  sol - metaMDS(sp, trymax = 5)
  fit - envfit(sol~env, permutations = 0) ## perms now won't work!
  
  B - 999 ## number of perms
  
  ### setting up frame for population of r2 values:
  pop - rep(NA, B + 1)
  pop[1] - fit$vectors$r
  
  ## set-up a Control object:
  ctrl - permControl(strata = rep.mes,
  within = Within(type = series, mirror = FALSE))
  ## we turn off mirroring as time should only flow in one direction
  
  ## Number of observations
  nobs - nrow(sp)
  
  ## check it works
  matrix(permuted.index(nobs, control = ctrl), ncol = 3, byrow = TRUE)
  ## Yep - Phew!!!
  
  ### loop:
  set.seed(1)
  for(i in 2:(B+1)){
  idx - permuted.index(nobs, control = ctrl)
  fit.rand - envfit(sol ~ env[idx], permutations = 0)
  pop[i] - fit.rand$vectors$r
  }
  
  ### p-value:
  pval - sum(pop = pop[1]) / (B + 1)
  pval
  
  I get:
  
  pval
  [1] 0.286
  
  Now to compare with the actual permutation you'd have gotten from
  env.fit, you first need:
  
  detach(package:permute)
  
  Then run:
  set.seed(1)
  fit2 - envfit(sol~env, permutations = 999, strata = rep.mes)
  fit2
  
  ***VECTORS
  
NMDS1   NMDS2 r2 Pr(r)
  env 0.28727 0.95785 0.0315  0.321
  P values based on 999 permutations, stratified within strata.
  
  a simplistic approach could be, 

[R-sig-eco] Fit log-series to species-rank abundance

2010-11-11 Thread Guillem Chust
Hello,

 

Does anybody know how to estimate abundances of the i-th ranked species
under the Fisher's log-series model?

In vegan package, the radfit function perform this but for other models
(log-normal, ...). 

Also in vegan, the fisherfit function estimates the model parameters,
however I could not obtain the fitted values of abundance for each
species.

 

Sincerely,

 

Guillem Chust

 


[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] ANOSIM in vegan

2010-11-11 Thread Jari Oksanen
On 12/11/10 02:23 AM, Soumi Ray soumira...@gmail.com wrote:

 Hi,
 
 I have a dataset consisting of species collected from the same location
 during 2 time periods - i want to see if the community composition is
 similar during the two time periods. My entire dataset is presence/absence
 (0/1) data. There are around 23 species and 400 samples (during each time
 period, so a total of 800 samples). Will ANOSIM from the vegan package be an
 right test to apply? I was going through some papers online where they have
 used methods like db-RDA in similar situations. Would it be right to use it
 for qualitative data? Any suggestion would be of great help.

Soumi,

I only comment the db-RDA/anosim choice: if you can use one, you can use the
other. They are very similar and have the same limitation and assumptions.
Both are based on dissimilarity measures, and you can use the same
dissimilarities in both methods. They also handle the dissimilarities very
similarly. Overall tests for db-RDA by terms (as implemented in anova(...,
by = term) for vegan::capscale) and adonis tests give very similar
results. However, they are not identical. The difference is that for
non-Euclidean dissimilarities you will have some negative eigenvalues. These
are ignored in db-RDA (capscale), but they are taken into account in adonis.
Which method to use depends on your questions, and what else you want to do
with your data than get the test statistics.

Cheers, Jari Oksanen

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology