Dear List Members,

I am using the 'cocorresp' package and trying to compare several pairs of
communities. My goal would be to see which of several communities best
predicts a final community. Say, for instance, that I have four communities
A, B, C and D. I want to know which community best predicts D. To further
complicate things, I have more data points for some communities than
others. For instance communities A and B and D have 100 sites associated
with them, while C only was measured at 75 sites (lets imagine it wasn't
measured in the last 25 sites). So I run

A.pred <- coca(x =A, y = D, method = “pred”, reg.method = “eigen” n.axes =
10)
B.pred <- coca(x =B, y = D, method = “pred”, reg.method = “eigen” n.axes =
10)
C.pred <- coca(x =C, y = D[1:75,], method = “pred”, reg.method = “eigen”
n.axes = 10)

I then apply crossvalidation, as in (Braak & Schaffers 2004).

A.crossval <- crossval(x = A.pred, n.axes = 10)
A.cvax <- which.max(A.crossval$CVfit)
A.cvmax <- max(A.crossval$CVfit)

B.crossval <- crossval(x = B.pred, n.axes = 10)
B.cvax <- which.max(B.crossval$CVfit)
B.cvmax <- max(B.crossval$CVfit)

C.crossval <- crossval(x = C.pred, n.axes = 10)
C.cvax <- which.max(C.crossval$CVfit)
C.cvmax <- max(C.crossval$CVfit)

Thus X.cvax is the number of axes that give the best crossvalidation score
for community X predicting community D. X.cvax max is that best
crossvalidation score.  I think that if X.cvmax is positive, that suggests
that the model can predict come fraction of the structure of community D
from community X (where X is the community (A, B or C) in question) (as
suggested in Schaffers et al 2008).

Now, I want to know whether A, B or C best predicts D. Because C is only 75
data points, I expect A.cvmax and B.cvmax should be inflated relative to
C.cvmax, so I shouldn't be able to use these values to cross compare
predictive power.

One approach that I think might work is to run permutest to generate
percent fits for each axis that crossvalidation suggests I should include
in the model.

A.perm <- permutest(mo.pred, n.axes = A.cvax)
B.perm <- permutest(mo.pred, n.axes = B.cvax)
C.perm <- permutest(mo.pred, n.axes = C.cvax)

I can then ask what percentage of the community of X predicts what percent
of community Y using the pcent.fit value from the permutest objects. I
think this value is independent of sample size, but am not sure. Does
anybody else know? I could either do this for the first axis, or for the
sum all axes in the model. I'm not sure if the second option is legitimate

Option 1
A.ax1 <- A.perm$pcent.fit[1]
B.ax1 <- B.perm$pcent.fit[1]
C.ax1 <- C.perm$pcent.fit[1]

Option 2
A.all <- sum(A.perm$pcent.fit[1:A.cvax])
B.all <- sum(A.perm$pcent.fit[1:B.cvax])
C.all <- sum(B.perm$pcent.fit[1:C.cvax])

So my first question is, when comparing communities with different but
overlapping sample sizes(all 75 sample sites in C are the same sample sites
as the first 75 in A and B), if

A.all >B.all > C.all

or if A.ax1 > B.ax1 > C.ax1,

can I then say that A is the best predictor community for community D?

More specifically, is this approach a legitimate way of comparing
communities with different sample sizes in which the sample locations
overlap. If not is there a better way of making such a comparison? Perhaps
the co-correspondence method is not appropriate for making this kind of
comparison. I notice that most literature to date uses co-correspondence in
conjunction with cca to determine whether a different community or
environmental parameters best predict a given community. If
co-correspondence is not appropriate, could somebody explain why and
suggest a better method?

References:

 Braak CJF ter, Schaffers AP. (2004). Co-Correspondence Analysis: A New
Ordination Method to Relate Two Community Compositions. Ecology 85:834–846.

Schaffers AP, Raemakers IP, Sýkora KV, Braak CJF ter. (2008). Arthropod
Assemblages Are Best Predicted by Plant Species Composition. Ecology
89:782–794.

 Thanks everyone for your consideration in this matter.

Sincerely,

Jacob Cram

Graduate Student
Department of Biological Sciences
University of Southern California

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

Reply via email to