Dear List Members, I am using the 'cocorresp' package and trying to compare several pairs of communities. My goal would be to see which of several communities best predicts a final community. Say, for instance, that I have four communities A, B, C and D. I want to know which community best predicts D. To further complicate things, I have more data points for some communities than others. For instance communities A and B and D have 100 sites associated with them, while C only was measured at 75 sites (lets imagine it wasn't measured in the last 25 sites). So I run
A.pred <- coca(x =A, y = D, method = pred, reg.method = eigen n.axes = 10) B.pred <- coca(x =B, y = D, method = pred, reg.method = eigen n.axes = 10) C.pred <- coca(x =C, y = D[1:75,], method = pred, reg.method = eigen n.axes = 10) I then apply crossvalidation, as in (Braak & Schaffers 2004). A.crossval <- crossval(x = A.pred, n.axes = 10) A.cvax <- which.max(A.crossval$CVfit) A.cvmax <- max(A.crossval$CVfit) B.crossval <- crossval(x = B.pred, n.axes = 10) B.cvax <- which.max(B.crossval$CVfit) B.cvmax <- max(B.crossval$CVfit) C.crossval <- crossval(x = C.pred, n.axes = 10) C.cvax <- which.max(C.crossval$CVfit) C.cvmax <- max(C.crossval$CVfit) Thus X.cvax is the number of axes that give the best crossvalidation score for community X predicting community D. X.cvax max is that best crossvalidation score. I think that if X.cvmax is positive, that suggests that the model can predict come fraction of the structure of community D from community X (where X is the community (A, B or C) in question) (as suggested in Schaffers et al 2008). Now, I want to know whether A, B or C best predicts D. Because C is only 75 data points, I expect A.cvmax and B.cvmax should be inflated relative to C.cvmax, so I shouldn't be able to use these values to cross compare predictive power. One approach that I think might work is to run permutest to generate percent fits for each axis that crossvalidation suggests I should include in the model. A.perm <- permutest(mo.pred, n.axes = A.cvax) B.perm <- permutest(mo.pred, n.axes = B.cvax) C.perm <- permutest(mo.pred, n.axes = C.cvax) I can then ask what percentage of the community of X predicts what percent of community Y using the pcent.fit value from the permutest objects. I think this value is independent of sample size, but am not sure. Does anybody else know? I could either do this for the first axis, or for the sum all axes in the model. I'm not sure if the second option is legitimate Option 1 A.ax1 <- A.perm$pcent.fit[1] B.ax1 <- B.perm$pcent.fit[1] C.ax1 <- C.perm$pcent.fit[1] Option 2 A.all <- sum(A.perm$pcent.fit[1:A.cvax]) B.all <- sum(A.perm$pcent.fit[1:B.cvax]) C.all <- sum(B.perm$pcent.fit[1:C.cvax]) So my first question is, when comparing communities with different but overlapping sample sizes(all 75 sample sites in C are the same sample sites as the first 75 in A and B), if A.all >B.all > C.all or if A.ax1 > B.ax1 > C.ax1, can I then say that A is the best predictor community for community D? More specifically, is this approach a legitimate way of comparing communities with different sample sizes in which the sample locations overlap. If not is there a better way of making such a comparison? Perhaps the co-correspondence method is not appropriate for making this kind of comparison. I notice that most literature to date uses co-correspondence in conjunction with cca to determine whether a different community or environmental parameters best predict a given community. If co-correspondence is not appropriate, could somebody explain why and suggest a better method? References: Braak CJF ter, Schaffers AP. (2004). Co-Correspondence Analysis: A New Ordination Method to Relate Two Community Compositions. Ecology 85:834846. Schaffers AP, Raemakers IP, Sýkora KV, Braak CJF ter. (2008). Arthropod Assemblages Are Best Predicted by Plant Species Composition. Ecology 89:782794. Thanks everyone for your consideration in this matter. Sincerely, Jacob Cram Graduate Student Department of Biological Sciences University of Southern California [[alternative HTML version deleted]]
_______________________________________________ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology