Dear list users, I am trying to make an easy R demonstration to teach the variance-covariance matrix to students. However, After consulting the internet and books, I found myself facing three difficulties to understand the math and code behind this important matrix. As this list is answered by several authors of books of phylocomp methods, thought this might make an useful general discussion.
Here we go, 1) I dont know how to generate a phyloVCV matrix in R (Liams kindly described some options here <http://blog.phytools.org/2013/12/three-different-ways-to-calculate-among.html> but I cannot tell for sure what is X made of. It would seem a dataframe of some variables measured across species. But then, I get errors when I write: tree <- pbtree(n = 10, scale = 1) tree$tip.label <- sprintf("sp%s",seq(1:n)) x <- fastBM(tree) y <- fastBM(tree) X=data.frame(x,y) rownames(X)=tree$tip.label ## Revell (2009) A<-matrix(1,nrow(X),1)%*%apply(X,2,fastAnc,tree=tree)[1,] V1<-t(X-A)%*%solve(vcv(tree))%*%(X-A)/(nrow(X)-1) ## Butler et al. (2000) Z<-solve(t(chol(vcv(tree))))%*%(X-A) V2<-t(Z)%*%Z/(nrow(X)-1) ## pics Y<-apply(X,2,pic,phy=tree) V3<-t(Y)%*%Y/nrow(Y) 2) The phyloVCV matrix has n x n coordinates defined by the n species, and it represents covariances among observations made across the n species, right?. Still, I do no know whether these covariances are calculated over a) X vs Y values for each pair of species coordinates in the matrix, across the n species, or b) directly over the vector of n residuals of Y, after correlating Y vs X, across all pairs of species coordinates. I think it may be a) because, by definition, variance cannot be calculated for a single value. I am not sure though, since it seems the whole point of PGLS is to control phylosignal within the residuals of a regression procedure, prior to actually making it. 3) If I create two perfeclty correlated variables with independent observations and calculate a covariance or correlation matrix for them, I do not get a diagonal matrix, with zeros at the off diagonals (ex. here <https://www.dropbox.com/s/y8g3tkzk509pz58/vcvexamplewithrandomvariables.xlsx?dl=0>), why expect then a diagonal matrix for the case of independence among the observations? Thanks in advance and sorry if I missed anything obvious here! Agus Dr. Agustín Camacho Guerrero. Universidade de São Paulo. http://www.agustincamacho.com Laboratório de Comportamento e Fisiologia Evolutiva, Departamento de Fisiologia, Instituto de Biociências, USP.Rua do Matão, trav. 14, nº 321, Cidade Universitária, São Paulo - SP, CEP: 05508-090, Brasil. [[alternative HTML version deleted]] _______________________________________________ R-sig-phylo mailing list - [email protected] https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/[email protected]/
