# Re: [R-sig-phylo] understanding variance-covariance matrix

```Hi Agus,

I just posted some courses I did a while ago to understand the variances and
covariances of a BM process on trees and time-series (R markdown):
https://github.com/JClavel/Examples<https://github.com/JClavel/Teaching>```
```
This is also based on the previous mentioned references illustrated with some
simulations.
Hope it may helps...

Best,

Julien

________________________________
De : R-sig-phylo <r-sig-phylo-boun...@r-project.org> de la part de Andrew Hipp
<ah...@mortonarb.org>
Envoy� : dimanche 26 ao�t 2018 05:36
� : bome...@utk.edu
Cc : mailman, r-sig-phylo; Agus Camacho
Objet : Re: [R-sig-phylo] understanding variance-covariance matrix

I'll second Brian's self-citation. O'Meara et al. 2006 is I think one of
the best introductions to the phylogenetic covariance matrix, and I often
direct students to it.

Brian's point about the relationship between observed and expected
covariance is illustrated here in a brief note I wrote up for students this
spring:

https://github.com/andrew-hipp/PCM-2018/blob/master/R-tutorials/2018-PCM-covarianceMatrixRuminations.ipynb

[[elided Hotmail spam]]

Take care,
Andrew

On Sat, Aug 25, 2018 at 1:33 PM, Brian O'Meara <omeara.br...@gmail.com>
wrote:

> Hi, Agus. The variance-covariance matrix comes from the tree and the
> evolutionary model, not the data. Each entry between taxa A and B in the
> VCV is how much covariance I should expect between data for taxa A and B
> simulated up that tree using that model. I don't want to be *that guy*, but
> O'Meara et al. (2006)
> https://onlinelibrary.wiley.com/doi/10.1111/j.0014-3820.2006.tb01171.x has
> a fairly accessible explanation of this (largely b/c I was just learning
> about VCVs when working on that paper). Hansen and Martins (1996)
> https://onlinelibrary.wiley.com/doi/10.1111/j.1558-5646.1996.tb03914.x
> have
> a much more detailed description of how you get these covariance matrices
> from microevolutionary processes.
>
> Typically, ape::vcv() is how you get a variance covariance for a phylogeny,
> assuming Brownian motion and no measurement error. It just basically takes
> the history two taxa share to create the covariance (or variance, if the
> two taxa are the same taxon). A different approach, which seems to be what
> you're doing, would be to simulate up a tree many times, and then for each
> pair of taxa (including the pair of a taxon with itself, the diagonal of
> the VCV), calculate the covariance. These approaches should get the same
> results, though the shared history on the tree approach is faster.
>
> Best,
> Brian
>
>
> _______________________________________________________________________
> Brian O'Meara, http://www.brianomeara.info, especially Calendar
> <http://brianomeara.info/calendars/omeara/>, CV
> <http://brianomeara.info/cv/>, and Feedback
> <http://brianomeara.info/teaching/feedback/>
>
> Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
> Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
>
>
>
> On Sat, Aug 25, 2018 at 1:16 PM Agus Camacho <agus.cama...@gmail.com>
> wrote:
>
> > Dear list users,
> >
> > I am trying to make an easy R demonstration to teach the
> > variance-covariance matrix to students. However, After consulting the
> > internet and books, I found myself facing three difficulties to
> understand
> > the math and code behind this important matrix. As this list is answered
> by
> > several authors of books of phylocomp methods, thought this might make an
> > useful general discussion.
> >
> > Here we go,
> >
> > 1) I dont know how to generate a phyloVCV matrix in R (Liams kindly
> > described some options here
> > <
> > http://blog.phytools.org/2013/12/three-different-ways-to-
> calculate-among.html
> > >
> > but I cannot tell for sure what is X made of. It would seem a dataframe
> of
> > some variables measured across species. But then, I get errors when I
> > write:
> >
> >  tree <- pbtree(n = 10, scale = 1)
> >  tree\$tip.label <- sprintf("sp%s",seq(1:n))
> >  x <- fastBM(tree)
> > y <- fastBM(tree)
> >   X=data.frame(x,y)
> >  rownames(X)=tree\$tip.label
> >  ## Revell (2009)
> >  A<-matrix(1,nrow(X),1)%*%apply(X,2,fastAnc,tree=tree)[1,]
> >  V1<-t(X-A)%*%solve(vcv(tree))%*%(X-A)/(nrow(X)-1)
> >    ## Butler et al. (2000)
> >    Z<-solve(t(chol(vcv(tree))))%*%(X-A)
> >  V2<-t(Z)%*%Z/(nrow(X)-1)
> >
> >    ## pics
> >    Y<-apply(X,2,pic,phy=tree)
> >  V3<-t(Y)%*%Y/nrow(Y)
> >
> > 2) The phyloVCV matrix has n x n coordinates defined by the n species,
> and
> > it represents covariances among observations made across the n species,
> > right?. Still, I do no know whether these covariances are calculated over
> > a) X vs Y values for each pair of species coordinates in the matrix,
> across
> > the n species, or b) directly over the vector of n residuals of Y, after
> > correlating Y vs X, across all pairs of species coordinates. I think it
> may
> > be a) because, by definition, variance cannot be calculated for a single
> > value. I am not sure though, since it seems the whole point of PGLS is to
> > control phylosignal within the residuals of a regression procedure, prior
> > to actually making it.
> >
> > 3) If I create two perfeclty correlated variables with independent
> > observations and calculate a covariance or correlation matrix for them, I
> > do not get a diagonal matrix, with zeros at the off diagonals (ex. here
> > <
> > https://www.dropbox.com/s/y8g3tkzk509pz58/vcvexamplewithrandomvariables.
> xlsx?dl=0
> > >),
> > why expect then a diagonal matrix for the case of independence among the
> > observations?
> >
[[elided Hotmail spam]]
> > Agus
> > Dr. Agust�n Camacho Guerrero. Universidade de S�o Paulo.
> > http://www.agustincamacho.com
> > Laborat�rio de Comportamento e Fisiologia Evolutiva, Departamento de
> > Fisiologia,
> > Instituto de Bioci�ncias, USP.Rua do Mat�o, trav. 14, n� 321, Cidade
> > Universit�ria,
> > S�o Paulo - SP, CEP: 05508-090, Brasil.
> >
> >         [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > R-sig-phylo mailing list - R-sig-phylo@r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> > Searchable archive at
> > http://www.mail-archive.com/r-sig-phylo@r-project.org/
> >
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at http://www.mail-archive.com/r-
> sig-ph...@r-project.org/
>

--
Andrew Hipp, PhD
Senior Scientist in Plant Systematics and Herbarium Curator, The Morton
Arboretum
Lecturer, Committee on Evolutionary Biology, University of Chicago

The Morton Arboretum
4100 Illinois Route 53 / Lisle IL 60532-1293 / USA
+1 630 725 2094

Lab: http://systematics.mortonarb.org/lab
Hebarium data: http://vplants.org
U of Chicago, CEB: http://evbio.uchicago.edu/
Phenology of the East Woods: http://systematics.mortonarb.org/phenology

[[alternative HTML version deleted]]

_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

[[alternative HTML version deleted]]

```
```_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
```