Re: [R-sig-phylo] understanding variance-covariance matrix

2018-08-27 Thread Theodore Garland
I just wanted to clarify one thing.  Agus wrote:

" Under the case of phylo dependence, all off-diagonal values are expected
to
be non zero "

That is incorrect for species on opposite sides of the root of the
phylogenetic tree, as they have zero shared phylogenetic history in the
context of the set of species under analysis.  So, your expected
variance-covariance matrix for the trait (or for residuals from a
regression model) will have some blocks of zeros on the off-diagonals.

Cheers,
Ted


Theodore Garland, Jr., Distinguished Professor

Department of Evolution, Ecology, and Organismal Biology (EEOB)

University of California, Riverside

Riverside, CA 92521

Office Phone:  (951) 827-3524

Facsimile:  (951) 827-4286 (not confidential)

Email:  tgarl...@ucr.edu

http://www.biology.ucr.edu/people/faculty/Garland.html

http://scholar.google.com/citations?hl=en=iSSbrhwJ


Director, UCR Institute for the Development of
<http://idea.ucr.edu/>Educational
Applications <http://idea.ucr.edu/>


Editor in Chief, *Physiological and Biochemical Zoology
<http://www.press.uchicago.edu/ucp/journals/journal/pbz.html>*


Fail Lab: Episode One

*https://www.youtube.com/watch?v=c0msBWyTzU0
<https://www.youtube.com/watch?v=c0msBWyTzU0>*

On Mon, Aug 27, 2018 at 7:28 AM, Agus Camacho 
wrote:

> Many thanks to all for your very helpful references, commands and examples.
>
> For the sake of future readers, a summary of the off-list discussions and
> further reading.
>
> Andrew Hipp provided a very didactic way to look at the vcv matrix as a
> triangular distance matrix
>
> library(geiger)
> tr <- sim.bdtree(n = 100)
> C = as.dist(vcv(tr))
> C
>
> Now, it made sense that for n species where a trait X has been measured,
> Cov[xi,xj] is an nxn matrix with species coordinates, which for any pair of
> tip values Xi and Xj, represents the shared distances of these tips from
> the root, multiplied for a rate of evolution, or matrix of rates. The
> results of such multiplication constitute the *expected *covariances among
> values at the tips.
>
> Under the case of phylo independence, all off-diagonal values of Cov[xi,xj]
> are expected to become zero (because of shared history=0, and is a
> multiplicative term in the covariance formula). At the diagonal, each trait
> covariance with itself consists in the squared distance from the tip to the
> root of the tree multiplied by the rate of evolution (or rates, if they are
> not homogeneous). That can be represented by a unity matrix multiplied by
> the squared term "sigma", which represents the rate of evolution, or a
> matrix of rates of evolution along the tree.
>
> Under the case of phylo dependence, all off-diagonal values are expected to
> be non zero (because of shared history>0 and is a multiplicative term in
> the covariance formula). The rest of Cov [xi,xj] is similar to a case of
> phylo independence.
>
> Thus, during a PGLS model fit procedure, R calculates  Cov [xi,xj] for a
> given model of evolution over the residuals of the OLS regression. These
> residuals should contain phylogenetic dependence coming from shared
> evolution of trait values in X and Y. That is because the OLS regression
> does not account for such dependence. The residuals of an OLS do not need
> to be one unity matrix, but the dependence in residuals should look like it
> to assume phylo independence in trait values. Finally, the PGLS procedure
> modifies the predictive and predicted variables accordingly, to avoid a
> case of phylo dependence under the specified model of evolution, and
> finally performs an OLS.
>
> Many thanks again to the refs pointed by Brian, and Liam's 2010 paper on
> phylo signal. Matterial from Andrew, Julen and Diogo Provete were also
> super didactic.
>
> Hope that help others too,
> Cheers
>
> Dr. Agustín Camacho Guerrero. Universidade de São Paulo.
> http://www.agustincamacho.com
> Laboratório de Comportamento e Fisiologia Evolutiva, Departamento de
> Fisiologia,
> Instituto de Biociências, USP.Rua do Matão, trav. 14, nº 321, Cidade
> Universitária,
> São Paulo - SP, CEP: 05508-090, Brasil.
>
>
>
> El dom., 26 ago. 2018 a las 21:55, Julien Clavel (<
> julien.cla...@hotmail.fr>)
> escribió:
>
> > Hi Agus,
> >
> > I just posted some courses I did a while ago to understand the variances
> > and covariances of a BM process on trees and time-series (R markdown):
> > https://github.com/JClavel/Examples <https://github.com/JClavel/Teaching
> >
> >
> > This is also based on the previous mentioned references illustrated with
> > some simulations.
> > Hope it may helps...
> >
> > Best,
> >
> > Julien
> >
> > --

Re: [R-sig-phylo] understanding variance-covariance matrix

2018-08-27 Thread Agus Camacho
Many thanks to all for your very helpful references, commands and examples.

For the sake of future readers, a summary of the off-list discussions and
further reading.

Andrew Hipp provided a very didactic way to look at the vcv matrix as a
triangular distance matrix

library(geiger)
tr <- sim.bdtree(n = 100)
C = as.dist(vcv(tr))
C

Now, it made sense that for n species where a trait X has been measured,
Cov[xi,xj] is an nxn matrix with species coordinates, which for any pair of
tip values Xi and Xj, represents the shared distances of these tips from
the root, multiplied for a rate of evolution, or matrix of rates. The
results of such multiplication constitute the *expected *covariances among
values at the tips.

Under the case of phylo independence, all off-diagonal values of Cov[xi,xj]
are expected to become zero (because of shared history=0, and is a
multiplicative term in the covariance formula). At the diagonal, each trait
covariance with itself consists in the squared distance from the tip to the
root of the tree multiplied by the rate of evolution (or rates, if they are
not homogeneous). That can be represented by a unity matrix multiplied by
the squared term "sigma", which represents the rate of evolution, or a
matrix of rates of evolution along the tree.

Under the case of phylo dependence, all off-diagonal values are expected to
be non zero (because of shared history>0 and is a multiplicative term in
the covariance formula). The rest of Cov [xi,xj] is similar to a case of
phylo independence.

Thus, during a PGLS model fit procedure, R calculates  Cov [xi,xj] for a
given model of evolution over the residuals of the OLS regression. These
residuals should contain phylogenetic dependence coming from shared
evolution of trait values in X and Y. That is because the OLS regression
does not account for such dependence. The residuals of an OLS do not need
to be one unity matrix, but the dependence in residuals should look like it
to assume phylo independence in trait values. Finally, the PGLS procedure
modifies the predictive and predicted variables accordingly, to avoid a
case of phylo dependence under the specified model of evolution, and
finally performs an OLS.

Many thanks again to the refs pointed by Brian, and Liam's 2010 paper on
phylo signal. Matterial from Andrew, Julen and Diogo Provete were also
super didactic.

Hope that help others too,
Cheers

Dr. Agustín Camacho Guerrero. Universidade de São Paulo.
http://www.agustincamacho.com
Laboratório de Comportamento e Fisiologia Evolutiva, Departamento de
Fisiologia,
Instituto de Biociências, USP.Rua do Matão, trav. 14, nº 321, Cidade
Universitária,
São Paulo - SP, CEP: 05508-090, Brasil.



El dom., 26 ago. 2018 a las 21:55, Julien Clavel ()
escribió:

> Hi Agus,
>
> I just posted some courses I did a while ago to understand the variances
> and covariances of a BM process on trees and time-series (R markdown):
> https://github.com/JClavel/Examples <https://github.com/JClavel/Teaching>
>
> This is also based on the previous mentioned references illustrated with
> some simulations.
> Hope it may helps...
>
> Best,
>
> Julien
>
> --
> *De :* R-sig-phylo  de la part de
> Andrew Hipp 
> *Envoyé :* dimanche 26 août 2018 05:36
> *À :* bome...@utk.edu
> *Cc :* mailman, r-sig-phylo; Agus Camacho
> *Objet :* Re: [R-sig-phylo] understanding variance-covariance matrix
>
> I'll second Brian's self-citation. O'Meara et al. 2006 is I think one of
> the best introductions to the phylogenetic covariance matrix, and I often
> direct students to it.
>
> Brian's point about the relationship between observed and expected
> covariance is illustrated here in a brief note I wrote up for students this
> spring:
>
>
> https://github.com/andrew-hipp/PCM-2018/blob/master/R-tutorials/2018-PCM-covarianceMatrixRuminations.ipynb
>
> It might be helpful, or it might not. I hope so!
>
> Take care,
> Andrew
>
> On Sat, Aug 25, 2018 at 1:33 PM, Brian O'Meara 
> wrote:
>
> > Hi, Agus. The variance-covariance matrix comes from the tree and the
> > evolutionary model, not the data. Each entry between taxa A and B in the
> > VCV is how much covariance I should expect between data for taxa A and B
> > simulated up that tree using that model. I don't want to be *that guy*,
> but
> > O'Meara et al. (2006)
> > https://onlinelibrary.wiley.com/doi/10./j.0014-3820.2006.tb01171.x
> has
> > a fairly accessible explanation of this (largely b/c I was just learning
> > about VCVs when working on that paper). Hansen and Martins (1996)
> > https://onlinelibrary.wiley.com/doi/10./j.1558-5646.1996.tb03914.x
> > have
> > a much more detailed description of how you get these covariance matrices
> > from microevolutionary processes.
> >
&g

Re: [R-sig-phylo] understanding variance-covariance matrix

2018-08-26 Thread Julien Clavel
Hi Agus,

I just posted some courses I did a while ago to understand the variances and 
covariances of a BM process on trees and time-series (R markdown):
https://github.com/JClavel/Examples<https://github.com/JClavel/Teaching>

This is also based on the previous mentioned references illustrated with some 
simulations.
Hope it may helps...

Best,

Julien


De : R-sig-phylo  de la part de Andrew Hipp 

Envoy� : dimanche 26 ao�t 2018 05:36
� : bome...@utk.edu
Cc : mailman, r-sig-phylo; Agus Camacho
Objet : Re: [R-sig-phylo] understanding variance-covariance matrix

I'll second Brian's self-citation. O'Meara et al. 2006 is I think one of
the best introductions to the phylogenetic covariance matrix, and I often
direct students to it.

Brian's point about the relationship between observed and expected
covariance is illustrated here in a brief note I wrote up for students this
spring:

https://github.com/andrew-hipp/PCM-2018/blob/master/R-tutorials/2018-PCM-covarianceMatrixRuminations.ipynb

[[elided Hotmail spam]]

Take care,
Andrew

On Sat, Aug 25, 2018 at 1:33 PM, Brian O'Meara 
wrote:

> Hi, Agus. The variance-covariance matrix comes from the tree and the
> evolutionary model, not the data. Each entry between taxa A and B in the
> VCV is how much covariance I should expect between data for taxa A and B
> simulated up that tree using that model. I don't want to be *that guy*, but
> O'Meara et al. (2006)
> https://onlinelibrary.wiley.com/doi/10./j.0014-3820.2006.tb01171.x has
> a fairly accessible explanation of this (largely b/c I was just learning
> about VCVs when working on that paper). Hansen and Martins (1996)
> https://onlinelibrary.wiley.com/doi/10./j.1558-5646.1996.tb03914.x
> have
> a much more detailed description of how you get these covariance matrices
> from microevolutionary processes.
>
> Typically, ape::vcv() is how you get a variance covariance for a phylogeny,
> assuming Brownian motion and no measurement error. It just basically takes
> the history two taxa share to create the covariance (or variance, if the
> two taxa are the same taxon). A different approach, which seems to be what
> you're doing, would be to simulate up a tree many times, and then for each
> pair of taxa (including the pair of a taxon with itself, the diagonal of
> the VCV), calculate the covariance. These approaches should get the same
> results, though the shared history on the tree approach is faster.
>
> Best,
> Brian
>
>
> ___
> Brian O'Meara, http://www.brianomeara.info, especially Calendar
> <http://brianomeara.info/calendars/omeara/>, CV
> <http://brianomeara.info/cv/>, and Feedback
> <http://brianomeara.info/teaching/feedback/>
>
> Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
> Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
>
>
>
> On Sat, Aug 25, 2018 at 1:16 PM Agus Camacho 
> wrote:
>
> > Dear list users,
> >
> > I am trying to make an easy R demonstration to teach the
> > variance-covariance matrix to students. However, After consulting the
> > internet and books, I found myself facing three difficulties to
> understand
> > the math and code behind this important matrix. As this list is answered
> by
> > several authors of books of phylocomp methods, thought this might make an
> > useful general discussion.
> >
> > Here we go,
> >
> > 1) I dont know how to generate a phyloVCV matrix in R (Liams kindly
> > described some options here
> > <
> > http://blog.phytools.org/2013/12/three-different-ways-to-
> calculate-among.html
> > >
> > but I cannot tell for sure what is X made of. It would seem a dataframe
> of
> > some variables measured across species. But then, I get errors when I
> > write:
> >
> >  tree <- pbtree(n = 10, scale = 1)
> >  tree$tip.label <- sprintf("sp%s",seq(1:n))
> >  x <- fastBM(tree)
> > y <- fastBM(tree)
> >   X=data.frame(x,y)
> >  rownames(X)=tree$tip.label
> >  ## Revell (2009)
> >  A<-matrix(1,nrow(X),1)%*%apply(X,2,fastAnc,tree=tree)[1,]
> >  V1<-t(X-A)%*%solve(vcv(tree))%*%(X-A)/(nrow(X)-1)
> >## Butler et al. (2000)
> >Z<-solve(t(chol(vcv(tree%*%(X-A)
> >  V2<-t(Z)%*%Z/(nrow(X)-1)
> >
> >## pics
> >Y<-apply(X,2,pic,phy=tree)
> >  V3<-t(Y)%*%Y/nrow(Y)
> >
> > 2) The phyloVCV matrix has n x n coordinates defined by the n species,
> and
> > it represents covariances among observations made across the n species,
> > right?. Still, I do no know whether these covar

Re: [R-sig-phylo] understanding variance-covariance matrix

2018-08-25 Thread Andrew Hipp
I'll second Brian's self-citation. O'Meara et al. 2006 is I think one of
the best introductions to the phylogenetic covariance matrix, and I often
direct students to it.

Brian's point about the relationship between observed and expected
covariance is illustrated here in a brief note I wrote up for students this
spring:

https://github.com/andrew-hipp/PCM-2018/blob/master/R-tutorials/2018-PCM-covarianceMatrixRuminations.ipynb

It might be helpful, or it might not. I hope so!

Take care,
Andrew

On Sat, Aug 25, 2018 at 1:33 PM, Brian O'Meara 
wrote:

> Hi, Agus. The variance-covariance matrix comes from the tree and the
> evolutionary model, not the data. Each entry between taxa A and B in the
> VCV is how much covariance I should expect between data for taxa A and B
> simulated up that tree using that model. I don't want to be *that guy*, but
> O'Meara et al. (2006)
> https://onlinelibrary.wiley.com/doi/10./j.0014-3820.2006.tb01171.x has
> a fairly accessible explanation of this (largely b/c I was just learning
> about VCVs when working on that paper). Hansen and Martins (1996)
> https://onlinelibrary.wiley.com/doi/10./j.1558-5646.1996.tb03914.x
> have
> a much more detailed description of how you get these covariance matrices
> from microevolutionary processes.
>
> Typically, ape::vcv() is how you get a variance covariance for a phylogeny,
> assuming Brownian motion and no measurement error. It just basically takes
> the history two taxa share to create the covariance (or variance, if the
> two taxa are the same taxon). A different approach, which seems to be what
> you're doing, would be to simulate up a tree many times, and then for each
> pair of taxa (including the pair of a taxon with itself, the diagonal of
> the VCV), calculate the covariance. These approaches should get the same
> results, though the shared history on the tree approach is faster.
>
> Best,
> Brian
>
>
> ___
> Brian O'Meara, http://www.brianomeara.info, especially Calendar
> , CV
> , and Feedback
> 
>
> Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
> Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville
>
>
>
> On Sat, Aug 25, 2018 at 1:16 PM Agus Camacho 
> wrote:
>
> > Dear list users,
> >
> > I am trying to make an easy R demonstration to teach the
> > variance-covariance matrix to students. However, After consulting the
> > internet and books, I found myself facing three difficulties to
> understand
> > the math and code behind this important matrix. As this list is answered
> by
> > several authors of books of phylocomp methods, thought this might make an
> > useful general discussion.
> >
> > Here we go,
> >
> > 1) I dont know how to generate a phyloVCV matrix in R (Liams kindly
> > described some options here
> > <
> > http://blog.phytools.org/2013/12/three-different-ways-to-
> calculate-among.html
> > >
> > but I cannot tell for sure what is X made of. It would seem a dataframe
> of
> > some variables measured across species. But then, I get errors when I
> > write:
> >
> >  tree <- pbtree(n = 10, scale = 1)
> >  tree$tip.label <- sprintf("sp%s",seq(1:n))
> >  x <- fastBM(tree)
> > y <- fastBM(tree)
> >   X=data.frame(x,y)
> >  rownames(X)=tree$tip.label
> >  ## Revell (2009)
> >  A<-matrix(1,nrow(X),1)%*%apply(X,2,fastAnc,tree=tree)[1,]
> >  V1<-t(X-A)%*%solve(vcv(tree))%*%(X-A)/(nrow(X)-1)
> >## Butler et al. (2000)
> >Z<-solve(t(chol(vcv(tree%*%(X-A)
> >  V2<-t(Z)%*%Z/(nrow(X)-1)
> >
> >## pics
> >Y<-apply(X,2,pic,phy=tree)
> >  V3<-t(Y)%*%Y/nrow(Y)
> >
> > 2) The phyloVCV matrix has n x n coordinates defined by the n species,
> and
> > it represents covariances among observations made across the n species,
> > right?. Still, I do no know whether these covariances are calculated over
> > a) X vs Y values for each pair of species coordinates in the matrix,
> across
> > the n species, or b) directly over the vector of n residuals of Y, after
> > correlating Y vs X, across all pairs of species coordinates. I think it
> may
> > be a) because, by definition, variance cannot be calculated for a single
> > value. I am not sure though, since it seems the whole point of PGLS is to
> > control phylosignal within the residuals of a regression procedure, prior
> > to actually making it.
> >
> > 3) If I create two perfeclty correlated variables with independent
> > observations and calculate a covariance or correlation matrix for them, I
> > do not get a diagonal matrix, with zeros at the off diagonals (ex. here
> > <
> > https://www.dropbox.com/s/y8g3tkzk509pz58/vcvexamplewithrandomvariables.
> xlsx?dl=0
> > >),
> > why expect then a diagonal matrix for the case of independence among the
> > observations?
> >
> > Thanks in advance and sorry if I missed anything obvious here!
> > 

Re: [R-sig-phylo] understanding variance-covariance matrix

2018-08-25 Thread Emmanuel Paradis

Hi Agus & Brian,

To complete Brian's response, vcv() is a generic function which works 
with trees (class "phylo") and phylogenetic correlation structures 
(class "corPhyl"). The latter can be used to define an OU model, see 
?vcv, and ?corPhyl for the correlation structures available in ape.


Best,

Emmanuel

Le 25/08/2018 à 20:33, Brian O'Meara a écrit :

Hi, Agus. The variance-covariance matrix comes from the tree and the
evolutionary model, not the data. Each entry between taxa A and B in the
VCV is how much covariance I should expect between data for taxa A and B
simulated up that tree using that model. I don't want to be *that guy*, but
O'Meara et al. (2006)
https://onlinelibrary.wiley.com/doi/10./j.0014-3820.2006.tb01171.x has
a fairly accessible explanation of this (largely b/c I was just learning
about VCVs when working on that paper). Hansen and Martins (1996)
https://onlinelibrary.wiley.com/doi/10./j.1558-5646.1996.tb03914.x have
a much more detailed description of how you get these covariance matrices
from microevolutionary processes.

Typically, ape::vcv() is how you get a variance covariance for a phylogeny,
assuming Brownian motion and no measurement error. It just basically takes
the history two taxa share to create the covariance (or variance, if the
two taxa are the same taxon). A different approach, which seems to be what
you're doing, would be to simulate up a tree many times, and then for each
pair of taxa (including the pair of a taxon with itself, the diagonal of
the VCV), calculate the covariance. These approaches should get the same
results, though the shared history on the tree approach is faster.

Best,
Brian


___
Brian O'Meara, http://www.brianomeara.info, especially Calendar
, CV
, and Feedback


Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville



On Sat, Aug 25, 2018 at 1:16 PM Agus Camacho  wrote:


Dear list users,

I am trying to make an easy R demonstration to teach the
variance-covariance matrix to students. However, After consulting the
internet and books, I found myself facing three difficulties to understand
the math and code behind this important matrix. As this list is answered by
several authors of books of phylocomp methods, thought this might make an
useful general discussion.

Here we go,

1) I dont know how to generate a phyloVCV matrix in R (Liams kindly
described some options here
<
http://blog.phytools.org/2013/12/three-different-ways-to-calculate-among.html



but I cannot tell for sure what is X made of. It would seem a dataframe of
some variables measured across species. But then, I get errors when I
write:

  tree <- pbtree(n = 10, scale = 1)
  tree$tip.label <- sprintf("sp%s",seq(1:n))
  x <- fastBM(tree)
y <- fastBM(tree)
   X=data.frame(x,y)
  rownames(X)=tree$tip.label
  ## Revell (2009)
  A<-matrix(1,nrow(X),1)%*%apply(X,2,fastAnc,tree=tree)[1,]
  V1<-t(X-A)%*%solve(vcv(tree))%*%(X-A)/(nrow(X)-1)
## Butler et al. (2000)
Z<-solve(t(chol(vcv(tree%*%(X-A)
  V2<-t(Z)%*%Z/(nrow(X)-1)

## pics
Y<-apply(X,2,pic,phy=tree)
  V3<-t(Y)%*%Y/nrow(Y)

2) The phyloVCV matrix has n x n coordinates defined by the n species, and
it represents covariances among observations made across the n species,
right?. Still, I do no know whether these covariances are calculated over
a) X vs Y values for each pair of species coordinates in the matrix, across
the n species, or b) directly over the vector of n residuals of Y, after
correlating Y vs X, across all pairs of species coordinates. I think it may
be a) because, by definition, variance cannot be calculated for a single
value. I am not sure though, since it seems the whole point of PGLS is to
control phylosignal within the residuals of a regression procedure, prior
to actually making it.

3) If I create two perfeclty correlated variables with independent
observations and calculate a covariance or correlation matrix for them, I
do not get a diagonal matrix, with zeros at the off diagonals (ex. here
<
https://www.dropbox.com/s/y8g3tkzk509pz58/vcvexamplewithrandomvariables.xlsx?dl=0

),

why expect then a diagonal matrix for the case of independence among the
observations?

Thanks in advance and sorry if I missed anything obvious here!
Agus
Dr. Agustín Camacho Guerrero. Universidade de São Paulo.
http://www.agustincamacho.com
Laboratório de Comportamento e Fisiologia Evolutiva, Departamento de
Fisiologia,
Instituto de Biociências, USP.Rua do Matão, trav. 14, nº 321, Cidade
Universitária,
São Paulo - SP, CEP: 05508-090, Brasil.

 [[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org

Re: [R-sig-phylo] understanding variance-covariance matrix

2018-08-25 Thread Brian O'Meara
Hi, Agus. The variance-covariance matrix comes from the tree and the
evolutionary model, not the data. Each entry between taxa A and B in the
VCV is how much covariance I should expect between data for taxa A and B
simulated up that tree using that model. I don't want to be *that guy*, but
O'Meara et al. (2006)
https://onlinelibrary.wiley.com/doi/10./j.0014-3820.2006.tb01171.x has
a fairly accessible explanation of this (largely b/c I was just learning
about VCVs when working on that paper). Hansen and Martins (1996)
https://onlinelibrary.wiley.com/doi/10./j.1558-5646.1996.tb03914.x have
a much more detailed description of how you get these covariance matrices
from microevolutionary processes.

Typically, ape::vcv() is how you get a variance covariance for a phylogeny,
assuming Brownian motion and no measurement error. It just basically takes
the history two taxa share to create the covariance (or variance, if the
two taxa are the same taxon). A different approach, which seems to be what
you're doing, would be to simulate up a tree many times, and then for each
pair of taxa (including the pair of a taxon with itself, the diagonal of
the VCV), calculate the covariance. These approaches should get the same
results, though the shared history on the tree approach is faster.

Best,
Brian


___
Brian O'Meara, http://www.brianomeara.info, especially Calendar
, CV
, and Feedback


Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville



On Sat, Aug 25, 2018 at 1:16 PM Agus Camacho  wrote:

> Dear list users,
>
> I am trying to make an easy R demonstration to teach the
> variance-covariance matrix to students. However, After consulting the
> internet and books, I found myself facing three difficulties to understand
> the math and code behind this important matrix. As this list is answered by
> several authors of books of phylocomp methods, thought this might make an
> useful general discussion.
>
> Here we go,
>
> 1) I dont know how to generate a phyloVCV matrix in R (Liams kindly
> described some options here
> <
> http://blog.phytools.org/2013/12/three-different-ways-to-calculate-among.html
> >
> but I cannot tell for sure what is X made of. It would seem a dataframe of
> some variables measured across species. But then, I get errors when I
> write:
>
>  tree <- pbtree(n = 10, scale = 1)
>  tree$tip.label <- sprintf("sp%s",seq(1:n))
>  x <- fastBM(tree)
> y <- fastBM(tree)
>   X=data.frame(x,y)
>  rownames(X)=tree$tip.label
>  ## Revell (2009)
>  A<-matrix(1,nrow(X),1)%*%apply(X,2,fastAnc,tree=tree)[1,]
>  V1<-t(X-A)%*%solve(vcv(tree))%*%(X-A)/(nrow(X)-1)
>## Butler et al. (2000)
>Z<-solve(t(chol(vcv(tree%*%(X-A)
>  V2<-t(Z)%*%Z/(nrow(X)-1)
>
>## pics
>Y<-apply(X,2,pic,phy=tree)
>  V3<-t(Y)%*%Y/nrow(Y)
>
> 2) The phyloVCV matrix has n x n coordinates defined by the n species, and
> it represents covariances among observations made across the n species,
> right?. Still, I do no know whether these covariances are calculated over
> a) X vs Y values for each pair of species coordinates in the matrix, across
> the n species, or b) directly over the vector of n residuals of Y, after
> correlating Y vs X, across all pairs of species coordinates. I think it may
> be a) because, by definition, variance cannot be calculated for a single
> value. I am not sure though, since it seems the whole point of PGLS is to
> control phylosignal within the residuals of a regression procedure, prior
> to actually making it.
>
> 3) If I create two perfeclty correlated variables with independent
> observations and calculate a covariance or correlation matrix for them, I
> do not get a diagonal matrix, with zeros at the off diagonals (ex. here
> <
> https://www.dropbox.com/s/y8g3tkzk509pz58/vcvexamplewithrandomvariables.xlsx?dl=0
> >),
> why expect then a diagonal matrix for the case of independence among the
> observations?
>
> Thanks in advance and sorry if I missed anything obvious here!
> Agus
> Dr. Agustín Camacho Guerrero. Universidade de São Paulo.
> http://www.agustincamacho.com
> Laboratório de Comportamento e Fisiologia Evolutiva, Departamento de
> Fisiologia,
> Instituto de Biociências, USP.Rua do Matão, trav. 14, nº 321, Cidade
> Universitária,
> São Paulo - SP, CEP: 05508-090, Brasil.
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org

[R-sig-phylo] understanding variance-covariance matrix

2018-08-25 Thread Agus Camacho
Dear list users,

I am trying to make an easy R demonstration to teach the
variance-covariance matrix to students. However, After consulting the
internet and books, I found myself facing three difficulties to understand
the math and code behind this important matrix. As this list is answered by
several authors of books of phylocomp methods, thought this might make an
useful general discussion.

Here we go,

1) I dont know how to generate a phyloVCV matrix in R (Liams kindly
described some options here

but I cannot tell for sure what is X made of. It would seem a dataframe of
some variables measured across species. But then, I get errors when I
write:

 tree <- pbtree(n = 10, scale = 1)
 tree$tip.label <- sprintf("sp%s",seq(1:n))
 x <- fastBM(tree)
y <- fastBM(tree)
  X=data.frame(x,y)
 rownames(X)=tree$tip.label
 ## Revell (2009)
 A<-matrix(1,nrow(X),1)%*%apply(X,2,fastAnc,tree=tree)[1,]
 V1<-t(X-A)%*%solve(vcv(tree))%*%(X-A)/(nrow(X)-1)
   ## Butler et al. (2000)
   Z<-solve(t(chol(vcv(tree%*%(X-A)
 V2<-t(Z)%*%Z/(nrow(X)-1)

   ## pics
   Y<-apply(X,2,pic,phy=tree)
 V3<-t(Y)%*%Y/nrow(Y)

2) The phyloVCV matrix has n x n coordinates defined by the n species, and
it represents covariances among observations made across the n species,
right?. Still, I do no know whether these covariances are calculated over
a) X vs Y values for each pair of species coordinates in the matrix, across
the n species, or b) directly over the vector of n residuals of Y, after
correlating Y vs X, across all pairs of species coordinates. I think it may
be a) because, by definition, variance cannot be calculated for a single
value. I am not sure though, since it seems the whole point of PGLS is to
control phylosignal within the residuals of a regression procedure, prior
to actually making it.

3) If I create two perfeclty correlated variables with independent
observations and calculate a covariance or correlation matrix for them, I
do not get a diagonal matrix, with zeros at the off diagonals (ex. here
),
why expect then a diagonal matrix for the case of independence among the
observations?

Thanks in advance and sorry if I missed anything obvious here!
Agus
Dr. Agustín Camacho Guerrero. Universidade de São Paulo.
http://www.agustincamacho.com
Laboratório de Comportamento e Fisiologia Evolutiva, Departamento de
Fisiologia,
Instituto de Biociências, USP.Rua do Matão, trav. 14, nº 321, Cidade
Universitária,
São Paulo - SP, CEP: 05508-090, Brasil.

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/