[R-sig-phylo] Phylogenetic Canonical Correlations

2012-07-13 Thread Jonathan Mitchell
Hello all,

I'm trying to perform a canonical correlations analysis that takes
phylogenetic non-independence into account properly (a la what Revall 2009
did for PCA). To this effect, I've written a script to do canonical
correlates in R based entirely on SVD instead of QR decomp. (because I
understand SVD better) that recapitulates the cancor() function (
http://nr.com/whp/notes/CanonCorrBySVD.pdf).

The way I'm currently approaching this is, I'm centering each column of X
and Y on their phylogenetic mean instead of their numeric mean, and
multiplying the centered matrices by the inverse of the vcv. Is this
appropriate/correct? Is there a better way? Simplified example code posted
below:

tree - rtree(10)
X - replicate(5, rnorm(10))
Y - replicate(6, rnorm(10))
rownames(X) - rownames(Y) - tree$tip.label

C - vcv(tree)
inC - solve(C)
Ones - rep(1, nrow(X))

PMx - t(solve(t(Ones) %*% inC %*% Ones) %*% t(Ones) %*%inC %*% X)
PMy - t(solve(t(Ones) %*% inC %*% Ones) %*% t(Ones) %*%inC %*% Y)

pcX - invC %*% (X - Ones %*% t(PMx))
pcY - invC %*% (Y - Ones %*% t(PMy))

#SVD of X, Y and Ux'Uy
Xs - svd(pcX)
Ys - svd(pcY)

uTu - t(Xs$u) %*% Ys$u
uTus - svd(uTu)

#Canonical coefficients for X
A - Xs$v %*% solve(diag(Xs$d)) %*% uTus$u

#Canonical coefficients for Y
B - Ys$v %*% solve(diag(Ys$d)) %*% uTus$v

-- Jon

_

Jonathan S. Mitchell
http://home.uchicago.edu/~mitchelljs/

PhD Student
Committee on Evolutionary Biology
The University of Chicago
1025 57th Str, Culver Hall 402
Chicago, IL 60637

Geology Department
The Field Museum of Natural History
1400 S. Lake Shore Dr.
Chicago, IL 60605

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


Re: [R-sig-phylo] Phylogenetic Canonical Correlations

2012-07-13 Thread Liam J. Revell

Hi Jon.

phytools (http://cran.r-project.org/package=phytools) has a function, 
phyl.cca, for phylogenetic canonical correlations. It returns the 
coefficients  scores (in the species space - so, with phylogenetic 
correlation), correlations, and p-values.


You can also just use pic and cancor to get the same coefficients, as 
follows:


# simulate data
tree-rtree(50)
X-replicate(5,fastBM(tree))
Y-replicate(6,fastBM(tree))
# use phyl.cca
result1-phyl.cca(tree,X,Y)
# use pic  cancor
picX-apply(X,2,pic,phy=tree)
picY-apply(Y,2,pic,phy=tree)
result2-cancor(picX,picY,xcenter=F,ycenter=F)

Note that the coefficients from result1 and result2 may not be equal, 
but scale by a constant (this will not change the correlation between 
corresponding canonical variables).


# e.g.
result1$xcoef/result2$xcoef
result1$ycoef/result2$ycoef[,1:5]
# should produce matrices in which each column of each matrix
# is a scalar constant

- Liam

--
Liam J. Revell, Assistant Professor of Biology
University of Massachusetts Boston
web: http://faculty.umb.edu/liam.revell/
email: liam.rev...@umb.edu
blog: http://phytools.blogspot.com


On 7/13/2012 1:32 PM, Jonathan Mitchell wrote:

Hello all,

I'm trying to perform a canonical correlations analysis that takes
phylogenetic non-independence into account properly (a la what Revall 2009
did for PCA). To this effect, I've written a script to do canonical
correlates in R based entirely on SVD instead of QR decomp. (because I
understand SVD better) that recapitulates the cancor() function (
http://nr.com/whp/notes/CanonCorrBySVD.pdf).

The way I'm currently approaching this is, I'm centering each column of X
and Y on their phylogenetic mean instead of their numeric mean, and
multiplying the centered matrices by the inverse of the vcv. Is this
appropriate/correct? Is there a better way? Simplified example code posted
below:

tree - rtree(10)
X - replicate(5, rnorm(10))
Y - replicate(6, rnorm(10))
rownames(X) - rownames(Y) - tree$tip.label

C - vcv(tree)
inC - solve(C)
Ones - rep(1, nrow(X))

PMx - t(solve(t(Ones) %*% inC %*% Ones) %*% t(Ones) %*%inC %*% X)
PMy - t(solve(t(Ones) %*% inC %*% Ones) %*% t(Ones) %*%inC %*% Y)

pcX - invC %*% (X - Ones %*% t(PMx))
pcY - invC %*% (Y - Ones %*% t(PMy))

#SVD of X, Y and Ux'Uy
Xs - svd(pcX)
Ys - svd(pcY)

uTu - t(Xs$u) %*% Ys$u
uTus - svd(uTu)

#Canonical coefficients for X
A - Xs$v %*% solve(diag(Xs$d)) %*% uTus$u

#Canonical coefficients for Y
B - Ys$v %*% solve(diag(Ys$d)) %*% uTus$v

-- Jon

_

Jonathan S. Mitchell
http://home.uchicago.edu/~mitchelljs/

PhD Student
Committee on Evolutionary Biology
The University of Chicago
1025 57th Str, Culver Hall 402
Chicago, IL 60637

Geology Department
The Field Museum of Natural History
1400 S. Lake Shore Dr.
Chicago, IL 60605

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo



___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo


[R-sig-phylo] phylogenetic signal on multistate discrete traits

2012-07-13 Thread Patricia Rodríguez González
Hi all,

I am new to this forum and also new to phylogenetic analysis.
I am trying to study phylogenetic signal on traits from plant 
communities following an environmental gradient.
Part of my traits are continuous and for them I have used Blomberg's K.
However for the multistate discrete traits I was not able yet to run an 
adequate methodology.
When exploring data, I have tried fitDiscrete {geiger} to fit a 
Jukes-Cantor model to each trait  (i.e. first trait :DEN1)  but I always 
obtain the following error:

Finding the maximum likelihood solution
[050  100]
[]
Error in fitDiscrete(treeDNNnexus, DEN1) :
   ERROR: No solution found. Does your tree contain zero-length tip 
branches?

Although I don't have zero-length tip branches in my phylo tree, there 
are 3 pairs of species which display very small length tip branches.

Is there any alternative to run this function keeping or modifying the 
characteristics of my tree?
Or, if this function couldn't be applied in my case, Does anybody 
recommend an alternative methodology that could be implemented in R or 
Matlab?

Thank you in advance for your help
Patricia

---
Patricia Rodríguez González
Post Doc at Centro de Estudos Florestais
Instituto Superior de Agronomia
Universidade Técnica de Lisboa
Tapada da Ajuda
1349-017 Lisboa
PORTUGAL
http://www.isa.utl.pt/cef/ForEcoGen/


[[alternative HTML version deleted]]

___
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo