Hi Pascal.
The first clue is the the length of your tree is exactly the same as the
ratio of your two sets of rates. This suggests that OUCH is first
rescaling the tree to unit length. Let's see if this explains your
results, just working with your first character:
> tree<-read.nexus("testtree.nex")
> X<-read.csv("spdata.csv",header=T,row.names=1)
> X<-as.matrix(X[,1:4])
> fitContinuous(tree,X[,1])
Fitting BM model:
$Trait1
$Trait1$lnl
[1] -164.5281
$Trait1$beta
[1] 0.1788076
$Trait1$aic
[1] 333.0561
$Trait1$aicc
[1] 333.208
$Trait1$k
[1] 2
> fitContinuous(rescaleTree(tree,1),X[,1])
Fitting BM model:
$Trait1
$Trait1$lnl
[1] -164.5281
$Trait1$beta
[1] 5.885607
$Trait1$aic
[1] 333.0561
$Trait1$aicc
[1] 333.208
$Trait1$k
[1] 2
Looks like that's your explanation. Note that the first rate from
fitContinuous on the tree in its original scale is slightly different
from what you report: off by a factor of n/(n-1), in fact. This is
because ic.sigma returns the unbiased REML estimates of the rate (from
Felsenstein's contrasts); whereas the ML rates from fitContinuous and
OUCH are biased by a factor of (n-1)/n.
Note also that the likelihoods (and thus the AIC scores) are not
affected by rescaling the tree. This is because scaling the branches by
k & the rate sigma^2 by 1/k yields exactly the same log-likelihood. This
is different from the effect of rescaling the data which affects the
likelihood, but should not alter the relative likelihoods of different
models fit to the same rescaled data!
All the best, Liam
--
Liam J. Revell
University of Massachusetts Boston
web: http://faculty.umb.edu/liam.revell/
email: liam.rev...@umb.edu
blog: http://phytools.blogspot.com
On 2/11/2012 10:20 AM, Pascal Title wrote:
Hello all
I am trying to calculate sigma2 values for a clade, using 4 PCA axes as my
characters.
I first wanted to figure out if my data better fit a BM model or an OU1
model of evolution. So I compared AICc values using the fitContinuous
function in Geiger.
I found the following:
BM OU
Score1 2.6 0
Score2 9.1 0
Score3 17.2 0
Score4 9.8 0
However, using the pmc package, I ran
pmc(tree,data,modelA='BM',modelB='OU',nboot=100)
for the first PC axis and found that the BM and OU1 distributions are
mostly overlapping, which I believe tells me that you can't really
differentiate BM and OU with the data at hand (see plot of
pmc<http://dl.dropbox.com/u/34644229/Tangarinae.pdf>
).
Therefore, I was planning on calculating sigma2 based on a BM model of
evolution.
But I noticed that I get hugely different results using different methods:
(here is my tree<http://dl.dropbox.com/u/34644229/testtree.nex>, here is
my data<http://dl.dropbox.com/u/34644229/spdata.csv>)
With OUCH method:
require(geiger)
require(ouch)
tree<-read.nexus('testtree.nex')
spdata<-read.csv('spdata.csv',stringsAsFactors=FALSE)
rownames(spdata)<-spdata$X
spdata$X<-NULL
spdata$labels<-NULL
spdata->spdata1
ot<-ape2ouch(tree)
otd<-as(ot,"data.frame")
spdata$labels<-rownames(spdata)
otd<-merge(otd,spdata,by="labels",all=TRUE)
rownames(otd)<-otd$nodes
ot<-with(otd,ouchtree(nodes=nodes,ancestors=ancestors,times=times,labels=labels))
b1<-brown(tree=ot,data=otd[colnames(spdata1)])
sigma2<-coef(b1)$sigma.sq.matrix
sigma2
[,1] [,2] [,3] [,4]
[1,] 5.885605498 1.17240962 0.7760799 -0.004528573
[2,] 1.172409617 1.52861569 0.1039507 -0.073455526
[3,] 0.776079917 0.10395074 1.0567364 -0.158017246
[4,] -0.004528573 -0.07345553 -0.1580172 1.236342488
With ic.sigma method from GEIGER:
require(geiger)
tree<-read.nexus('testtree.nex')
spdata<-read.csv('spdata.csv',stringsAsFactors=FALSE)
rownames(spdata)<-spdata$X
spdata$X<-NULL
spdata$labels<-NULL
sig<-ic.sigma(tree,spdata)
Score1 Score2 Score3 Score4
Score1 0.1809881293 0.036052743 0.023865217 -0.0001392581
Score2 0.0360527432 0.047006429 0.003196587 -0.0022588293
Score3 0.0238652170 0.003196587 0.032495680 -0.0048591850
Score4 -0.0001392581 -0.002258829 -0.004859185 0.0380187415
Interestingly, the OUCH result is greater than the GEIGER result by a
factor of 32.
I also simulated a tree and data under BM and tried both methods and they
agree, so somehow this has something to do with my data.
Both methods are estimating these parameters under brownian motion, so
where is there a breakdown?
Any advice would be greatly appreciated!
_______________________________________________
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo