----- Forwarded message from Paolo Piras
<paolo.pi...@uniroma3.it> -----
Date: Wed, 17 Apr 2013
09:38:22 -0400
From: Paolo Piras <paolo.pi...@uniroma3.it>
Reply-To: Paolo Piras <paolo.pi...@uniroma3.it>
Subject: RE:
multi-variate PGLS with GMM
To: "morphmet@morphometrics.org"
<morphmet@morphometrics.org>
Hi all,
Assuming brownian motion, and dealing with MANY landmarks, i.e. face to the impossibility to apply parametric test of significance, a solution could be the same adopted in MorphoJ. It uses a completely randomized approach thus avoiding restrictions due to "fat data" (i.e. many vars, few cases). It is something like a "phylogenetic adonis"; adonis() can be found in vegan package.
Even if....... I cannot reproduce EXACTLY the same results of MorphoJ procedure in R using adonis and a phyl cov matrix; I think it is a problem of centering the data. Any advice for that point?
By the way I wrote a couple of functions (genpgls(), accompanied by another required function- testsix() - ), mainly inspired by that of Dean Adams, that allows a number of option in performing PGLS (univariate, multivariate, multiple or multiple-multivariate). For univariate Y it estimates the best mode of evoltion and transforms the tree accordingy before computing the covariance matrix. Any testing for amelioring the function is WELCOME.
best
paolo
#This function takes as arguments a tree,a dependent(s) table,an
independent(s) table,a vector specifying the criterion
#upon which the tree is transformed (for univariate y only) and an integer
indicating the permutations upon which the non
#phylogenetic correlation is tested for significance; it is useful when many
variables enter the model.
#For univariate y the function calls fitContinuous() in order to find
#the most supported mode of evolution (basing on AICs) among BM",
"OU","lambda", "kappa", "delta", "EB" and "white noise" models.
#This is done automatically using the function "testsix()" appended below.
This is a mere translation in function of the example given in GEIGER manual.
HERE SOMETIMES THE SCALING OF THE VARIABLE ## #INFLUENCES THE
BEHAVIOUR OF fitcontinuous(). Muliplying
or dividing by a proper constant solves the problem nearly always.
#By specifying the argument "mode", one can choose if the tree should be
transformed according to the best mode
#of evolution found for y or for regression residuals of the linear model
"y~x". This holds for univariate y. For multivariate Y one can write "no
transf".
#For multivariate Y Brownian motion is assumed and the GLOBAL multivariate
test (Wilks'lambda) is performed using PICs. Here still holds the problem of fat
data (see above)
# Individual regressions of Ys on X(s) table are performed via PGLS.
#A number of results are printed. If y is univariate, the first ones regard
the phylogenetic signal in y and in regression residuals (using both lambda and
K criteria). Then the parameters and AICs of fitted models
# are printed followed by pgls and PICs analysis
genpgls<-function(tree,y,x,mode=c("use y","use regression residuals","no
transf"),nperm){
require(phytools)
require(picante)
require(ape)
x<-as.matrix(match.phylo.data(tree,x)$data)
y<-as.matrix(match.phylo.data(tree,y)$data)
tree<-match.phylo.data(tree,x)$phy
if(dim(y)[2]<2){lmuniy<-lm(y~x)
phylosig_in_residuals<-phylosig(tree,resid(lmuniy),method="lambda",test=T)
phylosig_in_residuals2<-phylosig(tree,resid(lmuniy),method="K",test=T)
phylosig_in_y<-phylosig(tree,y,method="lambda",test=T)
phylosig_in_y2<-phylosig(tree,y,method="K",test=T)
print("PHYLOSIG IN Y USING LAMBDA")
print(phylosig_in_y)
print("PHYLOSIG IN Y USING K")
print(phylosig_in_y2)
print("PHYLOSIG IN RESIDUALS USING LAMBDA")
print(phylosig_in_residuals)
print("PHYLOSIG IN RESIDUALS USING K")
print(phylosig_in_residuals2)
}else{NULL}
if(dim(y)[2]<2&mode=="use
y"){bestmodely<-testsix(tree,y)}else{vcvy<-vcv.phylo(tree)}
if(dim(y)[2]<2&mode=="use regression
residuals"){bestmodely<-testsix(tree,resid(lmuniy))}else{vcvy<-vcv.phylo(tree)}
if(exists("bestmodely")==TRUE){
if(bestmodely$AICs[1,1]>0&bestmodely$AICs[1,7]>0){
modepositiony<-which(bestmodely$AICs-min(bestmodely$AICs)==0)
if(modepositiony==2){
treey<-lambdaTree(tree,bestmodely$parameters[modepositiony])
vcvy<-vcv.phylo(treey)
}else{NULL}
if(modepositiony==3){
treey<-deltaTree(tree,bestmodely$parameters[modepositiony])
vcvy<-vcv.phylo(treey)
}else{NULL}
if(modepositiony==4){
treey<-kappaTree(tree,bestmodely$parameters[modepositiony])
vcvy<-vcv.phylo(treey)
}else{NULL}
if(modepositiony==5){
treey<-ouTree(tree,bestmodely$parameters[modepositiony])
vcvy<-vcv.phylo(treey)
}else{NULL}
if(modepositiony==6){
treey<-eyponentialchangeTree(tree,bestmodely$parameters[modepositiony])
vcvy<-vcv.phylo(treey)
}else{NULL}
}else{vcvy<-vcv.phylo(tree)}
}else{vcvy<-vcv.phylo(tree)}
ad<-adonis(y~as.matrix(x),method="euclidean",permutations=nperm)
int<-c(rep(1,nrow(x))) #### add a column of ones for intercept that will be subjected to transformation together with x table
mDnewy<-solve(svd(vcvy)$u%*%diag(sqrt(svd(vcvy)$d))%*%t(svd(vcvy)$u))
ynew<-mDnewy %*% y; xnew<-mDnewy%*%as.matrix(cbind(as.matrix(int),x)) ## I transform both x and intercept estimate
pgls<-lm(ynew~xnew-1)
summ<-summary(pgls)##### I SPECIFY "-1" BECAUSE lm() add a column of ones
by default;
#####but we already added (and transformed) it; this pgls procedure
estimates the intercept
if(exists("treey")==TRUE){an<-summary(lm(multpic(y,treey)~multpic(x,treey)-1))}else{an<-anova(lm(multpic(y,tree)~multpic(x,tree)-1))}
print("INDIVIDUAL CORRELATIONS UNDER PGLS")
print(summ)
if(exists("bestmodely")==TRUE){
if(mode==c("use y")){print("BEST MODEL PARAMETERS FOR Y")}else{print("BEST
MODEL PARAMETERS FOR REGRESSION RESIDUALS")}
print(bestmodely)
}
else{NULL}
print("GLOBAL CORRELATION UNDER PICS")
print(an)
print("NON PHYLOGENETIC CORRELATION USING PERMUTATIONS")
print(ad)
}
testsix<-function(tree,x){
require(picante)
require(ape)
require(geiger)
x<-match.phylo.data(tree,x)$data
tree<-match.phylo.data(tree,x)$phy
brownfit<-fitContinuous(tree,x)
aic.brown<-numeric(1)
for(i in 1:1) aic.brown[i]<-brownfit[[i]]$aic
lambdaFit<-fitContinuous(tree, x, model="lambda")
d.lambda<-numeric(5)
for(i in 1:1) d.lambda[i]=2*(lambdaFit[[i]]$lnl-brownfit[[i]]$lnl)
# Calculate p values assuming chi-squared distribution with 1 d.f.
p.lambda=pchisq(d.lambda, 1, lower.tail=FALSE)
aic.lambda<-numeric(1)
for(i in 1:1) aic.lambda[i]<-lambdaFit[[i]]$aic
deltaFit<-fitContinuous(tree, x, model="delta")
d.delta<-numeric(1)
for(i in 1:1) d.delta[i]=2*(deltaFit[[i]]$lnl-brownfit[[i]]$lnl)
# Calculate p values assuming chi-squared distribution with 1 d.f.
p.delta=pchisq(d.delta, 1, lower.tail=FALSE)
aic.delta<-numeric(1)
for(i in 1:1) aic.delta[i]<-deltaFit[[i]]$aic
kappaFit<-fitContinuous(tree, x, model="kappa")
# Compare likelihoods:
d.kappa<-numeric(1)
for(i in 1:1) d.kappa[i]=2*(kappaFit[[i]]$lnl-brownfit[[i]]$lnl)
# Calculate p values assuming chi-squared distribution with 1 d.f.
p.kappa=pchisq(d.kappa, 1, lower.tail=FALSE)
aic.kappa<-numeric(1)
for(i in 1:1) aic.kappa[i]<-kappaFit[[i]]$aic
ouFit<-fitContinuous(tree, x, model="OU",bounds=list(alpha=c(0.0001, 10),
beta=c(1, 100)))
# Compare likelihoods:
d.ou<-numeric(1)
for(i in 1:1) d.ou[i]=2*(ouFit[[i]]$lnl-brownfit[[i]]$lnl)
# Calculate p values assuming chi-squared distribution with 1 d.f.
p.ou=pchisq(d.ou, 1, lower.tail=FALSE)
aic.ou<-numeric(1)
for(i in 1:1) aic.ou[i]<-ouFit[[i]]$aic
ebFit<-fitContinuous(tree, x, model="EB",bounds=list(alpha=c(0.0001, 10),
beta=c(1, 100)))
# Compare likelihoods:
d.eb<-numeric(1)
for(i in 1:1) d.eb[i]=2*(ebFit[[i]]$lnl-brownfit[[i]]$lnl)
# Calculate p values assuming chi-squared distribution with 1 d.f.
p.eb=pchisq(d.eb, 1, lower.tail=FALSE)
aic.eb<-numeric(1)
for(i in 1:1) aic.eb[i]<-ebFit[[i]]$aic
whiteFit<-fitContinuous(tree, x, model="white")
# Compare likelihoods:
d.white<-numeric(1)
for(i in 1:1) d.white[i]=2*(whiteFit[[i]]$lnl-brownfit[[i]]$lnl)
# Calculate p values assuming chi-squared distribution with 1 d.f.
p.white=pchisq(d.white, 1, lower.tail=FALSE)
aic.white<-numeric(1)
for(i in 1:1) aic.white[i]<-whiteFit[[i]]$aic
aic.all<-cbind(aic.brown, aic.lambda, aic.delta, aic.kappa, aic.ou,
aic.eb,aic.white)
foo<-function(x) x-x[which(x==min(x))]
daic<-t(apply(aic.all, 1, foo))
rownames(daic)<-colnames(x)
colnames(daic)<-c("Brownian", "Lambda", "Delta", "Kappa", "OU",
"EB","whitw")
cat("Table of delta-aic values; zero - best model\n")
print(daic, digits=2)
parameters<-cbind(NA,lambdaFit$Trait1$lambda,deltaFit$Trait1$delta,kappaFit$Trait1$lambda,ouFit$Trait1$alpha,ebFit$Trait1$a,NA)
colnames(parameters)<-c("Brown","lambda","delta","kappa","ou","eb","White")
print("PARAMETER ESTIMATES")
print(parameters)
print(aic.all)
out<-list(parameters=parameters,AICs=aic.all)
}
Inviato: mercoledì 17 aprile 2013 4.13
A: morphmet@morphometrics.org
Oggetto: Re: multi-variate PGLS with GMM
----- Forwarded message from Dean Adams <dcad...@iastate.edu>
-----
Date: Mon, 15 Apr 2013 10:11:06 -0400
From: Dean Adams <dcad...@iastate.edu>
Reply-To: Dean Adams <dcad...@iastate.edu>
Subject: Re: multi-variate PGLS with GMM
To: morphmet@morphometrics.org
Yes, one can account for phylogenetic non-independence while performing MANOVA or MANCOVA on shape variables. An early example is in Ruber and Adams 2001 (J. Evol. Biol.). That analysis was implemented in NTSYS.
One could also implement this in R. The standard approach is the use the 'gls' function, and provide it the correlation structure as specified by a phylogenetic covariance matrix. A univariate example is: summary(gls(Y~X, correlation=my.phy.cor))
Unfortunately, one cannot do the equivalent for multivariate data (summary(manova(gls ...))) because the 'manova' function in R does not utilize the correlation structure even if it provided. Thus, the phylogenetic and non-phylogenetic GLS yield the same numerical results.
To properly implement multivariate PGLS in R, one must first transform the X and Y data by the phylogeny under a Brownian motion model, using the phylogenetic transform procedure in Garland and Ives (2000). Then a standard multivariate MANCOVA may be used. I have written R code for this, which is in the supplemental material of a paper currently in press at Evolutionary Biology. The function is found below, along with the citation.
In your specific case, the input X-matrix would contain a column of ones, the ecological variables, and body size. Note that you must have more species than shape-dimensions for the algebra of this procedure to be completed. Also, one should rotate the aligned Procrustes coordinates to their principal component dimensions and eliminate the redundant dimensions due to superimposition to avoid singularities in the MANOVA procedure (7 in your case if you have a set of fixed, 3D landmarks).
Hope this helps.
Dean
-- Dr. Dean C. Adams Professor Department of Ecology, Evolution, and Organismal Biology Department of Statistics Iowa State University Ames, Iowa 50011 www.public.iastate.edu/~dcadams/ phone: 515-294-3834
Outomuro, D., D.C. Adams, and F. Johansson. 2013. Evolution of wing shape in ornamented-winged damselflies. Evolutionary Biology. (In Press).
##MULTIVARIATE PLGS FUNCTION: DC Adams. 2013
mult.pgls<-function(phy,y.mat,x.mat){
phy.mat<-vcv(phy)
x.mat<-cbind(matrix(1,length(phy$tip.label)),x.mat)
x.mat<-x.mat[rownames(phy.mat),]
y.mat<-y.mat[rownames(phy.mat),]
mDnew<-solve(svd(phy.mat)$u%*%diag(sqrt(svd(phy.mat)$d))%*%t(svd(phy.mat)$u))
ynew<-mDnew %*% y.mat
xnew<-mDnew %*% x.mat
summary=summary(manova(lm(ynew~xnew-1)))
return(list(summary=summary))
}
----- Forwarded message from Blake Dickson <b.dick...@unsw.edu.au> -----Date: Thu, 11 Apr 2013 00:58:04 -0400 From: Blake Dickson <b.dick...@unsw.edu.au> Reply-To: Blake Dickson <b.dick...@unsw.edu.au> Subject: multi-variate PGLS with GMM To: "morphmet@morphometrics.org" Hi All, I have a set of 3D procrustes co-ordinates for a bunch of taxa I would like to test against ecological characteristics. I am wondering whether it is possible to perform a multivariate PGLS - controlling for both phylogeny and body size, and if so what is the best way to go about it. Cheers, Blake Blake Dickson, School of BEES University of New South Wales, NSW, Australia b.dick...@unsw.edu.au ----- End forwarded message -----
----- End forwarded message -----
----- End forwarded message -----