Re: [R-sig-phylo] PGLS with different correlation structures, model selection, REML vs ML

2015-01-21 Thread Brian A. Gill
Thanks Emmanuel and Jörg!

On Tue, Jan 20, 2015 at 8:40 AM, Emmanuel Paradis emmanuel.para...@ird.fr
wrote:

 It seems I forgot to Cc my reply to the list...

 Emmanuel

  Message transféré 
 Sujet : Re: [R-sig-phylo] PGLS with different correlation structures,
 model selection, REML vs ML
 Date : Fri, 09 Jan 2015 12:09:13 +0100
 De : Emmanuel Paradis emmanuel.para...@ird.fr
 Pour : Brian A. Gill gillbri...@gmail.com


 Hi Brian,

 Le 08/01/2015 19:57, Brian A. Gill a écrit :

 Dear List-Serv Users.

 I am running PGLS across large blocks of trees with the corPagel and
 corMartins correlation structures with maximum likelihood estimation of
 parameters.

 In the literature, it seems like people using PGLS use either a Brownian
 motion or OU approach, and you don't see many examples of people trying
 both and then using model selection to determine which is better.


 Note: corMartins() implements an OU-based correlation strucrture.

  I am planning on running both and then summing Akakie weights across all
 trees for each correlation structure to determine which is better
 supported.

 Is this kind of model selection approach reasonable?


 Yes. AIC was originally developed with the idea to find the correct
 (auto-)correlation structure in time-series data. So it makes sense to
 use it to select an appropriate phylogenetic correlation structure.

  Should I use the AIC value determined using REML or ML?


 If you use the same regression model (ie, the fixed effects part of the
 model) then AIC_REML should be OK and, I suspect, will give you similar
 results than AIC_ML. Only the latter should be used if models with
 different regressions are compared (of course, providing the same
 response variable is used in all models).

  If I also wanted to present model averaged (weighted by AIC) estimates of
 slope/intercept for each correlation structure do I use the estimates from
 REML or ML?


 Regression coefficients (ie, the parameters of the fixed-effect) are the
 same with ML or REML (at least, this is what is expected).

 Best,

 Emmanuel

  Thanks and Happy New Year!


 Brian






 ___
 R-sig-phylo mailing list - R-sig-phylo@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
 Searchable archive at http://www.mail-archive.com/r-
 sig-ph...@r-project.org/




-- 
Brian A. Gill

VISIT MY WEBSITE:
http://gillbriana.wix.com/brian-gill

FOLLOW ME ON TWITTER:
@CSUBrianGill

Colorado State University Biology
1878 Campus Delivery
Fort Collins, CO 80523

970-215-7037

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

[R-sig-phylo] Partitioning phylogentic and environment effect on trait distribution

2015-01-21 Thread Gustaf Granath

Dear all,
Is it possible to quantify the influence of phylogeny and environment on 
a measured trait?


Say that trait y has been measured in several species located in 
different habitats. Some species have been measured in more than one 
habitat (less specialized species). Now I want to say something about 
how much variation in my trait that may be explained by phylogenetic 
structure (I have the phyl tree) and the habitat. I also have multiple 
measurements per species in the habitat(s) where is occurs (replicates) 
but lets ignore that for now and only use the means.


One way would be to calculate phylogenetic corrected trait means (as in 
Bloomberg's K).
Here it is easy to add rows/columns in the vcv.phylo(tree) for species 
that occurs in more than one habitat and it is assumed that they have 
the same postion in the tree (ie within-species correlation is = 1)
Residual sums of squares for the phylo effect can be calculated and the 
corrected trait means are used in a linear model to test the habitat 
effect.

R2 for both factors (phylo, habitat) can then be calculated.

Another method is to use gls(). But I dont know how to deal with species 
occurring in multiple habitats here (can you feed the gls with a 
manually contructed var-covariance matrix?). If I drop some species x 
habitat combinations (so species can only occur in one habitat) I can do:

bm.sp - corBrownian(phy = tree)
fit0 - gls(trait ~ 1, sp.means, method = ML)
fit1 - gls(trait ~ 1, sp.means, correlation = bm.sp, method = ML)
fit2 - gls(trait ~ Habitat, sp.means, correlation = bm.sp, method = ML)
anova(fit0, fit1, fit2) #check AIC

#pseudoR2 (not perfect, I know, but indicate somewhat the importance)
G2 - (-2*(logLik(fit0) - logLik(fit1)))[1]
n = nrow(sp.means)
r2ML.phyl - 1 - exp(-G2/n) #phyl effect

G2 - (-2*(logLik(fit0)-logLik(fit2)))[1]
r2ML.both - 1 - exp(-G2/n)
r2ML.hab - r2ML.both - r2ML.phyl #habitat effect

As mentioned, there are many assumptions here (BM etc) but I wonder if 
this sort of approach is in general valid?


Is there a standard way to deal with these kind of problems (e.g. same 
species in multiple groups)?


Is there a way to incorporate replicates? (manually add/rows columns in 
the phylogenetic varcovar-matrix?)
(this has been on the list before, MCMCglmm has been mentioned and maybe 
a more complete modeling approach is the way to go here)


Key papers?

Cheers

Gustaf

--
Gustaf Granath (PhD)
Post doc
Swedish University of Agricultural Sciences
Department of Aquatic Sciences and Assessment

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] Partitioning phylogentic and environment effect on trait distribution

2015-01-21 Thread Jarrod Hadfield

Dear Gustaf,

How many levels of `habitat' are there, and are they cross-classified  
with respect to species (i.e. are multiple species measured in the  
same habitat)?


Assuming for now there are a reasonable number of habitats then the  
simplest model (without cross-classification) in asreml/MCMCglmm would  
be


Ainv-inverseA(my_tree)$Ainv

m1.asreml-asreml(trait~nhabitat, random =~  
habitat+giv(species)+species.ide, data=my_data,  
ginverse=list(species=sm2asreml(Ainv)))


m1.mcmc-MCMCglmm(trait~nhabitat, random =~  
habitat+species+species.ide, data=my_data, ginverse=list(species=Ainv))


nhabitat is the number of habitats a species is found in. species.ide  
is the same as species and models the between species variation not  
captured by the BM model. Note that this requires you using the  
individual data rather than the species means (which is to be  
preferred).


With cross-classification you may want to have the effects of the same  
habitat vary across species. There's many ways to do this, but for now  
lets assume i) that in general they could be positively correlated (if  
the trait increases for species 1 in habitat 1 then it is also likely  
to increase the trait in species 2) and ii) if species 1 and species 2  
are close relatives the habitat is likely to effect them in a more  
similar way:


m2.asreml-asreml(trait~nhabitat, random =~  
habitat+giv(species)+giv(species):habitat+species.ide, data=my_data,  
ginverse=list(species=sm2asreml(Ainv)))


m2.mcmc-MCMCglmm(trait~nhabitat, random =~  
habitat+species+species:habitat+species.ide, data=my_data,  
ginverse=list(species=Ainv))


There's many more possibilities of course, depending on the biology of  
the problem.


Cheers,

Jarrod



Quoting Gustaf Granath gustaf.gran...@gmail.com on Wed, 21 Jan 2015  
15:59:23 +0100:



Dear all,
Is it possible to quantify the influence of phylogeny and  
environment on a measured trait?


Say that trait y has been measured in several species located in  
different habitats. Some species have been measured in more than one  
habitat (less specialized species). Now I want to say something  
about how much variation in my trait that may be explained by  
phylogenetic structure (I have the phyl tree) and the habitat. I  
also have multiple measurements per species in the habitat(s) where  
is occurs (replicates) but lets ignore that for now and only use the  
means.


One way would be to calculate phylogenetic corrected trait means (as  
in Bloomberg's K).
Here it is easy to add rows/columns in the vcv.phylo(tree) for  
species that occurs in more than one habitat and it is assumed that  
they have the same postion in the tree (ie within-species  
correlation is = 1)
Residual sums of squares for the phylo effect can be calculated and  
the corrected trait means are used in a linear model to test the  
habitat effect.

R2 for both factors (phylo, habitat) can then be calculated.

Another method is to use gls(). But I dont know how to deal with  
species occurring in multiple habitats here (can you feed the gls  
with a manually contructed var-covariance matrix?). If I drop some  
species x habitat combinations (so species can only occur in one  
habitat) I can do:

bm.sp - corBrownian(phy = tree)
fit0 - gls(trait ~ 1, sp.means, method = ML)
fit1 - gls(trait ~ 1, sp.means, correlation = bm.sp, method = ML)
fit2 - gls(trait ~ Habitat, sp.means, correlation = bm.sp, method = ML)
anova(fit0, fit1, fit2) #check AIC

#pseudoR2 (not perfect, I know, but indicate somewhat the importance)
G2 - (-2*(logLik(fit0) - logLik(fit1)))[1]
n = nrow(sp.means)
r2ML.phyl - 1 - exp(-G2/n) #phyl effect

G2 - (-2*(logLik(fit0)-logLik(fit2)))[1]
r2ML.both - 1 - exp(-G2/n)
r2ML.hab - r2ML.both - r2ML.phyl #habitat effect

As mentioned, there are many assumptions here (BM etc) but I wonder  
if this sort of approach is in general valid?


Is there a standard way to deal with these kind of problems (e.g.  
same species in multiple groups)?


Is there a way to incorporate replicates? (manually add/rows columns  
in the phylogenetic varcovar-matrix?)
(this has been on the list before, MCMCglmm has been mentioned and  
maybe a more complete modeling approach is the way to go here)


Key papers?

Cheers

Gustaf

--
Gustaf Granath (PhD)
Post doc
Swedish University of Agricultural Sciences
Department of Aquatic Sciences and Assessment

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/






--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/