Re: [R-sig-phylo] Partitioning phylogentic and environment effect on trait distribution

Jarrod Hadfield Thu, 22 Jan 2015 03:24:08 -0800

Hi Gustaf,

In the model with just species the residual variation is measurementerror/plasticity error, but could also include deviations from theassumed BM process. If you add species.ide that term capturesdeviations from the assumed BM process.

Va/(Va+Ve) is the proportion of variance explained by phylogeny*after* accounting for the fixed effects. To include the varianceexplained by the fixed effects in the denominator is a bit difficult,because you have to assume something about the distribution ofpredictors. For example do you want the different habitat types to beequally represented, or at the frequency with which they were sampled?


Cheers,

Jarrod

Quoting Gustaf Granath <gustaf.gran...@gmail.com> on Thu, 22 Jan 201511:11:02 +0100:

Jarrod,
Thanks for your input. Dropping species.ide result in a model thatis estimated without any problems.prior = list(R = list(V = 1, nu = 0), G = list(G1 = list(V = 1, nu =1, alpha.mu = 0, alpha.V = 1000)))
m1.mcmc <- MCMCglmm(y ~ habitat, random =~ species,
data = traits, ginverse = list(species = Ainv),nitt = 700000, thin=100, prior = prior)It seems to be rather robust (not sensitive to priors) and Im notsure I understand why you would need 100s of species.
But how is this model related to the simpler approach I described inmy first post?The error term (Ve) is the within species error(measurement/plasticity error) and the species variance (Va) is thephylogenetic variance in my data. Lambda (Pagel's) is then Va/(Va+Ve).
(m1.mcmc$VCV[,"species"] / (m1.mcmc$VCV[,"species"] + m1.mcmc$VCV[,"units"]))
# BUT is this only correct under a model without habitat?
If habitat is included in the model the between species variationnot explained by phylogeny or habitat is included in Ve because Idropped the speceis.ide, or am I missing something here?
If I run a model with and without habitat I can extract the amountof variation explained by habitat (differences in Ve = Vhab) and geta ballpark number on the relative contribution to the pattern weobserve:
Va / (Va+Vhab+Ve) #phylo
Vhab / (Va+Vhab+Ve) #habitat
Ve / (Va+Vhab+Ve) #measurement/plasticity/local adaption .... andother processes
Did I get that right or am I lost?

Gustaf


On 2015-01-22 04:54, Jarrod Hadfield wrote:
Hi Gustaf,
1/ You can ignore nhabitat: for some reason I thought you thoughtthat there might be systematic differences between specialists andnon-specialists.
2/ species and species.ide are identical, but species effects arestructured by the phylogeny due to the ginverse specification.species.ide are treated as iid.
3/ With few species and few habitats, and very few species observedin multiple habitats, I think you will just have to go for thesimple model you describe (perhaps even dropping species.ide):
m1.mcmc<-MCMCglmm(trait~as.factor(habitat)+x, random =~species+species.ide, data=my_data, ginverse=list(species=Ainv))
Expect the variance components to be poorly estimated and priorsensitive. 100's of species are required to get even moderatelyprecise estimates of the phylogenetic variance.
Cheers,

Jarrod
Quoting Gustaf Granath <gustaf.gran...@gmail.com> on Thu, 22 Jan2015 01:15:23 +0100:
Jarrod,
I tried this but Im not sure it works well for my problem and/orIm doing something wrong.
First, Im not sure what nhabitat is. A vector specifying thenumber of habitats that the species is found in? I.e.my_data$nhabitat may look like:1,1,1,1,2,2,2,2, 2,2,2,2 .... if the first two species occurs inone and two habitats, respectively.
Second, species.ide. If it is the same as species, why not just: +species + species ?
Should it be a vector identical to species but just names species.ide?
So to my data, there are only three habitats (3 levels), thus Imnot sure it works well modeling it as a random factor (and as youindicate we need a "reasonable number of habitats"). There are 2,2 and 11 species in each habitat, and only a few are found inmultiple habitats. So Im really looking at simpler model becausethere are not enough data for complicated structures. With so fewspecies things are pushed to the limit but my guess is that themain problem here is that habitat cannot be fitted as a randomfactor. Possible to model it as a fixed effect?
However, there are other predictors of interest as well,continuous variables. How would that be modeled then?
Like this (predictor=x)?
m1.mcmc<-MCMCglmm(trait~ x, random =~ species+species.ide,data=my_data, ginverse=list(species=Ainv))
I would also like to thank Tony for recommending excellent papersand I didnt know about the pez package which looks promising.
Cheers

Gustaf


On 2015-01-21 16:26, Jarrod Hadfield wrote:
Dear Gustaf,
How many levels of `habitat' are there, and are theycross-classified with respect to species (i.e. are multiplespecies measured in the same habitat)?
Assuming for now there are a reasonable number of habitats thenthe simplest model (without cross-classification) inasreml/MCMCglmm would be
Ainv<-inverseA(my_tree)$Ainv
m1.asreml<-asreml(trait~nhabitat, random =~habitat+giv(species)+species.ide, data=my_data,ginverse=list(species=sm2asreml(Ainv)))
m1.mcmc<-MCMCglmm(trait~nhabitat, random =~habitat+species+species.ide, data=my_data,ginverse=list(species=Ainv))
nhabitat is the number of habitats a species is found in.species.ide is the same as species and models the between speciesvariation not captured by the BM model. Note that this requiresyou using the individual data rather than the species means(which is to be preferred).
With cross-classification you may want to have the effects of thesame habitat vary across species. There's many ways to do this,but for now lets assume i) that in general they could bepositively correlated (if the trait increases for species 1 inhabitat 1 then it is also likely to increase the trait in species2) and ii) if species 1 and species 2 are close relatives thehabitat is likely to effect them in a more similar way:
m2.asreml<-asreml(trait~nhabitat, random =~habitat+giv(species)+giv(species):habitat+species.ide,data=my_data, ginverse=list(species=sm2asreml(Ainv)))
m2.mcmc<-MCMCglmm(trait~nhabitat, random =~habitat+species+species:habitat+species.ide, data=my_data,ginverse=list(species=Ainv))
There's many more possibilities of course, depending on thebiology of the problem.
Cheers,

Jarrod
Quoting Gustaf Granath <gustaf.gran...@gmail.com> on Wed, 21 Jan2015 15:59:23 +0100:
Dear all,
Is it possible to quantify the influence of phylogeny andenvironment on a measured trait?
Say that trait y has been measured in several species located indifferent habitats. Some species have been measured in more thanone habitat (less specialized species). Now I want to saysomething about how much variation in my trait that may beexplained by phylogenetic structure (I have the phyl tree) andthe habitat. I also have multiple measurements per species inthe habitat(s) where is occurs (replicates) but lets ignore thatfor now and only use the means.
One way would be to calculate phylogenetic corrected trait means(as in Bloomberg's K).Here it is easy to add rows/columns in the vcv.phylo(tree) forspecies that occurs in more than one habitat and it is assumedthat they have the same postion in the tree (ie within-speciescorrelation is = 1)Residual sums of squares for the phylo effect can be calculatedand the corrected trait means are used in a linear model to testthe habitat effect.
R2 for both factors (phylo, habitat) can then be calculated.
Another method is to use gls(). But I dont know how to deal withspecies occurring in multiple habitats here (can you feed thegls with a manually contructed var-covariance matrix?). If Idrop some species x habitat combinations (so species can onlyoccur in one habitat) I can do:
bm.sp <- corBrownian(phy = tree)
fit0 <- gls(trait ~ 1, sp.means, method = "ML")
fit1 <- gls(trait ~ 1, sp.means, correlation = bm.sp, method = "ML")
fit2 <- gls(trait ~ Habitat, sp.means, correlation = bm.sp,method = "ML")
anova(fit0, fit1, fit2) #check AIC

#pseudoR2 (not perfect, I know, but indicate somewhat the "importance")
G2 <- (-2*(logLik(fit0) - logLik(fit1)))[1]
n = nrow(sp.means)
r2ML.phyl <- 1 - exp(-G2/n) #phyl effect

G2 <- (-2*(logLik(fit0)-logLik(fit2)))[1]
r2ML.both <- 1 - exp(-G2/n)
r2ML.hab <- r2ML.both - r2ML.phyl #habitat effect
As mentioned, there are many assumptions here (BM etc) but Iwonder if this sort of approach is in general valid?
Is there a "standard way" to deal with these kind of problems(e.g. same species in multiple groups)?
Is there a way to incorporate replicates? (manually add/rowscolumns in the phylogenetic varcovar-matrix?)(this has been on the list before, MCMCglmm has been mentionedand maybe a more complete modeling approach is the way to go here)
Key papers?

Cheers

Gustaf

--
Gustaf Granath (PhD)
Post doc
Swedish University of Agricultural Sciences
Department of Aquatic Sciences and Assessment

_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive athttp://www.mail-archive.com/r-sig-phylo@r-project.org/
--
Gustaf Granath (PhD)
Post doc
Swedish University of Agricultural Sciences
Department of Aquatic Sciences and Assessment
--
Gustaf Granath (PhD)
Post doc
McMaster University
School of Geography & Earth Sciences

Swedish University of Agricultural Sciences
Department of Aquatic Sciences and Assessment

@GGranath
Web: ggranath.blogspot.com
Email: gustaf.gran...@gmail.com




--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Partitioning phylogentic and environment effect on trait distribution

Reply via email to