Hi Gustaf,

In the model with just species the residual variation is measurement error/plasticity error, but could also include deviations from the assumed BM process. If you add species.ide that term captures deviations from the assumed BM process.

Va/(Va+Ve) is the proportion of variance explained by phylogeny *after* accounting for the fixed effects. To include the variance explained by the fixed effects in the denominator is a bit difficult, because you have to assume something about the distribution of predictors. For example do you want the different habitat types to be equally represented, or at the frequency with which they were sampled?

Cheers,

Jarrod






Quoting Gustaf Granath <gustaf.gran...@gmail.com> on Thu, 22 Jan 2015 11:11:02 +0100:

Jarrod,
Thanks for your input. Dropping species.ide result in a model that is estimated without any problems. prior = list(R = list(V = 1, nu = 0), G = list(G1 = list(V = 1, nu = 1, alpha.mu = 0, alpha.V = 1000)))
m1.mcmc <- MCMCglmm(y ~ habitat, random =~ species,
data = traits, ginverse = list(species = Ainv), nitt = 700000, thin=100, prior = prior) It seems to be rather robust (not sensitive to priors) and Im not sure I understand why you would need 100s of species.

But how is this model related to the simpler approach I described in my first post? The error term (Ve) is the within species error (measurement/plasticity error) and the species variance (Va) is the phylogenetic variance in my data. Lambda (Pagel's) is then Va/(Va+Ve).
(m1.mcmc$VCV[,"species"] / (m1.mcmc$VCV[,"species"] + m1.mcmc$VCV[,"units"]))
# BUT is this only correct under a model without habitat?

If habitat is included in the model the between species variation not explained by phylogeny or habitat is included in Ve because I dropped the speceis.ide, or am I missing something here?

If I run a model with and without habitat I can extract the amount of variation explained by habitat (differences in Ve = Vhab) and get a ballpark number on the relative contribution to the pattern we observe:
Va / (Va+Vhab+Ve) #phylo
Vhab / (Va+Vhab+Ve) #habitat
Ve / (Va+Vhab+Ve) #measurement/plasticity/local adaption .... and other processes

Did I get that right or am I lost?

Gustaf


On 2015-01-22 04:54, Jarrod Hadfield wrote:
Hi Gustaf,

1/ You can ignore nhabitat: for some reason I thought you thought that there might be systematic differences between specialists and non-specialists.

2/ species and species.ide are identical, but species effects are structured by the phylogeny due to the ginverse specification. species.ide are treated as iid.

3/ With few species and few habitats, and very few species observed in multiple habitats, I think you will just have to go for the simple model you describe (perhaps even dropping species.ide):

m1.mcmc<-MCMCglmm(trait~as.factor(habitat)+x, random =~ species+species.ide, data=my_data, ginverse=list(species=Ainv))

Expect the variance components to be poorly estimated and prior sensitive. 100's of species are required to get even moderately precise estimates of the phylogenetic variance.

Cheers,

Jarrod



Quoting Gustaf Granath <gustaf.gran...@gmail.com> on Thu, 22 Jan 2015 01:15:23 +0100:

Jarrod,
I tried this but Im not sure it works well for my problem and/or Im doing something wrong.

First, Im not sure what nhabitat is. A vector specifying the number of habitats that the species is found in? I.e. my_data$nhabitat may look like: 1,1,1,1,2,2,2,2, 2,2,2,2 .... if the first two species occurs in one and two habitats, respectively.

Second, species.ide. If it is the same as species, why not just: + species + species ?
Should it be a vector identical to species but just names species.ide?

So to my data, there are only three habitats (3 levels), thus Im not sure it works well modeling it as a random factor (and as you indicate we need a "reasonable number of habitats"). There are 2, 2 and 11 species in each habitat, and only a few are found in multiple habitats. So Im really looking at simpler model because there are not enough data for complicated structures. With so few species things are pushed to the limit but my guess is that the main problem here is that habitat cannot be fitted as a random factor. Possible to model it as a fixed effect?

However, there are other predictors of interest as well, continuous variables. How would that be modeled then?
Like this (predictor=x)?
m1.mcmc<-MCMCglmm(trait~ x, random =~ species+species.ide, data=my_data, ginverse=list(species=Ainv))

I would also like to thank Tony for recommending excellent papers and I didnt know about the pez package which looks promising.

Cheers

Gustaf


On 2015-01-21 16:26, Jarrod Hadfield wrote:
Dear Gustaf,

How many levels of `habitat' are there, and are they cross-classified with respect to species (i.e. are multiple species measured in the same habitat)?

Assuming for now there are a reasonable number of habitats then the simplest model (without cross-classification) in asreml/MCMCglmm would be

Ainv<-inverseA(my_tree)$Ainv

m1.asreml<-asreml(trait~nhabitat, random =~ habitat+giv(species)+species.ide, data=my_data, ginverse=list(species=sm2asreml(Ainv)))

m1.mcmc<-MCMCglmm(trait~nhabitat, random =~ habitat+species+species.ide, data=my_data, ginverse=list(species=Ainv))

nhabitat is the number of habitats a species is found in. species.ide is the same as species and models the between species variation not captured by the BM model. Note that this requires you using the individual data rather than the species means (which is to be preferred).

With cross-classification you may want to have the effects of the same habitat vary across species. There's many ways to do this, but for now lets assume i) that in general they could be positively correlated (if the trait increases for species 1 in habitat 1 then it is also likely to increase the trait in species 2) and ii) if species 1 and species 2 are close relatives the habitat is likely to effect them in a more similar way:

m2.asreml<-asreml(trait~nhabitat, random =~ habitat+giv(species)+giv(species):habitat+species.ide, data=my_data, ginverse=list(species=sm2asreml(Ainv)))

m2.mcmc<-MCMCglmm(trait~nhabitat, random =~ habitat+species+species:habitat+species.ide, data=my_data, ginverse=list(species=Ainv))

There's many more possibilities of course, depending on the biology of the problem.

Cheers,

Jarrod



Quoting Gustaf Granath <gustaf.gran...@gmail.com> on Wed, 21 Jan 2015 15:59:23 +0100:

Dear all,
Is it possible to quantify the influence of phylogeny and environment on a measured trait?

Say that trait y has been measured in several species located in different habitats. Some species have been measured in more than one habitat (less specialized species). Now I want to say something about how much variation in my trait that may be explained by phylogenetic structure (I have the phyl tree) and the habitat. I also have multiple measurements per species in the habitat(s) where is occurs (replicates) but lets ignore that for now and only use the means.

One way would be to calculate phylogenetic corrected trait means (as in Bloomberg's K). Here it is easy to add rows/columns in the vcv.phylo(tree) for species that occurs in more than one habitat and it is assumed that they have the same postion in the tree (ie within-species correlation is = 1) Residual sums of squares for the phylo effect can be calculated and the corrected trait means are used in a linear model to test the habitat effect.
R2 for both factors (phylo, habitat) can then be calculated.

Another method is to use gls(). But I dont know how to deal with species occurring in multiple habitats here (can you feed the gls with a manually contructed var-covariance matrix?). If I drop some species x habitat combinations (so species can only occur in one habitat) I can do:
bm.sp <- corBrownian(phy = tree)
fit0 <- gls(trait ~ 1, sp.means, method = "ML")
fit1 <- gls(trait ~ 1, sp.means, correlation = bm.sp, method = "ML")
fit2 <- gls(trait ~ Habitat, sp.means, correlation = bm.sp, method = "ML")
anova(fit0, fit1, fit2) #check AIC

#pseudoR2 (not perfect, I know, but indicate somewhat the "importance")
G2 <- (-2*(logLik(fit0) - logLik(fit1)))[1]
n = nrow(sp.means)
r2ML.phyl <- 1 - exp(-G2/n) #phyl effect

G2 <- (-2*(logLik(fit0)-logLik(fit2)))[1]
r2ML.both <- 1 - exp(-G2/n)
r2ML.hab <- r2ML.both - r2ML.phyl #habitat effect

As mentioned, there are many assumptions here (BM etc) but I wonder if this sort of approach is in general valid?

Is there a "standard way" to deal with these kind of problems (e.g. same species in multiple groups)?

Is there a way to incorporate replicates? (manually add/rows columns in the phylogenetic varcovar-matrix?) (this has been on the list before, MCMCglmm has been mentioned and maybe a more complete modeling approach is the way to go here)

Key papers?

Cheers

Gustaf

--
Gustaf Granath (PhD)
Post doc
Swedish University of Agricultural Sciences
Department of Aquatic Sciences and Assessment

_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/






--
Gustaf Granath (PhD)
Post doc
Swedish University of Agricultural Sciences
Department of Aquatic Sciences and Assessment






--
Gustaf Granath (PhD)
Post doc
McMaster University
School of Geography & Earth Sciences

Swedish University of Agricultural Sciences
Department of Aquatic Sciences and Assessment

@GGranath
Web: ggranath.blogspot.com
Email: gustaf.gran...@gmail.com






--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Reply via email to