Hi Gustaf,
In the model with just species the residual variation is measurement
error/plasticity error, but could also include deviations from the
assumed BM process. If you add species.ide that term captures
deviations from the assumed BM process.
Va/(Va+Ve) is the proportion of variance explained by phylogeny
*after* accounting for the fixed effects. To include the variance
explained by the fixed effects in the denominator is a bit difficult,
because you have to assume something about the distribution of
predictors. For example do you want the different habitat types to be
equally represented, or at the frequency with which they were sampled?
Cheers,
Jarrod
Quoting Gustaf Granath <gustaf.gran...@gmail.com> on Thu, 22 Jan 2015
11:11:02 +0100:
Jarrod,
Thanks for your input. Dropping species.ide result in a model that
is estimated without any problems.
prior = list(R = list(V = 1, nu = 0), G = list(G1 = list(V = 1, nu =
1, alpha.mu = 0, alpha.V = 1000)))
m1.mcmc <- MCMCglmm(y ~ habitat, random =~ species,
data = traits, ginverse = list(species = Ainv),
nitt = 700000, thin=100, prior = prior)
It seems to be rather robust (not sensitive to priors) and Im not
sure I understand why you would need 100s of species.
But how is this model related to the simpler approach I described in
my first post?
The error term (Ve) is the within species error
(measurement/plasticity error) and the species variance (Va) is the
phylogenetic variance in my data. Lambda (Pagel's) is then Va/(Va+Ve).
(m1.mcmc$VCV[,"species"] / (m1.mcmc$VCV[,"species"] + m1.mcmc$VCV[,"units"]))
# BUT is this only correct under a model without habitat?
If habitat is included in the model the between species variation
not explained by phylogeny or habitat is included in Ve because I
dropped the speceis.ide, or am I missing something here?
If I run a model with and without habitat I can extract the amount
of variation explained by habitat (differences in Ve = Vhab) and get
a ballpark number on the relative contribution to the pattern we
observe:
Va / (Va+Vhab+Ve) #phylo
Vhab / (Va+Vhab+Ve) #habitat
Ve / (Va+Vhab+Ve) #measurement/plasticity/local adaption .... and
other processes
Did I get that right or am I lost?
Gustaf
On 2015-01-22 04:54, Jarrod Hadfield wrote:
Hi Gustaf,
1/ You can ignore nhabitat: for some reason I thought you thought
that there might be systematic differences between specialists and
non-specialists.
2/ species and species.ide are identical, but species effects are
structured by the phylogeny due to the ginverse specification.
species.ide are treated as iid.
3/ With few species and few habitats, and very few species observed
in multiple habitats, I think you will just have to go for the
simple model you describe (perhaps even dropping species.ide):
m1.mcmc<-MCMCglmm(trait~as.factor(habitat)+x, random =~
species+species.ide, data=my_data, ginverse=list(species=Ainv))
Expect the variance components to be poorly estimated and prior
sensitive. 100's of species are required to get even moderately
precise estimates of the phylogenetic variance.
Cheers,
Jarrod
Quoting Gustaf Granath <gustaf.gran...@gmail.com> on Thu, 22 Jan
2015 01:15:23 +0100:
Jarrod,
I tried this but Im not sure it works well for my problem and/or
Im doing something wrong.
First, Im not sure what nhabitat is. A vector specifying the
number of habitats that the species is found in? I.e.
my_data$nhabitat may look like:
1,1,1,1,2,2,2,2, 2,2,2,2 .... if the first two species occurs in
one and two habitats, respectively.
Second, species.ide. If it is the same as species, why not just: +
species + species ?
Should it be a vector identical to species but just names species.ide?
So to my data, there are only three habitats (3 levels), thus Im
not sure it works well modeling it as a random factor (and as you
indicate we need a "reasonable number of habitats"). There are 2,
2 and 11 species in each habitat, and only a few are found in
multiple habitats. So Im really looking at simpler model because
there are not enough data for complicated structures. With so few
species things are pushed to the limit but my guess is that the
main problem here is that habitat cannot be fitted as a random
factor. Possible to model it as a fixed effect?
However, there are other predictors of interest as well,
continuous variables. How would that be modeled then?
Like this (predictor=x)?
m1.mcmc<-MCMCglmm(trait~ x, random =~ species+species.ide,
data=my_data, ginverse=list(species=Ainv))
I would also like to thank Tony for recommending excellent papers
and I didnt know about the pez package which looks promising.
Cheers
Gustaf
On 2015-01-21 16:26, Jarrod Hadfield wrote:
Dear Gustaf,
How many levels of `habitat' are there, and are they
cross-classified with respect to species (i.e. are multiple
species measured in the same habitat)?
Assuming for now there are a reasonable number of habitats then
the simplest model (without cross-classification) in
asreml/MCMCglmm would be
Ainv<-inverseA(my_tree)$Ainv
m1.asreml<-asreml(trait~nhabitat, random =~
habitat+giv(species)+species.ide, data=my_data,
ginverse=list(species=sm2asreml(Ainv)))
m1.mcmc<-MCMCglmm(trait~nhabitat, random =~
habitat+species+species.ide, data=my_data,
ginverse=list(species=Ainv))
nhabitat is the number of habitats a species is found in.
species.ide is the same as species and models the between species
variation not captured by the BM model. Note that this requires
you using the individual data rather than the species means
(which is to be preferred).
With cross-classification you may want to have the effects of the
same habitat vary across species. There's many ways to do this,
but for now lets assume i) that in general they could be
positively correlated (if the trait increases for species 1 in
habitat 1 then it is also likely to increase the trait in species
2) and ii) if species 1 and species 2 are close relatives the
habitat is likely to effect them in a more similar way:
m2.asreml<-asreml(trait~nhabitat, random =~
habitat+giv(species)+giv(species):habitat+species.ide,
data=my_data, ginverse=list(species=sm2asreml(Ainv)))
m2.mcmc<-MCMCglmm(trait~nhabitat, random =~
habitat+species+species:habitat+species.ide, data=my_data,
ginverse=list(species=Ainv))
There's many more possibilities of course, depending on the
biology of the problem.
Cheers,
Jarrod
Quoting Gustaf Granath <gustaf.gran...@gmail.com> on Wed, 21 Jan
2015 15:59:23 +0100:
Dear all,
Is it possible to quantify the influence of phylogeny and
environment on a measured trait?
Say that trait y has been measured in several species located in
different habitats. Some species have been measured in more than
one habitat (less specialized species). Now I want to say
something about how much variation in my trait that may be
explained by phylogenetic structure (I have the phyl tree) and
the habitat. I also have multiple measurements per species in
the habitat(s) where is occurs (replicates) but lets ignore that
for now and only use the means.
One way would be to calculate phylogenetic corrected trait means
(as in Bloomberg's K).
Here it is easy to add rows/columns in the vcv.phylo(tree) for
species that occurs in more than one habitat and it is assumed
that they have the same postion in the tree (ie within-species
correlation is = 1)
Residual sums of squares for the phylo effect can be calculated
and the corrected trait means are used in a linear model to test
the habitat effect.
R2 for both factors (phylo, habitat) can then be calculated.
Another method is to use gls(). But I dont know how to deal with
species occurring in multiple habitats here (can you feed the
gls with a manually contructed var-covariance matrix?). If I
drop some species x habitat combinations (so species can only
occur in one habitat) I can do:
bm.sp <- corBrownian(phy = tree)
fit0 <- gls(trait ~ 1, sp.means, method = "ML")
fit1 <- gls(trait ~ 1, sp.means, correlation = bm.sp, method = "ML")
fit2 <- gls(trait ~ Habitat, sp.means, correlation = bm.sp,
method = "ML")
anova(fit0, fit1, fit2) #check AIC
#pseudoR2 (not perfect, I know, but indicate somewhat the "importance")
G2 <- (-2*(logLik(fit0) - logLik(fit1)))[1]
n = nrow(sp.means)
r2ML.phyl <- 1 - exp(-G2/n) #phyl effect
G2 <- (-2*(logLik(fit0)-logLik(fit2)))[1]
r2ML.both <- 1 - exp(-G2/n)
r2ML.hab <- r2ML.both - r2ML.phyl #habitat effect
As mentioned, there are many assumptions here (BM etc) but I
wonder if this sort of approach is in general valid?
Is there a "standard way" to deal with these kind of problems
(e.g. same species in multiple groups)?
Is there a way to incorporate replicates? (manually add/rows
columns in the phylogenetic varcovar-matrix?)
(this has been on the list before, MCMCglmm has been mentioned
and maybe a more complete modeling approach is the way to go here)
Key papers?
Cheers
Gustaf
--
Gustaf Granath (PhD)
Post doc
Swedish University of Agricultural Sciences
Department of Aquatic Sciences and Assessment
_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at
http://www.mail-archive.com/r-sig-phylo@r-project.org/
--
Gustaf Granath (PhD)
Post doc
Swedish University of Agricultural Sciences
Department of Aquatic Sciences and Assessment
--
Gustaf Granath (PhD)
Post doc
McMaster University
School of Geography & Earth Sciences
Swedish University of Agricultural Sciences
Department of Aquatic Sciences and Assessment
@GGranath
Web: ggranath.blogspot.com
Email: gustaf.gran...@gmail.com
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/