Hi Sam,

The terminology G and R structure is used widely, for example in ASreml & SAS and probably others. The G-structure is the covariance matrix of the random effects and the R-structure is the covariance matrix of the residuals. In your model you have one random term (animal) and one residual term (the default - units).

You haven't posted your model specification only the summary, so I will presume that the animal term is linked to a phylogeny. There is a single random effect for each of the n species and they have correlation structure A. A is an nxn matrix representing the proportion of evolution time shared between each species. The covariance structure is Va*A where Va is a scalar variance estimated from the data. The ... in the prior:

G=list(G1=list(...))

should contain your specification for the prior on Va.

units is a factor with a unique level for each row of the data.frame. The default ~units therefore has the standard correlation structure I for the residuals. I is an identity matrix (ones on the diagonal zero's elsewhere) and represents the fact that we believe the residuals are independent and homoscedastic. If there is only one observation per species I is nxn also. The covariance structure is Ve*I where Ve is a scalar variance estimated from the data. The ... in the prior:

R=list(...)

should contain your specification for the prior on Ve.

Together we have the prior

prior=list(R=R,G=G)

You have this bit of code (which I've modified a bit):

l = rbv(tree, Va, nodes="TIPS") + rnorm(100, 0, sqrt(Ve))

which generates random data with covariance structure Va*A+Ve*I. Va=1 and Ve=1 in your example and Va/(Va+Ve)=0.5 is sometimes known as the phylogenetic heritability (Lynch 1991) or more commonly Pagel's lambda (Pagel 1999). Note that I've removed x from the code because it doesn't really make sense to have it there. More likely you want something like:

l = rbv(tree, Va, nodes="TIPS") + as.matrix(x)%*%beta + rnorm(100, 0, sqrt(Ve))

where beta is a vector of regression coefficients (beta=c(1, 1)) determining the effect of being arboreal and being nocturnal respectively on energy use. The data have the same covariance structure as before conditional on the vector of means (x%*%beta).

However, the response in your analysis is not l but Energy. This was generated with as:

 50 + rnorm(100, 0, sqrt(Ve))

where Ve=20^2=400.

The summary shows that the posterior distribution returned by MCMCglmm has support for the true values (Va=0, Ve=400, beta1=0, beta2=0). For example Va is associated with the animal term in your model and so has a posterior mean of 2.186 and 95% credible interval of 0 - 6.493.

If you rerun the analysis with l rather than Energy

data = data.frame(l, x, animal=tree$tip.label)

prior=list(R=list(V=1,nu=0), G=list(G1=list(V=1, nu=1, alpha.mu=0, alpha.V=1000)))

# here I've used a flat *improper* prior for the Ve and a parameter expanded prior for Va. The Va prior is a scaled F-distribution with 1 df for the numerator and denominator. Parameter expanded priors have some nice properties, but the main advantage of parameter expansion is algorithmic - it increases the efficiency of the MCMC algorithm when the posterior for Va contains values close to zero.

m2<-MCMCglmm(l~arboreal+nocturnal, random=~animal, data=data, pedigree=tree, prior=prior)

The summary shows that the posterior distribution returned by MCMCglmm has support for the true values (Va=1, Ve=1, beta1=1, beta2=1).


summary(m2)

 Iterations = 3001:12991
 Thinning interval  = 10
 Sample size  = 1000

 DIC: 285.753

 G-structure:  ~animal

       post.mean l-95% CI u-95% CI eff.samp
animal      1.49   0.1696    3.445    442.9

 R-structure:  ~units

      post.mean l-95% CI u-95% CI eff.samp
units    0.8819   0.6187    1.208     1000

 Location effects: l ~ arboreal + nocturnal

            post.mean l-95% CI u-95% CI eff.samp  pMCMC
(Intercept)   -1.0719  -2.3445   0.0800    139.1  0.082 .
arboreal       1.3163   0.9545   1.6891   1000.0 <0.001 ***
nocturnal      0.9639   0.5477   1.3512   1108.4 <0.001 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Have you consulted the CourseNotes? I think not.

Cheers,

Jarrod


Quoting Sam Passmore <passmore....@gmail.com> on Sat, 15 Dec 2012 11:23:16 +1300:

Dear R-Sig-Phylo subscribers,



I am currently trying to perform some analysis using the MCMCglmm package,
however I am having trouble understanding some of the aspects of the
package and wondered if I could get your help?



I have tried to develop an example of my problem that I hope will make my
problems clear, but they are primarily associated around the meanings of
G-structure and R-structure. If anyone could give a broad definition of
what these parts of the analysis do and how they affect the results, that
would be most useful.



I have a list of species in a data frame with traits for each taxon. Thisis
made up of Yes/No variables and a continuous response variable we think is
related to these Yes/No variables. I also have a species phylogeny and
speciesÂ’ locations.

For the purpose of the example, letÂ’s say the response is how much energy a
species uses each day .



We want to model energy use as a function of whether a species is arboreal
(yes/no) and nocturnal (yes/no) and we want to control for (and measure the
effect of) phylogeny (later I would like to control for spatial
autocorrelation too).



tree = rcoal(100);

x = data.frame(arboreal = sample(c(0, 1), 100, replace = TRUE), nocturnal =
sample(c(0, 1), 100, replace = TRUE))



Now this is a bit of an aside, but I thought I would clear it up



l = rbv(tree, 1, nodes="TIPS") + x + rnorm(100)

is a piece of code I have seen commonly in examples, but it seems to me
that this is just to add some variation to the phylogeny. Is this the case?



I have a continuous response, which we can say is energy use

Energy = rnorm(100, 50, 20)

data = data.frame(Energy, x, animal=tree$tip.label)



Now, this is where I have driven off the main road. The implementation of
the priors are throwing me off my stride, primarily due to a lack of
attention during my Bayesian classes.

R & G-structures are a mystery to me. My guess would be they are something
to do with priors on residuals and Random effects. But I am unsure why the
are not like the priors I am used to using, which have a mean and a
variance.



I have read the following link numerous times, but it is not shedding any
light on it for me:



http://stats.stackexchange.com/questions/32994/lme4-vs-mcmcglmm-what-are-r-structure-g-structure-in-a-glmm





I used the following model and default priors, which lead to the output
below



Could anyone explain to me what the G-structure and R-structure sections in
the summary output are saying?

Preferably with reference to my example.



summary(m1)



 Iterations = 3001:12991

 Thinning interval  = 10

 Sample size  = 1000



 DIC: 868.0991



 G-structure:  ~animal



       post.mean  l-95% CI u-95% CI eff.samp

animal     2.186 3.495e-08    6.493    42.85



 R-structure:  ~units



      post.mean l-95% CI u-95% CI eff.samp

units       332    239.8    421.4     1000



 Location effects: Energy ~ arboreal + nocturnal



            post.mean l-95% CI u-95% CI eff.samp  pMCMC

(Intercept)   48.1297  42.1408  54.3912     1000 <0.001 ***

arboreal         3.4769  -4.2014   9.8992     1000  0.342

nocturnal       -0.9693  -8.5014   5.6701     1000  0.806

---

Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Thanks,

Sam

        [[alternative HTML version deleted]]





--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Reply via email to