Hi,

I'm not sure why this is occurring (and I've cc'ed the r-sig-me list as they might know more). It appears to me that there is some problem with the parameterisation of the random effects, and how that interacts with the correlation structure. The usual random=~1|date parameterisation uses a log-Cholesky factorisation, which is different to the pdIdent parameterisation, but as you say in this case should give the same answers. It looks like there may be something going on deep in the lme internals. Maybe someone on r-sig-me can help.

Sorry I can't be of more help. It's an interesting problem and I would like to see it resolved too.

Best,

Simon.

On 01/03/12 00:47, Rudolf Philippe Rohr wrote:
Hello.

I'm writing about a possible issue when incorporating a phylogenetic correlation structure (corPagel) in a linear mixed effect model (lme).

There are two techniques for incorporating a random effect in a linear model:

t1 (it is the traditional way): m1 <- lme( y ~ x, random = ~1|date)

t2: u = gl(1,1,length(y))
m2 <- lme(y ~ x, random = list(u = pdIdent(form=~factor(date)-1)))

(http://www.mail-archive.com/r-help@stat.math.ethz.ch/msg11749.html)

> str(d)
'data.frame': 446 obs. of 3 variables:
$ y : num 2.197 0.693 0.693 0 0.693 ...
$ x : Factor w/ 2 levels "N","Y": 2 1 1 1 1 2 2 2 2 2 ...
$ date: Factor w/ 7 levels "12","16","21",..: 5 6 7 4 7 6 2 7 3 6 ...

These two techniques have to give the same output, and it is exactly what happens (same parameter estimations, same log-like, same p-values, same values for the random factor, …)

> summary(m1)
Linear mixed-effects model fit by REML
Data: d
AIC BIC logLik
1366.617 1383 -679.3083

Random effects:
Formula: ~1 | date
(Intercept) Residual
StdDev: 0.4283086 1.086683

Fixed effects: y ~ x
Value Std.Error DF t-value p-value
(Intercept) 1.4430143 0.1856287 438 7.773658 0.0000
xY 0.0700834 0.1119626 438 0.625953 0.5317
Correlation:
(Intr)
xY -0.397

> summary(m2)
Linear mixed-effects model fit by REML
Data: d
AIC BIC logLik
1366.617 1383 -679.3083

Random effects:
Formula: ~factor(date) - 1 | u
Structure: Multiple of an Identity
factor(date)12 factor(date)16 factor(date)21 factor(date)26 factor(date)30 factor(date)35 factor(date)38 StdDev: 0.4283086 0.4283086 0.4283086 0.4283086 0.4283086 0.4283086 0.4283086
Residual
StdDev: 1.086683

Fixed effects: y ~ x
Value Std.Error DF t-value p-value
(Intercept) 1.4430143 0.1856287 444 7.773658 0.0000
xY 0.0700834 0.1119626 444 0.625953 0.5317
Correlation:
(Intr)
xY -0.397


Things get strange when incorporating a phylogenetic correlation structure with corPagel (and also with corGrafen):

t1: m3 <- lme( y ~ x, random = ~1|date, correlation = corPagel(0.5,tree,fixed=FALSE))

t2: u = gl(1,1,length(y))
m4 <- lme(y ~ x, random = list(u = pdIdent(form=~factor(date)-1)), correlation = corPagel(0.5,tree,fixed=FALSE)))

Again, these two methods should give the same output, however here:

t1 always gives the following error message:

Error in corFactor.corStruct(object) :
NA/NaN/Inf in foreign function call (arg 1)

t2: the optimization process converges, and gives reasonable output.

> summary(m4)
Linear mixed-effects model fit by REML
Data: d
AIC BIC logLik
1350.275 1370.754 -670.1376

Random effects:
Formula: ~factor(date) - 1 | u
Structure: Multiple of an Identity
factor(date)12 factor(date)16 factor(date)21 factor(date)26 factor(date)30 factor(date)35 factor(date)38 StdDev: 0.3966233 0.3966233 0.3966233 0.3966233 0.3966233 0.3966233 0.3966233
Residual
StdDev: 1.141081

Correlation Structure: corPagel
Formula: ~1 | u
Parameter estimate(s):
lambda
0.2846964
Fixed effects: y ~ x
Value Std.Error DF t-value p-value
(Intercept) 1.1950145 0.3493305 444 3.420871 0.0007
xY -0.0312983 0.1139128 444 -0.274757 0.7836
Correlation:
(Intr)
xY -0.18

First question: why is it that t1 does not work, while t2 does?


To go a step forward with this problem, we can set the lambda value in t1 with the value obtained from t2.

t1bis: m3bis <-lme( y ~ x, random = ~1|date, correlation = corPagel(0.2846964,tree,fixed=TRUE))

This time, t1bis converges. However the output is completely different from t2.

> summary(m3bis)
Linear mixed-effects model fit by REML
Data: d
AIC BIC logLik
14907.8 14924.19 -7449.902

Random effects:
Formula: ~1 | date
(Intercept) Residual
StdDev: 809763.6 4632624

Correlation Structure: corPagel
Formula: ~1 | date
Parameter estimate(s):
lambda
0.2846964
Fixed effects: y ~ x
Value Std.Error DF t-value p-value
(Intercept) -329242.7 324363.4 438 -1.01504 0.3106
xY 3.7 0.0 438 127.43249 0.0000
Correlation:
(Intr)
xY 0.014

Second question: how is that possible? The two outputs should be the same.


To try to understand what is going on, we can compute the profile log-likelihood curve, as a function of lambda, for both techniques.

lambda <- seq(0,1,0.01)

loglike.t1 <- rep(NA,length(lambda))
loglike.t2 <- rep(NA,length(lambda))

for (i in 1:length(lambda)){

m1 <- lme(y ~ x, random = ~1|date, correlation = corPagel(lambda[i],tree,fixed=TRUE))
loglike.t1[i] <- m1$logLik

u = gl(1,1,length(d$y))
m2 <- lme(y ~ x,random = list(u = pdIdent(form=~factor(date)-1)), correlation = corPagel(lambda[i],tree,fixed=TRUE))
loglike.t2[i] <- m2$logLik

}

The two curves are completely different:

With t2, we obtain a reasonable curve, with a maximum at the previously estimated lambda value.

With t1, the curve looks really strange: there is a discontinuity at the origin, i.e., for lambda=0 the log-like value is higher than for lambda>0, and for lambda>0 the log-like is only increasing with lambda.

Thus a third question: why are these two profile log-likelihood curves different?

The final question is: in which technique can we believe?

I’m using the version 2.14.1 of R, 3.1-96 of nlme, and 3.0-1 of ape.

Best,
Rudolf


--
Simon Blomberg, BSc (Hons), PhD, MAppStat, AStat.
Lecturer and Consultant Statistician
School of Biological Sciences
The University of Queensland
St. Lucia Queensland 4072
Australia
T: +61 7 3365 2506
email: S.Blomberg1_at_uq.edu.au
http://www.uq.edu.au/~uqsblomb/

Policies:
1.  I will NOT analyse your data for you.
2.  Your deadline is your problem.

Statistics is the grammar of science - Karl Pearson.

_______________________________________________
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo

Reply via email to