Thanks Douglas for the detailed explanation; it was very helpful (I'll
definitely give lme4 a try as well).
For the record: I also found it helpful to play with model.matrix (e.g. p.
14ff and p.28 of the book by Jose Pinheiro and Douglas) to see how the
formulas for fixed and random effects are related to the Laird&Ware'82
formulation for mixed-effects models.

Now I have one "follow-up" question, about model selection:

>From your advice, I figure that the suggested procedure is to build a number
of models (with differences in fixed and/or random effects) and then
comparing them to select the most adequate one.
When using anova(model1, model2) to compare two models that differ in both
fixed and random effects, I get a warning message: "Fitted objects with
different fixed effects. REML comparisons are not meaningful. in:
anova.lme(model1,
model2)".

So apparently this means that anova only works for comparing models with
same fixed-effects component but different random effects, right?
So what is then the correct way to compare models with different fixed
and/or random effects?
For example, would it be correct to simply choose the model with the lowest
AIC/BIC? (this seems to be a better choice than selecting by logLikelihood
since AIC/BIC penalize higher #parameters, whereas likelihood does not).
Or is there another, better way?

Many thanks so far,
  Lukas


On 10/22/06, Douglas Bates <[EMAIL PROTECTED]> wrote:
>
> On 10/21/06, Lukas Rode <[EMAIL PROTECTED]> wrote:
> > Dear list,
> >
> > I'm trying to fit a multilevel (mixed-effects) model using the lme
> function
> > (package nlme) in R 2.4.0. As a mixed-effects newbie I'm neither sure
> about
> > the modeling nor the correct R syntax.
>
> You may also want to consider using the lmer function from package
> lme4.  It's a later version of lme for R and is generally faster than
> lme.  I'll show the syntax for both.
>
> > My data is structured as follows: For each subject, a quantity Y is
> measured
> > at a number (>= 2) of time points. Moreover, at time point 0
> ("baseline"), a
> > quantity X is measured for each subject (I am interested to see whether
> X,
> > or log(X), is a predictor of Y). I saw in some examples that
> time-invariant
> > predictors should be repeated for all rows of a subject, which is why I
> > copied the baseline value of X also to time points > 0.
>
> That's correct.  For both lme and lmer the data need to be in the
> 'long' form, also called the 'person-period' form.  You can reshape
> the data by hand, as you have done, or you can use the 'reshape'
> function or the more general capabilities in Hadley Wickham's reshape
> package.
>
> > The resulting data frame looks like this:
>
> > Grouped Data: Y ~ TIME | Subject
> > Y         TIME        Subject         X.Baseline
> > 9           0.0             1                1084
> > 7           3.7             1                1084
> > 11         0.0             2                7150
> > 8           9.2             2                7150
>
> > Intra-subject trajectories of Y very close to linear. I'd like to check
> > whether slope (and maybe also offset) of this line are (in part)
> predicted
> > by X.baseline.
>
> I would say that this question would be addressed as part of the
> fixed-effects specification.  You would be comparing a model with main
> effects for TIME and X.Baseline and their interaction to various
> submodels.  In the S language formula notation these fixed effects
> specifications are
>
> Y ~ TIME * X.Baseline  # main effects for TIME and X.Baseline and
> their interaction
> Y ~ TIME + X.Baseline + TIME:X.Baseline # equivalent to first model
> Y ~ TIME + X.Baseline # main effects but no interaction.  That is
> X.Baseline does not affect slope
> Y ~ TIME + TIME:X.Baseline # X.Baseline affects slope (wrt TIME) but not
> offset
> Y ~ TIME # X.Baseline has no effect on Y
>
> > Thus, I think the multilevel model specification should be as follows (i
> =
> > subject, j=measurement):
> > y_ij = \beta_i  + b_i * TIME_ij + \epsilon_ij,
> > with
> > b_i = \zeta_i0 + \zeta_i1 * X.Baseline
> > Is this correct?
>
> Well that model doesn't have a main effect for X.Baseline so you can't
> test for an effect for X.Baseline on the slope.
>
> > Now, I am completely unsure how to "translate" this into the syntax
> needed
> > by lme.
> > Is there any standard procedure on how to get from e.g. the
> Laird&Ware'82
> > matrix model notation to the lme input?
>
> I think that is more of a question of how to use the model formula
> language than a question about lme.  In lme the right hand side of the
> formula argument generates Laird&Ware's matrix X and the random
> specification generates the matrix Z.
>
> > And, in my case, is the correct model as follows, or am I wrong?
> > my.model <- lme( Y ~ TIME, my.data.frame, random=~X.Baseline | Subject )
> > [in case this is correct, it is by accident, since I made it up from
> some
> > examples I found without really understanding the translation from the
> model
> > formulation]
>
> Actually this model specification should throw an error but it doesn't.
>
> Estimates for the random effects in this model are not well defined
> because X.Baseline is constant within Subject.  Thus within Subject
> the Intercept term and the X.Baseline term are confounded.  I would
> begin with the model
>
> lme(Y ~ TIME * X.Baseline, my.data.frame, random = ~ 1|Subject)
>
> which is equivalent to
>
> lmer(Y ~ TIME * X.Baseline + (1|Subject), my.data.frame)
>
> Then examine the estimates for the fixed-effects coefficients and
> their standard errors to see what reductions of this model may be
> appropriate.
>
> > I'd greatly appreciate some advice or help.
>
> I hope this helps.
>

        [[alternative HTML version deleted]]

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to