Re: [R] [R-sig-ME] lme nesting/interaction advice

2008-05-13 Thread Rolf Turner


Thanks very much for your long, detailed, patient, and lucid
response to my cri de coeur.  That helps a *great* deal.

I'm not sure that I have a solid understanding of the issues yet
--- I never am! --- but I think I'm getting there.  I'll need to
chew over the posting a bit more and try some examples before
I feel comfortable.

One point that I'd like to spell out very explicitly, to make sure
that I'm starting from the right square:

The model that you start with in the Machines/Workers example is

fm1 - lmer(score ~ Machine + (1|Worker) + (1|Machine:Worker),  
Machines)



My understanding is that this is the ``only'' model that could be fitted
by the Package-Which-Must-Not-Be-Named.  I.e. this package *could  
not* fit

the second, more complex model:


fm2 - lmer(score ~ Machine + (Machine|Worker), Machines)


(At least not directly.)  Can you (or someone) confirm that I've got  
that

right?

It seems to me to be the key to why I've had such trouble in the past
in grappling with mixed models in R.  I.e. I've been thinking like
the Package-Which-Must-Not-Be-Named --- that the simple, everything- 
independent

model was the only possible model.

Thanks again.

cheers,

Rolf

##
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-sig-ME] lme nesting/interaction advice

2008-05-13 Thread Kingsford Jones
Hi Rolf,

On Tue, May 13, 2008 at 1:59 PM, Rolf Turner [EMAIL PROTECTED] wrote:

 in response to Doug Bates' useful tutorial...

  Thanks very much for your long, detailed, patient, and lucid
  response to my cri de coeur.  That helps a *great* deal.


Hear Hear!

snip
  One point that I'd like to spell out very explicitly, to make sure
  that I'm starting from the right square:

  The model that you start with in the Machines/Workers example is



 
   fm1 - lmer(score ~ Machine + (1|Worker) + (1|Machine:Worker), Machines)
  
 


  My understanding is that this is the ``only'' model that could be fitted
  by the Package-Which-Must-Not-Be-Named.  I.e. this package *could not* fit
  the second, more complex model:



 
   fm2 - lmer(score ~ Machine + (Machine|Worker), Machines)
  
 

  (At least not directly.)  Can you (or someone) confirm that I've got that
  right?

Compare:

## R
 m4
Linear mixed model fit by REML
Formula: score ~ Machine + (0 + Machine | Worker)
   Data: Machines
   AIC   BIC logLik deviance REMLdev
 228.3 248.2 -104.2216.6   208.3
Random effects:
 Groups   Name Variance Std.Dev. Corr
 Worker   MachineA 16.64098 4.07934
  MachineB 74.39558 8.62529  0.803
  MachineC 19.26646 4.38936  0.623 0.771
 Residual   0.92463 0.96158
Number of obs: 54, groups: Worker, 6

## The-Package

proc mixed data = machines;
class worker machine;
model score = machine  / solution;
random machine / subject = worker type = un;
run;

 Covariance Parameter Estimates

Cov Parm SubjectEstimate

UN(1,1)  Worker  16.6405
UN(2,1)  Worker  28.2447
UN(2,2)  Worker  74.3956
UN(3,1)  Worker  11.1465
UN(3,2)  Worker  29.1841
UN(3,3)  Worker  19.2675
Residual  0.9246


The two outputs report essentially the same thing.
Note that e.g, UN(2,1) = 28.2477 approx= .803*4.07934*8.62529
(And, as usual, the fixed effects estimates match too once the
contrasts and 'types' of SS for an ANOVA table are set up)

UN is short for 'unstructured' - a term Doug has pointed out is not
particularly fitting because the covariance matrix is symmetric
positive definite.


  It seems to me to be the key to why I've had such trouble in the past
  in grappling with mixed models in R.  I.e. I've been thinking like
  the Package-Which-Must-Not-Be-Named --- that the simple,
 everything-independent
  model was the only possible model.


Although this may well not apply to you, another area of confusion
arises not so much from differences between stats packages but by
differences between methods. I'm not an expert in the estimation
methods but, as I understand it, classic texts describe fitting mixed
models in terms of ANOVA in the OLS framework, calculating method of
moments estimators for the variances of the random effects by equating
observed and expected mean squares (I believe using aov and lm with an
'Error' term would fall into this category, and proc anova and proc
glm would also).  Starting in the 90's these methods started falling
out of fashion in favor of ML/REML/GLS methods (likelihood based),
which offer more flexibility in structuring both the error and random
effects covariance matrices, will not produce negative variance
estimates, and have other nice properties that someone more 'mathy'
than me could explain.  Tools like lme, lmer, proc mixed and proc
glimmix fall into this category.

hoping this helps,

Kingsford Jones







  Thanks again.

 cheers,

 Rolf



  ##
  Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-sig-ME] lme nesting/interaction advice

2008-05-12 Thread Ken Beath

On 12/05/2008, at 4:52 AM, Federico Calboli wrote:


On 10 May 2008, at 07:36, Kingsford Jones wrote:

Federico,

I think you'll be more likely to receive the type of response you're
looking for if you formulate your question more clearly.  The
inclusion of commented, minimal, self-contained, reproducible code
(as is requested at the bottom of every email sent by r-help) is an
effective way to clarify the issues.  Also, when asking a question
about fitting a model it's helpful to describe the specific research
questions you want the model to answer.


snip

I apprecciate that my description of the *full* model is not 100%  
clear, but my main beef was another.


The main point of my question is, having a 3 way anova (or ancova,  
if you prefer), with *no* nesting, 2 fixed effects and 1 random  
effect, why is it so boneheaded difficult to specify a bog standard  
fully crossed model? I'm not talking about some rarified esoteric  
model here, we're talking about stuff tought in a first year Biology  
Stats course here[1].


Now, to avoid any chances of being misunderstood in my use of the  
words 'fully crossed model', what I mean is a simple


y ~ effect1 * effect2 * effect3

with effect3 being random (all all the jazz that comes from this  
fact). I fully apprecciate that the only reasonable F-tests would be  
for effect1, effect2 and effect1:effect2, but there is no way I can  
use lme to specify such simple thing without getting the *wrong*  
denDF. I need light on this topic and I'd say it's a general enough  
question not to need much more handholding than this.




There is only one random effect, so where does the crossing come  
from ? The fixed effects vary across blocks, but they are fixed so are  
just covariates. For this type of data the usual model in lme4 is  
y~fixed1+fixed2+1|group and for lme split into fixed and random parts.



Having said that, I did look at the mixed-effects mailing list  
before posting here, and it looks like it was *not* the right place  
to post anyway:


'This mailing list is primarily for useRs and programmeRs interested  
in *development* and beta-testing of the lme4 package.'


although the R-Me is now CC'd in this.

I fully apprecciate that R is developed for love, not money, and if  
I knew how to write an user friendly frontend for nlme and lme4 (and  
I knew how to actually get the model I want) I'd be pretty happy to  
do so and submit it as a library. In any case, I feel my complaint  
is pefectly valid, because specifying such basic model should  
ideally not such a chore, and I think the powers that be might  
actually find some use from user feedback.




The problems seems to be that you want lme to work in the same way as  
an ANOVA table and it doesn't. The secret with lme and lme4 is to  
think about the structure of the data and describe with an equation.  
Then each term in the equation corresponds to part of the model  
definition in R.



Once I have sorted how to specify such trivial model I'll face the  
horror of the nesting, in any case I attach a toy dataset I created  
especially to test how to specify the correct model (silly me).




I'm a bit lost with your data file, it has 4 covariates, which is more  
than enough for 2 fixed effects, assuming block is the grouping and y  
the outcome.


Ken


Best,

Federico Calboli

[1] So much bog standard that the Zar, IV ed, gives a nice table of  
how to compute the F-tests correctly, taking into account that one  
of the 3  effects is randon (I'll send the exact page and table  
number tomorrow, I don't have the book at home).


testdat.txt
--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com



___
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-sig-ME] lme nesting/interaction advice

2008-05-12 Thread Federico Calboli

On 12 May 2008, at 10:05, Ken Beath wrote:
There is only one random effect, so where does the crossing come  
from ? The fixed effects vary across blocks, but they are fixed so  
are just covariates. For this type of data the usual model in lme4  
is y~fixed1+fixed2+1|group and for lme split into fixed and random  
parts.


First off, whoa, an helpful reply! thanks for that, I hope I won't  
sound sarcastic or aggressive because I do not mean to be either.


Regarding your comment, the experiment was replicated three times, in  
3 different months. I would argue that for the fixed effects to be  
meaningful, they must have an effect over an above the effect:month  
interaction (because each fixed effect, and their interaction, might  
vary between each replicate). I would then argue I need to calculate


1) fixed.effect1:random.effect
2) fixed.effect2:random.effect
3) fixed.effect1:fixed.effect2:random.effect

to test if fixed.effect1 is meaningful (and use 1) as the error); if  
fixed.effect2 has is meaningful (and use 2) as the error);  
fixed.effect1:fixed.effect2 is meaningful (and use 3) as the error).


I'm happy to be correct if I am wrong here.

The problems seems to be that you want lme to work in the same way  
as an ANOVA table and it doesn't. The secret with lme and lme4 is  
to think about the structure of the data and describe with an  
equation. Then each term in the equation corresponds to part of  
the model definition in R.


I'll try to do that.



Once I have sorted how to specify such trivial model I'll face the  
horror of the nesting, in any case I attach a toy dataset I  
created especially to test how to specify the correct model (silly  
me).




I'm a bit lost with your data file, it has 4 covariates, which is  
more than enough for 2 fixed effects, assuming block is the  
grouping and y the outcome.


In the data file, 'selection' and 'males' are fixed effects, and  
'month' is the effect I am using for the model we are discussing  
here. The y was generatde with runif() just to have something, I'm  
not expecting any intersting result, just to understand how to fit  
the right model.


In the dataset 'line' is nested within 'selection' and 'block' is  
nested within 'month'. That's the nesting I will have to take into  
account once I get the more straightforward (sic!) model we're  
discussing right.


Best,

Federico


--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-sig-ME] lme nesting/interaction advice

2008-05-12 Thread Nick Isaac
I *think* the syntax for the model Federico wants is this:

lmer(y~selection*males+ (selection|month) + (males|month))

My lme syntax is a bit rusty, so I'm not confident how to recode with nested
random effects, as in PB p24.


Two quick points:
1. I think Federico has caused some confusion on account of the way he used
the term 'crossing'2. Whilst interesting reading, some of the content of
this thread does not conform to the list's rules and regs.

Best wishes, Nick

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-sig-ME] lme nesting/interaction advice

2008-05-12 Thread Doran, Harold
   But that avoids the question as to *why* it isn't very well
   set up for crossed random effects?  What's the problem?
   What are the issues?  The model is indeed bog-standard.
   It would seem not unreasonable to expect that it could be
   fitted in a straightforward manner, and it is irritating to
   find that it cannot be.  If SAS and Minitab can do it at
   the touch of a button, why can't R do it?

I haven't followed this thread carefully, so apologies if I'm too off
base. But, in response to Rolf's questions/issues. First, SAS cannot
handle models with crossed random effects (at least well at all). SAS is
horribly incapable of handling even the simplest of models (especially
generalized linear mixed models). I can cite numerous (recent) examples
of SAS coming to a complete halt (proc nlmixed) for an analyses we were
recently working on. R (and Ubuntu) was the only solution to our
problem.

Now, lme is not optimized for crossed random effects, but lmer is. That
is why lmer is supported and lme is not really supported much. lmer is
optimized for models with nested random effects and crossed random
effects. 

When working with models with nested random effects, and software
optimized for those problems (e.g., HLM, SAS, mlWin) the
variance/covariance matrix forms a special, and simple structure that
can be easily worked with. This is not the case for models with crossed
random effects.

Software packages designed for nested random effects can be tricked into
handling models with crossed random effects, but this kludge is slow and
really inefficient.

If you want complete transparency into the why and how, here is a
citation for your review.

Best
Harold

@article{Doran:Bates:Bliese:Dowling:2007:JSSOBK:v20i02,
  author =  Harold  Doran and Douglas  Bates and Paul  Bliese and
Maritza   Dowling,
  title =   Estimating the Multilevel Rasch Model: With the lme4
Package,
  journal = Journal of Statistical Software,
  volume =  20,
  number =  2,
  pages =   1--18,
  day = 22,
  month =   2,
  year =2007,
  CODEN =   JSSOBK,
  ISSN =1548-7660,
  bibdate = 2007-02-22,
  URL = http://www.jstatsoft.org/v20/i02;,
  accepted =2007-02-22,
  acknowledgement = ,
  keywords =,
  submitted =   2006-10-01,
}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-sig-ME] lme nesting/interaction advice

2008-05-12 Thread Federico Calboli

On 12 May 2008, at 12:21, Nick Isaac wrote:



I *think* the syntax for the model Federico wants is this:

lmer(y~selection*males+ (selection|month) + (males|month))


I'll try and check against some back of the envelope calculations -- 
as I said, the model is, per se, nothing really new, and my data is  
fully balanced.


My lme syntax is a bit rusty, so I'm not confident how to recode  
with nested random effects, as in PB p24.



Two quick points:
1. I think Federico has caused some confusion on account of the way  
he used the term 'crossing'


Sorry about that, I'll try to avoid causing such confusion in the  
future.


Cheers,

/F

--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-sig-ME] lme nesting/interaction advice

2008-05-12 Thread Federico Calboli

On 12 May 2008, at 14:37, Doran, Harold wrote:


I haven't followed this thread carefully, so apologies if I'm too off
base. But, in response to Rolf's questions/issues. First, SAS cannot
handle models with crossed random effects (at least well at all).  
SAS is

horribly incapable of handling even the simplest of models (especially
generalized linear mixed models). I can cite numerous (recent)  
examples
of SAS coming to a complete halt (proc nlmixed) for an analyses we  
were

recently working on. R (and Ubuntu) was the only solution to our
problem


First off, let's keep SAS out of this. I never used it, never wanted  
to use it and did not mention anywhere I wanted to get SAS-like  
results! Although, seeing how easily it creeps up, I can sympathise  
with those who have strog feelings about it! [for those with strong  
feelings about me, this is meant to be something joke-like]


Now, lme is not optimized for crossed random effects, but lmer is.  
That

is why lmer is supported and lme is not really supported much. lmer is
optimized for models with nested random effects and crossed random
effects.

When working with models with nested random effects, and software
optimized for those problems (e.g., HLM, SAS, mlWin) the
variance/covariance matrix forms a special, and simple structure that
can be easily worked with. This is not the case for models with  
crossed

random effects.

Software packages designed for nested random effects can be tricked  
into
handling models with crossed random effects, but this kludge is  
slow and

really inefficient.

If you want complete transparency into the why and how, here is a
citation for your review.


Thank you very much. I'll read the paper and hopefully get the  
answers I was looking for.


Best,

Federico




Best
Harold

@article{Doran:Bates:Bliese:Dowling:2007:JSSOBK:v20i02,
  author =  Harold  Doran and Douglas  Bates and Paul  Bliese and
Maritza   Dowling,
  title =   Estimating the Multilevel Rasch Model: With the lme4
Package,
  journal = Journal of Statistical Software,
  volume =  20,
  number =  2,
  pages =   1--18,
  day = 22,
  month =   2,
  year =2007,
  CODEN =   JSSOBK,
  ISSN =1548-7660,
  bibdate = 2007-02-22,
  URL = http://www.jstatsoft.org/v20/i02;,
  accepted =2007-02-22,
  acknowledgement = ,
  keywords =,
  submitted =   2006-10-01,
}

___
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models


--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-sig-ME] lme nesting/interaction advice

2008-05-12 Thread Douglas Bates
I'm entering this discussion late so I may be discussing issues that
have already been addressed.

As I understand it, Federico, you began by describing a model for data
in which two factors have a fixed set of levels and one factor has an
extensible, or random, set of levels and you wanted to fit a model
that you described as

y ~ effect1 * effect2 * effect3

The problem is that this specification is not complete.  An
interaction of factors with fixed levels and a factor with random
levels can mean, in the lmer specification,

lmer(y ~ effect1 * effect2 + (1| effect3) + (1|effect1:effect2:effect3), ...)

or

lmer(y ~ effect1 * effect2 + (effect1*effect2 | effect3), ...)

or other variations.  When you specify a random effect or an random
interaction term you must, either explicitly or implicitly, specify
the form of the variance-covariance matrix associated with those
random effects.

The advantage that other software may provide for you is that it
chooses the model for you but that, of course, means that you only
have the one choice.

If you can describe how many variance components you think should be
estimated in your model and what they would represent then I think it
will be easier to describe how to fit the model.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-sig-ME] lme nesting/interaction advice

2008-05-12 Thread Federico Calboli

On 12 May 2008, at 17:09, Douglas Bates wrote:


I'm entering this discussion late so I may be discussing issues that
have already been addressed.

As I understand it, Federico, you began by describing a model for data
in which two factors have a fixed set of levels and one factor has an
extensible, or random, set of levels and you wanted to fit a model
that you described as

y ~ effect1 * effect2 * effect3

The problem is that this specification is not complete.


My apologies for that, I thought that the above formula was the  
shorthand for what I would call the 'full' model, i.e. the single  
factors and the 2 and 3 ways interactions.

An
interaction of factors with fixed levels and a factor with random
levels can mean, in the lmer specification,

lmer(y ~ effect1 * effect2 + (1| effect3) + (1| 
effect1:effect2:effect3), ...)


or

lmer(y ~ effect1 * effect2 + (effect1*effect2 | effect3), ...)

or other variations.  When you specify a random effect or an random
interaction term you must, either explicitly or implicitly, specify
the form of the variance-covariance matrix associated with those
random effects.


I'll play around with this and see what I can get.


The advantage that other software may provide for you is that it
chooses the model for you but that, of course, means that you only
have the one choice.


I'm more than happy to stick to R, and to put more legwork into my  
models


If you can describe how many variance components you think should be
estimated in your model and what they would represent then I think it
will be easier to describe how to fit the model.


I'll work on that. Incidentally, what/where is the most comprehensive  
and up to date documentation for lme4? the pdfs coming with the  
package? I suspect knowing which are the right docs will help a lot  
in keeping me within the boundaries of civility and prevent me from  
annoying anyone (which is not something I sent forth to do on purpose).


Best regards,

Federico

--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-sig-ME] lme nesting/interaction advice

2008-05-12 Thread Douglas Bates
On Mon, May 12, 2008 at 11:22 AM, Federico Calboli
[EMAIL PROTECTED] wrote:
 On 12 May 2008, at 17:09, Douglas Bates wrote:

 I'm entering this discussion late so I may be discussing issues that
 have already been addressed.

 As I understand it, Federico, you began by describing a model for data
 in which two factors have a fixed set of levels and one factor has an
 extensible, or random, set of levels and you wanted to fit a model
 that you described as

 y ~ effect1 * effect2 * effect3

 The problem is that this specification is not complete.

 My apologies for that, I thought that the above formula was the shorthand
 for what I would call the 'full' model, i.e. the single factors and the 2
 and 3 ways interactions.

As I indicated, the trick is that the interaction of a fixed factor
and a random factor can be defined in more than one way.

It sounds as if what you want is

lmer(y ~ factor1 * factor2 + (1|factor3) + (1|factor1:factor3) +
(1|factor2:factor3) + (1|factor1:factor2:factor3), ...)

but I'm not sure.

 An interaction of factors with fixed levels and a factor with random
 levels can mean, in the lmer specification,

 lmer(y ~ effect1 * effect2 + (1| effect3) + (1|effect1:effect2:effect3),
 ...)

 or

 lmer(y ~ effect1 * effect2 + (effect1*effect2 | effect3), ...)

 or other variations.  When you specify a random effect or an random
 interaction term you must, either explicitly or implicitly, specify
 the form of the variance-covariance matrix associated with those
 random effects.

 I'll play around with this and see what I can get.

 The advantage that other software may provide for you is that it
 chooses the model for you but that, of course, means that you only
 have the one choice.

 I'm more than happy to stick to R, and to put more legwork into my models

 If you can describe how many variance components you think should be
 estimated in your model and what they would represent then I think it
 will be easier to describe how to fit the model.

 I'll work on that. Incidentally, what/where is the most comprehensive and up
 to date documentation for lme4? the pdfs coming with the package? I suspect
 knowing which are the right docs will help a lot in keeping me within the
 boundaries of civility and prevent me from annoying anyone (which is not
 something I sent forth to do on purpose).

Documentation for lme4 is pretty sketchy at present.  I hope to remedy
that during our summer break.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-sig-ME] lme nesting/interaction advice

2008-05-12 Thread Rolf Turner


On 13/05/2008, at 4:09 AM, Douglas Bates wrote:


I'm entering this discussion late so I may be discussing issues that
have already been addressed.

As I understand it, Federico, you began by describing a model for data
in which two factors have a fixed set of levels and one factor has an
extensible, or random, set of levels and you wanted to fit a model
that you described as

y ~ effect1 * effect2 * effect3

The problem is that this specification is not complete.


At *last* (as Owl said to Rabbit) we're getting somewhere!!!

I always knew that there was some basic fundamental point
about this business that I (and I believe many others) were
simply missing.  But I could not for the life of me get anyone
to explain to me what that point was.  Or to put it another
way, I was never able to frame a question that would illuminate
just what it was that I wasn't getting.

I now may be at a stage where I can start asking the right
questions.


An interaction of factors with fixed levels and a factor with random
levels can mean, in the lmer specification,

lmer(y ~ effect1 * effect2 + (1| effect3) + (1| 
effect1:effect2:effect3), ...)


or

lmer(y ~ effect1 * effect2 + (effect1*effect2 | effect3), ...)

or other variations.  When you specify a random effect or an random
interaction term you must, either explicitly or implicitly, specify
the form of the variance-covariance matrix associated with those
random effects.

The advantage that other software may provide for you is that it
chooses the model for you but that, of course, means that you only
have the one choice.


Now may I start asking what I hope are questions that will lift
the fog a bit?

Let us for specificity consider a three-way model with two
fixed effects and one random effect from the good old Rothamstead style
agricultural experiment context:  Suppose we have a number of
species/breeds of wheat (say) and a number of fertilizers.
These are fixed effects.  And we have a number of fields (blocks)
--- a random effect.  Each breed-fertilizer combination is
applied a number of times in each field.  We ***assume*** that
that the field or block effect is homogeneous throughout.  This
may or may not be a ``good'' assumption, but it's not completely
ridiculous and would often be made in practice.  And probably
*was* made at Rothamstead.  The response would be something like
yield in bushels per acre.

The way that I would write the ``full'' model for this setting,
in mathematical style is:

Y_ijkl = mu + alpha_i + beta_j + (alpha.beta)_ij + C_k + (alpha.C)_ik
+ (beta.C)_jk + (alpha.beta.C)_ijk + E_ijkl

	The alpha_i and beta_j are parameters corresponding to breed and  
fertilizer
	respectively; the C_k are random effects corresponding to fields or  
blocks.

Any effect ``involving'' C is also random.

	The assumptions made by the Package-Which-Must-Not-Be-Named are (I  
think)

that

C_k ~ N(0,sigma_C^2)
(alpha.C)_ik ~ N(0,sigma_aC^2)
(beta.C)jk ~ N(0,sigma_bC^2)
(alpha.beta.C)_ijk ~ N(0,sigma_abC^2)
E_ijkl ~ N(0,sigma^2)

and these random variables are *all independent*.

A ... perhaps I'm on the way to answering my own question.  Is
it this assumption of ``all independent'' which is questionable?  It
seemed innocent enough when I first learned about this stuff, lo these
many years ago.  But  maybe not!

To start with:  What would be the lmer syntax to fit the foregoing
(possibly naive) model?  I am sorry, but I really cannot get my head
around the syntax of lmer model specification, and I've tried.  I
really have.  Hard.  I know I must be starting from the wrong place,
but I haven't a clue as to what the *right* place to start from is.
And if I'm in that boat, I will wager Euros to pretzels that there
are others in it.  I know that I'm not the brightest bulb in the
chandelier, but I'm not the dullest either.

Having got there:  Presuming that I'm more-or-less on the right track
	in my foregoing conjecture that it's the over-simple dependence  
structure
	that is the problem with what's delivered by the Package-Which-Must- 
Not-Be-Named,
	how might one go about being less simple-minded?  I.e. what might be  
some
	more realistic dependence structures, and how would one specify  
these in lmer?

And how would one assess whether the assumed dependence structure gives
a reasonable fit to the data?


If you can describe how many variance components you think should be
estimated in your model and what they would represent then I think it
will be easier to describe how to fit the model.


How