Re: [R] R] lme X lmer results

2006-01-01 Thread John Maindonald
 From a quick look at the paper in the SAS proceedings, the simulations
seem limited to nested designs.  The major problems are with repeated
measures designs where the error structure is not compound symmetric,
which lme4 does not at present handle (unless I have missed something).
Such imbalance as was investigated was not a serious issue, at least for
the Kenward and Roger degree of freedom calculations.

The paper ends by commenting that research should continue.  What
may be even more important is to educate users to think carefully about
any df that they are presented with, and to be especially sceptical when
designs are not approximately balanced nested designs and/or there are
repeated measures error structures that are not compound symmetric.

It is also necessary to consider how well the analysis reflects matters
on which there may be existing good evidence. Suppose in Ronaldo's
case that he'd previously run a number of experiments with very similar
plots and observation al units, and with comparable treatments and
outcome measures. If the plot 1 SD estimate (i.e., at the level of
experimental units) had never been larger than 0.01, with the SD for
observational units always in a range of 2 to 20, I'd take this as  
licence
to ignore the variance at the plot 1 level.  It would be nice to be  
able to
build in such prior information more formally, probably via a modified
version of mcmcsamp().

[Some people are never satisfied, You've written a great piece of
software, and users reward you by complaining that they want even
more!]

John Maindonald.


On 1 Jan 2006, at 10:00 PM, [EMAIL PROTECTED] wrote:

 From: Dave Atkins [EMAIL PROTECTED]
 Date: 1 January 2006 1:40:45 AM

 To: r-help@stat.math.ethz.ch
 Subject: Re: [R] lme X lmer results



 Message: 18
 Date: Fri, 30 Dec 2005 12:51:59 -0600
 From: Douglas Bates [EMAIL PROTECTED]
 Subject: Re: [R] lme X lmer results
 To: John Maindonald [EMAIL PROTECTED]
 Cc: r-help@stat.math.ethz.ch
 Message-ID:
   [EMAIL PROTECTED]
 Content-Type: text/plain; charset=ISO-8859-1

 On 12/29/05, John Maindonald [EMAIL PROTECTED] wrote:

  Surely there is a correct denominator degrees of freedom if the  
 design
  is balanced, as Ronaldo's design seems to be. Assuming that he has
  specified the design correctly to lme() and that lme() is  
 getting the df
  right, the difference is between 2 df and 878 df.  If the t- 
 statistic
  for the
  second level of Xvar had been 3.0 rather than 1.1, the difference
  would be between a t-statistic equal to 0.095 and 1e-6.  In a  
 design
  where there are 10 observations on each experimental unit, and all
  comparisons are at the level of experimental units or above, df for
  all comparisons will be inflated by a factor of at least 9.

 Doug Bates commented:

 I don't want to be obtuse and argumentative but I still am not
 convinced that there is a correct denominator degrees of freedom for
 _this_ F statistic.  I may be wrong about this but I think you are
 referring to an F statistic based on a denominator from a different
 error stratum, which is not what is being quoted.  (Those are not
 given because they don't generalize to unbalanced designs.)

 This is why I would like to see someone undertake a simulation study
 to compare various approaches to inference for the fixed effects terms
 in a mixed model, using realistic (i.e. unbalanced) examples.

 Doug--

 Here is a paper that focused on the various alternatives to  
 denominator degrees of freedom in SAS and does report some  
 simulation results:

 http://www2.sas.com/proceedings/sugi26/p262-26.pdf

 Not sure whether it argues convincingly one way or the other in the  
 present discussion.

 cheers, Dave

 -- 
 Dave Atkins, PhD
 [EMAIL PROTECTED]


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] lme X lmer results

2005-12-31 Thread Dave Atkins

Message: 18
Date: Fri, 30 Dec 2005 12:51:59 -0600
From: Douglas Bates [EMAIL PROTECTED]
Subject: Re: [R] lme X lmer results
To: John Maindonald [EMAIL PROTECTED]
Cc: r-help@stat.math.ethz.ch
Message-ID:
[EMAIL PROTECTED]
Content-Type: text/plain; charset=ISO-8859-1

On 12/29/05, John Maindonald [EMAIL PROTECTED] wrote:

  Surely there is a correct denominator degrees of freedom if the design
  is balanced, as Ronaldo's design seems to be. Assuming that he has
  specified the design correctly to lme() and that lme() is getting the df
  right, the difference is between 2 df and 878 df.  If the t-statistic
  for the
  second level of Xvar had been 3.0 rather than 1.1, the difference
  would be between a t-statistic equal to 0.095 and 1e-6.  In a design
  where there are 10 observations on each experimental unit, and all
  comparisons are at the level of experimental units or above, df for
  all comparisons will be inflated by a factor of at least 9.

Doug Bates commented:

I don't want to be obtuse and argumentative but I still am not
convinced that there is a correct denominator degrees of freedom for
_this_ F statistic.  I may be wrong about this but I think you are
referring to an F statistic based on a denominator from a different
error stratum, which is not what is being quoted.  (Those are not
given because they don't generalize to unbalanced designs.)

This is why I would like to see someone undertake a simulation study
to compare various approaches to inference for the fixed effects terms
in a mixed model, using realistic (i.e. unbalanced) examples.

Doug--

Here is a paper that focused on the various alternatives to denominator degrees 
of freedom in SAS and does report some simulation results:

http://www2.sas.com/proceedings/sugi26/p262-26.pdf

Not sure whether it argues convincingly one way or the other in the present 
discussion.

cheers, Dave

-- 
Dave Atkins, PhD
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] lme X lmer results

2005-12-31 Thread Peter Dalgaard
Dave Atkins [EMAIL PROTECTED] writes:

 Message: 18
 Date: Fri, 30 Dec 2005 12:51:59 -0600
 From: Douglas Bates [EMAIL PROTECTED]
 Subject: Re: [R] lme X lmer results
 To: John Maindonald [EMAIL PROTECTED]
 Cc: r-help@stat.math.ethz.ch
 Message-ID:
   [EMAIL PROTECTED]
 Content-Type: text/plain; charset=ISO-8859-1
 
 On 12/29/05, John Maindonald [EMAIL PROTECTED] wrote:
 
   Surely there is a correct denominator degrees of freedom if the design
   is balanced, as Ronaldo's design seems to be. Assuming that he has
   specified the design correctly to lme() and that lme() is getting the df
   right, the difference is between 2 df and 878 df.  If the t-statistic
   for the
   second level of Xvar had been 3.0 rather than 1.1, the difference
   would be between a t-statistic equal to 0.095 and 1e-6.  In a design
   where there are 10 observations on each experimental unit, and all
   comparisons are at the level of experimental units or above, df for
   all comparisons will be inflated by a factor of at least 9.
 
 Doug Bates commented:
 
 I don't want to be obtuse and argumentative but I still am not
 convinced that there is a correct denominator degrees of freedom for
 _this_ F statistic.  I may be wrong about this but I think you are
 referring to an F statistic based on a denominator from a different
 error stratum, which is not what is being quoted.  (Those are not
 given because they don't generalize to unbalanced designs.)
 
 This is why I would like to see someone undertake a simulation study
 to compare various approaches to inference for the fixed effects terms
 in a mixed model, using realistic (i.e. unbalanced) examples.
 
 Doug--
 
 Here is a paper that focused on the various alternatives to denominator 
 degrees 
 of freedom in SAS and does report some simulation results:
 
 http://www2.sas.com/proceedings/sugi26/p262-26.pdf
 
 Not sure whether it argues convincingly one way or the other in the present 
 discussion.
 
 cheers, Dave

It's certainly informative. I was not quite aware that what SAS calls
Satterthwaite could perform so poorly relative to KenwardRoger
(which to my mind is much closer in structure to a generalized
Satterthwaite approximation).

In either case I think it might be worth emphasizing that the issue is
not really whether you are testing at alpha=.050 or at alpha=.093 --
the corrections are likely to be so highly dependent on higher order
moments of the normal distribution to be wildly off in practice. The
important thing is that you get a low denominator DF value when you
are far from Asymptopia and any good statistician should know better
than to trust such tests. 

The nasty problem with the containment type methods is that they can
get things completely wrong and, e.g., give DFs on the order of the
number of observations where in truth (insofar as it exists) it should
be closer to the number of subjects in the study.

I don't think anyone, including Doug, would be opposed to including a
Kenward-Roger style DF calculation in lmer. It just has to be worked
out how to convert the calculations to work with the sparse-matrix,
penalized least squares techniques that it uses, and Doug himself has
his mind elsewhere. 

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] lme X lmer results

2005-12-31 Thread John Maindonald
 contribute
equally to sd1^2, and sd1^2 usually tracks quite closely to a
chi-squared distribution.

John Maindonald.


On 31 Dec 2005, at 5:51 AM, Douglas Bates wrote:

 On 12/29/05, John Maindonald [EMAIL PROTECTED] wrote:
 Surely there is a correct denominator degrees of freedom if the  
 design
 is balanced, as Ronaldo's design seems to be. Assuming that he has
 specified the design correctly to lme() and that lme() is getting  
 the df
 right, the difference is between 2 df and 878 df.  If the t-statistic
 for the
 second level of Xvar had been 3.0 rather than 1.1, the difference
 would be between a t-statistic equal to 0.095 and 1e-6.  In a design
 where there are 10 observations on each experimental unit, and all
 comparisons are at the level of experimental units or above, df for
 all comparisons will be inflated by a factor of at least 9.

 I don't want to be obtuse and argumentative but I still am not
 convinced that there is a correct denominator degrees of freedom for
 _this_ F statistic.  I may be wrong about this but I think you are
 referring to an F statistic based on a denominator from a different
 error stratum, which is not what is being quoted.  (Those are not
 given because they don't generalize to unbalanced designs.)

 This is why I would like to see someone undertake a simulation study
 to compare various approaches to inference for the fixed effects terms
 in a mixed model, using realistic (i.e. unbalanced) examples.

 It seems peculiar to me that the F statistics are being created from
 the ratios of mean squares for different terms to the _same_ mean
 square (actually a penalized sum of squares divided by the degrees of
 freedom) and the adjustment suggested to take into account the
 presence of the random effects is to change the denominator degrees of
 freedom.  I think the rationale for this is an attempt to generalized
 another approach (the use of error strata) even though it is not being
 used here.

 Rather than giving df that for the comparison(s) of interest may be
 highly inflated, I'd prefer to give no degrees of freedom at all,  
  to
 encourage users to work out df for themselves if at all possible.
 If they are not able to do this, then mcmcsamp() is a good  
 alternative,
 and may be the way to go in any case.  This has the further advantage
 of allowing assessments in cases where the relevant distribution is
 hard to get at. I'd think a warning in order that the df are upper
 bounds,
 and may be grossly inflated.

 As I said, I am willing to change this if it is shown to be grossly
 inaccurate but please show me.

 Incidentally, does mcmcsamp() do its calculations pretty well
 independently of the lmer results?

 mcmcsamp starts from the parameter estimates when creating the chain
 but that is the extent to which it depends on the lmer results.

 John Maindonald.

 On 29 Dec 2005, at 10:00 PM, [EMAIL PROTECTED] wrote:

 From: Douglas Bates [EMAIL PROTECTED]
 Date: 29 December 2005 5:59:07 AM
 To: Ronaldo Reis-Jr. [EMAIL PROTECTED]
 Cc: R-Help r-help@stat.math.ethz.ch
 Subject: Re: [R] lme X lmer results


 On 12/26/05, Ronaldo Reis-Jr. [EMAIL PROTECTED] wrote:
 Hi,

 this is not a new doubt, but is a doubt that I cant find a good
 response.

 Look this output:

 m.lme - lme(Yvar~Xvar,random=~1|Plot1/Plot2/Plot3)

 anova(m.lme)
 numDF denDF  F-value p-value
 (Intercept) 1   860 210.2457  .0001
 Xvar1 2   1.2352  0.3821
 summary(m.lme)
 Linear mixed-effects model fit by REML
  Data: NULL
   AIC  BIClogLik
   5416.59 5445.256 -2702.295

 Random effects:
  Formula: ~1 | Plot1
 (Intercept)
 StdDev: 0.000745924

  Formula: ~1 | Plot2 %in% Plot1
 (Intercept)
 StdDev: 0.000158718

  Formula: ~1 | Plot3 %in% Plot2 %in% Plot1
 (Intercept) Residual
 StdDev: 0.000196583 5.216954

 Fixed effects: Yvar ~ Xvar
Value Std.Error  DF  t-value p-value
 (Intercept)2.3545454 0.2487091 860 9.467066  0.
 XvarFactor20.3909091 0.3517278   2 1.111397  0.3821

 Number of Observations: 880
 Number of Groups:
  Plot1   Plot2 %in% Plot1
  4  8
Plot3 %in% Plot2 %in% Plot1
 20

 This is the correct result, de correct denDF for Xvar.

 I make this using lmer.

 m.lmer - lmer(Yvar~Xvar+(1|Plot1)+(1|Plot1:Plot2)+(1|Plot3))
 anova(m.lmer)
 Analysis of Variance Table
Df Sum Sq Mean Sq  Denom F value Pr(F)
 Xvar  1  33.62   33.62 878.00  1.2352 0.2667
 summary(m.lmer)
 Linear mixed-effects model fit by REML
 Formula: Yvar ~ Xvar + (1 | Plot1) + (1 | Plot1:Plot2) + (1 |  
 Plot3)
  AIC BIClogLik MLdeviance REMLdeviance
  5416.59 5445.27 -2702.295   5402.698  5404.59
 Random effects:
  GroupsNameVariance   Std.Dev.
  Plot3 (Intercept) 1.3608e-08 0.00011665
  Plot1:Plot2   (Intercept) 1.3608e-08 0.00011665
  Plot1 (Intercept) 1.3608e

Re: [R] lme X lmer results

2005-12-30 Thread Douglas Bates
On 12/29/05, John Maindonald [EMAIL PROTECTED] wrote:
 Surely there is a correct denominator degrees of freedom if the design
 is balanced, as Ronaldo's design seems to be. Assuming that he has
 specified the design correctly to lme() and that lme() is getting the df
 right, the difference is between 2 df and 878 df.  If the t-statistic
 for the
 second level of Xvar had been 3.0 rather than 1.1, the difference
 would be between a t-statistic equal to 0.095 and 1e-6.  In a design
 where there are 10 observations on each experimental unit, and all
 comparisons are at the level of experimental units or above, df for
 all comparisons will be inflated by a factor of at least 9.

I don't want to be obtuse and argumentative but I still am not
convinced that there is a correct denominator degrees of freedom for
_this_ F statistic.  I may be wrong about this but I think you are
referring to an F statistic based on a denominator from a different
error stratum, which is not what is being quoted.  (Those are not
given because they don't generalize to unbalanced designs.)

This is why I would like to see someone undertake a simulation study
to compare various approaches to inference for the fixed effects terms
in a mixed model, using realistic (i.e. unbalanced) examples.

It seems peculiar to me that the F statistics are being created from
the ratios of mean squares for different terms to the _same_ mean
square (actually a penalized sum of squares divided by the degrees of
freedom) and the adjustment suggested to take into account the
presence of the random effects is to change the denominator degrees of
freedom.  I think the rationale for this is an attempt to generalized
another approach (the use of error strata) even though it is not being
used here.

 Rather than giving df that for the comparison(s) of interest may be
 highly inflated, I'd prefer to give no degrees of freedom at all,  to
 encourage users to work out df for themselves if at all possible.
 If they are not able to do this, then mcmcsamp() is a good alternative,
 and may be the way to go in any case.  This has the further advantage
 of allowing assessments in cases where the relevant distribution is
 hard to get at. I'd think a warning in order that the df are upper
 bounds,
 and may be grossly inflated.

As I said, I am willing to change this if it is shown to be grossly
inaccurate but please show me.

 Incidentally, does mcmcsamp() do its calculations pretty well
 independently of the lmer results?

mcmcsamp starts from the parameter estimates when creating the chain
but that is the extent to which it depends on the lmer results.

 John Maindonald.

 On 29 Dec 2005, at 10:00 PM, [EMAIL PROTECTED] wrote:

  From: Douglas Bates [EMAIL PROTECTED]
  Date: 29 December 2005 5:59:07 AM
  To: Ronaldo Reis-Jr. [EMAIL PROTECTED]
  Cc: R-Help r-help@stat.math.ethz.ch
  Subject: Re: [R] lme X lmer results
 
 
  On 12/26/05, Ronaldo Reis-Jr. [EMAIL PROTECTED] wrote:
  Hi,
 
  this is not a new doubt, but is a doubt that I cant find a good
  response.
 
  Look this output:
 
  m.lme - lme(Yvar~Xvar,random=~1|Plot1/Plot2/Plot3)
 
  anova(m.lme)
  numDF denDF  F-value p-value
  (Intercept) 1   860 210.2457  .0001
  Xvar1 2   1.2352  0.3821
  summary(m.lme)
  Linear mixed-effects model fit by REML
   Data: NULL
AIC  BIClogLik
5416.59 5445.256 -2702.295
 
  Random effects:
   Formula: ~1 | Plot1
  (Intercept)
  StdDev: 0.000745924
 
   Formula: ~1 | Plot2 %in% Plot1
  (Intercept)
  StdDev: 0.000158718
 
   Formula: ~1 | Plot3 %in% Plot2 %in% Plot1
  (Intercept) Residual
  StdDev: 0.000196583 5.216954
 
  Fixed effects: Yvar ~ Xvar
 Value Std.Error  DF  t-value p-value
  (Intercept)2.3545454 0.2487091 860 9.467066  0.
  XvarFactor20.3909091 0.3517278   2 1.111397  0.3821
 
  Number of Observations: 880
  Number of Groups:
   Plot1   Plot2 %in% Plot1
   4  8
 Plot3 %in% Plot2 %in% Plot1
  20
 
  This is the correct result, de correct denDF for Xvar.
 
  I make this using lmer.
 
  m.lmer - lmer(Yvar~Xvar+(1|Plot1)+(1|Plot1:Plot2)+(1|Plot3))
  anova(m.lmer)
  Analysis of Variance Table
 Df Sum Sq Mean Sq  Denom F value Pr(F)
  Xvar  1  33.62   33.62 878.00  1.2352 0.2667
  summary(m.lmer)
  Linear mixed-effects model fit by REML
  Formula: Yvar ~ Xvar + (1 | Plot1) + (1 | Plot1:Plot2) + (1 | Plot3)
   AIC BIClogLik MLdeviance REMLdeviance
   5416.59 5445.27 -2702.295   5402.698  5404.59
  Random effects:
   GroupsNameVariance   Std.Dev.
   Plot3 (Intercept) 1.3608e-08 0.00011665
   Plot1:Plot2   (Intercept) 1.3608e-08 0.00011665
   Plot1 (Intercept) 1.3608e-08 0.00011665
   Residual  2.7217e+01 5.21695390
  # of obs: 880, groups: Plot3, 20; Plot1:Plot2, 8; Plot1, 4

[R] lme X lmer results

2005-12-29 Thread John Maindonald
Surely there is a correct denominator degrees of freedom if the design
is balanced, as Ronaldo's design seems to be. Assuming that he has
specified the design correctly to lme() and that lme() is getting the df
right, the difference is between 2 df and 878 df.  If the t-statistic  
for the
second level of Xvar had been 3.0 rather than 1.1, the difference
would be between a t-statistic equal to 0.095 and 1e-6.  In a design
where there are 10 observations on each experimental unit, and all
comparisons are at the level of experimental units or above, df for
all comparisons will be inflated by a factor of at least 9.

Rather than giving df that for the comparison(s) of interest may be
highly inflated, I'd prefer to give no degrees of freedom at all,  to
encourage users to work out df for themselves if at all possible.
If they are not able to do this, then mcmcsamp() is a good alternative,
and may be the way to go in any case.  This has the further advantage
of allowing assessments in cases where the relevant distribution is
hard to get at. I'd think a warning in order that the df are upper  
bounds,
and may be grossly inflated.

Incidentally, does mcmcsamp() do its calculations pretty well
independently of the lmer results?

John Maindonald.

On 29 Dec 2005, at 10:00 PM, [EMAIL PROTECTED] wrote:

 From: Douglas Bates [EMAIL PROTECTED]
 Date: 29 December 2005 5:59:07 AM
 To: Ronaldo Reis-Jr. [EMAIL PROTECTED]
 Cc: R-Help r-help@stat.math.ethz.ch
 Subject: Re: [R] lme X lmer results


 On 12/26/05, Ronaldo Reis-Jr. [EMAIL PROTECTED] wrote:
 Hi,

 this is not a new doubt, but is a doubt that I cant find a good  
 response.

 Look this output:

 m.lme - lme(Yvar~Xvar,random=~1|Plot1/Plot2/Plot3)

 anova(m.lme)
 numDF denDF  F-value p-value
 (Intercept) 1   860 210.2457  .0001
 Xvar1 2   1.2352  0.3821
 summary(m.lme)
 Linear mixed-effects model fit by REML
  Data: NULL
   AIC  BIClogLik
   5416.59 5445.256 -2702.295

 Random effects:
  Formula: ~1 | Plot1
 (Intercept)
 StdDev: 0.000745924

  Formula: ~1 | Plot2 %in% Plot1
 (Intercept)
 StdDev: 0.000158718

  Formula: ~1 | Plot3 %in% Plot2 %in% Plot1
 (Intercept) Residual
 StdDev: 0.000196583 5.216954

 Fixed effects: Yvar ~ Xvar
Value Std.Error  DF  t-value p-value
 (Intercept)2.3545454 0.2487091 860 9.467066  0.
 XvarFactor20.3909091 0.3517278   2 1.111397  0.3821

 Number of Observations: 880
 Number of Groups:
  Plot1   Plot2 %in% Plot1
  4  8
Plot3 %in% Plot2 %in% Plot1
 20

 This is the correct result, de correct denDF for Xvar.

 I make this using lmer.

 m.lmer - lmer(Yvar~Xvar+(1|Plot1)+(1|Plot1:Plot2)+(1|Plot3))
 anova(m.lmer)
 Analysis of Variance Table
Df Sum Sq Mean Sq  Denom F value Pr(F)
 Xvar  1  33.62   33.62 878.00  1.2352 0.2667
 summary(m.lmer)
 Linear mixed-effects model fit by REML
 Formula: Yvar ~ Xvar + (1 | Plot1) + (1 | Plot1:Plot2) + (1 | Plot3)
  AIC BIClogLik MLdeviance REMLdeviance
  5416.59 5445.27 -2702.295   5402.698  5404.59
 Random effects:
  GroupsNameVariance   Std.Dev.
  Plot3 (Intercept) 1.3608e-08 0.00011665
  Plot1:Plot2   (Intercept) 1.3608e-08 0.00011665
  Plot1 (Intercept) 1.3608e-08 0.00011665
  Residual  2.7217e+01 5.21695390
 # of obs: 880, groups: Plot3, 20; Plot1:Plot2, 8; Plot1, 4

 Fixed effects:
 Estimate Std. Error  DF t value Pr(|t|)
 (Intercept)  2.354550.24871 878  9.4671   2e-16 ***
 XvarFactor2  0.390910.35173 878  1.1114   0.2667
 ---
 Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

 Look the wrong P value, I know that it is wrong because the DF  
 used. But, In
 this case, the result is not correct. Dont have any difference of  
 the result
 using random effects with lmer and using a simple analyses with lm.

 You are assuming that there is a correct value of the denominator
 degrees of freedom.  I don't believe there is.  The statistic that is
 quoted there doesn't have exactly an F distribution so there is no
 correct degrees of freedom.

 One thing you can do with lmer is to form a Markov Chain Monte Carlo
 sample from the posterior distribution of the parameters so you can
 check to see whether the value of zero is in the middle of the
 distribution of XvarFactor2 or not.

 It would be possible for me to recreate in lmer the rules used in lme
 for calculating denominator degrees of freedom associated with terms
 of the random effects.  However, the class of models fit by lmer is
 larger than the class of models fit by lme (at least as far as the
 structure of the random-effects terms goes).  In particular lmer
 allows for random effects associated with crossed or partially crossed
 grouping factors and the rules for denominator degrees of freedom in
 lme only

Re: [R] lme X lmer results

2005-12-28 Thread Douglas Bates
On 12/26/05, Ronaldo Reis-Jr. [EMAIL PROTECTED] wrote:
 Hi,

 this is not a new doubt, but is a doubt that I cant find a good response.

 Look this output:

  m.lme - lme(Yvar~Xvar,random=~1|Plot1/Plot2/Plot3)

  anova(m.lme)
 numDF denDF  F-value p-value
 (Intercept) 1   860 210.2457  .0001
 Xvar1 2   1.2352  0.3821
  summary(m.lme)
 Linear mixed-effects model fit by REML
  Data: NULL
   AIC  BIClogLik
   5416.59 5445.256 -2702.295

 Random effects:
  Formula: ~1 | Plot1
 (Intercept)
 StdDev: 0.000745924

  Formula: ~1 | Plot2 %in% Plot1
 (Intercept)
 StdDev: 0.000158718

  Formula: ~1 | Plot3 %in% Plot2 %in% Plot1
 (Intercept) Residual
 StdDev: 0.000196583 5.216954

 Fixed effects: Yvar ~ Xvar
Value Std.Error  DF  t-value p-value
 (Intercept)2.3545454 0.2487091 860 9.467066  0.
 XvarFactor20.3909091 0.3517278   2 1.111397  0.3821

 Number of Observations: 880
 Number of Groups:
  Plot1   Plot2 %in% Plot1
  4  8
Plot3 %in% Plot2 %in% Plot1
 20

 This is the correct result, de correct denDF for Xvar.

 I make this using lmer.

  m.lmer - lmer(Yvar~Xvar+(1|Plot1)+(1|Plot1:Plot2)+(1|Plot3))
  anova(m.lmer)
 Analysis of Variance Table
Df Sum Sq Mean Sq  Denom F value Pr(F)
 Xvar  1  33.62   33.62 878.00  1.2352 0.2667
  summary(m.lmer)
 Linear mixed-effects model fit by REML
 Formula: Yvar ~ Xvar + (1 | Plot1) + (1 | Plot1:Plot2) + (1 | Plot3)
  AIC BIClogLik MLdeviance REMLdeviance
  5416.59 5445.27 -2702.295   5402.698  5404.59
 Random effects:
  GroupsNameVariance   Std.Dev.
  Plot3 (Intercept) 1.3608e-08 0.00011665
  Plot1:Plot2   (Intercept) 1.3608e-08 0.00011665
  Plot1 (Intercept) 1.3608e-08 0.00011665
  Residual  2.7217e+01 5.21695390
 # of obs: 880, groups: Plot3, 20; Plot1:Plot2, 8; Plot1, 4

 Fixed effects:
 Estimate Std. Error  DF t value Pr(|t|)
 (Intercept)  2.354550.24871 878  9.4671   2e-16 ***
 XvarFactor2  0.390910.35173 878  1.1114   0.2667
 ---
 Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

 Look the wrong P value, I know that it is wrong because the DF used. But, In
 this case, the result is not correct. Dont have any difference of the result
 using random effects with lmer and using a simple analyses with lm.

You are assuming that there is a correct value of the denominator
degrees of freedom.  I don't believe there is.  The statistic that is
quoted there doesn't have exactly an F distribution so there is no
correct degrees of freedom.

One thing you can do with lmer is to form a Markov Chain Monte Carlo
sample from the posterior distribution of the parameters so you can
check to see whether the value of zero is in the middle of the
distribution of XvarFactor2 or not.

It would be possible for me to recreate in lmer the rules used in lme
for calculating denominator degrees of freedom associated with terms
of the random effects.  However, the class of models fit by lmer is
larger than the class of models fit by lme (at least as far as the
structure of the random-effects terms goes).  In particular lmer
allows for random effects associated with crossed or partially crossed
grouping factors and the rules for denominator degrees of freedom in
lme only apply cleanly to nested grouping factors.  I would prefer to
have a set of rules that would apply to the general case.

Right now I would prefer to devote my time to other aspects of lmer -
in particular I am still working on code for generalized linear mixed
models using a supernodal Cholesky factorization.  I am willing to put
this aside and code up the rules for denominator degrees of freedom
with nested grouping factors BUT first I want someone to show me an
example demonstrating that there really is a problem.  The example
must show that the p-value calculated in the anova table or the
parameter estimates table for lmer is seriously wrong compared to an
empirical p-value - obtained from simulation under the null
distribution or through MCMC sampling or something like that.  Saying
that Software XYZ says there are n denominator d.f. and lmer says
there are m does NOT count as an example.  I will readily concede
that the denominator degrees of freedom reported by lmer are wrong but
so are the degrees of freedom reported by Software XYZ because there
is no right answer (in general - in a few simple balanced designs
there may be a right answer).


  m.lm - lm(Yvar~Xvar)
 
  anova(m.lm)
 Analysis of Variance Table

 Response: Nadultos
 Df  Sum Sq Mean Sq F value Pr(F)
 Xvar 133.633.6  1.2352 0.2667
 Residuals  878 23896.227.2
 
  summary(m.lm)

 Call:
 lm(formula = Yvar ~ Xvar)

 Residuals:
 Min  1Q  Median  3Q Max
 -2.7455 -2.3545 -1.7455  0.2545 

Re: [R] lme X lmer results

2005-12-28 Thread Douglas Bates
On 12/28/05, Douglas Bates [EMAIL PROTECTED] miswrote:
 It would be possible for me to recreate in lmer the rules used in lme
 for calculating denominator degrees of freedom associated with terms
 of the random effects.

I should have written fixed effects, not random effects at the end
of that sentence.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] lme X lmer results

2005-12-26 Thread Ronaldo Reis-Jr.
Hi,

this is not a new doubt, but is a doubt that I cant find a good response.

Look this output:

 m.lme - lme(Yvar~Xvar,random=~1|Plot1/Plot2/Plot3)

 anova(m.lme)
numDF denDF  F-value p-value
(Intercept) 1   860 210.2457  .0001
Xvar1 2   1.2352  0.3821
 summary(m.lme)
Linear mixed-effects model fit by REML
 Data: NULL 
  AIC  BIClogLik
  5416.59 5445.256 -2702.295

Random effects:
 Formula: ~1 | Plot1
(Intercept)
StdDev: 0.000745924

 Formula: ~1 | Plot2 %in% Plot1
(Intercept)
StdDev: 0.000158718

 Formula: ~1 | Plot3 %in% Plot2 %in% Plot1
(Intercept) Residual
StdDev: 0.000196583 5.216954

Fixed effects: Yvar ~ Xvar
   Value Std.Error  DF  t-value p-value
(Intercept)2.3545454 0.2487091 860 9.467066  0.
XvarFactor20.3909091 0.3517278   2 1.111397  0.3821

Number of Observations: 880
Number of Groups: 
 Plot1   Plot2 %in% Plot1 
 4  8 
   Plot3 %in% Plot2 %in% Plot1 
20 

This is the correct result, de correct denDF for Xvar.

I make this using lmer.

 m.lmer - lmer(Yvar~Xvar+(1|Plot1)+(1|Plot1:Plot2)+(1|Plot3))
 anova(m.lmer)
Analysis of Variance Table
   Df Sum Sq Mean Sq  Denom F value Pr(F)
Xvar  1  33.62   33.62 878.00  1.2352 0.2667
 summary(m.lmer)
Linear mixed-effects model fit by REML
Formula: Yvar ~ Xvar + (1 | Plot1) + (1 | Plot1:Plot2) + (1 | Plot3) 
 AIC BIClogLik MLdeviance REMLdeviance
 5416.59 5445.27 -2702.295   5402.698  5404.59
Random effects:
 GroupsNameVariance   Std.Dev.  
 Plot3 (Intercept) 1.3608e-08 0.00011665
 Plot1:Plot2   (Intercept) 1.3608e-08 0.00011665
 Plot1 (Intercept) 1.3608e-08 0.00011665
 Residual  2.7217e+01 5.21695390
# of obs: 880, groups: Plot3, 20; Plot1:Plot2, 8; Plot1, 4

Fixed effects:
Estimate Std. Error  DF t value Pr(|t|)
(Intercept)  2.354550.24871 878  9.4671   2e-16 ***
XvarFactor2  0.390910.35173 878  1.1114   0.2667
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Look the wrong P value, I know that it is wrong because the DF used. But, In 
this case, the result is not correct. Dont have any difference of the result 
using random effects with lmer and using a simple analyses with lm.

 m.lm - lm(Yvar~Xvar)
 
 anova(m.lm)
Analysis of Variance Table

Response: Nadultos
Df  Sum Sq Mean Sq F value Pr(F)
Xvar 133.633.6  1.2352 0.2667
Residuals  878 23896.227.2   
 
 summary(m.lm)

Call:
lm(formula = Yvar ~ Xvar)

Residuals:
Min  1Q  Median  3Q Max 
-2.7455 -2.3545 -1.7455  0.2545 69.6455 

Coefficients:
   Estimate Std. Error t value Pr(|t|)
(Intercept)  2.3545 0.2487   9.467   2e-16 ***
XvarFactor2  0.3909 0.3517   1.1110.267
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Residual standard error: 5.217 on 878 degrees of freedom
Multiple R-Squared: 0.001405,   Adjusted R-squared: 0.0002675 
F-statistic: 1.235 on 1 and 878 DF,  p-value: 0.2667 

I read the rnews about this use of the full DF in lmer, but I dont undestand 
this use with a gaussian error, I undestand this with glm data.

I need more explanations, please.

Thanks
Ronaldo
--
|   // | \\   [***]
|   ( õ   õ )  [Ronaldo Reis Júnior]
|  V  [UFV/DBA-Entomologia]
|/ \   [36570-000 Viçosa - MG  ]
|  /(.''`.)\  [Fone: 31-3899-4007 ]
|  /(: :'  :)\ [EMAIL PROTECTED]]
|/ (`. `'` ) \[ICQ#: 5692561 | LinuxUser#: 205366 ]
|( `-  )   [***]
|  _/   \_Powered by GNU/Debian Woody/Sarge

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html