Re: [R] Chi square value of anova(binomialglmnull, binomglmmod, test="Chisq")

Marc Schwartz Wed, 06 Jun 2012 08:47:46 -0700

On Jun 6, 2012, at 9:36 AM, peter dalgaard wrote:

> 
> On Jun 6, 2012, at 10:59 , lincoln wrote:
> 
>> 
>> David Winsemius wrote
>>> 
>>> This is making me think you really have multiple observation on the  
>>> same individuals (and that persons make transitions from one state to  
>>> another as a result of the passage of time. That needs a more complex  
>>> analysis than "simple" logistic regression. You might consider posting  
>>> a more complete description of the study on the SIG Mixed Effects  
>>> mailing list.
>>> 
>>> -- 
>>> David.
>>> 
>> 
>> No, I haven't. Individuals are birds marked with an unique alphanumeric code
>> that gives me information on their gender (sometimes I have this data
>> sometime I haven't), and their birth date (as a consequence also the age).
>> There are no multiple observations of the same individual.
>> 
>> Anyway, I believe I have not been answered to the main question: when using
>> anova with test "Chisq" between two models, is the difference in deviance
>> between the two models interpretable as the Chi Square value and the
>> difference in df interpretable as the df of the Chi square test?
>> 
>> For instance, given:
>> 
>>> anova(mod4,update(mod4,~.-cohort),test="Chisq")
>> Analysis of Deviance Table
>> 
>> Model 1: site ~ cohort
>> Model 2: site ~ 1
>> Resid. Df Resid. Dev Df Deviance P(>|Chi|)    
>> 1       993     1283.7                          
>> 2      1002     1368.2 -9  -84.554 2.002e-14 ***
>> ---
>> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
>> 
>> Is 84.554 taken as the Chi square value, 9 as the df of the test and the
>> p-value depending on these two values?
> 
> That's the general mechanism, yes. (Whether the chi-square distribution holds 
> after variable selection is a more difficult issue. Frank Harrell might chime 
> in and remind us that there are books on that subject.)




Frank might be busy with useR preparations for next week...

Quoting from Frank's book "Regression Modeling Strategies", page 58, in the 
context of variable selection, stepwise methods and stopping rules:

"The residual $\chi^2$ can be tested for significance (if one is able to forget 
that because of variable selection this statistic does not have a $\chi^2$ 
distribution), or the stopping rule can be based on Akaike's information 
criterion (AIC), here residual $\chi^2$ - 2 x d.f. Of course, use of more 
insight from knowledge of the subject matter will generally improve the 
modeling process substantially. It must be remembered that no currently 
available stopping rule was developed for data driven variable selection. 
Stopping rules such as AIC or Mallows' $C_p$ are intended for comparing only 
two \emph{prespecified} models."


The entire chapter (4) discusses these issues in more detail and as Peter notes 
there are other books and papers that focus on the underlying issue of variable 
selection. As Frank is oft-quoted as saying:

"Variable selection is hazardous both to inference and to prediction. There is 
no free lunch; we are torturing data to confess its own sins."


Going back to Lincoln's prior post in the thread, presuming that there is 
sufficient data to use the original pre-specified model and also that the 
original full model itself was not derived from prior variable selection or 
univariate pre-screening:

  mod1 <- glm(site ~ sex + birth + cohort + sex:birth, data=datasex, family = 
binomial) 

I would recommend reviewing the likelihood ratio test for that model versus the 
null model:

  anova(mod1, test = "Chisq")

and determine whether or not 'cohort' was significant at some level there, 
rather than in the final reduced model. You might also want to consider using 
some of the tools in Frank's rms package on CRAN to further evaluate/validate 
that model.

Regards,

Marc Schwartz

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Chi square value of anova(binomialglmnull, binomglmmod, test="Chisq")

Reply via email to