Re: [R] Goodness of fit of binary logistic model

Frank Harrell Sat, 06 Aug 2011 07:28:49 -0700

Exactly right Peter.  Thanks.

There should be some way for me to detect such situations so as to not
result in an impressive P-value.  Ideas welcomed!


This is a great example why users should post a toy example on the first
posting, as we can immediately see that this model MUST fit the data, so 
that any evidence for lack of fit has to be misleading.

Frank

Peter Dalgaard-2 wrote:
> 
> On Aug 5, 2011, at 23:16 , Paul Smith wrote:
> 
>> Thanks, Frank. The following piece of code generate data, which
>> exhibit the problem I reported:
>> 
>> -----------------------------------------
>> set.seed(123)
>> intercept = -1.32
>> beta = 1.36
>> xtest = rbinom(1000,1,0.5)
>> linpred = intercept + xtest*beta
>> prob = exp(linpred)/(1 + exp(linpred))
>> runis = runif(1000,0,1)
>> ytest = ifelse(runis < prob,1,0)
>> xtest <- as.factor(xtest)
>> ytest <- as.factor(ytest)
>> require(rms)
>> model <- lrm(ytest ~ xtest,x=T,y=T)
>> model
>> residuals.lrm(model,'gof')
>> -----------------------------------------
> 
> Basically, what you have is zero divided by zero, except that floating
> point inaccuracy turns it into the ratio of two small numbers. So the Z
> statistic is effectively rubbish.
> This comes about because the SSE minus its expectation has effectively
> zero variance, which makes it rather useless for testing whether the model
> fits.
> 
> Since the model is basically a full model for a 2x2 table, it is not
> surprising to me that "goodness of fit" tests behave poorly. In fact, I
> would conjecture that no sensible g.o.f. test exists for that case.
> 
>> 
>> Paul
>> 
>> 
>> On Fri, Aug 5, 2011 at 7:58 PM, Frank Harrell
>> &lt;f.harr...@vanderbilt.edu&gt; wrote:
>>> Please provide the data or better the R code for simulating the data
>>> that
>>> shows the problem.  Then we can look further into this.
>>> Frank
>>> 
>>> -----
>>> Frank Harrell
>>> Department of Biostatistics, Vanderbilt University
>>> --
>>> View this message in context:
>>> http://r.789695.n4.nabble.com/Goodness-of-fit-of-binary-logistic-model-tp3721242p3721997.html
>>> Sent from the R help mailing list archive at Nabble.com.
>>> 
>>> ______________________________________________
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>> 
>> 
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> -- 
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Email: pd....@cbs.dk  Priv: pda...@gmail.com
> "Døden skal tape!" --- Nordahl Grieg
> 
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 


-----
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: 
http://r.789695.n4.nabble.com/Goodness-of-fit-of-binary-logistic-model-tp3721242p3723388.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Goodness of fit of binary logistic model

Reply via email to