On 12/01/2011 08:00 PM, Ben quant wrote:
The data I am using is the last file called l_yx.RData at this link (the
second file contains the plots from earlier):
http://scientia.crescat.net/static/ben/
The logistic regression model you are fitting assumes a linear
relationship between x and the log odds of y; that does not seem to be
the case for your data. To illustrate:
x <- l_yx[,"x"]
y <- l_yx[,"y"]
ind1 <- x <= .002
ind2 <- (x > .002 & x <= .0065)
ind3 <- (x > .0065 & x <= .13)
ind4 <- (x > .0065 & x <= .13)
> summary(glm(y[ind1]~x[ind1],family=binomial))
...
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.79174 0.02633 -106.03 <2e-16 ***
x[ind1] 354.98852 22.78190 15.58 <2e-16 ***
> summary(glm(y[ind2]~x[ind2],family=binomial))
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.15805 0.02966 -72.766 <2e-16 ***
x[ind2] -59.92934 6.51650 -9.197 <2e-16 ***
> summary(glm(y[ind3]~x[ind3],family=binomial))
...
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.367206 0.007781 -304.22 <2e-16 ***
x[ind3] 18.104314 0.346562 52.24 <2e-16 ***
> summary(glm(y[ind4]~x[ind4],family=binomial))
...
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.31511 0.08549 -15.383 <2e-16 ***
x[ind4] 0.06261 0.08784 0.713 0.476
To summarize, the relationship between x and the log odds of y appears
to vary dramatically in both magnitude and direction depending on which
interval of x's range we're looking at. Trying to summarize this
complicated pattern with a single line is leading to the fitted
probabilities near 0 and 1 you are observing (note that only 0.1% of the
data is in region 4 above, although region 4 accounts for 99.1% of the
range of x).
--
Patrick Breheny
Assistant Professor
Department of Biostatistics
Department of Statistics
University of Kentucky
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.