For the record, I was overlooking the obvious point that the statistical 
significance of the coefficient indicates only that the confidence interval for 
the *difference* in predicted probabilities will not overlap zero. It does not 
indicate that the two confidence intervals I was originally calculating will 
not overlap each other.

In case it's useful to others, the code below generates a confidence interval 
for the difference, on the response (probability) scale.

- Malcolm

# CI by SE:
sapply(c(1.96,-1.96), function(a) plogis(sum(coef(mod1)) - 
a*summary(mod1)$coefficients[2,"Std. Error"]) - plogis(coef(mod1)[1]))

# CI by simulation:
quantile(plogis(sims%*%c(1,1)) - plogis(sims%*%c(1,0)), c(0, 0.025, 0.5, 0.975, 
1))



On 3 May 2012, at 19:10, Bert Gunter wrote:

> Your post suggests statistical confusion on several levels. But as
> this concerns statistics, not R, consult your local statistician or
> post on a statistical help list (e.g. stats.stackexchange.com).
> 
> Incidentally, the short answer to your question about the overlap is:
> why shouldn't they? You will hopefully receive a more informative
> longer answer from the above suggested (or similar) resources.
> 
> -- Bert
> 
> 
> On Thu, May 3, 2012 at 9:37 AM, Malcolm Fairbrother
> <m.fairbrot...@bristol.ac.uk> wrote:
>> Dear list,
>> 
>> I'm a bit perplexed why the 95% confidence bands for the predicted 
>> probabilities for units where x=0 and x=1 overlap in the following instance.
>> 
>> I've simulated binary data to which I've then fitted a simple logistic 
>> regression model, with one covariate, and the coefficient on x is 
>> statistically significant at the 0.05 level. I've then used two different 
>> methods to generate 95% confidence bands for predicted probabilities, for 
>> each of two possible values of x. Given the result of the model, I expected 
>> the bands not to overlap… but they do.
>> 
>> Can anyone please explain where I've gone wrong with the following code, OR 
>> why it makes sense for the bands to overlap, despite the statistically 
>> significant beta coefficient? There may be a good statistical reason for 
>> this, but I'm not aware of it.
>> 
>> Many thanks,
>> Malcolm Fairbrother
>> 
>> 
>> n <- 120
>> set.seed(030512)
>> x <- rbinom(n, 1, 0.5)
>> dat <- within(data.frame(x), ybe <- rbinom(n, 1, plogis(-0.5 + x)))
>> mod1 <- glm(ybe ~ x, dat, family=binomial)
>> summary(mod1) # coefficient on x is statistically significant at the 0.05 
>> level… almost at the 0.01 level
>> 
>> pred <- predict(mod1, newdata=data.frame(x=c(0,1)), se.fit=T)
>> with(pred, cbind(low = plogis(fit - 1.96*se.fit), est = plogis(fit), up = 
>> plogis(fit + 1.96*se.fit)))  # confidence bands based on SEs
>> 
>> # simulation-based confidence bands:
>> sims <- t(replicate(200, coef(glm(simulate(mod1)$sim_1 ~ x, data=dat, 
>> family=binomial))))
>> pred0 <- plogis(quantile(sims%*%c(1,0), c(0.025, 0.5, 0.975)))
>> pred1 <- plogis(quantile(sims%*%c(1,1), c(0.025, 0.5, 0.975)))
>> rbind(pred0, pred1)
>> 
>> # the upper bound of the prediction for x=0 is greater than the lower bound 
>> of the prediction for x=1, using both methods

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to