[R] confidence interval for the logit - predict.glm

Christine Sonvilla Wed, 28 May 2008 04:21:24 -0700

Hello all,

I've come across an online posting 
 
http://www.biostat.wustl.edu/archives/html/s-news/2001-10/msg00119.html


that described how to get confidence intervals for predicted values from 
predict.glm. These instructions were meant for S-Plus. Yet, it generally seems 
to work with R too, but I am encountering some problems. I am explaining my 
procedure in the following and would be most grateful if anyone could reply to 
me on that.

I have run glm models to predict fish species from a number of environmental 
predictors, using a stepwise approach (I am aware that there is some 
controversy going on about this). I have got a data set that is called 
"testfishes.txt", "predy.txt" contains the predictors for the testfishes only 
and "predallx.txt" contains all predictors for all planning units. 

I used the following commands: "stepmodfi" is the output of the stepwise 
approach

predictionlogit <- predict(stepmodfi, predallx, type="link", se.fit=TRUE)

upperlogit <- predictionlogit$fit + 2*predictionlogit$se.fit
lowerlogit <- predictionlogit$fit - 2*predictionlogit$se.fit

upperresponse[,i] <- exp(upperlogit)/(1+exp(upperlogit))
lowerresponse[,i] <- exp(lowerlogit)/(1+exp(lowerlogit))

predictionresponse <- predict(stepmodfi, predallx, type="response", 
se.fit=FALSE)
fishoccur[,i] <- predictionresponse


type="link" should be the equivalent to the S-Plus version of type="1", which 
was indicated in the online posting explaining this procedure.

So I first tried to get predictions on the logit scale and there is already the 
point where I am a bit bewildered. Predictionlogit would only result in ONE 
value for every single planning unit contained in predallx.txt, when actually I 
would assume that there would be "planning unit times fishes" predicted values 
- at least this is what I get from predictionresponse. 
Predictionresponse generates as many predicted values as there are planning 
units times fishes (1680 planning units x 205 fish species, whihc are predicted 
over all these planning units), which I put into a constructed matrix named 
"fishoccur".
In fact this is what I did in the very beginning, generating predicted values 
with the command predictionresponse <- predict(stepmodfi, predallx, 
type="response", se.fit=FALSE).
Then I wanted to construct confidence intervals around these values.
Therefore I first generated the logit predicted values, but that already is the 
point I am unsure about.

The problem in fact is, that quite some of my upper values of the confidence 
interval result in NAs. The lower interval seems fine. Yet, I am not even sure 
if my approach is correct, regarding this single value that is being produced 
by the prediction on the logit scale.

I hope that my descriptions were clear enough and would greatly appreciate if 
anyone could suggest how to tackle this, whether the approach itself is fine 
(which I believe indeed) and how to get proper lower and upper confidence 
values.

Many thanks in advance!

Christine

-- 
Super-Aktion nur in der GMX Spieleflat: 10 Tage für 1 Euro.
Über 180 Spiele downloaden: http://flat.games.gmx.de

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] confidence interval for the logit - predict.glm

Reply via email to