Re: [R] gam - Y axis probability scale with confidence/error lines
I find plogis() is easier to remember all.equal(binomial()$linkinv(seq(-2, 2, length = 101)), plogis(seq(-2, 2, length = 101))) ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie Kwaliteitszorg / team Biometrics Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium + 32 2 525 02 51 + 32 54 43 61 85 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Ben quant Verzonden: woensdag 14 maart 2012 19:48 Aan: Patrick Breheny CC: r-help@r-project.org Onderwerp: Re: [R] gam - Y axis probability scale with confidence/error lines Thank you. The binomial()$linkinv() is good to know. Ben On Wed, Mar 14, 2012 at 12:23 PM, Patrick Breheny patrick.breh...@uky.eduwrote: Actually, I responded a bit too quickly last time, without really reading through your example carefully. You're fitting a logistic regression model and plotting the results on the probability scale. The better way to do what you propose is to obtain the confidence interval on the scale of the linear predictor and then transform to the probability scale, as in: x - seq(0,1,by=.01) y - rbinom(length(x),size=1,p=x) require(gam) fit - gam(y~s(x),family=binomial) pred - predict(fit,se.fit=TRUE) yy - binomial()$linkinv(pred$fit) l - binomial()$linkinv(pred$fit-1.**96*pred$se.fit) u - binomial()$linkinv(pred$fit+1.**96*pred$se.fit) plot(x,yy,type=l) lines(x,l,lty=2) lines(x,u,lty=2) -- Patrick Breheny Assistant Professor Department of Biostatistics Department of Statistics University of Kentucky On 03/14/2012 01:49 PM, Ben quant wrote: That was embarrassingly easy. Thanks again Patrick! Just correcting a little typo to his reply. this is probably what he meant: pred = predict(fit,data.frame(x=xx),**type=response,se.fit=TRUE) upper = pred$fit + 1.96 * pred$se.fit lower = pred$fit - 1.96 * pred$se.fit # For people who are interested this is how you plot it line by line: plot(xx,pred$fit,type=l,**xlab=fd$getFactorName(),ylab=**ylab,ylim= c(min(down),max(up))) lines(xx,upper,type=l,lty='**dashed') lines(xx,lower,type=l,lty='**dashed') In my opinion this is only important if the desired y axis is different than what plot(fit) gives you for a gam fit (i.e fit - gam(...stuff...)) and you want to plot the confidence intervals. thanks again! Ben On Wed, Mar 14, 2012 at 10:39 AM, Patrick Breheny patrick.breh...@uky.edu mailto:patrick.breheny@uky.**edupatrick.breh...@uky.edu wrote: The predict() function has an option 'se.fit' that returns what you are asking for. If you set this equal to TRUE in your code: pred - predict(fit,data.frame(x=xx),_**_type=response,se.fit=TRUE) will return a list with two elements, 'fit' and 'se.fit'. The pointwise confidence intervals will then be pred$fit + 1.96*se.fit pred$fit - 1.96*se.fit for 95% confidence intervals (replace 1.96 with the appropriate quantile of the normal distribution for other confidence levels). You can then do whatever stuff you want to do with them, including plot them. --Patrick On 03/14/2012 10:48 AM, Ben quant wrote: Hello, How do I plot a gam fit object on probability (Y axis) vs raw values (X axis) axis and include the confidence plot lines? Details... I'm using the gam function like this: l_yx[,2] = log(l_yx[,2] + .0004) fit- gam(y~s(x),data=as.data.frame(**__l_yx),family=binomial) And I want to plot it so that probability is on the Y axis and values are on the X axis (i.e. I don't want log likelihood on the Y axis or the log of my values on my X axis): xx- seq(min(l_yx[,2]),max(l_yx[,2]**__),len=101) plot(xx,predict(fit,data.__**frame(x=xx),type=response),_** _type=l,xaxt=n,xlab=**Churn__,ylab=P(Top Performer)) at- c(.001,.01,.1,1,10) #-- I'd also like to generalize this rather than hard code the numbers axis(1,at=log(at+ .0004),label=at) So far, using the code above, everything looks the way I want. But that does not give me anything information on variability/confidence/__**certainty. How do I get the dash plots from this: plot(fit) ...on the same scales as above? Related question: how do get the dashed values out of the fit
[R] gam - Y axis probability scale with confidence/error lines
Hello, How do I plot a gam fit object on probability (Y axis) vs raw values (X axis) axis and include the confidence plot lines? Details... I'm using the gam function like this: l_yx[,2] = log(l_yx[,2] + .0004) fit - gam(y~s(x),data=as.data.frame(l_yx),family=binomial) And I want to plot it so that probability is on the Y axis and values are on the X axis (i.e. I don't want log likelihood on the Y axis or the log of my values on my X axis): xx - seq(min(l_yx[,2]),max(l_yx[,2]),len=101) plot(xx,predict(fit,data.frame(x=xx),type=response),type=l,xaxt=n,xlab=Churn,ylab=P(Top Performer)) at - c(.001,.01,.1,1,10) # -- I'd also like to generalize this rather than hard code the numbers axis(1,at=log(at+ .0004),label=at) So far, using the code above, everything looks the way I want. But that does not give me anything information on variability/confidence/certainty. How do I get the dash plots from this: plot(fit) ...on the same scales as above? Related question: how do get the dashed values out of the fit object so I can do 'stuff' with it? Thanks, Ben PS - thank you Patrick for your help previously. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gam - Y axis probability scale with confidence/error lines
The predict() function has an option 'se.fit' that returns what you are asking for. If you set this equal to TRUE in your code: pred - predict(fit,data.frame(x=xx),type=response,se.fit=TRUE) will return a list with two elements, 'fit' and 'se.fit'. The pointwise confidence intervals will then be pred$fit + 1.96*se.fit pred$fit - 1.96*se.fit for 95% confidence intervals (replace 1.96 with the appropriate quantile of the normal distribution for other confidence levels). You can then do whatever stuff you want to do with them, including plot them. --Patrick On 03/14/2012 10:48 AM, Ben quant wrote: Hello, How do I plot a gam fit object on probability (Y axis) vs raw values (X axis) axis and include the confidence plot lines? Details... I'm using the gam function like this: l_yx[,2] = log(l_yx[,2] + .0004) fit- gam(y~s(x),data=as.data.frame(l_yx),family=binomial) And I want to plot it so that probability is on the Y axis and values are on the X axis (i.e. I don't want log likelihood on the Y axis or the log of my values on my X axis): xx- seq(min(l_yx[,2]),max(l_yx[,2]),len=101) plot(xx,predict(fit,data.frame(x=xx),type=response),type=l,xaxt=n,xlab=Churn,ylab=P(Top Performer)) at- c(.001,.01,.1,1,10) #-- I'd also like to generalize this rather than hard code the numbers axis(1,at=log(at+ .0004),label=at) So far, using the code above, everything looks the way I want. But that does not give me anything information on variability/confidence/certainty. How do I get the dash plots from this: plot(fit) ...on the same scales as above? Related question: how do get the dashed values out of the fit object so I can do 'stuff' with it? Thanks, Ben PS - thank you Patrick for your help previously. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Patrick Breheny Assistant Professor Department of Biostatistics Department of Statistics University of Kentucky __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gam - Y axis probability scale with confidence/error lines
That was embarrassingly easy. Thanks again Patrick! Just correcting a little typo to his reply. this is probably what he meant: pred = predict(fit,data.frame(x=xx),type=response,se.fit=TRUE) upper = pred$fit + 1.96 * pred$se.fit lower = pred$fit - 1.96 * pred$se.fit # For people who are interested this is how you plot it line by line: plot(xx,pred$fit,type=l,xlab=fd$getFactorName(),ylab=ylab,ylim= c(min(down),max(up))) lines(xx,upper,type=l,lty='dashed') lines(xx,lower,type=l,lty='dashed') In my opinion this is only important if the desired y axis is different than what plot(fit) gives you for a gam fit (i.e fit - gam(...stuff...)) and you want to plot the confidence intervals. thanks again! Ben On Wed, Mar 14, 2012 at 10:39 AM, Patrick Breheny patrick.breh...@uky.eduwrote: The predict() function has an option 'se.fit' that returns what you are asking for. If you set this equal to TRUE in your code: pred - predict(fit,data.frame(x=xx),**type=response,se.fit=TRUE) will return a list with two elements, 'fit' and 'se.fit'. The pointwise confidence intervals will then be pred$fit + 1.96*se.fit pred$fit - 1.96*se.fit for 95% confidence intervals (replace 1.96 with the appropriate quantile of the normal distribution for other confidence levels). You can then do whatever stuff you want to do with them, including plot them. --Patrick On 03/14/2012 10:48 AM, Ben quant wrote: Hello, How do I plot a gam fit object on probability (Y axis) vs raw values (X axis) axis and include the confidence plot lines? Details... I'm using the gam function like this: l_yx[,2] = log(l_yx[,2] + .0004) fit- gam(y~s(x),data=as.data.frame(**l_yx),family=binomial) And I want to plot it so that probability is on the Y axis and values are on the X axis (i.e. I don't want log likelihood on the Y axis or the log of my values on my X axis): xx- seq(min(l_yx[,2]),max(l_yx[,2]**),len=101) plot(xx,predict(fit,data.**frame(x=xx),type=response),** type=l,xaxt=n,xlab=Churn**,ylab=P(Top Performer)) at- c(.001,.01,.1,1,10) #-- I'd also like to generalize this rather than hard code the numbers axis(1,at=log(at+ .0004),label=at) So far, using the code above, everything looks the way I want. But that does not give me anything information on variability/confidence/** certainty. How do I get the dash plots from this: plot(fit) ...on the same scales as above? Related question: how do get the dashed values out of the fit object so I can do 'stuff' with it? Thanks, Ben PS - thank you Patrick for your help previously. [[alternative HTML version deleted]] __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Patrick Breheny Assistant Professor Department of Biostatistics Department of Statistics University of Kentucky [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gam - Y axis probability scale with confidence/error lines
Actually, I responded a bit too quickly last time, without really reading through your example carefully. You're fitting a logistic regression model and plotting the results on the probability scale. The better way to do what you propose is to obtain the confidence interval on the scale of the linear predictor and then transform to the probability scale, as in: x - seq(0,1,by=.01) y - rbinom(length(x),size=1,p=x) require(gam) fit - gam(y~s(x),family=binomial) pred - predict(fit,se.fit=TRUE) yy - binomial()$linkinv(pred$fit) l - binomial()$linkinv(pred$fit-1.96*pred$se.fit) u - binomial()$linkinv(pred$fit+1.96*pred$se.fit) plot(x,yy,type=l) lines(x,l,lty=2) lines(x,u,lty=2) -- Patrick Breheny Assistant Professor Department of Biostatistics Department of Statistics University of Kentucky On 03/14/2012 01:49 PM, Ben quant wrote: That was embarrassingly easy. Thanks again Patrick! Just correcting a little typo to his reply. this is probably what he meant: pred = predict(fit,data.frame(x=xx),type=response,se.fit=TRUE) upper = pred$fit + 1.96 * pred$se.fit lower = pred$fit - 1.96 * pred$se.fit # For people who are interested this is how you plot it line by line: plot(xx,pred$fit,type=l,xlab=fd$getFactorName(),ylab=ylab,ylim= c(min(down),max(up))) lines(xx,upper,type=l,lty='dashed') lines(xx,lower,type=l,lty='dashed') In my opinion this is only important if the desired y axis is different than what plot(fit) gives you for a gam fit (i.e fit - gam(...stuff...)) and you want to plot the confidence intervals. thanks again! Ben On Wed, Mar 14, 2012 at 10:39 AM, Patrick Breheny patrick.breh...@uky.edu mailto:patrick.breh...@uky.edu wrote: The predict() function has an option 'se.fit' that returns what you are asking for. If you set this equal to TRUE in your code: pred - predict(fit,data.frame(x=xx),__type=response,se.fit=TRUE) will return a list with two elements, 'fit' and 'se.fit'. The pointwise confidence intervals will then be pred$fit + 1.96*se.fit pred$fit - 1.96*se.fit for 95% confidence intervals (replace 1.96 with the appropriate quantile of the normal distribution for other confidence levels). You can then do whatever stuff you want to do with them, including plot them. --Patrick On 03/14/2012 10:48 AM, Ben quant wrote: Hello, How do I plot a gam fit object on probability (Y axis) vs raw values (X axis) axis and include the confidence plot lines? Details... I'm using the gam function like this: l_yx[,2] = log(l_yx[,2] + .0004) fit- gam(y~s(x),data=as.data.frame(__l_yx),family=binomial) And I want to plot it so that probability is on the Y axis and values are on the X axis (i.e. I don't want log likelihood on the Y axis or the log of my values on my X axis): xx- seq(min(l_yx[,2]),max(l_yx[,2]__),len=101) plot(xx,predict(fit,data.__frame(x=xx),type=response),__type=l,xaxt=n,xlab=Churn__,ylab=P(Top Performer)) at- c(.001,.01,.1,1,10) #-- I'd also like to generalize this rather than hard code the numbers axis(1,at=log(at+ .0004),label=at) So far, using the code above, everything looks the way I want. But that does not give me anything information on variability/confidence/__certainty. How do I get the dash plots from this: plot(fit) ...on the same scales as above? Related question: how do get the dashed values out of the fit object so I can do 'stuff' with it? Thanks, Ben PS - thank you Patrick for your help previously. [[alternative HTML version deleted]] R-help@r-project.org mailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/__listinfo/r-help https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/__posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gam - Y axis probability scale with confidence/error lines
Thank you. The binomial()$linkinv() is good to know. Ben On Wed, Mar 14, 2012 at 12:23 PM, Patrick Breheny patrick.breh...@uky.eduwrote: Actually, I responded a bit too quickly last time, without really reading through your example carefully. You're fitting a logistic regression model and plotting the results on the probability scale. The better way to do what you propose is to obtain the confidence interval on the scale of the linear predictor and then transform to the probability scale, as in: x - seq(0,1,by=.01) y - rbinom(length(x),size=1,p=x) require(gam) fit - gam(y~s(x),family=binomial) pred - predict(fit,se.fit=TRUE) yy - binomial()$linkinv(pred$fit) l - binomial()$linkinv(pred$fit-1.**96*pred$se.fit) u - binomial()$linkinv(pred$fit+1.**96*pred$se.fit) plot(x,yy,type=l) lines(x,l,lty=2) lines(x,u,lty=2) -- Patrick Breheny Assistant Professor Department of Biostatistics Department of Statistics University of Kentucky On 03/14/2012 01:49 PM, Ben quant wrote: That was embarrassingly easy. Thanks again Patrick! Just correcting a little typo to his reply. this is probably what he meant: pred = predict(fit,data.frame(x=xx),**type=response,se.fit=TRUE) upper = pred$fit + 1.96 * pred$se.fit lower = pred$fit - 1.96 * pred$se.fit # For people who are interested this is how you plot it line by line: plot(xx,pred$fit,type=l,**xlab=fd$getFactorName(),ylab=**ylab,ylim= c(min(down),max(up))) lines(xx,upper,type=l,lty='**dashed') lines(xx,lower,type=l,lty='**dashed') In my opinion this is only important if the desired y axis is different than what plot(fit) gives you for a gam fit (i.e fit - gam(...stuff...)) and you want to plot the confidence intervals. thanks again! Ben On Wed, Mar 14, 2012 at 10:39 AM, Patrick Breheny patrick.breh...@uky.edu mailto:patrick.breheny@uky.**edupatrick.breh...@uky.edu wrote: The predict() function has an option 'se.fit' that returns what you are asking for. If you set this equal to TRUE in your code: pred - predict(fit,data.frame(x=xx),_**_type=response,se.fit=TRUE) will return a list with two elements, 'fit' and 'se.fit'. The pointwise confidence intervals will then be pred$fit + 1.96*se.fit pred$fit - 1.96*se.fit for 95% confidence intervals (replace 1.96 with the appropriate quantile of the normal distribution for other confidence levels). You can then do whatever stuff you want to do with them, including plot them. --Patrick On 03/14/2012 10:48 AM, Ben quant wrote: Hello, How do I plot a gam fit object on probability (Y axis) vs raw values (X axis) axis and include the confidence plot lines? Details... I'm using the gam function like this: l_yx[,2] = log(l_yx[,2] + .0004) fit- gam(y~s(x),data=as.data.frame(**__l_yx),family=binomial) And I want to plot it so that probability is on the Y axis and values are on the X axis (i.e. I don't want log likelihood on the Y axis or the log of my values on my X axis): xx- seq(min(l_yx[,2]),max(l_yx[,2]**__),len=101) plot(xx,predict(fit,data.__**frame(x=xx),type=response),_** _type=l,xaxt=n,xlab=**Churn__,ylab=P(Top Performer)) at- c(.001,.01,.1,1,10) #-- I'd also like to generalize this rather than hard code the numbers axis(1,at=log(at+ .0004),label=at) So far, using the code above, everything looks the way I want. But that does not give me anything information on variability/confidence/__**certainty. How do I get the dash plots from this: plot(fit) ...on the same scales as above? Related question: how do get the dashed values out of the fit object so I can do 'stuff' with it? Thanks, Ben PS - thank you Patrick for your help previously. [[alternative HTML version deleted]] __**__ R-help@r-project.org mailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/_**_listinfo/r-helphttps://stat.ethz.ch/mailman/__listinfo/r-help https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/__**posting-guide.htmlhttp://www.R-project.org/__posting-guide.html http://www.R-project.org/**posting-guide.htmlhttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal,