Re: [R] Confidence interval from resampling
On 24/06/11 01:44, Adriana Bejarano wrote: Dear R gurus, I have the following code, but I still not know how to estimate and extract confidence intervals (95%CI) from resampling. SNIP Some sound advice --- still sound after all these years I believe --- in respect of boot strap confidence intervals, is to be found in Theoretical comparison of bootstrap confidence intervals, Peter Hall, Annals of Statistics vol. 16 no. 3 1988, pp. 927 -- 953. cheers, Rolf Turner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Confidence interval from resampling
Dear R gurus, I have the following code, but I still not know how to estimate and extract confidence intervals (95%CI) from resampling. Thanks! ~Adriana #data penta-c(770,729,640,486,450,410,400,340,306,283,278,260,253,242,240,229,201,198,190,186,180,170,168,151,150,148,147,125,117,110,107,104,85,83,80,74,70,66,54,46,45,43,40,38,10) x-log(penta+1) plot(ecdf(x), ylab=Probability, xlab=Concentration (Ln+1)) x.wei-fitdistr(x,weibull) x.wei shapescale 6.7291685 5.3769965 (0.7807718) (0.1254696) xwei.shape - x.wei$estimate[[1]] xwei.scale - x.wei$estimate[[2]] x.wei-fitdistr(x,weibull) x.wei xwei.shape - x.wei$estimate[[1]] xwei.scale - x.wei$estimate[[2]] curve(pweibull(x, shape=xwei.shape, scale = xwei.scale,lower.tail=TRUE, log.p=FALSE), add=TRUE,col='green',lwd=3) #draw random numbers from a weibull distribution 100 times with shape=xwei.shape, scale = xwei.scale draw - lapply(1:100, function(.x){ out-rweibull(x, shape=xwei.shape, scale = xwei.scale) }) newx- data.frame(draw) colnames(newx)-paste(x, 1:100, sep = ) newmat-data.matrix(newx) # matrix of coefficients rownum=2 colnum=100 ResultMat-matrix(NA, ncol=colnum, nrow=rownum) rownum2=45 colnum2=100 ResultMat2-matrix(NA, ncol=colnum2, nrow=rownum2) #loop through each column in the source matrix for (i in 1:100) { sel_col-newmat[col(newmat)==i] {ResultMat[,i]-coef(fitdistr(sel_col,weibull))} xwei.shape- ResultMat[1,i] xwei.scale- ResultMat[2,i] curve(pweibull(x, shape=xwei.shape, scale=xwei.scale, lower.tail=TRUE, log.p = FALSE), add=TRUE, col='grey',lwd=0.5) ResultMat2[,i]-pweibull(x, shape=xwei.shape, scale = xwei.scale,lower.tail=TRUE, log.p=FALSE) } ## convert dataframe to numeric MatOut- as.matrix(ResultMat2) mode(MatOut) - numeric # initiate variables mm-ml-mu-rep(0,length(MatOut[,1])) # mean and upper/lower quantiles for(i in 1:length(MatOut[,1])){ mm[i]- mean(MatOut[i,]) ml[i]- quantile(MatOut[i,], 0.025, na.rm=TRUE) mu[i]- quantile(MatOut[i,], 0.975, na.rm=TRUE) } #lines(x, mm, col=black) lines(x, ml, col=blue, lwd=2) lines(x, mu, col=blue, lwd=2) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Confidence interval from resampling
On Jun 23, 2011, at 9:44 AM, Adriana Bejarano wrote: Dear R gurus, I have the following code, but I still not know how to estimate and extract confidence intervals (95%CI) from resampling. If you have a distribution of values, say resamp.stat, of a statistic from a properly performed resampling operation you can extract and display easily the 5th and 95th percentiles. CI.stat - quantile(resamp.stat, c(0.05, 0.95) ) CI.stat Note: I do not think that 100 replications would generally be sufficient for final work, although its probably acceptable for testing. Your code as posted initially threw a bunch of errors since you did not include library(MASS), but fixing that fairly obvious problem shows that you can draw a nice plot. However, it remains unclear what statistic of what distribution you desire to assess. Mean, median, ... of what? I do not think the error that arose on my machine from the wrapped text here: #draw random numbers from a weibull distribution 100 times with ... shape=xwei.shape, scale = xwei.scale - error . was causing any problem, but there were a bunch of warnings that ought to be investigated: Right after the loop I see ten of these: Warning messages: 1: In dweibull(x, shape, scale, log) : NaNs produced -- David Thanks! ~Adriana #data penta- c (770,729,640,486,450,410,400,340,306,283,278,260,253,242,240,229,201,198,190,186,180,170,168,151,150,148,147,125,117,110,107,104,85,83,80,74,70,66,54,46,45,43,40,38,10 ) x-log(penta+1) plot(ecdf(x), ylab=Probability, xlab=Concentration (Ln+1)) x.wei-fitdistr(x,weibull) x.wei shapescale 6.7291685 5.3769965 (0.7807718) (0.1254696) xwei.shape - x.wei$estimate[[1]] xwei.scale - x.wei$estimate[[2]] x.wei-fitdistr(x,weibull) x.wei xwei.shape - x.wei$estimate[[1]] xwei.scale - x.wei$estimate[[2]] curve(pweibull(x, shape=xwei.shape, scale = xwei.scale,lower.tail=TRUE, log.p=FALSE), add=TRUE,col='green',lwd=3) #draw random numbers from a weibull distribution 100 times with shape=xwei.shape, scale = xwei.scale draw - lapply(1:100, function(.x){ out-rweibull(x, shape=xwei.shape, scale = xwei.scale) }) newx- data.frame(draw) colnames(newx)-paste(x, 1:100, sep = ) newmat-data.matrix(newx) # matrix of coefficients rownum=2 colnum=100 ResultMat-matrix(NA, ncol=colnum, nrow=rownum) rownum2=45 colnum2=100 ResultMat2-matrix(NA, ncol=colnum2, nrow=rownum2) #loop through each column in the source matrix for (i in 1:100) { sel_col-newmat[col(newmat)==i] {ResultMat[,i]-coef(fitdistr(sel_col,weibull))} xwei.shape- ResultMat[1,i] xwei.scale- ResultMat[2,i] curve(pweibull(x, shape=xwei.shape, scale=xwei.scale, lower.tail=TRUE, log.p = FALSE), add=TRUE, col='grey',lwd=0.5) ResultMat2[,i]-pweibull(x, shape=xwei.shape, scale = xwei.scale,lower.tail=TRUE, log.p=FALSE) } ## convert dataframe to numeric MatOut- as.matrix(ResultMat2) mode(MatOut) - numeric # initiate variables mm-ml-mu-rep(0,length(MatOut[,1])) # mean and upper/lower quantiles for(i in 1:length(MatOut[,1])){ mm[i]- mean(MatOut[i,]) ml[i]- quantile(MatOut[i,], 0.025, na.rm=TRUE) mu[i]- quantile(MatOut[i,], 0.975, na.rm=TRUE) } #lines(x, mm, col=black) lines(x, ml, col=blue, lwd=2) lines(x, mu, col=blue, lwd=2) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Confidence interval from resampling
Depending on how critical the problem is, you might also want to look at the literature on bootstrap CI's, perhaps starting from the references in boot.ci in the boot package. The simple quantiles are not necessarily the most appropriate. For example I seem to recall that BCa intervals were the intervals recommended for 'general use' by Efron and Tibshirani (Introduction to the Bootstrap (1993) Chapman and Hall) with ABC intervals also getting an honourable mention. 1993 is a long time ago, though... S Ellison -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of David Winsemius Sent: 23 June 2011 15:25 To: Adriana Bejarano Cc: r-help@r-project.org Subject: Re: [R] Confidence interval from resampling On Jun 23, 2011, at 9:44 AM, Adriana Bejarano wrote: Dear R gurus, I have the following code, but I still not know how to estimate and extract confidence intervals (95%CI) from resampling. If you have a distribution of values, say resamp.stat, of a statistic from a properly performed resampling operation you can extract and display easily the 5th and 95th percentiles. CI.stat - quantile(resamp.stat, c(0.05, 0.95) ) CI.stat Note: I do not think that 100 replications would generally be sufficient for final work, although its probably acceptable for testing. Your code as posted initially threw a bunch of errors since you did not include library(MASS), but fixing that fairly obvious problem shows that you can draw a nice plot. However, it remains unclear what statistic of what distribution you desire to assess. Mean, median, ... of what? I do not think the error that arose on my machine from the wrapped text here: #draw random numbers from a weibull distribution 100 times with ... shape=xwei.shape, scale = xwei.scale - error . was causing any problem, but there were a bunch of warnings that ought to be investigated: Right after the loop I see ten of these: Warning messages: 1: In dweibull(x, shape, scale, log) : NaNs produced -- David Thanks! ~Adriana #data penta- c (770,729,640,486,450,410,400,340,306,283,278,260,253,242,240,229,201,1 98,190,186,180,170,168,151,150,148,147,125,117,110,107,104,85,83,80,74 ,70,66,54,46,45,43,40,38,10 ) x-log(penta+1) plot(ecdf(x), ylab=Probability, xlab=Concentration (Ln+1)) x.wei-fitdistr(x,weibull) x.wei shapescale 6.7291685 5.3769965 (0.7807718) (0.1254696) xwei.shape - x.wei$estimate[[1]] xwei.scale - x.wei$estimate[[2]] x.wei-fitdistr(x,weibull) x.wei xwei.shape - x.wei$estimate[[1]] xwei.scale - x.wei$estimate[[2]] curve(pweibull(x, shape=xwei.shape, scale = xwei.scale,lower.tail=TRUE, log.p=FALSE), add=TRUE,col='green',lwd=3) #draw random numbers from a weibull distribution 100 times with shape=xwei.shape, scale = xwei.scale draw - lapply(1:100, function(.x){ out-rweibull(x, shape=xwei.shape, scale = xwei.scale) }) newx- data.frame(draw) colnames(newx)-paste(x, 1:100, sep = ) newmat-data.matrix(newx) # matrix of coefficients rownum=2 colnum=100 ResultMat-matrix(NA, ncol=colnum, nrow=rownum) rownum2=45 colnum2=100 ResultMat2-matrix(NA, ncol=colnum2, nrow=rownum2) #loop through each column in the source matrix for (i in 1:100) { sel_col-newmat[col(newmat)==i] {ResultMat[,i]-coef(fitdistr(sel_col,weibull))} xwei.shape- ResultMat[1,i] xwei.scale- ResultMat[2,i] curve(pweibull(x, shape=xwei.shape, scale=xwei.scale, lower.tail=TRUE, log.p = FALSE), add=TRUE, col='grey',lwd=0.5) ResultMat2[,i]-pweibull(x, shape=xwei.shape, scale = xwei.scale,lower.tail=TRUE, log.p=FALSE) } ## convert dataframe to numeric MatOut- as.matrix(ResultMat2) mode(MatOut) - numeric # initiate variables mm-ml-mu-rep(0,length(MatOut[,1])) # mean and upper/lower quantiles for(i in 1:length(MatOut[,1])){ mm[i]- mean(MatOut[i,]) ml[i]- quantile(MatOut[i,], 0.025, na.rm=TRUE) mu[i]- quantile(MatOut[i,], 0.975, na.rm=TRUE) } #lines(x, mm, col=black) lines(x, ml, col=blue, lwd=2) lines(x, mu, col=blue, lwd=2) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. *** This email and any