Re: [scikit-learn] Confidence and Prediction Intervals of Support Vector Regression

Sebastian Raschka Wed, 01 Mar 2017 19:15:07 -0800

Glad to hear that it was at least a little bit helpful :) 
(haha, Efron and Tibshirani even have a whole ~500 pg book on bootstrap if you 
have the time and patience … :) 
https://www.crcpress.com/An-Introduction-to-the-Bootstrap/Efron-Tibshirani/p/book/9780412042317)


> On Mar 1, 2017, at 10:07 PM, Raga Markely <[email protected]> wrote:
> 
> No worries, Sebastian :) .. thank you very much for your help.. I learned a 
> lot of new things from your site today.. it led me to some relevant chapters 
> in "The Elements of Statistical Learning", which then led me to chapter 8 
> page 264 about non-parametric & parametric bootstrap.. 
> 
> I think I will just go with the non-parametric bootstrap for my problem.. 
> similar to the bootstrap steps i mentioned earlier..
> 
> Thank you!
> Raga
> 
> On Wed, Mar 1, 2017 at 9:44 PM, Sebastian Raschka <[email protected]> 
> wrote:
> Hi, Raga,
> 
> > 1. Just to make sure I understand correctly, using the .632+ bootstrap 
> > method, the ACC_lower and ACC_upper are the lower and higher percentile of 
> > the ACC_h,i distribution?
> 
> phew, I am actually not sure anymore … I think it’s the percentile of the 
> ACC_boot distribution, similar to the “classic” bootstrap but where ACC_boot 
> got computed from weighted ACC_h,i and ACC_r,i
> 
> >  2. For regression algorithms, is there a recommended equation for the 
> > no-information rate gamma?
> 
> 
> Sorry, can’t be of much help here; I am not sure what the equivalent of the 
> no-information rate for regression would be ...
> 
> 
> 
> > On Mar 1, 2017, at 5:39 PM, Raga Markely <[email protected]> wrote:
> >
> > Thanks a lot, Sebastian! Very nicely written.
> >
> > I have a few follow-up questions:
> > 1. Just to make sure I understand correctly, using the .632+ bootstrap 
> > method, the ACC_lower and ACC_upper are the lower and higher percentile of 
> > the ACC_h,i distribution?
> > 2. For regression algorithms, is there a recommended equation for the 
> > no-information rate gamma?
> > 3. I need to plot the confidence interval and prediction interval for my 
> > Support Vector Regression prediction (just to clarify these intervals, 
> > please see an analogy from linear model on slide 14: 
> > http://www2.stat.duke.edu/~tjl13/s101/slides/unit6lec3H.pdf) - can I derive 
> > the intervals from .632+ bootstrap method or is there a different way of 
> > getting these intervals?
> >
> > Thank you!
> > Raga
> >
> >
> > On Wed, Mar 1, 2017 at 3:13 PM, Sebastian Raschka <[email protected]> 
> > wrote:
> > Hi, Raga,
> > I have a short section on this here 
> > (https://sebastianraschka.com/blog/2016/model-evaluation-selection-part2.html#the-bootstrap-method-and-empirical-confidence-intervals)
> >  if it helps.
> >
> > Best,
> > Sebastian
> >
> > > On Mar 1, 2017, at 3:07 PM, Raga Markely <[email protected]> wrote:
> > >
> > > Hi everyone,
> > >
> > > I wonder if you could provide me with some suggestions on how to 
> > > determine the confidence and prediction intervals of SVR? If you have 
> > > suggestions for any machine learning algorithms in general, that would be 
> > > fine too (doesn't have to be specific for SVR).
> > >
> > > So far, I have found:
> > > 1. Bootstrap: 
> > > http://stats.stackexchange.com/questions/183230/bootstrapping-confidence-interval-from-a-regression-prediction
> > > 2. 
> > > http://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0048723&type=printable
> > > 3. ftp://ftp.esat.kuleuven.ac.be/sista/suykens/reports/10_156_v0.pdf
> > >
> > > But, I don't fully understand the details in #2 and #3 to the point that 
> > > I can write a step by step code. If I use bootstrap method, I can get the 
> > > confidence interval as follows?
> > > a. Draw bootstrap sample of size n
> > > b. Fit the SVR model (with hyperparameters chosen during model selection 
> > > with grid search cv) to this bootstrap sample
> > > c. Use this model to predict the output variable y* from input variable X*
> > > d. Repeat step a-c for, for instance, 100 times
> > > e. Order the 100 values of y*, and determine, for instance, the 10th 
> > > percentile and 90th percentile (if we are looking for 0.8 confidence 
> > > interval)
> > > f. Repeat a-e for different values of X* to plot the prediction with 
> > > confidence interval
> > >
> > > But, I don't know how to get the prediction interval from here.
> > >
> > > Thank you very much,
> > > Raga
> > > _______________________________________________
> > > scikit-learn mailing list
> > > [email protected]
> > > https://mail.python.org/mailman/listinfo/scikit-learn
> >
> > _______________________________________________
> > scikit-learn mailing list
> > [email protected]
> > https://mail.python.org/mailman/listinfo/scikit-learn
> >
> > _______________________________________________
> > scikit-learn mailing list
> > [email protected]
> > https://mail.python.org/mailman/listinfo/scikit-learn
> 
> _______________________________________________
> scikit-learn mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/scikit-learn
> 
> _______________________________________________
> scikit-learn mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/scikit-learn

_______________________________________________
scikit-learn mailing list
[email protected]
https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] Confidence and Prediction Intervals of Support Vector Regression

Reply via email to