I had a similar situation and the solution I came up with was calculating
the standard deviation of the predictions of all the individual trees.
I found that when I trained my regressor on the lower half of my data, then
used the model to predict the upper half of my data my model generally
returned estimates with much greater variability across trees.
I also tried training a second model (same variables) to predict the first
model's error and had very little success. I think this makes sense as if
the variables were sufficient to predict the errors, then there would be
have fewer/lower errors to begin with. Make sense?
Hope that helps!
On Mon, Feb 10, 2014 at 5:42 PM, Alessandro Gagliardi <
alessandro.gaglia...@glassdoor.com> wrote:
> I got ExtraTreesRegressor running on IPython.parallel (Pyrallel doesn't
> work for me but the example at
> http://nbviewer.ipython.org/github/ogrisel/notebooks/blob/master/Distributed%20Learning%20of%20Extra%20Trees%20with%20IPython.parallel.ipynbdid).
> Now I'd like to be able to predict my error (i.e provide a confidence
> interval?) It doesn't look like ExtraTreesRegressor provides that, so I
> thought I might try training a second model to predict the error.
>
> 1. Is this crazy and/or stupid? I'm using the same factors to predict
> the error as I am to predict the result. I'm afraid there might be a
> circularity there but I can't see it.
> 2. ExtraTreesRegressor is too good! Even if I train on half, my median
> error is .025%. I mostly care if the error is more than 8% but those cases
> are so rare, I can't really train on it. I could set my threshold to 0.1%
> but that is far too strict for my purposes.
>
> I'm actually a little worried that my ExtraTreesRegressor is too good to
> be true. But I can't see anything wrong with my cross validation.
>
> A little more background in case it helps:
>
> I am trying to extrapolate to unknown data. The known dataset is not
> representative of the unknown (i.e. it's skewed) which is why predicting
> the error is important. In other words, my model needs to know when it's
> encountering a situation it doesn't know enough about.
>
> Thanks in advance,
>
> Alessandro Gagliardi| Glassdoor| alessan...@glassdoor.com
>
> *We're hiring! Check out our open jobs
> <http://www.glassdoor.com/about/careers.htm>.*
>
> *Twitter <https://twitter.com/Glassdoor>** | Facebook
> <https://www.facebook.com/Glassdoor> | Glassdoor Blog
> <http://www.glassdoor.com/blog/>*
>
> *2012 Webby Award Winner: Best Employment Site*
>
> *2013 Webby Award Winner: Best Guides/Ratings/Review Site*
>
>
> ------------------------------------------------------------------------------
> Android apps run on BlackBerry 10
> Introducing the new BlackBerry 10.2.1 Runtime for Android apps.
> Now with support for Jelly Bean, Bluetooth, Mapview and more.
> Get your Android app in front of a whole new audience. Start now.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
Android apps run on BlackBerry 10
Introducing the new BlackBerry 10.2.1 Runtime for Android apps.
Now with support for Jelly Bean, Bluetooth, Mapview and more.
Get your Android app in front of a whole new audience. Start now.
http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general