Re: [Scikit-learn-general] "reverse feature engineering" (or something vague like that)

Joseph Turian Mon, 01 Oct 2012 15:34:32 -0700

For error analysis, I usually look at the examples that the model
breaks on, and try to figure out the pattern.
This usually suggests new features to engineer.


On Mon, Oct 1, 2012 at 6:01 PM, Immanuel <[email protected]> wrote:
> Hi Christian,
>
> that's a great question and I'm curious what other's have to say.
>
> My impression is that the way to diagnose a trained model (classifier or
> regression)
> differ much between models and also depend on the problem at hand. This
> makes it hard
> to come up with a general framework.
> Here some resources:
>
> * ESL [0] contains lot's of information on how to interpret linear models.
> * "Advice for applying Machine Learning" [1] gives general recommendations
> on how
> to diagnose trained models
> * Some inspiration on how to gain inside though visualization [2]
> * [3] and [4] deal with Functional ANOVA decomposition (Still on my reading
> list)
>
> Best,
> Immanuel
>
>
> [0] Hastie, T., R. Tibshirani, J. Friedman, and J. Franklin. “The Elements
> of Statistical Learning: Data Mining, Inference and Prediction.” The
> Mathematical Intelligencer 27, no. 2 (2005): 83–85.
> [1] http://cs229.stanford.edu/materials/ML-advice.pdf
> [2] http://had.co.nz/model-vis/[3] Hooker, G. “Diagnostics and Extrapolation
> in Machine Learning”. stanford university, 2004.
> [4] Roosen, C.B. “Visualization and Exploration of High-dimensional
> Functions Using the Functional ANOVA Decomposition”. Citeseer, 1995.
>
>
>
> On 10/01/2012 10:49 PM, Christian Jauvin wrote:
>
> Hi everyone,
>
> I have this (rather vague) intuition that studying the "reasons" which
> led a trained classifier to behave like it did on particular instances
> of a problem might be a good way to increase its understanding. If you
> have for instance a very imbalanced problem, it might be useful to
> identify the few cases where a (trained) classifier answered right (in
> terms of classification or probabilistic output) on the least likely
> class, in order to determine which particular features have played a
> positive role, and which haven't. The way I see it, this would be a
> bit like "reverse engineering the features".
>
> So my question: is there a mechanism or maybe an already existing
> framework or theory for doing this? And would something approaching it
> be possible currently with Sklearn?
>
> Thanks,
>
> Christian
>
> ------------------------------------------------------------------------------
> Got visibility?
> Most devs has no idea what their production app looks like.
> Find out how fast your code is with AppDynamics Lite.
> http://ad.doubleclick.net/clk;262219671;13503038;y?
> http://info.appdynamics.com/FreeJavaPerformanceDownload.html
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
>
> ------------------------------------------------------------------------------
> Got visibility?
> Most devs has no idea what their production app looks like.
> Find out how fast your code is with AppDynamics Lite.
> http://ad.doubleclick.net/clk;262219671;13503038;y?
> http://info.appdynamics.com/FreeJavaPerformanceDownload.html
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>



-- 
Joseph Turian, Ph.D. | President, MetaOptimize
"Optimize Profits. Optimize Engagement."
http://metaoptimize.com
855-ALL-DATA

The web's most active forum for data scientists: http://metaoptimize.com/qa/

------------------------------------------------------------------------------
Got visibility?
Most devs has no idea what their production app looks like.
Find out how fast your code is with AppDynamics Lite.
http://ad.doubleclick.net/clk;262219671;13503038;y?
http://info.appdynamics.com/FreeJavaPerformanceDownload.html
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] "reverse feature engineering" (or something vague like that)

Reply via email to