For error analysis, I usually look at the examples that the model breaks on, and try to figure out the pattern. This usually suggests new features to engineer.
On Mon, Oct 1, 2012 at 6:01 PM, Immanuel <[email protected]> wrote: > Hi Christian, > > that's a great question and I'm curious what other's have to say. > > My impression is that the way to diagnose a trained model (classifier or > regression) > differ much between models and also depend on the problem at hand. This > makes it hard > to come up with a general framework. > Here some resources: > > * ESL [0] contains lot's of information on how to interpret linear models. > * "Advice for applying Machine Learning" [1] gives general recommendations > on how > to diagnose trained models > * Some inspiration on how to gain inside though visualization [2] > * [3] and [4] deal with Functional ANOVA decomposition (Still on my reading > list) > > Best, > Immanuel > > > [0] Hastie, T., R. Tibshirani, J. Friedman, and J. Franklin. “The Elements > of Statistical Learning: Data Mining, Inference and Prediction.” The > Mathematical Intelligencer 27, no. 2 (2005): 83–85. > [1] http://cs229.stanford.edu/materials/ML-advice.pdf > [2] http://had.co.nz/model-vis/[3] Hooker, G. “Diagnostics and Extrapolation > in Machine Learning”. stanford university, 2004. > [4] Roosen, C.B. “Visualization and Exploration of High-dimensional > Functions Using the Functional ANOVA Decomposition”. Citeseer, 1995. > > > > On 10/01/2012 10:49 PM, Christian Jauvin wrote: > > Hi everyone, > > I have this (rather vague) intuition that studying the "reasons" which > led a trained classifier to behave like it did on particular instances > of a problem might be a good way to increase its understanding. If you > have for instance a very imbalanced problem, it might be useful to > identify the few cases where a (trained) classifier answered right (in > terms of classification or probabilistic output) on the least likely > class, in order to determine which particular features have played a > positive role, and which haven't. The way I see it, this would be a > bit like "reverse engineering the features". > > So my question: is there a mechanism or maybe an already existing > framework or theory for doing this? And would something approaching it > be possible currently with Sklearn? > > Thanks, > > Christian > > ------------------------------------------------------------------------------ > Got visibility? > Most devs has no idea what their production app looks like. > Find out how fast your code is with AppDynamics Lite. > http://ad.doubleclick.net/clk;262219671;13503038;y? > http://info.appdynamics.com/FreeJavaPerformanceDownload.html > _______________________________________________ > Scikit-learn-general mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > > > > ------------------------------------------------------------------------------ > Got visibility? > Most devs has no idea what their production app looks like. > Find out how fast your code is with AppDynamics Lite. > http://ad.doubleclick.net/clk;262219671;13503038;y? > http://info.appdynamics.com/FreeJavaPerformanceDownload.html > _______________________________________________ > Scikit-learn-general mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > -- Joseph Turian, Ph.D. | President, MetaOptimize "Optimize Profits. Optimize Engagement." http://metaoptimize.com 855-ALL-DATA The web's most active forum for data scientists: http://metaoptimize.com/qa/ ------------------------------------------------------------------------------ Got visibility? Most devs has no idea what their production app looks like. Find out how fast your code is with AppDynamics Lite. http://ad.doubleclick.net/clk;262219671;13503038;y? http://info.appdynamics.com/FreeJavaPerformanceDownload.html _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
