@federico haha! thanks for the motivation! @Josh, I'm not aware of polling in Github, but it sounds very convenient, a polling feature would be a great addition to scikit ;)
@Gael, I also thought that AUC is not suitable for multi-labels, but if you check the latest kaggle competitions such as this one `http://www.kaggle.com/c/mlsp-2013-birds/forums` they have established AUC measure for multi-label classification. I thought of a simple way to do it which is to first label Binarize the output so lets say y=[[1,2],[1]] which means sample 1 belongs to class 1 and 2 and sample 2 belongs to class 1, then the binarized form would be y=[[0,1,1],[0,1,0]], finally this can be rasterized to form a vector on which the predicted probabilities can be evaluated against, using the trivial AUC metrics already implemented in scikit, this could be wrong, however, the scores achieved were quite as to the leader-board. There are quite a number of papers that use AUC for multilabels, for example, http://www.cse.msu.edu/~rongjin/publications/iccv_camera.pdf <http://www.cse.msu.edu/%7Erongjin/publications/iccv_camera.pdf> ------------------------------------------------------------------------------ Get 100% visibility into Java/.NET code with AppDynamics Lite! It's a free troubleshooting tool designed for production. Get down to code-level detail for bottlenecks, with <2% overhead. Download for free and get started troubleshooting in minutes. http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general