On Tue, Oct 04, 2011 at 12:23:59AM +0200, Gael Varoquaux wrote: > On Mon, Oct 03, 2011 at 06:16:37PM -0400, Satrajit Ghosh wrote: > > when i used mean_square_error as score_func, it gave me p=.98, when i was > > pretty positive i had a significant result. but that's because the lower > > the value is in the distribution the better it is. this obviously > > reversed > > when i used explained_variance, where things closer to 1 are better. > > do you think stating that score_func should return a float between 0 and > > 1 > > would be better or to state that if you have a score_func that ranges > > from > > 0 to inf and whose lower bound is a better score, then interpret > > significance as 1-p_value? > > In the scikit, there is a convention that everything that is a 'score' is > 'bigger is better'. The reason is that it enables black box optimizers to > tune parameters or select models based on this score. I wouldn't like to > enforce that it is bound between 0 and 1 because many scores used in real > life are not bound. Also, in general, you cannot interpret a score (like > explained_variance) as related to a p-value. I wouldn't try to have a too > simple message by fitting all the metrics in the framework. I don't think > that it can work: they test for different things.
A more common one I've seen is -log(pval). This has the benefit that it looks nothing like a p-value, and may have some numerical stability advantages in certain situations (in particular, 1 - A_VERY_SMALL_VALUE is very, very error-prone). David ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
