On Mon, Oct 3, 2011 at 6:43 PM, Satrajit Ghosh <[email protected]> wrote: > hi gael, > >> >> In the scikit, there is a convention that everything that is a 'score' is >> 'bigger is better'. The reason is that it enables black box optimizers to >> tune parameters or select models based on this score. I wouldn't like >> to enforce that it is bound between 0 and 1 because many scores used in >> real life are not bound. > > understandable. i just didn't know the convention. > >> >> Also, in general, you cannot interpret a score (like explained_variance) >> as related to a p-value. > > i don't quite understand what you mean by this. i wasn't trying to interpret > explained_variance as a p-value. that's why i was using the permutation > test, to give me a p value corresponding to my observed explained_variance > value. sorry if i wasn't clear. > >> >> I wouldn't try to have a too simple message by fitting all the metrics in >> the framework. I don't think that it can work: they test for different >> things. > > i think just a note to say 'bigger is better' would suffice wherever a score > func is used. > cheers, > satra > >> >> Gaël
pvalue = (np.sum(permutation_scores >= score) + 1.0) / (n_permutations + 1) I think it's easier to say it's the survival probability (sf in scipy), probability of observing value at least as large as the value in the sample. it's the pvalue in the upper tail, but 1-pvalue for the lower tail. For two-sided tests it's possible to hope for symmetry. Josef >> >> >> ------------------------------------------------------------------------------ >> All the data continuously generated in your IT infrastructure contains a >> definitive record of customers, application performance, security >> threats, fraudulent activity and more. Splunk takes this data and makes >> sense of it. Business sense. IT sense. Common sense. >> http://p.sf.net/sfu/splunk-d2dcopy1 >> _______________________________________________ >> Scikit-learn-general mailing list >> [email protected] >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2dcopy1 > _______________________________________________ > Scikit-learn-general mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > > ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
