2012/2/23 Matthias Ekman <[email protected]>: > Hi, > > I only recently started using sklearn and it's an impressive and well > documented library. Thanks! > > I run into some strange behavior while using the function > 'permutation_test_score'. > > When using permutation_test_score with n_permutations = 50, everything > looks alright > > In [4]: cv_scores, permutation_scores, pval = > permutation_test_score(clf, X, Y, zero_one_score, cv=cv, > n_permutations=50, n_jobs=4,verbose=1, random_state=0) > [Parallel(n_jobs=4)]: Done 1 out of 50 | elapsed: 0.0s remaining: > 1.5s > [Parallel(n_jobs=4)]: Done 50 out of 50 | elapsed: 0.2s finished > > However, when using the exact same data, but with n_permutations = 200 > I don't get a result and this runs forever. > > In [6]: cv_scores, permutation_scores, pval = > permutation_test_score(clf, X, Y, zero_one_score, cv=cv, > n_permutations=200, n_jobs=4,verbose=1, random_state=0) > [Parallel(n_jobs=4)]: Done 1 out of 54 | elapsed: 0.0s > remaining: 2.0s # <-- stops here > > My code is here: https://gist.github.com/1884451 and the data to > reproduce the problem is here: > http://dl.dropbox.com/u/38470419/wired_data.dat # sample x feature matrix > http://dl.dropbox.com/u/38470419/Y.txt # binary labels > > I am using sklearn .10 and joblib 0.6.1. > > I am not sure if that can be caused by some irregularities in my data. > I would be grateful for every pointer.
Strange, it might be related to a problem Vlad is investigating on Mac OS X Lion: https://github.com/scikit-learn/scikit-learn/issues/636 Which platform / OS are you using? > As a related question, as far as I can see permutation_test_score does > not assure permuted labels, right? Couldn't > > pvalue = (np.sum(permutation_scores >= score) + 1.0) / (n_permutations + 1) > > be in some cases too conservative? I would count +1 _only_ when the > true labels are _not_ included in the permutation set. No idea, maybe @agramfort or @GaelVaroquaux have an opinion? -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel ------------------------------------------------------------------------------ Virtualization & Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
