Hi,

I only recently started using sklearn and it's an impressive and well
documented library. Thanks!

I run into some strange behavior while using the function
'permutation_test_score'.

When using permutation_test_score with n_permutations = 50, everything
looks alright

In [4]: cv_scores, permutation_scores, pval =
permutation_test_score(clf, X, Y, zero_one_score, cv=cv,
n_permutations=50, n_jobs=4,verbose=1, random_state=0)
[Parallel(n_jobs=4)]: Done   1 out of  50 | elapsed:    0.0s remaining:    1.5s
[Parallel(n_jobs=4)]: Done  50 out of  50 | elapsed:    0.2s finished

However, when using the exact same data, but with n_permutations = 200
I don't get a result and this runs forever.

In [6]: cv_scores, permutation_scores, pval =
permutation_test_score(clf, X, Y, zero_one_score, cv=cv,
n_permutations=200, n_jobs=4,verbose=1, random_state=0)
[Parallel(n_jobs=4)]: Done   1 out of  54 | elapsed:    0.0s
remaining:    2.0s #<-- stops here

My code is here:https://gist.github.com/1884451  and the data to
reproduce the problem is here:
http://dl.dropbox.com/u/38470419/wired_data.dat  # sample x feature matrix
http://dl.dropbox.com/u/38470419/Y.txt  # binary labels

I am using sklearn .10 and joblib 0.6.1.

I am not sure if that can be caused by some irregularities in my data.
I would be grateful for every pointer.
As a related question, as far as I can see permutation_test_score does
not assure permuted labels, right? Couldn't

pvalue = (np.sum(permutation_scores>= score) + 1.0) / (n_permutations + 1)

be in some cases too conservative? I would count +1_only_  when the
true labels are_not_  included in the permutation set.

Thanks in advance!
 Matthias

------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to