2012/10/26  <[email protected]>:
>
> Dear SciKitters,
>
> I'm rather new to this wonderful toolkit and starting to use it in the
> cheminformatics environment.
>
> I was wondering if I properly defined the grid search in the case of a SVM:
>
> "
> # code snippet
> tuned_parameters = [{'kernel': ['linear'],'C': [1,10,100,1000]}]
> scores = [ ('precision', precision_score), ('recall', recall_score),]
> for score_name, score_func in scores:
>   clf = GridSearchCV(SVC(C=1), tuned_parameters,
> score_func=score_func,n_jobs=20)
> "

Why do you loop on score without storing the `clf.best_params_` of
each iteration? their is no point in running a grid search for finding
the optimal parameter values if you don't even look at the parameters
and average score values in the end :)

Also: what is your problem?

What do you expect to get?

What do you get when running this on your data?

> My data set is rather small - for debugging purposes, it just contains 10
> training and 10 testing molecules with 120 numerical descriptors each. I'm
> trying to resolve a binary classification with the help of a SVM.

There is probably no point in parallelizing the grid search of such a
small problem. How long do a single `SVC(C=1).fit(X, y)` take? If it's
less than a couple of seconds you should not bother with multi
processing and just leave `n_jobs=1` (i.e. its default value).

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to