2012/10/26 <[email protected]>: > > Dear SciKitters, > > I'm rather new to this wonderful toolkit and starting to use it in the > cheminformatics environment. > > I was wondering if I properly defined the grid search in the case of a SVM: > > " > # code snippet > tuned_parameters = [{'kernel': ['linear'],'C': [1,10,100,1000]}] > scores = [ ('precision', precision_score), ('recall', recall_score),] > for score_name, score_func in scores: > clf = GridSearchCV(SVC(C=1), tuned_parameters, > score_func=score_func,n_jobs=20) > "
Why do you loop on score without storing the `clf.best_params_` of each iteration? their is no point in running a grid search for finding the optimal parameter values if you don't even look at the parameters and average score values in the end :) Also: what is your problem? What do you expect to get? What do you get when running this on your data? > My data set is rather small - for debugging purposes, it just contains 10 > training and 10 testing molecules with 120 numerical descriptors each. I'm > trying to resolve a binary classification with the help of a SVM. There is probably no point in parallelizing the grid search of such a small problem. How long do a single `SVC(C=1).fit(X, y)` take? If it's less than a couple of seconds you should not bother with multi processing and just leave `n_jobs=1` (i.e. its default value). -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel ------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_sfd2d_oct _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
