Hi everyone!
I have the following question: I'm training an SGDClassifier and doing a
GridSearch to find the best parameters.
If I then use the "best parameters" found by the GridSearch and do a
CrossValidation with the same folds I provided to the GridSearch I get
different results than before.
Also if I do CrossValidations with other parameter combinations the results
differ from what is saved for the same combination in
grid_search.grid_scores_ (the difference in the example provided below is
not that much, but for other parameter combinations it differs greatly).
I don't really understand why this is happening - shouldn't they be the
same?
Thanks alot for any help!
Here is the Code I'm trying:
cv = cross_validation.StratifiedKFold(targets, 10)
score_func = metrics.f1_score
parameters = {
'loss': ('log', 'hinge'),
'penalty': ['l1', 'l2', 'elasticnet'],
'alpha': [0.001, 0.0001, 0.00001, 0.000001]
}
grid_search = GridSearchCV(SGDClassifier(), parameters,
score_func=score_func, cv=cv)
grid_search.fit(sample_vector, targets)
print "Best %s: %0.3f" % (score_func.__name__,
grid_search.best_score_)
print "Best parameters set:"
best_parameters = grid_search.best_estimator_.get_params()
for param_name in sorted(parameters.keys()):
print "\t%s: %r" % (param_name, best_parameters[param_name])
clf = SGDClassifier(**best_parameters)
scores = cross_validation.cross_val_score(clf, sample_vector,
targets,
cv=cv, score_func=score_func)
print 'INFO: %s: %0.2f (+/- %0.2f)' % \
(score_func.__name__, scores.mean(), scores.std() / 2)
===================OUTPUT==================
Best f1_score: 0.908
Best parameters set:
alpha: 0.001
loss: u'log'
penalty: u'l1'
INFO: f1_score: 0.89 (+/- 0.02)
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general