> b) You shouldn't set max_depth=5. Instead, build fully developed trees
> (max_depth=None) or rather tune min_samples_split using
> cross-validation.
Dear Gilles,
I have set up a grid search:
"
tuned_parameters = [{'min_samples_split': [1,2,3,4,5,6,7,8,9]}]
scores = [('precision', precision_score),('recall', recall_score)]
for score_name, score_func in scores:
print "# Tuning hyper-parameters for %s" % score_name
clf_RF_gridsearched = GridSearchCV(RandomForestClassifier
(),tuned_parameters,score_func=score_func,n_jobs=1)
clf_RF_gridsearched = clf_RF_gridsearched.fit
(X_train,y_train,cv=5,n_jobs=1)
print clf_RF_gridsearched.best_estimator_
print "Grid scores on development set:"
for params, mean_score, scores in clf_RF_gridsearched.grid_scores_:
print "%0.3f (+/-%0.03f) for %r" % (mean_score, scores.std() / 2,
params)
y_true, y_pred = y_test, clf_RF_gridsearched.predict(X_test)
print "Detailed classification report:"
print classification_report(y_true, y_pred)
"
min_samples_split is apparently "7" (at least for the recall score).
"
0.335 (+/-0.072) for {'min_samples_split': 7}
"
Nonetheless, this score is still only decent.
BTW: How do I output the confusion matrix/overall_accuracy of the
development set?
Cheers,
Paul
This message and any attachment are confidential and may be privileged or
otherwise protected from disclosure. If you are not the intended recipient,
you must not copy this message or attachment or disclose the contents to
any other person. If you have received this transmission in error, please
notify the sender immediately and delete the message and any attachment
from your system. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not accept liability for any omissions or errors in this
message which may arise as a result of E-Mail-transmission or for damages
resulting from any unauthorized changes of the content of this message and
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not guarantee that this message is free of viruses and does
not accept liability for any damages caused by any virus transmitted
therewith.
Click http://www.merckgroup.com/disclaimer to access the German, French,
Spanish and Portuguese versions of this disclaimer.
------------------------------------------------------------------------------
LogMeIn Central: Instant, anywhere, Remote PC access and management.
Stay in control, update software, and manage PCs from one command center
Diagnose problems and improve visibility into emerging IT issues
Automate, monitor and manage. Do more in less time with Central
http://p.sf.net/sfu/logmein12331_d2d
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general