Firstly thank you to the devs for a great toolkit.
I am using sklearn's GridSearchCV for a classification with the F1 metric.
GridSearchCV.fit() produces a .cv_scores_ attribute which allows me to view
the scores for each fold for each point in the grid. But it does not let me
view the precision and recall for each fold, or even overall for a grid
point, from which F1 is calculated.
I can see two ways to make this data available:
* allow the user to provide an arbitrary diagnostics function which is run
on each fold's predictions and whose output is stored with cv_scores_ (or
one could even store the learnt parameters for each fold)
* allow scorers to return multiple values as a tuple, but provide for a
custom aggregator (currently there are effectively two different
aggregation procedures depending on the iid param, but neither will
aggregate triples!)
The first approach sounds nice, except that since the Estimator.predict()
(or decision_funciton or predict_proba) method is now called by the scorer,
it would need to be called again by the diagnostic function.
Is one of these preferable? Is there a better solution?
Cheers,
Joel
------------------------------------------------------------------------------
Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester
Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the
endpoint security space. For insight on selecting the right partner to
tackle endpoint security challenges, access the full report.
http://p.sf.net/sfu/symantec-dev2dev
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general