>
> On 03/10/2013 16:42:44 +0100, Andreas Mueller wrote:
>
> If you have an elegant solution, I'm all ears, though ;)
>
Here's a hacky solution for my particular case which requires git revert
2d9cb81b8 to work at HEAD. It works by returning a Score object from the
Scorer, which pretends it is the fscore value when it is the operand of +
or *:
import sklearn.metrics
class Score(object):
__slots__ = ('value', 'meta')
def __init__(self, value, meta=None):
self.value = value
self.meta = meta
def __add__(self, other):
return self.value + other
def __radd__(self, other):
return other + self.value
def __mul__(self, other):
return self.value * other
def __repr__(self):
return '<{}{}>'.format(self.value, '' if self.meta is None else '
({})'.format(self.meta))
def prf(*args, **kwargs):
if 'average' not in kwargs:
kwargs['average'] = 'weighted'
p, r, f, support =
sklearn.metrics.precision_recall_fscore_support(*args, **kwargs)
return Score(f, {'precision': p, 'recall': r, 'support': support})
clf =
sklearn.grid_search.GridSearchCV(sklearn.linear_model.LogisticRegression(),
{'C': [1, 10]}, scoring=sklearn.metrics.Scorer(prf))
clf.fit(iris.data, iris.target == 1) # binary classification
Then printing clf.cv_scores_
[CVScoreTuple(parameters={'C': 1},
mean_validation_score=0.30285714285714288, cv_validation_scores=array([
<0.48 ({'recall': 0.375, 'support': 16, 'precision': 0.66666666666666663})>,
<0.428571428571 ({'recall': 0.35294117647058826, 'support': 17,
'precision': 0.54545454545454541})>,
...]
More generically, you might loosen the requirement that Scorer.__call__
returns a number, as long as it returns something with __float__() as in
https://github.com/jnothman/scikit-learn/commit/51d3ea. However, converting
to float may mess up results where scores are integers.
In this case, the following would suffice:
class FScore(object):
__slots__ = ('precision', 'recall', 'fscore', 'support')
def __init__(self, *args, **kwargs):
if 'average' not in kwargs:
kwargs['average'] = 'weighted'
self.precision, self.recall, self.fscore, support =
sklearn.metrics.precision_recall_fscore_support(*args, **kwargs)
def __float__(self):
return self.fscore
clf =
sklearn.grid_search.GridSearchCV(sklearn.linear_model.LogisticRegression(),
{'C': [1, 10]}, scoring=sklearn.metrics.Scorer(FScore))
...
As an aside: if you had all fitted estimators, it would also be quite
> easy to compute the other scores, right?
> Would that be an acceptable solution for you?
>
I guess so (noting that a modified scorer with the above could store the
estimator as well)... Perhaps that's a reasonable option -- its main
benefit over the above is less obfuscation -- though I to worry that
storing all estimators in the general case is expensive.
- Joel
------------------------------------------------------------------------------
Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester
Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the
endpoint security space. For insight on selecting the right partner to
tackle endpoint security challenges, access the full report.
http://p.sf.net/sfu/symantec-dev2dev
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general