Re: [Scikit-learn-general] design of scorer interface

Aaron Staple Sun, 02 Nov 2014 16:46:14 -0800

Hi folks,

I went ahead and made a POC for a more complete implementation of option #4:


https://github.com/staple/scikit-learn/commit/e76fa8887cd35ad7a249ee157067cd12c89bdefb

Aaron

On Tue, Oct 28, 2014 at 11:35 PM, Aaron Staple <aaron.sta...@gmail.com>
wrote:

> Following up on Andy’s questions:
>
> The scorer implementation provides a registry of named scorers, and these
> scorers may implement specialized logic such as choosing to call an
> appropriate predictor method or munging the output of predict_proba. My
> task was to make oob scoring support the same set of named scoring metrics
> as cv, so my inclination was to use the existing scorers rather than start
> from scratch. (Writing a separate implementation would be option #1 in my
> list above.)
>
> I’ve also written up some examples (copying details from @mblondel’s
> example earlier)
>
> For #3, the interface might look something like:
>
> class _BaseScorer(…):
>
> @abstractmethod
> def __call__(self, estimator, X, y_true, sample_weight=None):
>  pass
>
> @abstractmethod
> def _score(self, y_true, y_prediction=None, y_prediction_proba=None,
> y_decision_function=None):
> pass
>
> class _ProbaScorer(_BaseScorer):
>
> …
>
> def _score(self, y, y_pred=None, y_proba=None, y_decision=None,
> sample_weight=None):
> if y_proba is None:
> raise ValueError("This scorer needs y_proba.")
> if sample_weight is not None:
> return self._sign * self._score_func(y, y_proba,
> sample_weight=sample_weight,
> **self._kwargs)
> else:
> return self._sign * self._score_func(y, y_proba, **self._kwargs)
>
>
> And then there would be a function
>
> def getScore(scoring, y_true, y_prediction=None, y_prediction_proba=None,
> y_decision_function=None):
> return lookup(scoring)._score(y_true, y_prediction, y_prediction_proba,
> y_decision_function)
>
> (more detail in a possible variation of this at
> https://github.com/staple/scikit-learn/blob/3455/sklearn/metrics/scorer.py,
> where the __call__ and _score methods share an implementation.)
>
> For #4,
>
> class _BaseScorer(…):
>
> @abstractmethod
> def __call__(self, estimator, X, y_true, sample_weight=None):
>  pass
>
> @abstractmethod
> def get_score(self, y_true, y_prediction=None, y_prediction_proba=None,
> y_decision_function=None):
> pass
>
> class _ProbaScorer(_BaseScorer):
>
> …
>
> def get_score(self, y, y_pred=None, y_proba=None, y_decision=None,
> sample_weight=None):
> if y_proba is None:
> raise ValueError("This scorer needs y_proba.")
> if sample_weight is not None:
> return self._sign * self._score_func(y, y_proba,
> sample_weight=sample_weight,
> **self._kwargs)
> else:
> return self._sign * self._score_func(y, y_proba, **self._kwargs)
>
>
> On Tue, Oct 28, 2014 at 7:10 PM, Mathieu Blondel <math...@mblondel.org>
> wrote:
>
>> Different metrics require different inputs (results of predict,
>> decision_function, predict_proba). To avoid branching in the grid search
>> and cross-validation, we thus introduced the scorer API. A scorer knows
>> what kind of input it needs and calls predict, decision_function,
>> predict_proba as needed. We would like to reuse the scorer logic for
>> out-of-bag scores as well, in order to avoid branching. The problem is that
>> the scorer API is not suitable if the predictions are already available.
>> RidgeCV works around this by creating a constant predictor but this is in
>> my opinion an ugly hack. The get_score method I proposed would avoid
>> branching, although it would require to compute y_pred, y_decision and
>> y_proba.
>>
>> In the classification case, another idea would be to compute out-of-bag
>> probabilities. Then a score would be obtained by calling a
>> get_score_from_proba method. This method would be implemented as follows:
>>
>> class _PredictScorer(_BaseScorer):
>>     def get_score_from_proba(self, y, y_proba, classes):
>>         y_pred = classes[np.argmax(y_proba)
>>         return self._sign * self._score_func(y, y_pred, **self._kwargs)
>>
>> class _ProbaScorer(_BaseScorer):
>>     def get_score_from_proba(self, y, y_proba, classes):
>>         return self._sign * self._score_func(y, y_proba, **self._kwargs)
>>
>> The nice thing about predict_proba is that it returns consistently an
>> array of shape (n_samples, n_classes). decision_function is more
>> problematic because it doesn't return an array of shape (n_samples, 2) in
>> the binary case. There was a discussion a long time ago about adding a
>> predict_score method that would be more consistent in this regard but I
>> don't remember the discussion outcome.
>>
>> I don't agree that RidgeCV is an exception. If your labels are binary, it
>> is perfectly valid to train a regressor on them and want to compute ranking
>> metrics like AUC or Average Precision. And there is RidgeClassifierCV too.
>>
>> Mathieu
>>
>> On Wed, Oct 29, 2014 at 3:21 AM, Andy <t3k...@gmail.com> wrote:
>>
>>>  Hi.
>>> Can you give a bit more details on 3 and 4?
>>> And can you give an example use case?
>>> When do you need scorers and out of bag samples? The scorers are used in
>>> GridSearchCV and cross_val_score, but the out of bag samples basically
>>> replace cross validation,
>>> so I don't quite understand how these would work together.
>>>
>>> I think it would be great if you could give a use-case and some (pseudo)
>>> code on how it would look with your favourite solution.
>>>
>>> Cheers,
>>> Andy
>>>
>>>
>>> On 10/26/2014 10:33 PM, Aaron Staple wrote:
>>>
>>>  Greetings sklearn developers,
>>>
>>>  I’m a new sklearn contributor, and I’ve been working on a small
>>> project to allow customization of the scoring metric used when scoring out
>>> of bag data for random forests (see
>>> https://github.com/scikit-learn/scikit-learn/pull/3723). In this PR,
>>> @mblondel and I have been discussing an architectural issue that we would
>>> like others to weigh in on.
>>>
>>>  While working on my implementation, I’ve run into a bit of difficulty
>>> using the scorer implementation as it exists today - in particular, with
>>> the interface expressed in _BaseScorer. The current _BaseScorer interface
>>> is callable, accepting an estimator (utilized as a Predictor), along with
>>> some prediction data points X, and returning a score. The various
>>> _BaseScorer implementations compute a score by calling
>>> estimator.predict(X), estimator.predict_proba(X), or
>>> estimator.decision_function(X) as needed, possibly applying some
>>> transformations to the results, and then applying a score function.
>>>
>>>  The issue I’ve run into is that predicting out of bag samples is a
>>> rather specialized procedure because the model used differs for each
>>> training point, based on how that point was used during fitting. Computing
>>> these predictions is not particularly suited for implementation as a
>>> Predictor. In addition, in the PR we’ve been discussing that idea that a
>>> random forest estimator will make its out of bag predictions available as
>>> attributes, allowing a user of the estimator to subsequently score these
>>> provided predictions. Also, @mblondel mentioned that for his work on
>>> multiple-metric grid search, he is interested in scoring predictions he
>>> computes outside of a Predictor.
>>>
>>>  The difficulty is that the current scorers take an estimator and data
>>> points, and compute predictions internally. They don’t accept externally
>>> computed predictions.
>>>
>>>  I’ve written up a series of different generalized options for
>>> implementing a system of scoring externally computed predictions (some are
>>> likely undesirable but are provided as points of comparison):
>>>
>>>  1) Add a new implementation that’s completely separate from the
>>> existing _BaseScorer class.
>>>
>>>  2) Use the existing _BaseScorer without changes. This means abusing
>>> the Predictor interface and creating something like a dummy predictor that
>>> ignores X and returns the externally computed predictions - predictions not
>>> inherently based on the X variable, but which were externally computed
>>> based on a known X value.
>>>
>>>  3) Add a private api to _BaseScorer for scoring externally computed
>>> predictions. The private api can be called by a public helper function in
>>> scorer.py.
>>>
>>>  4) Change the public api of _BaseScorer to make scoring of externally
>>> computed predictions a public operation along with the existing
>>> functionality. Also possibly rename _BaseScorer => BaseScorer.
>>>
>>>  5) Change the public api of _BaseScorer so that it only handles
>>> externally computed predictions. The existing functionality would be
>>> implemented by the caller (as a callback, since the required type of
>>> prediction data is not known by the caller).
>>>
>>>  So far in the PR we’ve been looking at options 2, 3, and 4, with 4
>>> seeming like a good candidate. Once we decide on one of these options, I’d
>>> like to follow up with stakeholders on the specifics of what the new
>>> interface will look like.
>>>
>>>  Thanks,
>>> Aaron Staple
>>>
>>>
>>> ------------------------------------------------------------------------------
>>>
>>>
>>>
>>> _______________________________________________
>>> Scikit-learn-general mailing 
>>> listScikit-learn-general@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> Scikit-learn-general mailing list
>>> Scikit-learn-general@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>
>>>
>>
>>
>> ------------------------------------------------------------------------------
>>
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>

------------------------------------------------------------------------------

_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] design of scorer interface

Reply via email to