Hi folks, I went ahead and made a POC for a more complete implementation of option #4:
https://github.com/staple/scikit-learn/commit/e76fa8887cd35ad7a249ee157067cd12c89bdefb Aaron On Tue, Oct 28, 2014 at 11:35 PM, Aaron Staple <aaron.sta...@gmail.com> wrote: > Following up on Andy’s questions: > > The scorer implementation provides a registry of named scorers, and these > scorers may implement specialized logic such as choosing to call an > appropriate predictor method or munging the output of predict_proba. My > task was to make oob scoring support the same set of named scoring metrics > as cv, so my inclination was to use the existing scorers rather than start > from scratch. (Writing a separate implementation would be option #1 in my > list above.) > > I’ve also written up some examples (copying details from @mblondel’s > example earlier) > > For #3, the interface might look something like: > > class _BaseScorer(…): > > @abstractmethod > def __call__(self, estimator, X, y_true, sample_weight=None): > pass > > @abstractmethod > def _score(self, y_true, y_prediction=None, y_prediction_proba=None, > y_decision_function=None): > pass > > class _ProbaScorer(_BaseScorer): > > … > > def _score(self, y, y_pred=None, y_proba=None, y_decision=None, > sample_weight=None): > if y_proba is None: > raise ValueError("This scorer needs y_proba.") > if sample_weight is not None: > return self._sign * self._score_func(y, y_proba, > sample_weight=sample_weight, > **self._kwargs) > else: > return self._sign * self._score_func(y, y_proba, **self._kwargs) > > > And then there would be a function > > def getScore(scoring, y_true, y_prediction=None, y_prediction_proba=None, > y_decision_function=None): > return lookup(scoring)._score(y_true, y_prediction, y_prediction_proba, > y_decision_function) > > (more detail in a possible variation of this at > https://github.com/staple/scikit-learn/blob/3455/sklearn/metrics/scorer.py, > where the __call__ and _score methods share an implementation.) > > For #4, > > class _BaseScorer(…): > > @abstractmethod > def __call__(self, estimator, X, y_true, sample_weight=None): > pass > > @abstractmethod > def get_score(self, y_true, y_prediction=None, y_prediction_proba=None, > y_decision_function=None): > pass > > class _ProbaScorer(_BaseScorer): > > … > > def get_score(self, y, y_pred=None, y_proba=None, y_decision=None, > sample_weight=None): > if y_proba is None: > raise ValueError("This scorer needs y_proba.") > if sample_weight is not None: > return self._sign * self._score_func(y, y_proba, > sample_weight=sample_weight, > **self._kwargs) > else: > return self._sign * self._score_func(y, y_proba, **self._kwargs) > > > On Tue, Oct 28, 2014 at 7:10 PM, Mathieu Blondel <math...@mblondel.org> > wrote: > >> Different metrics require different inputs (results of predict, >> decision_function, predict_proba). To avoid branching in the grid search >> and cross-validation, we thus introduced the scorer API. A scorer knows >> what kind of input it needs and calls predict, decision_function, >> predict_proba as needed. We would like to reuse the scorer logic for >> out-of-bag scores as well, in order to avoid branching. The problem is that >> the scorer API is not suitable if the predictions are already available. >> RidgeCV works around this by creating a constant predictor but this is in >> my opinion an ugly hack. The get_score method I proposed would avoid >> branching, although it would require to compute y_pred, y_decision and >> y_proba. >> >> In the classification case, another idea would be to compute out-of-bag >> probabilities. Then a score would be obtained by calling a >> get_score_from_proba method. This method would be implemented as follows: >> >> class _PredictScorer(_BaseScorer): >> def get_score_from_proba(self, y, y_proba, classes): >> y_pred = classes[np.argmax(y_proba) >> return self._sign * self._score_func(y, y_pred, **self._kwargs) >> >> class _ProbaScorer(_BaseScorer): >> def get_score_from_proba(self, y, y_proba, classes): >> return self._sign * self._score_func(y, y_proba, **self._kwargs) >> >> The nice thing about predict_proba is that it returns consistently an >> array of shape (n_samples, n_classes). decision_function is more >> problematic because it doesn't return an array of shape (n_samples, 2) in >> the binary case. There was a discussion a long time ago about adding a >> predict_score method that would be more consistent in this regard but I >> don't remember the discussion outcome. >> >> I don't agree that RidgeCV is an exception. If your labels are binary, it >> is perfectly valid to train a regressor on them and want to compute ranking >> metrics like AUC or Average Precision. And there is RidgeClassifierCV too. >> >> Mathieu >> >> On Wed, Oct 29, 2014 at 3:21 AM, Andy <t3k...@gmail.com> wrote: >> >>> Hi. >>> Can you give a bit more details on 3 and 4? >>> And can you give an example use case? >>> When do you need scorers and out of bag samples? The scorers are used in >>> GridSearchCV and cross_val_score, but the out of bag samples basically >>> replace cross validation, >>> so I don't quite understand how these would work together. >>> >>> I think it would be great if you could give a use-case and some (pseudo) >>> code on how it would look with your favourite solution. >>> >>> Cheers, >>> Andy >>> >>> >>> On 10/26/2014 10:33 PM, Aaron Staple wrote: >>> >>> Greetings sklearn developers, >>> >>> I’m a new sklearn contributor, and I’ve been working on a small >>> project to allow customization of the scoring metric used when scoring out >>> of bag data for random forests (see >>> https://github.com/scikit-learn/scikit-learn/pull/3723). In this PR, >>> @mblondel and I have been discussing an architectural issue that we would >>> like others to weigh in on. >>> >>> While working on my implementation, I’ve run into a bit of difficulty >>> using the scorer implementation as it exists today - in particular, with >>> the interface expressed in _BaseScorer. The current _BaseScorer interface >>> is callable, accepting an estimator (utilized as a Predictor), along with >>> some prediction data points X, and returning a score. The various >>> _BaseScorer implementations compute a score by calling >>> estimator.predict(X), estimator.predict_proba(X), or >>> estimator.decision_function(X) as needed, possibly applying some >>> transformations to the results, and then applying a score function. >>> >>> The issue I’ve run into is that predicting out of bag samples is a >>> rather specialized procedure because the model used differs for each >>> training point, based on how that point was used during fitting. Computing >>> these predictions is not particularly suited for implementation as a >>> Predictor. In addition, in the PR we’ve been discussing that idea that a >>> random forest estimator will make its out of bag predictions available as >>> attributes, allowing a user of the estimator to subsequently score these >>> provided predictions. Also, @mblondel mentioned that for his work on >>> multiple-metric grid search, he is interested in scoring predictions he >>> computes outside of a Predictor. >>> >>> The difficulty is that the current scorers take an estimator and data >>> points, and compute predictions internally. They don’t accept externally >>> computed predictions. >>> >>> I’ve written up a series of different generalized options for >>> implementing a system of scoring externally computed predictions (some are >>> likely undesirable but are provided as points of comparison): >>> >>> 1) Add a new implementation that’s completely separate from the >>> existing _BaseScorer class. >>> >>> 2) Use the existing _BaseScorer without changes. This means abusing >>> the Predictor interface and creating something like a dummy predictor that >>> ignores X and returns the externally computed predictions - predictions not >>> inherently based on the X variable, but which were externally computed >>> based on a known X value. >>> >>> 3) Add a private api to _BaseScorer for scoring externally computed >>> predictions. The private api can be called by a public helper function in >>> scorer.py. >>> >>> 4) Change the public api of _BaseScorer to make scoring of externally >>> computed predictions a public operation along with the existing >>> functionality. Also possibly rename _BaseScorer => BaseScorer. >>> >>> 5) Change the public api of _BaseScorer so that it only handles >>> externally computed predictions. The existing functionality would be >>> implemented by the caller (as a callback, since the required type of >>> prediction data is not known by the caller). >>> >>> So far in the PR we’ve been looking at options 2, 3, and 4, with 4 >>> seeming like a good candidate. Once we decide on one of these options, I’d >>> like to follow up with stakeholders on the specifics of what the new >>> interface will look like. >>> >>> Thanks, >>> Aaron Staple >>> >>> >>> ------------------------------------------------------------------------------ >>> >>> >>> >>> _______________________________________________ >>> Scikit-learn-general mailing >>> listScikit-learn-general@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/scikit-learn-general >>> >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> >>> _______________________________________________ >>> Scikit-learn-general mailing list >>> Scikit-learn-general@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >>> >>> >> >> >> ------------------------------------------------------------------------------ >> >> _______________________________________________ >> Scikit-learn-general mailing list >> Scikit-learn-general@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >> >> >
------------------------------------------------------------------------------
_______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general