Re: [Scikit-learn-general] design of scorer interface

Mathieu Blondel Fri, 28 Nov 2014 08:17:46 -0800

I forgot to mention that in "Ridge", decision_function is an alias for
predict, precisely to allow grid searching against AUC and other ranking
metrics.


M.

On Sat, Nov 29, 2014 at 12:50 AM, Mathieu Blondel <math...@mblondel.org>
wrote:

>
>
> On Sat, Nov 29, 2014 at 12:29 AM, Michael Eickenberg <
> michael.eickenb...@gmail.com> wrote:
>
>> Hi Mathieu,
>>
>> is that the right name for this behaviour?
>>
>
> I agree, the name "predict_score" can be misleading. Another name I had in
> mind would be "predict_confidence".
>
>
>>
>> When I read the name, I thought you were proposing a function like
>> `fit_transform` in the sense that by default it would call `predict` and
>> then score the result with a given scorer and some ground truth information
>> (e.g. y_true from a cv fold). Any estimator that could do this better than
>> by following this standard procedure would then get its chance to do so.
>> The signature of this function would then have to take this ground truth
>> data and a scorer as optional inputs.
>>
>> (Secretly I have been wanting this feature but never dared to ask if I
>> can implement it. The function cross_val_score would benefit from it.)
>>
>> What you are proposing seems to group/generalize `predict_proba` and
>> `decision_function` into one. This is useful in many cases, but isn't there
>> a risk of introducing some uncontrollable magic here if several options are
>> available per estimator?
>>
>
> The scorer API is already choosing decision_function arbitrarily when both
> predict_proba and decision_function are available.
>
> https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/metrics/scorer.py#L159
>
> However except on rare occassions (e.g., SVC because of Platt
> calibration), predict_proba and decision function should agree on their
> predictions (i.e., when taking the argmax).
>
> This solution is intended to be "duck-typing" friendly. Personally, I
> think it would make our lives easier if we could just assume that all
> regressors inherit from RegressorMixin.
>
> M.
>
> Michael
>>
>> On Fri, Nov 28, 2014 at 4:05 PM, Mathieu Blondel <math...@mblondel.org>
>> wrote:
>>
>>> Here's a proof of concept that introduces a new method "predict_score":
>>>
>>> https://github.com/mblondel/scikit-learn/commit/0b06d424ea0fe40148436846c287046549419f03
>>>
>>> The role of this method is to get continuous-output predictions from
>>> both classifiers and regressors in a consistent manner. This way the
>>> predicted continuous outputs can be passed to ranking metrics like
>>> roc_auc_score. The advantage of this solution is that third-party code can
>>> reimplement "predict_score" without depending on scikit-learn.
>>>
>>> Another solution is to use isinstance(estimator, RegressorMixin) inside
>>> the scorer to detect if an estimator is a regressor and use predict instead
>>> of predict_proba / decision_function. This assumes that the estimator
>>> inherits from RegressorMixin and therefore, the code must depend on
>>> scikit-learn.
>>>
>>> M.
>>>
>>> On Fri, Nov 28, 2014 at 7:40 PM, Mathieu Blondel <math...@mblondel.org>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Fri, Nov 28, 2014 at 5:14 PM, Aaron Staple <aaron.sta...@gmail.com>
>>>> wrote:
>>>>
>>>>> [...]
>>>>> However, I tried to run a couple of test cases with 0-1 predictions
>>>>> for RidgeCV and classification with RidgeClassifierCV, and I got some 
>>>>> error
>>>>> messages. It looks like one reason for this is that
>>>>> LinearModel._center_data can convert the y values to non integers. In
>>>>> addition, it appears that in the case of multiclass classification the
>>>>> scorer is applied to the ravel()’ed list of one-vs-all classifiers and not
>>>>> to the actual class predictions. Am I right in thinking that this can
>>>>> affect the classification score for some scorers? For example, consider a
>>>>> simple accuracy scorer and just one prediction. It is possibly for some
>>>>> one-vs-all classifiers to be predicted correctly while the overall class
>>>>> prediction is wrong - thus the accuracy score over the one-vs-all
>>>>> classifiers would be nonzero while the overall classification accuracy is
>>>>> zero. (In addition, if I am reading correctly I believe the y_true and
>>>>> y_predicted values are possibly being passed incorrectly to the scorer
>>>>> currently, and are being swapped with each other.)
>>>>>
>>>>
>>>>
>>>> https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/linear_model/ridge.py#L800
>>>>
>>>> Shouldn't this line use the unnormalized y? Otherwise, this is
>>>> evaluating a different problem.
>>>>
>>>> BTW, the scorer handling in RidgeCV is currently broken.
>>>>
>>>>
>>>>>
>>>>> Given these observations I wanted to double check 1) that we want to
>>>>> support classification scorers and not just regression scorers at this
>>>>> precise location in this code and 2) that I should start using get_score 
>>>>> in
>>>>> this location now, given that I believe at least some additional work will
>>>>> be needed for support of classification scorers.
>>>>>
>>>>
>>>> I was more talking about ranking scorers.
>>>>
>>>> # y contains binary values
>>>> y_pred = RandomForestRegressor().fit(X, y).predict(X)
>>>> print roc_auc_score(y, y_pred)
>>>>
>>>> # y contains ordinal values
>>>> y_pred = RandomForestRegressor().fit(X, y).predict(X)
>>>> print ndcg_score(y, y_pred)  # not yet in scikit-learn
>>>>
>>>> For me these two usecases are perfectly legitimate. Now, I would really
>>>> like to use GridSearchCV to tune the RF hyper-parameters against AUC or
>>>> NDCG but the scorer API insists on calling either predict_proba or
>>>> decision_function.
>>>>
>>>> https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/metrics/scorer.py#L159
>>>>
>>>> If we could detect that an estimator is a regressor, we could call
>>>> "predict" instead but we have currently no way to know that. We can't check
>>>> isinstance(estimator, RegressorMixin) since we can't even expect a
>>>> third-party regression class to inherent RegressorMixin (as per our current
>>>> API "specification").
>>>>
>>>> M.
>>>>
>>>>
>>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
>>> from Actuate! Instantly Supercharge Your Business Reports and Dashboards
>>> with Interactivity, Sharing, Native Excel Exports, App Integration & more
>>> Get technology previously reserved for billion-dollar corporations, FREE
>>>
>>> http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk
>>> _______________________________________________
>>> Scikit-learn-general mailing list
>>> Scikit-learn-general@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>
>>>
>>
>

------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk

_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] design of scorer interface

Reply via email to