> My previous proposition was mainly for cross_val_score for the time being.
I consider them almost one and the same in terms of the information users
want out of them. The fact that cross_val_score is a function, not a class,
makes it more difficult to change the return format, but changing the
dimensions of the score array seems reasonable.
> My solution would be for scorers to take not a triplet (estimator, X,
y_true) but a pair (y_true, y_score), where y_score is a *continuous*
output (output of decision_function). For metrics which need categorical
predictions, y_score can be converted in the scorer.
I like this idea, broadly. I don't especially like the thought of
deprecating the scorer interface and parameter name already, but I think
this entails doing so. And the fact that it no longer has the same
interface as estimator.score suggests it should have a different name.
> The conversion would rely on the fact that predict in classifiers is
defined as the argmax of decision_function.
Or similar for multilabel, multi-output and binary...
> This solution assumes that all classifiers have a decision_function. I
think that this is feasible, even for non-parametric estimators like kNN.
It also assumes that decision_function is defined as an alias to predict in
RegressorMixin.
It also assumes that you're not going to cross-validate some other kind of
predictor, such as a clusterer (most don't support predict, and we already
don't handle fit_predict here).
On 14 January 2014 21:39, Mathieu Blondel <math...@mblondel.org> wrote:
>
>
> On Tue, Jan 14, 2014 at 4:16 PM, Joel Nothman <joel.noth...@gmail.com>wrote:
>
>>
>>
>> - I like some ideas of your solution, in which you can have multiple
>> objectives and hence best models, i.e. est.best_index_ could be an array,
>> and the corresponding est.best_params_. Yet I think there are many cases
>> where you don't actually want to find the best parameters for each metric
>> (e.g. P and R are only there to explain the F1 objective; multiclass
>> per-class vs average).
>>
>>
> So it seems that we have different use cases. I want to find the
> best-tuned estimator against each metric while you want to reuse
> computations from GridSearchCV to make a multiple metric evaluation report.
> But then I am not completely sure to see why you need to frame this within
> GridSearchCV.
>
> My previous proposition was mainly for cross_val_score for the time being.
> I actually think that supporting multiple scorers in GridSearchCV would be
> problematic because GridSearchCV needs to behave like a predictor. So, we
> would need a stateful API like:
>
> gs = GridSearchCV(LinearSVC(), param_dict, scoring=["auc", "f1"])
> gs.fit(X, y)
> gs.set_best_estimator(scoring="auc")
> gs.predict(X)
> gs.set_best_estimator(scoring="f1")
> gs.predict(X) # predictions may be different
>
> For this reason, I think that a function that outputs the best estimators
> for each scorer would be better:
>
> best_estimators = multiple_grid_search(LinearSVC(), param_dict,
> scoring=["auc", "f1"])
>
>
>> -
>> - Passing a list of scorers doesn't take advantage of already having
>> multiple metrics returned efficiently by a function (e.g. P,R,F; per-class
>> F1), besides the need to do an extra prediction which you already point
>> out. If each scorer were passed individually, you'd need a custom scorer
>> for each class in the per-class F1 case; or the outputs from each scorer
>> can be flattened and hstacked.
>>
>> I think evaluating the metric is orders of magnitude faster than
> computing the predictions.
>
>
>>
>> - Using a list of scorer names means this *can* be optimised to do
>> prediction as few times as possible, by grouping together those that
>> require thresholds and those that don't. This of course requires a rewrite
>> of scorer.py and is quite a complex solution.
>>
>> But I think that the fact that predictions must be recomputed every time
> is a serious limitation of the current scorer API and should be addressed.
>
> My solution would be for scorers to take not a triplet (estimator, X,
> y_true) but a pair (y_true, y_score), where y_score is a *continuous*
> output (output of decision_function). For metrics which need categorical
> predictions, y_score can be converted in the scorer. The conversion would
> rely on the fact that predict in classifiers is defined as the argmax of
> decision_function.
>
> This solution assumes that all classifiers have a decision_function. I
> think that this is feasible, even for non-parametric estimators like kNN.
> It also assumes that decision_function is defined as an alias to predict in
> RegressorMixin. The log loss is the only metric that specifically need
> probabilities but it can be re-implemented so as to take decision_function
> outputs instead.
>
> In any case, I can see the benefit of having a callback system in
> GridSearchCV to let the user reuse some computations.
>
> Mathieu
>
>
------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today.
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general