I'd definitely like to have support for multiple metrics. My use case is
that I have several methods that I want to evaluate against different
metrics and I want the hyper-parameters to be tuned against each metric. In
addition I don't have a test set so I need to use cross-validation both for
evaluation and hyper-parameter tuning.

A first change would be for cross_val_score to accept a list of scorers and
to return a n_folds x n_scorers array. This would only support a fixed set
of hyper-parameters but this change seems rather straightforward and
non-controversial. This would hopefully also serve as a basis for multiple
metrics grid search (can't fit_grid_point be replaced with
cross_val_score?).

When using multiple metrics, a major limitation of the current scorer API
is that it will recompute the predictions for each scorer. Unfortunately,
for kernel methods or random forests, computing the predictions is really
expensive.

I will study your solution more carefully when I have more time. Could you
also give a pointer to your previous proposed solution for comparison?

Mathieu


On Thu, Jan 9, 2014 at 4:48 PM, Eustache DIEMERT <eusta...@diemert.fr>wrote:

> +1 for the "diagnostics" attribute
>
> I've struggled with this in the past and the workaround I found was to
> subclass my estimator to hook up the computation of additional metrics and
> store the results into a new attribute like diagnostics.
>
> Also, having a default set of diagnostics for different tasks is a must
> for a practitioner-friendly library.
>
> my 2c :)
>
> Eustache
>
>
> 2014/1/9 Joel Nothman <joel.noth...@gmail.com>
>
>> Hi all,
>>
>> I've had enough frustration at having to patch in things from a code fork
>> in order to merely get back precision and recall while optimising F1 in
>> grid search. This is something I need to do really frequently, as I'm sure
>> do others.
>>
>> When I wrote and submitted PRs about this problem nine months ago, I
>> proposed relatively sophisticated solutions. Perhaps a simple, flexible
>> solution is appropriate:
>>
>> GridSearchCV, RandomizedSearchCV, cross_val_score, and perhaps anything
>> else supporting 'scoring', should take an additional parameter, e.g.
>> 'diagnostics', which is a callable with interface:
>> (estimator, X_train, y_train, X_test, y_test) -> object
>>
>> The results of CV will include a params x folds array (or list of arrays)
>> to store each of these returned objects, whose dtype is automatically
>> detected, so that it may be compactly stored and easily accessed if desired.
>>
>> So when scoring=f1, a diagnostic fn can be passed to calculate precision,
>> recall, etc., which means a bit of duplicated scoring work, but no
>> confusion of the existing scoring interface.
>>
>> Scikit-learn may indeed provide ready-made diagnostic functions for
>> certain types of tasks. For example:
>>
>>    - a binary classification diagnostic might return P, R, F, AUC,
>>    AvgPrec;
>>    - multiclass might add per-class performances, different averages and
>>    a confusion matrix;
>>    - a linear model diagnostic might measure model sparsity. (Perhaps
>>    the parameter can take a sequence of callables to return a tuple of
>>    diagnostic results per fold.)
>>
>>
>> As opposed to some of my more intricate proposals, this approach leaves
>> it to the user to do any averaging over folds etc.
>>
>> *SearchCV should also store best_index_ more importantly than
>> best_params_ so that this data can be cross-referenced. If the diagnostic
>> output is a list of arrays, rather than an array, the user can manually
>> delete information from the non-best trials, before saving the model to
>> disk.
>>
>> This also implies some refactoring of cross_val_score and fit_grid_point
>> that is overdue.
>>
>> Does this seem the right level of complexity/flexibility? Please help me
>> and the many others who have requested it resolve this issue sooner rather
>> than later. I'd like to submit a PR towards this that actually gets
>> accepted, so some feedback is really welcome.
>>
>> Cheers,
>>
>> - Joel
>>
>>
>>
>> ------------------------------------------------------------------------------
>> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
>> Learn Why More Businesses Are Choosing CenturyLink Cloud For
>> Critical Workloads, Development Environments & Everything In Between.
>> Get a Quote or Start a Free Trial Today.
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>
>
> ------------------------------------------------------------------------------
> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
> Learn Why More Businesses Are Choosing CenturyLink Cloud For
> Critical Workloads, Development Environments & Everything In Between.
> Get a Quote or Start a Free Trial Today.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to