Re: [scikit-learn] PyCM: Multiclass confusion matrix library in Python

2018-06-04 Thread Joel Nothman
>
> Thanks for this -- looks useful. I had to write something similar (for
>> the binary case) and wish scikit had something like this.
>
>
> Which part of it? I'm not entirely sure I understand what the core
> functionality is.
>
>
I think the core efficiently evaluating the full set of metrics appropriate
for the kind of task. We now support multi-metric scoring in things like
cross_validation and GridSearchCV (but not in other CV implementations
yet), but:

   1. it's not efficient (there are PRs in progress to work around this,
   but they are definitely work-arounds in the sense that we're still
   repeatedly calling metric functions rather than calculating sufficient
   statistics once), and
   2. we don't have a pre-defined set of scorers appropriate to binary
   classification; or for multiclass classification with 4 classes, one of
   which is the majority "no finding" class, etc.

But assuming we could solve or work around the first issue, having an
interface, in the core library or elsewhere which gave us a series of
appropriately-named scorers for different task types might be neat and
avoid code that a lot of people repeat:

def get_scorers_for_binary(pos_label, neg_label, proba_thresholds=(0.5,)):
return {'precision:p>0.5': make_scorer(precision_score,
pos_label=pos_label),
'accuracy:p>0.5': 'accuracy',
'roc_auc': 'roc_auc',
'log_loss': 'log_loss',
...
}

def get_scorers_for_multiclass(pos_labels, neg_labels=()):
out = {'accuracy': 'accuracy',
   'mcc': make_scorer(matthews_corrcoef),
   'cohen_kappa': make_scorer(cohen_kapppa_score),
   'precision_macro': make_scorer(precision_score,
labels=pos_labels, average='macro'),
   'precision_weighted': make_scorer(precision_score,
labels=pos_labels, average='weighted'),
   ...}
if neg_labels:
# micro-average precision is != accuracy only if some labels
are excluded
out['precision_micro'] = make_scorer(precision_score,
labels=pos_labels, average='micro')
...
return out


I note some risk of encouraging bad practice around multiple hypotheses,
etc... but generally I think this would be helpful to users.
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


Re: [scikit-learn] PyCM: Multiclass confusion matrix library in Python

2018-06-04 Thread Andreas Mueller

Is that Jet?!

https://www.youtube.com/watch?v=xAoljeRJ3lU

;)

On 6/4/18 11:56 AM, Brown J.B. via scikit-learn wrote:

Hello community,

I wonder if there's something similar for the binary class
case where,
the prediction is a real value (activation) and from this we
can also
derive
  - CMs for all prediction cutoff (or set of cutoffs?)
  - scores over all cutoffs (AUC, AP, ...)

AUC and AP are by definition over all cut-offs. And CMs for all
cutoffs doesn't seem a good idea, because that'll be n_samples many
in the general case. If you want to specify a set of cutoffs, that
would be pretty easy to do.
How do you find these cut-offs, though?


For me, in analyzing (binary class) performance, reporting
scores for
a single cutoff is less useful than seeing how the many scores
(tpr,
ppv, mcc, relative risk, chi^2, ...) vary at various false
positive
rates, or prediction quantiles.


In terms of finding cut-offs, one could use the idea of metric 
surfaces that I recently proposed

https://onlinelibrary.wiley.com/doi/abs/10.1002/minf.201700127
and then plot your per-threshold TPR/TNR pairs on the PPV/MCC/etc 
surfaces to determine what conditions you are willing to accept 
against the background of your prediction problem.


I use these surfaces (a) to think about the prediction problem before 
any attempt at modeling is made, and (b) to deconstruct results such 
as "Accuracy=85%" into interpretations in the context of my field and 
the data being predicted.


Hope this contributes a bit of food for thought.
J.B.

___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


Re: [scikit-learn] PyCM: Multiclass confusion matrix library in Python

2018-06-04 Thread Brown J.B. via scikit-learn
Hello community,

I wonder if there's something similar for the binary class case where,
>> the prediction is a real value (activation) and from this we can also
>> derive
>>   - CMs for all prediction cutoff (or set of cutoffs?)
>>   - scores over all cutoffs (AUC, AP, ...)
>>
> AUC and AP are by definition over all cut-offs. And CMs for all
> cutoffs doesn't seem a good idea, because that'll be n_samples many
> in the general case. If you want to specify a set of cutoffs, that would
> be pretty easy to do.
> How do you find these cut-offs, though?
>
>>
>> For me, in analyzing (binary class) performance, reporting scores for
>> a single cutoff is less useful than seeing how the many scores (tpr,
>> ppv, mcc, relative risk, chi^2, ...) vary at various false positive
>> rates, or prediction quantiles.
>>
>
In terms of finding cut-offs, one could use the idea of metric surfaces
that I recently proposed
https://onlinelibrary.wiley.com/doi/abs/10.1002/minf.201700127
and then plot your per-threshold TPR/TNR pairs on the PPV/MCC/etc surfaces
to determine what conditions you are willing to accept against the
background of your prediction problem.

I use these surfaces (a) to think about the prediction problem before any
attempt at modeling is made, and (b) to deconstruct results such as
"Accuracy=85%" into interpretations in the context of my field and the data
being predicted.

Hope this contributes a bit of food for thought.
J.B.
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


Re: [scikit-learn] PyCM: Multiclass confusion matrix library in Python

2018-06-04 Thread Andreas Mueller



On 5/31/18 1:26 PM, Stuart Reynolds wrote:

Hi Sepand,

Thanks for this -- looks useful. I had to write something similar (for
the binary case) and wish scikit had something like this.
Which part of it? I'm not entirely sure I understand what the core 
functionality is.


I wonder if there's something similar for the binary class case where,
the prediction is a real value (activation) and from this we can also
derive
  - CMs for all prediction cutoff (or set of cutoffs?)
  - scores over all cutoffs (AUC, AP, ...)

AUC and AP are by definition over all cut-offs. And CMs for all
cutoffs doesn't seem a good idea, because that'll be n_samples many
in the general case. If you want to specify a set of cutoffs, that would 
be pretty easy to do.

How do you find these cut-offs, though?


For me, in analyzing (binary class) performance, reporting scores for
a single cutoff is less useful than seeing how the many scores (tpr,
ppv, mcc, relative risk, chi^2, ...) vary at various false positive
rates, or prediction quantiles.

You can totally do that with sklearn right now. Granted, it's not
as convenient as it could be, but we're working on it.

What's really the crucial point for me is how to pick the cut-offs.


Cheers,

Andy

___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


Re: [scikit-learn] PyCM: Multiclass confusion matrix library in Python

2018-06-04 Thread Sepand Haghighi via scikit-learn
Hi Stuart
Thanks ;-)
Activation threshold is in our plan and will be added in next release (in the 
next few weeks)


Best RegardsSepand Haghighi 

On Thursday, May 31, 2018, 9:56:43 PM GMT+4:30, Stuart Reynolds 
 wrote:  
 
 Hi Sepand,

Thanks for this -- looks useful. I had to write something similar (for
the binary case) and wish scikit had something like this.

I wonder if there's something similar for the binary class case where,
the prediction is a real value (activation) and from this we can also
derive
 - CMs for all prediction cutoff (or set of cutoffs?)
 - scores over all cutoffs (AUC, AP, ...)

For me, in analyzing (binary class) performance, reporting scores for
a single cutoff is less useful than seeing how the many scores (tpr,
ppv, mcc, relative risk, chi^2, ...) vary at various false positive
rates, or prediction quantiles.
Does your library provide any tools for the binary case where we add
an activation threshold?

Thanks again for releasing this and providing pip packaging.
- Stuart


On Thu, May 31, 2018 at 6:05 AM, Sepand Haghighi via scikit-learn
 wrote:
> PyCM is a multi-class confusion matrix library written in Python that
> supports both input data vectors and direct matrix, and a proper tool for
> post-classification model evaluation that supports most classes and overall
> statistics parameters. PyCM is the swiss-army knife of confusion matrices,
> targeted mainly at data scientists that need a broad array of metrics for
> predictive models and an accurate evaluation of large variety of
> classifiers.
>
> Github Repo : https://github.com/sepandhaghighi/pycm
>
> Webpage : http://pycm.shaghighi.ir/
>
> JOSS Paper : https://doi.org/10.21105/joss.00729
>
>
>
>
>
>
>
>
>
> ___
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
  ___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn