Hello community, I wonder if there's something similar for the binary class case where, >> the prediction is a real value (activation) and from this we can also >> derive >> - CMs for all prediction cutoff (or set of cutoffs?) >> - scores over all cutoffs (AUC, AP, ...) >> > AUC and AP are by definition over all cut-offs. And CMs for all > cutoffs doesn't seem a good idea, because that'll be n_samples many > in the general case. If you want to specify a set of cutoffs, that would > be pretty easy to do. > How do you find these cut-offs, though? > >> >> For me, in analyzing (binary class) performance, reporting scores for >> a single cutoff is less useful than seeing how the many scores (tpr, >> ppv, mcc, relative risk, chi^2, ...) vary at various false positive >> rates, or prediction quantiles. >> > In terms of finding cut-offs, one could use the idea of metric surfaces that I recently proposed https://onlinelibrary.wiley.com/doi/abs/10.1002/minf.201700127 and then plot your per-threshold TPR/TNR pairs on the PPV/MCC/etc surfaces to determine what conditions you are willing to accept against the background of your prediction problem.
I use these surfaces (a) to think about the prediction problem before any attempt at modeling is made, and (b) to deconstruct results such as "Accuracy=85%" into interpretations in the context of my field and the data being predicted. Hope this contributes a bit of food for thought. J.B.
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn