Hi all, I am currently doing the integration of anomaly detection feature for ML. I have a problem of choosing the best accuracy measure for the model. I can get the confusion matrix which consists of true positives, true negatives, false positives and false negatives. There are few different measures such as sensitivity, accuracy, F1 score, etc. So what will be the best measure to give as the model accuracy for anomaly detection model.
[1] <https://en.wikipedia.org/wiki/Sensitivity_and_specificity>Some details about those measures. Terminology and derivations from a confusion matrix <https://en.wikipedia.org/wiki/Confusion_matrix> true positive (TP)eqv. with hittrue negative (TN)eqv. with correct rejectionfalse positive (FP)eqv. with false alarm <https://en.wikipedia.org/wiki/False_alarm>, Type I error <https://en.wikipedia.org/wiki/Type_I_error>false negative (FN)eqv. with miss, Type II error <https://en.wikipedia.org/wiki/Type_II_error> ------------------------------ sensitivity <https://en.wikipedia.org/wiki/Sensitivity_%28test%29> or true positive rate (TPR)eqv. with hit rate <https://en.wikipedia.org/wiki/Hit_rate>, recall <https://en.wikipedia.org/wiki/Information_retrieval#Recall>[image: \mathit{TPR} = \mathit{TP} / P = \mathit{TP} / (\mathit{TP}+\mathit{FN})] specificity <https://en.wikipedia.org/wiki/Specificity_%28tests%29> (SPC) or true negative rate[image: \mathit{SPC} = \mathit{TN} / N = \mathit{TN} / (\mathit{TN}+\mathit{FP})]precision <https://en.wikipedia.org/wiki/Information_retrieval#Precision> or positive predictive value <https://en.wikipedia.org/wiki/Positive_predictive_value> (PPV)[image: \mathit{PPV} = \mathit{TP} / (\mathit{TP} + \mathit{FP})]negative predictive value <https://en.wikipedia.org/wiki/Negative_predictive_value> (NPV)[image: \mathit{NPV} = \mathit{TN} / (\mathit{TN} + \mathit{FN})] fall-out <https://en.wikipedia.org/wiki/Information_retrieval#Fall-out> or false positive rate <https://en.wikipedia.org/wiki/False_positive_rate> (FPR)[image: \mathit{FPR} = \mathit{FP} / N = \mathit{FP} / (\mathit{FP} + \mathit{TN}) = 1-\mathit{SPC}]false negative rate <https://en.wikipedia.org/wiki/False_negative_rate> (FNR)[image: \mathit{FNR} = \mathit{FN} / (\mathit{TP} + \mathit{FN}) = 1-\mathit{TPR}]false discovery rate <https://en.wikipedia.org/wiki/False_discovery_rate> (FDR)[image: \mathit{FDR} = \mathit{FP} / (\mathit{TP} + \mathit{FP}) = 1 - \mathit{PPV}] ------------------------------ accuracy <https://en.wikipedia.org/wiki/Accuracy> (ACC)[image: \mathit{ACC} = (\mathit{TP} + \mathit{TN}) / (\mathit{TP} + \mathit{FP} + \mathit{FN} + \mathit{TN})]F1 score <https://en.wikipedia.org/wiki/F1_score>is the harmonic mean <https://en.wikipedia.org/wiki/Harmonic_mean#Harmonic_mean_of_two_numbers> of precision <https://en.wikipedia.org/wiki/Information_retrieval#Precision> and sensitivity <https://en.wikipedia.org/wiki/Sensitivity_%28test%29>[image: \mathit{F1} = 2 \mathit{TP} / (2 \mathit{TP} + \mathit{FP} + \mathit{FN})]Matthews correlation coefficient <https://en.wikipedia.org/wiki/Matthews_correlation_coefficient> (MCC)[image: \frac{ \mathit{TP} \times \mathit{TN} - \mathit{FP} \times \mathit{FN} } {\sqrt{ (\mathit{TP}+\mathit{FP}) ( \mathit{TP} + \mathit{FN} ) ( \mathit{TN} + \mathit{FP} ) ( \mathit{TN} + \mathit{FN} ) } }]Informedness[image: \mathit{TPR} + \mathit{SPC} - 1]Markedness <https://en.wikipedia.org/wiki/Markedness>[image: \mathit{PPV} + \mathit{NPV} - 1] *Sources: Fawcett (2006) and Powers (2011).*[1] <https://en.wikipedia.org/wiki/Sensitivity_and_specificity#cite_note-Fawcett2006-1> [2] <https://en.wikipedia.org/wiki/Sensitivity_and_specificity#cite_note-Powers2011-2> Thanks and Regards, Ashen -- *Ashen Weerathunga* Software Engineer - Intern WSO2 Inc.: http://wso2.com lean.enterprise.middleware Email: [email protected] Mobile: +94 716042995 <94716042995> LinkedIn: *http://lk.linkedin.com/in/ashenweerathunga <http://lk.linkedin.com/in/ashenweerathunga>*
_______________________________________________ Dev mailing list [email protected] http://wso2.org/cgi-bin/mailman/listinfo/dev
