Hi all. Since we are considering the anomaly detection true positive would be a case where a true anomaly detected as a anomaly by the model. Since in the real world scenario of anomaly detection as you said the positive(anomaly) instances are vary rare we can't go for more general measure. So I can summarized the most applicable measures as below,
- Sensitivity(recall) - gives the True Positive Rate. ( TP/(TP + FN) ) - Precision - gives the probability of predicting a True Positive from all positive predictions ( TP/(TP+FP) ) - PR cure - Precision recall(Sensitivity) curve - PR curve plots Precision Vs. Recall. - F1 score - gives the harmonic mean of Precision and Sensitivity(recall) ( 2TP / (2TP + FP + FN) ) So Precision and the Sensitivity are the most suitable measures to measure a model where positive instances are very less. And PR curve and F1 score are mixtures of both Sensitivity and Precision. So PR curve and F1 score can be used to tell how good is the model IMO. We can give Sensitivity and Precision also separately. Thanks everyone for the support. @Srinath, sure, I will write an article. Thanks and Regards, Ashen On Thu, Sep 17, 2015 at 10:19 AM, madhuka udantha <[email protected]> wrote: > Hi, > > This is good survey paper that can be found regard to Anomaly detection > [1], According to your need; it seems you will no need to go through whole > the survey papers. But few sub topics will be very useful for you. This > paper will be useful for your work. > > [1] Varun Chandola, Arindam Banerjee, and Vipin Kumar. 2009. Anomaly > detection: A survey. ACM Comput. Surv. 41, 3, Article 15 (July 2009), 58 > pages. DOI=10.1145/1541880.1541882 > <http://www.researchgate.net/profile/Vipin_Kumar26/publication/220565847_Anomaly_detection_A_survey/links/0deec5161f0ca7302a000000.pdf> > [Cited by 2458] > > On Wed, Sep 16, 2015 at 3:35 PM, Ashen Weerathunga <[email protected]> wrote: > >> Hi all, >> >> I am currently doing the integration of anomaly detection feature for ML. >> I have a problem of choosing the best accuracy measure for the model. I can >> get the confusion matrix which consists of true positives, true negatives, >> false positives and false negatives. There are few different measures such >> as sensitivity, accuracy, F1 score, etc. So what will be the best measure >> to give as the model accuracy for anomaly detection model. >> >> [1] <https://en.wikipedia.org/wiki/Sensitivity_and_specificity>Some >> details about those measures. >> >> Terminology and derivations >> from a confusion matrix <https://en.wikipedia.org/wiki/Confusion_matrix> true >> positive (TP)eqv. with hittrue negative (TN)eqv. with correct rejectionfalse >> positive (FP)eqv. with false alarm >> <https://en.wikipedia.org/wiki/False_alarm>, Type I error >> <https://en.wikipedia.org/wiki/Type_I_error>false negative (FN)eqv. with >> miss, Type II error <https://en.wikipedia.org/wiki/Type_II_error> >> ------------------------------ >> sensitivity <https://en.wikipedia.org/wiki/Sensitivity_%28test%29> or >> true positive rate (TPR)eqv. with hit rate >> <https://en.wikipedia.org/wiki/Hit_rate>, recall >> <https://en.wikipedia.org/wiki/Information_retrieval#Recall>[image: >> \mathit{TPR} = \mathit{TP} / P = \mathit{TP} / (\mathit{TP}+\mathit{FN})] >> specificity <https://en.wikipedia.org/wiki/Specificity_%28tests%29> >> (SPC) or true negative rate[image: \mathit{SPC} = \mathit{TN} / N = >> \mathit{TN} / (\mathit{TN}+\mathit{FP})]precision >> <https://en.wikipedia.org/wiki/Information_retrieval#Precision> or positive >> predictive value >> <https://en.wikipedia.org/wiki/Positive_predictive_value> (PPV)[image: >> \mathit{PPV} = \mathit{TP} / (\mathit{TP} + \mathit{FP})]negative >> predictive value >> <https://en.wikipedia.org/wiki/Negative_predictive_value> (NPV)[image: >> \mathit{NPV} = \mathit{TN} / (\mathit{TN} + \mathit{FN})]fall-out >> <https://en.wikipedia.org/wiki/Information_retrieval#Fall-out> or false >> positive rate <https://en.wikipedia.org/wiki/False_positive_rate> >> (FPR)[image: >> \mathit{FPR} = \mathit{FP} / N = \mathit{FP} / (\mathit{FP} + \mathit{TN}) >> = 1-\mathit{SPC}]false negative rate >> <https://en.wikipedia.org/wiki/False_negative_rate> (FNR)[image: >> \mathit{FNR} = \mathit{FN} / (\mathit{TP} + \mathit{FN}) = >> 1-\mathit{TPR}]false >> discovery rate <https://en.wikipedia.org/wiki/False_discovery_rate> >> (FDR)[image: >> \mathit{FDR} = \mathit{FP} / (\mathit{TP} + \mathit{FP}) = 1 - \mathit{PPV}] >> ------------------------------ >> accuracy <https://en.wikipedia.org/wiki/Accuracy> (ACC)[image: >> \mathit{ACC} = (\mathit{TP} + \mathit{TN}) / (\mathit{TP} + \mathit{FP} + >> \mathit{FN} + \mathit{TN})]F1 score >> <https://en.wikipedia.org/wiki/F1_score>is the harmonic mean >> <https://en.wikipedia.org/wiki/Harmonic_mean#Harmonic_mean_of_two_numbers> >> of precision >> <https://en.wikipedia.org/wiki/Information_retrieval#Precision> and >> sensitivity <https://en.wikipedia.org/wiki/Sensitivity_%28test%29>[image: >> \mathit{F1} = 2 \mathit{TP} / (2 \mathit{TP} + \mathit{FP} + >> \mathit{FN})]Matthews >> correlation coefficient >> <https://en.wikipedia.org/wiki/Matthews_correlation_coefficient> (MCC)[image: >> \frac{ \mathit{TP} \times \mathit{TN} - \mathit{FP} \times \mathit{FN} } >> {\sqrt{ (\mathit{TP}+\mathit{FP}) ( \mathit{TP} + \mathit{FN} ) ( >> \mathit{TN} + \mathit{FP} ) ( \mathit{TN} + \mathit{FN} ) } }] >> Informedness[image: \mathit{TPR} + \mathit{SPC} - 1]Markedness >> <https://en.wikipedia.org/wiki/Markedness>[image: \mathit{PPV} + >> \mathit{NPV} - 1] >> >> *Sources: Fawcett (2006) and Powers (2011).*[1] >> <https://en.wikipedia.org/wiki/Sensitivity_and_specificity#cite_note-Fawcett2006-1> >> [2] >> <https://en.wikipedia.org/wiki/Sensitivity_and_specificity#cite_note-Powers2011-2> >> >> Thanks and Regards, >> Ashen >> -- >> *Ashen Weerathunga* >> Software Engineer - Intern >> WSO2 Inc.: http://wso2.com >> lean.enterprise.middleware >> >> Email: [email protected] >> Mobile: +94 716042995 <94716042995> >> LinkedIn: >> *http://lk.linkedin.com/in/ashenweerathunga >> <http://lk.linkedin.com/in/ashenweerathunga>* >> >> _______________________________________________ >> Dev mailing list >> [email protected] >> http://wso2.org/cgi-bin/mailman/listinfo/dev >> >> > > > -- > Cheers, > Madhuka Udantha > http://madhukaudantha.blogspot.com > -- *Ashen Weerathunga* Software Engineer - Intern WSO2 Inc.: http://wso2.com lean.enterprise.middleware Email: [email protected] Mobile: +94 716042995 <94716042995> LinkedIn: *http://lk.linkedin.com/in/ashenweerathunga <http://lk.linkedin.com/in/ashenweerathunga>*
_______________________________________________ Dev mailing list [email protected] http://wso2.org/cgi-bin/mailman/listinfo/dev
