Re: [Dev] [ML] Accuracy Measure for Anomaly Detection?

Ashen Weerathunga Fri, 18 Sep 2015 02:58:39 -0700

Hi all.

Since we are considering the anomaly detection true positive would be a
case where a true anomaly detected as a anomaly by the model. Since in the
real world scenario of anomaly detection as you said the positive(anomaly)
instances are vary rare we can't go for more general measure. So I can
summarized the most applicable measures as below,


   - Sensitivity(recall) - gives the True Positive Rate. ( TP/(TP + FN) )
   - Precision - gives the probability of predicting a True Positive from
   all positive predictions ( TP/(TP+FP) )
   - PR cure - Precision recall(Sensitivity) curve - PR curve plots
   Precision Vs. Recall.
   - F1 score - gives the harmonic mean of Precision and
   Sensitivity(recall) ( 2TP / (2TP + FP + FN) )

So Precision and the Sensitivity are the most suitable measures to measure
a model where positive instances are very less. And PR curve and F1 score
are mixtures of both Sensitivity and Precision. So PR curve and F1 score
can be used to tell how good is the model IMO. We can give Sensitivity and
Precision also separately.

Thanks everyone for the support.

@Srinath, sure, I will write an article.


Thanks and Regards,

Ashen

On Thu, Sep 17, 2015 at 10:19 AM, madhuka udantha <[email protected]>
wrote:

> Hi,
>
> This is good survey paper that can be found regard to Anomaly detection
> [1], According to your need; it seems you will no need to go through whole
> the survey papers. But few sub topics will be very useful for you. This
> paper will be useful for your work.
>
> [1] Varun Chandola, Arindam Banerjee, and Vipin Kumar. 2009. Anomaly
> detection: A survey. ACM Comput. Surv. 41, 3, Article 15 (July 2009), 58
> pages. DOI=10.1145/1541880.1541882
> <http://www.researchgate.net/profile/Vipin_Kumar26/publication/220565847_Anomaly_detection_A_survey/links/0deec5161f0ca7302a000000.pdf>
> [Cited by 2458]
>
> On Wed, Sep 16, 2015 at 3:35 PM, Ashen Weerathunga <[email protected]> wrote:
>
>> Hi all,
>>
>> I am currently doing the integration of anomaly detection feature for ML.
>> I have a problem of choosing the best accuracy measure for the model. I can
>> get the confusion matrix which consists of true positives, true negatives,
>> false positives and false negatives. There are few different measures such
>> as sensitivity, accuracy, F1 score, etc. So what will be the best measure
>> to give as the model accuracy for anomaly detection model.
>>
>> [1] <https://en.wikipedia.org/wiki/Sensitivity_and_specificity>Some
>> details about those measures.
>>
>> Terminology and derivations
>> from a confusion matrix <https://en.wikipedia.org/wiki/Confusion_matrix> true
>> positive (TP)eqv. with hittrue negative (TN)eqv. with correct rejectionfalse
>> positive (FP)eqv. with false alarm
>> <https://en.wikipedia.org/wiki/False_alarm>, Type I error
>> <https://en.wikipedia.org/wiki/Type_I_error>false negative (FN)eqv. with
>> miss, Type II error <https://en.wikipedia.org/wiki/Type_II_error>
>> ------------------------------
>> sensitivity <https://en.wikipedia.org/wiki/Sensitivity_%28test%29> or
>> true positive rate (TPR)eqv. with hit rate
>> <https://en.wikipedia.org/wiki/Hit_rate>, recall
>> <https://en.wikipedia.org/wiki/Information_retrieval#Recall>[image:
>> \mathit{TPR} = \mathit{TP} / P = \mathit{TP} / (\mathit{TP}+\mathit{FN})]
>> specificity <https://en.wikipedia.org/wiki/Specificity_%28tests%29>
>> (SPC) or true negative rate[image: \mathit{SPC} = \mathit{TN} / N =
>> \mathit{TN} / (\mathit{TN}+\mathit{FP})]precision
>> <https://en.wikipedia.org/wiki/Information_retrieval#Precision> or positive
>> predictive value
>> <https://en.wikipedia.org/wiki/Positive_predictive_value> (PPV)[image:
>> \mathit{PPV} = \mathit{TP} / (\mathit{TP} + \mathit{FP})]negative
>> predictive value
>> <https://en.wikipedia.org/wiki/Negative_predictive_value> (NPV)[image:
>> \mathit{NPV} = \mathit{TN} / (\mathit{TN} + \mathit{FN})]fall-out
>> <https://en.wikipedia.org/wiki/Information_retrieval#Fall-out> or false
>> positive rate <https://en.wikipedia.org/wiki/False_positive_rate> 
>> (FPR)[image:
>> \mathit{FPR} = \mathit{FP} / N = \mathit{FP} / (\mathit{FP} + \mathit{TN})
>> = 1-\mathit{SPC}]false negative rate
>> <https://en.wikipedia.org/wiki/False_negative_rate> (FNR)[image:
>> \mathit{FNR} = \mathit{FN} / (\mathit{TP} + \mathit{FN}) = 
>> 1-\mathit{TPR}]false
>> discovery rate <https://en.wikipedia.org/wiki/False_discovery_rate> 
>> (FDR)[image:
>> \mathit{FDR} = \mathit{FP} / (\mathit{TP} + \mathit{FP}) = 1 - \mathit{PPV}]
>> ------------------------------
>> accuracy <https://en.wikipedia.org/wiki/Accuracy> (ACC)[image:
>> \mathit{ACC} = (\mathit{TP} + \mathit{TN}) / (\mathit{TP} + \mathit{FP} +
>> \mathit{FN} + \mathit{TN})]F1 score
>> <https://en.wikipedia.org/wiki/F1_score>is the harmonic mean
>> <https://en.wikipedia.org/wiki/Harmonic_mean#Harmonic_mean_of_two_numbers>
>> of precision
>> <https://en.wikipedia.org/wiki/Information_retrieval#Precision> and
>> sensitivity <https://en.wikipedia.org/wiki/Sensitivity_%28test%29>[image:
>> \mathit{F1} = 2 \mathit{TP} / (2 \mathit{TP} + \mathit{FP} + 
>> \mathit{FN})]Matthews
>> correlation coefficient
>> <https://en.wikipedia.org/wiki/Matthews_correlation_coefficient> (MCC)[image:
>> \frac{ \mathit{TP} \times \mathit{TN} - \mathit{FP} \times \mathit{FN} }
>> {\sqrt{ (\mathit{TP}+\mathit{FP}) ( \mathit{TP} + \mathit{FN} ) (
>> \mathit{TN} + \mathit{FP} ) ( \mathit{TN} + \mathit{FN} ) } }]
>> Informedness[image: \mathit{TPR} + \mathit{SPC} - 1]Markedness
>> <https://en.wikipedia.org/wiki/Markedness>[image: \mathit{PPV} +
>> \mathit{NPV} - 1]
>>
>> *Sources: Fawcett (2006) and Powers (2011).*[1]
>> <https://en.wikipedia.org/wiki/Sensitivity_and_specificity#cite_note-Fawcett2006-1>
>> [2]
>> <https://en.wikipedia.org/wiki/Sensitivity_and_specificity#cite_note-Powers2011-2>
>>
>> Thanks and Regards,
>> Ashen
>> --
>> *Ashen Weerathunga*
>> Software Engineer - Intern
>> WSO2 Inc.: http://wso2.com
>> lean.enterprise.middleware
>>
>> Email: [email protected]
>> Mobile: +94 716042995 <94716042995>
>> LinkedIn:
>> *http://lk.linkedin.com/in/ashenweerathunga
>> <http://lk.linkedin.com/in/ashenweerathunga>*
>>
>> _______________________________________________
>> Dev mailing list
>> [email protected]
>> http://wso2.org/cgi-bin/mailman/listinfo/dev
>>
>>
>
>
> --
> Cheers,
> Madhuka Udantha
> http://madhukaudantha.blogspot.com
>



-- 
*Ashen Weerathunga*
Software Engineer - Intern
WSO2 Inc.: http://wso2.com
lean.enterprise.middleware

Email: [email protected]
Mobile: +94 716042995 <94716042995>
LinkedIn:
*http://lk.linkedin.com/in/ashenweerathunga
<http://lk.linkedin.com/in/ashenweerathunga>*

_______________________________________________
Dev mailing list
[email protected]
http://wso2.org/cgi-bin/mailman/listinfo/dev

Re: [Dev] [ML] Accuracy Measure for Anomaly Detection?

Reply via email to