Re: [Dev] [ML] Accuracy Measure for Anomaly Detection?

2015-09-24 Thread Supun Sethunga
Hi Ashen,

In probabilistic models, what we do is, compare the predicted output of a
new data-point, against a cutoff probability, to decide which class it
belongs to. And this cutoff probability is decided by the user, hence has
the freedom to change from 0 to 1. So for a set of newly-arrived data
points, we can change the "cutoff probability" for any number of times
(between 0-1) and find a series of confusion matrices.

But in this case, (From what understood from the other mail thread, the
logic applied here is..) you first cluster the data, then for each incoming
data, you find the nearest cluster, then compare the distance between the
new point and the cluster's center, with the cluster-boundary. (please
correct me if i've mistaken). So we have only one static value as the class
boundary, and hence cannot have a series of confusion matrices. (which
means no ROC). But again, in the other mail thread you mentioned "*select
the* *percentile value from distances of each clusters as their cluster
boundaries*", Im not really sure what that "percentile" value is, but if
this is a volatile value or a user preferred value, I think we can change
that and do a similar thing as in the probabilistic case..  This means we
are  changing the cluster boundaries and see how the accuracy (or the
measurement statistics) change.

Regards,
Supun


On Wed, Sep 23, 2015 at 9:17 AM, Ashen Weerathunga  wrote:

> Hi all,
>
> Thanks Mahesan for the suggestion. yes we can give all the measure if It
> is better.
>
> But there is some problem of drawing PR curve or ROC curve. Since we can
> get only one point using the confusion matrix we cant give PR curve or ROC
> curve in the summary of the model. Currently ROC curve provided only in
> probabilistic classification methods. It's also calculated using the model
> itself. But in this scenario we use K means algorithm. after generating the
> clusters we evaluate the model using the test data according to the
> percentile value that user provided. So as a result we can get the
> confusion matrix which consist of TP,TN,FP,FN. But to draw a PR curve or
> ROC curve that is not enough. Does anyone have any suggestions about that?
> or should we drop it?
>
> On Mon, Sep 21, 2015 at 7:05 AM, Sinnathamby Mahesan  > wrote:
>
>> Ashen
>> Here is a situation:
>> Doctors  are testing a person for a disease, say, d.
>> Doctor's point of view +ve means  patient has (d)
>>
>> Which is of the following is worse than the other?
>> (1) The person who does NOT  have (d)  is identified as having (d)  -
>>  (that is, false  positive )
>> (2) The person who does have (d) is identified as NOT having (d)   -
>>  (that is, false negative)
>>
>> Doctors  argument is that  we have to be more concern on reducing case
>>  (2)
>> That is to say,  the sensitivity needs to be high.
>>
>> Anyway, I also thought it is better to display all measures :
>> sensitivity, specificity, precision and F1-Score
>> (suggesting to consider sensitivity for the case of  anomalous being
>> positive.
>>
>> Good Luck
>> Mahesan
>>
>>
>> On 18 September 2015 at 15:27, Ashen Weerathunga  wrote:
>>
>>> Hi all.
>>>
>>> Since we are considering the anomaly detection true positive would be a
>>> case where a true anomaly detected as a anomaly by the model. Since in the
>>> real world scenario of anomaly detection as you said the positive(anomaly)
>>> instances are vary rare we can't go for more general measure. So I can
>>> summarized the most applicable measures as below,
>>>
>>>- Sensitivity(recall) - gives the True Positive Rate. ( TP/(TP + FN)
>>>)
>>>- Precision - gives the probability of predicting a True Positive
>>>from all positive predictions ( TP/(TP+FP) )
>>>- PR cure - Precision recall(Sensitivity) curve - PR curve plots
>>>Precision Vs. Recall.
>>>- F1 score - gives the harmonic mean of Precision and
>>>Sensitivity(recall) ( 2TP / (2TP + FP + FN) )
>>>
>>> So Precision and the Sensitivity are the most suitable measures to
>>> measure a model where positive instances are very less. And PR curve and F1
>>> score are mixtures of both Sensitivity and Precision. So PR curve and F1
>>> score can be used to tell how good is the model IMO. We can give
>>> Sensitivity and Precision also separately.
>>>
>>> Thanks everyone for the support.
>>>
>>> @Srinath, sure, I will write an article.
>>>
>>>
>>> Thanks and Regards,
>>>
>>> Ashen
>>>
>>> On Thu, Sep 17, 2015 at 10:19 AM, madhuka udantha <
>>> madhukaudan...@gmail.com> wrote:
>>>
 Hi,

 This is good survey paper that can be found regard to Anomaly detection
 [1], According to your need; it seems you will no need to go through whole
 the survey papers. But few sub topics will be very useful for you. This
 paper will be useful for your work.

 [1] Varun Chandola, Arindam Banerjee, and Vipin Kumar. 2009. Anomaly
 detection: A survey. ACM Comput. Surv. 41, 3, 

Re: [Dev] [ML] Accuracy Measure for Anomaly Detection?

2015-09-24 Thread Supun Sethunga
>
> ...test data according to the percentile value that user provided.


Sorry I missed this part. If so, can't we not ask the user the percentile,
but instead create the ROC and let him decide the best percentile looking
at the ROC?

On Thu, Sep 24, 2015 at 10:26 AM, Supun Sethunga  wrote:

> Hi Ashen,
>
> In probabilistic models, what we do is, compare the predicted output of a
> new data-point, against a cutoff probability, to decide which class it
> belongs to. And this cutoff probability is decided by the user, hence has
> the freedom to change from 0 to 1. So for a set of newly-arrived data
> points, we can change the "cutoff probability" for any number of times
> (between 0-1) and find a series of confusion matrices.
>
> But in this case, (From what understood from the other mail thread, the
> logic applied here is..) you first cluster the data, then for each incoming
> data, you find the nearest cluster, then compare the distance between the
> new point and the cluster's center, with the cluster-boundary. (please
> correct me if i've mistaken). So we have only one static value as the class
> boundary, and hence cannot have a series of confusion matrices. (which
> means no ROC). But again, in the other mail thread you mentioned "*select
> the* *percentile value from distances of each clusters as their cluster
> boundaries*", Im not really sure what that "percentile" value is, but if
> this is a volatile value or a user preferred value, I think we can change
> that and do a similar thing as in the probabilistic case..  This means we
> are  changing the cluster boundaries and see how the accuracy (or the
> measurement statistics) change.
>
> Regards,
> Supun
>
>
> On Wed, Sep 23, 2015 at 9:17 AM, Ashen Weerathunga  wrote:
>
>> Hi all,
>>
>> Thanks Mahesan for the suggestion. yes we can give all the measure if It
>> is better.
>>
>> But there is some problem of drawing PR curve or ROC curve. Since we can
>> get only one point using the confusion matrix we cant give PR curve or ROC
>> curve in the summary of the model. Currently ROC curve provided only in
>> probabilistic classification methods. It's also calculated using the model
>> itself. But in this scenario we use K means algorithm. after generating the
>> clusters we evaluate the model using the test data according to the
>> percentile value that user provided. So as a result we can get the
>> confusion matrix which consist of TP,TN,FP,FN. But to draw a PR curve or
>> ROC curve that is not enough. Does anyone have any suggestions about that?
>> or should we drop it?
>>
>> On Mon, Sep 21, 2015 at 7:05 AM, Sinnathamby Mahesan <
>> sinnatha...@wso2.com> wrote:
>>
>>> Ashen
>>> Here is a situation:
>>> Doctors  are testing a person for a disease, say, d.
>>> Doctor's point of view +ve means  patient has (d)
>>>
>>> Which is of the following is worse than the other?
>>> (1) The person who does NOT  have (d)  is identified as having (d)  -
>>>  (that is, false  positive )
>>> (2) The person who does have (d) is identified as NOT having (d)   -
>>>  (that is, false negative)
>>>
>>> Doctors  argument is that  we have to be more concern on reducing case
>>>  (2)
>>> That is to say,  the sensitivity needs to be high.
>>>
>>> Anyway, I also thought it is better to display all measures :
>>> sensitivity, specificity, precision and F1-Score
>>> (suggesting to consider sensitivity for the case of  anomalous being
>>> positive.
>>>
>>> Good Luck
>>> Mahesan
>>>
>>>
>>> On 18 September 2015 at 15:27, Ashen Weerathunga  wrote:
>>>
 Hi all.

 Since we are considering the anomaly detection true positive would be a
 case where a true anomaly detected as a anomaly by the model. Since in the
 real world scenario of anomaly detection as you said the positive(anomaly)
 instances are vary rare we can't go for more general measure. So I can
 summarized the most applicable measures as below,

- Sensitivity(recall) - gives the True Positive Rate. ( TP/(TP +
FN) )
- Precision - gives the probability of predicting a True Positive
from all positive predictions ( TP/(TP+FP) )
- PR cure - Precision recall(Sensitivity) curve - PR curve plots
Precision Vs. Recall.
- F1 score - gives the harmonic mean of Precision and
Sensitivity(recall) ( 2TP / (2TP + FP + FN) )

 So Precision and the Sensitivity are the most suitable measures to
 measure a model where positive instances are very less. And PR curve and F1
 score are mixtures of both Sensitivity and Precision. So PR curve and F1
 score can be used to tell how good is the model IMO. We can give
 Sensitivity and Precision also separately.

 Thanks everyone for the support.

 @Srinath, sure, I will write an article.


 Thanks and Regards,

 Ashen

 On Thu, Sep 17, 2015 at 10:19 AM, madhuka udantha <
 

Re: [Dev] [ML] Accuracy Measure for Anomaly Detection?

2015-09-24 Thread Ashen Weerathunga
Thanks Dr. Ruvan and Supun for the suggestions!

Yes Supun, In this scenario we consider a percentile value of all distances
to identify the cluster boundary rather than just considering the max
distance. Right now we are getting that percentile value from the user.
yes, If we do calculate set of confusion matrices by considering set of
boundary values it will help user to identify the best option. will work on
that. Thanks for the idea!

On Thu, Sep 24, 2015 at 8:05 PM, Supun Sethunga  wrote:

> ...test data according to the percentile value that user provided.
>
>
> Sorry I missed this part. If so, can't we not ask the user the percentile,
> but instead create the ROC and let him decide the best percentile looking
> at the ROC?
>
> On Thu, Sep 24, 2015 at 10:26 AM, Supun Sethunga  wrote:
>
>> Hi Ashen,
>>
>> In probabilistic models, what we do is, compare the predicted output of a
>> new data-point, against a cutoff probability, to decide which class it
>> belongs to. And this cutoff probability is decided by the user, hence has
>> the freedom to change from 0 to 1. So for a set of newly-arrived data
>> points, we can change the "cutoff probability" for any number of times
>> (between 0-1) and find a series of confusion matrices.
>>
>> But in this case, (From what understood from the other mail thread, the
>> logic applied here is..) you first cluster the data, then for each incoming
>> data, you find the nearest cluster, then compare the distance between the
>> new point and the cluster's center, with the cluster-boundary. (please
>> correct me if i've mistaken). So we have only one static value as the class
>> boundary, and hence cannot have a series of confusion matrices. (which
>> means no ROC). But again, in the other mail thread you mentioned "*select
>> the* *percentile value from distances of each clusters as their cluster
>> boundaries*", Im not really sure what that "percentile" value is, but if
>> this is a volatile value or a user preferred value, I think we can change
>> that and do a similar thing as in the probabilistic case..  This means we
>> are  changing the cluster boundaries and see how the accuracy (or the
>> measurement statistics) change.
>>
>> Regards,
>> Supun
>>
>>
>> On Wed, Sep 23, 2015 at 9:17 AM, Ashen Weerathunga 
>> wrote:
>>
>>> Hi all,
>>>
>>> Thanks Mahesan for the suggestion. yes we can give all the measure if It
>>> is better.
>>>
>>> But there is some problem of drawing PR curve or ROC curve. Since we can
>>> get only one point using the confusion matrix we cant give PR curve or ROC
>>> curve in the summary of the model. Currently ROC curve provided only in
>>> probabilistic classification methods. It's also calculated using the model
>>> itself. But in this scenario we use K means algorithm. after generating the
>>> clusters we evaluate the model using the test data according to the
>>> percentile value that user provided. So as a result we can get the
>>> confusion matrix which consist of TP,TN,FP,FN. But to draw a PR curve or
>>> ROC curve that is not enough. Does anyone have any suggestions about that?
>>> or should we drop it?
>>>
>>> On Mon, Sep 21, 2015 at 7:05 AM, Sinnathamby Mahesan <
>>> sinnatha...@wso2.com> wrote:
>>>
 Ashen
 Here is a situation:
 Doctors  are testing a person for a disease, say, d.
 Doctor's point of view +ve means  patient has (d)

 Which is of the following is worse than the other?
 (1) The person who does NOT  have (d)  is identified as having (d)  -
  (that is, false  positive )
 (2) The person who does have (d) is identified as NOT having (d)   -
  (that is, false negative)

 Doctors  argument is that  we have to be more concern on reducing case
  (2)
 That is to say,  the sensitivity needs to be high.

 Anyway, I also thought it is better to display all measures :
 sensitivity, specificity, precision and F1-Score
 (suggesting to consider sensitivity for the case of  anomalous being
 positive.

 Good Luck
 Mahesan


 On 18 September 2015 at 15:27, Ashen Weerathunga 
 wrote:

> Hi all.
>
> Since we are considering the anomaly detection true positive would be
> a case where a true anomaly detected as a anomaly by the model. Since in
> the real world scenario of anomaly detection as you said the
> positive(anomaly) instances are vary rare we can't go for more general
> measure. So I can summarized the most applicable measures as below,
>
>- Sensitivity(recall) - gives the True Positive Rate. ( TP/(TP +
>FN) )
>- Precision - gives the probability of predicting a True Positive
>from all positive predictions ( TP/(TP+FP) )
>- PR cure - Precision recall(Sensitivity) curve - PR curve plots
>Precision Vs. Recall.
>- F1 score - gives the harmonic mean of Precision and
>

Re: [Dev] [ML] Accuracy Measure for Anomaly Detection?

2015-09-23 Thread Ashen Weerathunga
Hi all,

Thanks Mahesan for the suggestion. yes we can give all the measure if It is
better.

But there is some problem of drawing PR curve or ROC curve. Since we can
get only one point using the confusion matrix we cant give PR curve or ROC
curve in the summary of the model. Currently ROC curve provided only in
probabilistic classification methods. It's also calculated using the model
itself. But in this scenario we use K means algorithm. after generating the
clusters we evaluate the model using the test data according to the
percentile value that user provided. So as a result we can get the
confusion matrix which consist of TP,TN,FP,FN. But to draw a PR curve or
ROC curve that is not enough. Does anyone have any suggestions about that?
or should we drop it?

On Mon, Sep 21, 2015 at 7:05 AM, Sinnathamby Mahesan 
wrote:

> Ashen
> Here is a situation:
> Doctors  are testing a person for a disease, say, d.
> Doctor's point of view +ve means  patient has (d)
>
> Which is of the following is worse than the other?
> (1) The person who does NOT  have (d)  is identified as having (d)  -
>  (that is, false  positive )
> (2) The person who does have (d) is identified as NOT having (d)   -
>  (that is, false negative)
>
> Doctors  argument is that  we have to be more concern on reducing case
>  (2)
> That is to say,  the sensitivity needs to be high.
>
> Anyway, I also thought it is better to display all measures : sensitivity,
> specificity, precision and F1-Score
> (suggesting to consider sensitivity for the case of  anomalous being
> positive.
>
> Good Luck
> Mahesan
>
>
> On 18 September 2015 at 15:27, Ashen Weerathunga  wrote:
>
>> Hi all.
>>
>> Since we are considering the anomaly detection true positive would be a
>> case where a true anomaly detected as a anomaly by the model. Since in the
>> real world scenario of anomaly detection as you said the positive(anomaly)
>> instances are vary rare we can't go for more general measure. So I can
>> summarized the most applicable measures as below,
>>
>>- Sensitivity(recall) - gives the True Positive Rate. ( TP/(TP + FN) )
>>- Precision - gives the probability of predicting a True Positive
>>from all positive predictions ( TP/(TP+FP) )
>>- PR cure - Precision recall(Sensitivity) curve - PR curve plots
>>Precision Vs. Recall.
>>- F1 score - gives the harmonic mean of Precision and
>>Sensitivity(recall) ( 2TP / (2TP + FP + FN) )
>>
>> So Precision and the Sensitivity are the most suitable measures to
>> measure a model where positive instances are very less. And PR curve and F1
>> score are mixtures of both Sensitivity and Precision. So PR curve and F1
>> score can be used to tell how good is the model IMO. We can give
>> Sensitivity and Precision also separately.
>>
>> Thanks everyone for the support.
>>
>> @Srinath, sure, I will write an article.
>>
>>
>> Thanks and Regards,
>>
>> Ashen
>>
>> On Thu, Sep 17, 2015 at 10:19 AM, madhuka udantha <
>> madhukaudan...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> This is good survey paper that can be found regard to Anomaly detection
>>> [1], According to your need; it seems you will no need to go through whole
>>> the survey papers. But few sub topics will be very useful for you. This
>>> paper will be useful for your work.
>>>
>>> [1] Varun Chandola, Arindam Banerjee, and Vipin Kumar. 2009. Anomaly
>>> detection: A survey. ACM Comput. Surv. 41, 3, Article 15 (July 2009), 58
>>> pages. DOI=10.1145/1541880.1541882
>>> 
>>> [Cited by 2458]
>>>
>>> On Wed, Sep 16, 2015 at 3:35 PM, Ashen Weerathunga 
>>> wrote:
>>>
 Hi all,

 I am currently doing the integration of anomaly detection feature for
 ML. I have a problem of choosing the best accuracy measure for the model. I
 can get the confusion matrix which consists of true positives, true
 negatives, false positives and false negatives. There are few different
 measures such as sensitivity, accuracy, F1 score, etc. So what will be the
 best measure to give as the model accuracy for anomaly detection model.

 [1] Some
 details about those measures.

 Terminology and derivations
 from a confusion matrix
  true positive (TP)eqv.
 with hittrue negative (TN)eqv. with correct rejectionfalse positive
 (FP)eqv. with false alarm , Type
 I error false negative (FN)eqv.
 with miss, Type II error 
 --
 sensitivity  or
 true positive rate (TPR)eqv. with hit rate
 

Re: [Dev] [ML] Accuracy Measure for Anomaly Detection?

2015-09-18 Thread Ashen Weerathunga
Hi all.

Since we are considering the anomaly detection true positive would be a
case where a true anomaly detected as a anomaly by the model. Since in the
real world scenario of anomaly detection as you said the positive(anomaly)
instances are vary rare we can't go for more general measure. So I can
summarized the most applicable measures as below,

   - Sensitivity(recall) - gives the True Positive Rate. ( TP/(TP + FN) )
   - Precision - gives the probability of predicting a True Positive from
   all positive predictions ( TP/(TP+FP) )
   - PR cure - Precision recall(Sensitivity) curve - PR curve plots
   Precision Vs. Recall.
   - F1 score - gives the harmonic mean of Precision and
   Sensitivity(recall) ( 2TP / (2TP + FP + FN) )

So Precision and the Sensitivity are the most suitable measures to measure
a model where positive instances are very less. And PR curve and F1 score
are mixtures of both Sensitivity and Precision. So PR curve and F1 score
can be used to tell how good is the model IMO. We can give Sensitivity and
Precision also separately.

Thanks everyone for the support.

@Srinath, sure, I will write an article.


Thanks and Regards,

Ashen

On Thu, Sep 17, 2015 at 10:19 AM, madhuka udantha 
wrote:

> Hi,
>
> This is good survey paper that can be found regard to Anomaly detection
> [1], According to your need; it seems you will no need to go through whole
> the survey papers. But few sub topics will be very useful for you. This
> paper will be useful for your work.
>
> [1] Varun Chandola, Arindam Banerjee, and Vipin Kumar. 2009. Anomaly
> detection: A survey. ACM Comput. Surv. 41, 3, Article 15 (July 2009), 58
> pages. DOI=10.1145/1541880.1541882
> 
> [Cited by 2458]
>
> On Wed, Sep 16, 2015 at 3:35 PM, Ashen Weerathunga  wrote:
>
>> Hi all,
>>
>> I am currently doing the integration of anomaly detection feature for ML.
>> I have a problem of choosing the best accuracy measure for the model. I can
>> get the confusion matrix which consists of true positives, true negatives,
>> false positives and false negatives. There are few different measures such
>> as sensitivity, accuracy, F1 score, etc. So what will be the best measure
>> to give as the model accuracy for anomaly detection model.
>>
>> [1] Some
>> details about those measures.
>>
>> Terminology and derivations
>> from a confusion matrix  true
>> positive (TP)eqv. with hittrue negative (TN)eqv. with correct rejectionfalse
>> positive (FP)eqv. with false alarm
>> , Type I error
>> false negative (FN)eqv. with
>> miss, Type II error 
>> --
>> sensitivity  or
>> true positive rate (TPR)eqv. with hit rate
>> , recall
>> [image:
>> \mathit{TPR} = \mathit{TP} / P = \mathit{TP} / (\mathit{TP}+\mathit{FN})]
>> specificity 
>> (SPC) or true negative rate[image: \mathit{SPC} = \mathit{TN} / N =
>> \mathit{TN} / (\mathit{TN}+\mathit{FP})]precision
>>  or positive
>> predictive value
>>  (PPV)[image:
>> \mathit{PPV} = \mathit{TP} / (\mathit{TP} + \mathit{FP})]negative
>> predictive value
>>  (NPV)[image:
>> \mathit{NPV} = \mathit{TN} / (\mathit{TN} + \mathit{FN})]fall-out
>>  or false
>> positive rate  
>> (FPR)[image:
>> \mathit{FPR} = \mathit{FP} / N = \mathit{FP} / (\mathit{FP} + \mathit{TN})
>> = 1-\mathit{SPC}]false negative rate
>>  (FNR)[image:
>> \mathit{FNR} = \mathit{FN} / (\mathit{TP} + \mathit{FN}) = 
>> 1-\mathit{TPR}]false
>> discovery rate  
>> (FDR)[image:
>> \mathit{FDR} = \mathit{FP} / (\mathit{TP} + \mathit{FP}) = 1 - \mathit{PPV}]
>> --
>> accuracy  (ACC)[image:
>> \mathit{ACC} = (\mathit{TP} + \mathit{TN}) / (\mathit{TP} + \mathit{FP} +
>> \mathit{FN} + \mathit{TN})]F1 score
>> is the harmonic mean
>> 
>> of precision
>>  and
>> sensitivity 

[Dev] [ML] Accuracy Measure for Anomaly Detection?

2015-09-16 Thread Ashen Weerathunga
Hi all,

I am currently doing the integration of anomaly detection feature for ML. I
have a problem of choosing the best accuracy measure for the model. I can
get the confusion matrix which consists of true positives, true negatives,
false positives and false negatives. There are few different measures such
as sensitivity, accuracy, F1 score, etc. So what will be the best measure
to give as the model accuracy for anomaly detection model.

[1] Some details
about those measures.

Terminology and derivations
from a confusion matrix  true
positive (TP)eqv. with hittrue negative (TN)eqv. with correct rejectionfalse
positive (FP)eqv. with false alarm
, Type I error
false negative (FN)eqv. with
miss, Type II error 
--
sensitivity  or true
positive rate (TPR)eqv. with hit rate
, recall
[image:
\mathit{TPR} = \mathit{TP} / P = \mathit{TP} / (\mathit{TP}+\mathit{FN})]
specificity  (SPC)
or true negative rate[image: \mathit{SPC} = \mathit{TN} / N = \mathit{TN} /
(\mathit{TN}+\mathit{FP})]precision
 or positive
predictive value 
(PPV)[image: \mathit{PPV} = \mathit{TP} / (\mathit{TP} + \mathit{FP})]negative
predictive value 
(NPV)[image: \mathit{NPV} = \mathit{TN} / (\mathit{TN} + \mathit{FN})]
fall-out 
or false
positive rate  (FPR)[image:
\mathit{FPR} = \mathit{FP} / N = \mathit{FP} / (\mathit{FP} + \mathit{TN})
= 1-\mathit{SPC}]false negative rate
 (FNR)[image:
\mathit{FNR} = \mathit{FN} / (\mathit{TP} + \mathit{FN}) = 1-\mathit{TPR}]false
discovery rate 
(FDR)[image:
\mathit{FDR} = \mathit{FP} / (\mathit{TP} + \mathit{FP}) = 1 - \mathit{PPV}]
--
accuracy  (ACC)[image: \mathit{ACC}
= (\mathit{TP} + \mathit{TN}) / (\mathit{TP} + \mathit{FP} + \mathit{FN} +
\mathit{TN})]F1 score is the harmonic
mean

of precision 
and sensitivity [image:
\mathit{F1} = 2 \mathit{TP} / (2 \mathit{TP} + \mathit{FP} +
\mathit{FN})]Matthews
correlation coefficient
 (MCC)[image:
\frac{ \mathit{TP} \times \mathit{TN} - \mathit{FP} \times \mathit{FN} }
{\sqrt{ (\mathit{TP}+\mathit{FP}) ( \mathit{TP} + \mathit{FN} ) (
\mathit{TN} + \mathit{FP} ) ( \mathit{TN} + \mathit{FN} ) }
}]Informedness[image:
\mathit{TPR} + \mathit{SPC} - 1]Markedness
[image: \mathit{PPV} +
\mathit{NPV} - 1]

*Sources: Fawcett (2006) and Powers (2011).*[1]

[2]


Thanks and Regards,
Ashen
-- 
*Ashen Weerathunga*
Software Engineer - Intern
WSO2 Inc.: http://wso2.com
lean.enterprise.middleware

Email: as...@wso2.com
Mobile: +94 716042995 <94716042995>
LinkedIn:
*http://lk.linkedin.com/in/ashenweerathunga
*
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [ML] Accuracy Measure for Anomaly Detection?

2015-09-16 Thread Sinnathamby Mahesan
Dear Ashen
 Sensitivity  - in view of reducing the false negative
Precision - in view of reducing the false positive

F1 score combines both as the harmonic mean of precision and sensitivity

That's why F1 is chosen normally and is simple  (2TP / (2TP + FN + FP))



By the way, which you consider is True positive
(a) Anomaly  - Anomaly
or
(b) Normal - Normal

I think case (a) is more suited to your with regard to your objective.

Or If you have trouble in choosing which way:

You could consider Accuracy (Acc) which is somewhat similar to F1, but
gives same weight to TP and TN
Acc= ( ( TP + TN) / (TP + TN + FN + FP))



= Good Luck




On 16 September 2015 at 15:35, Ashen Weerathunga  wrote:

> Hi all,
>
> I am currently doing the integration of anomaly detection feature for ML.
> I have a problem of choosing the best accuracy measure for the model. I can
> get the confusion matrix which consists of true positives, true negatives,
> false positives and false negatives. There are few different measures such
> as sensitivity, accuracy, F1 score, etc. So what will be the best measure
> to give as the model accuracy for anomaly detection model.
>
> [1] Some
> details about those measures.
>
> Terminology and derivations
> from a confusion matrix  true
> positive (TP)eqv. with hittrue negative (TN)eqv. with correct rejectionfalse
> positive (FP)eqv. with false alarm
> , Type I error
> false negative (FN)eqv. with
> miss, Type II error 
> --
> sensitivity  or
> true positive rate (TPR)eqv. with hit rate
> , recall
> [image:
> \mathit{TPR} = \mathit{TP} / P = \mathit{TP} / (\mathit{TP}+\mathit{FN})]
> specificity  (SPC)
> or true negative rate[image: \mathit{SPC} = \mathit{TN} / N = \mathit{TN}
> / (\mathit{TN}+\mathit{FP})]precision
>  or positive
> predictive value 
> (PPV)[image: \mathit{PPV} = \mathit{TP} / (\mathit{TP} + \mathit{FP})]negative
> predictive value 
> (NPV)[image: \mathit{NPV} = \mathit{TN} / (\mathit{TN} + \mathit{FN})]
> fall-out 
> or false positive rate 
> (FPR)[image: \mathit{FPR} = \mathit{FP} / N = \mathit{FP} / (\mathit{FP}
> + \mathit{TN}) = 1-\mathit{SPC}]false negative rate
>  (FNR)[image:
> \mathit{FNR} = \mathit{FN} / (\mathit{TP} + \mathit{FN}) = 
> 1-\mathit{TPR}]false
> discovery rate  
> (FDR)[image:
> \mathit{FDR} = \mathit{FP} / (\mathit{TP} + \mathit{FP}) = 1 - \mathit{PPV}]
> --
> accuracy  (ACC)[image:
> \mathit{ACC} = (\mathit{TP} + \mathit{TN}) / (\mathit{TP} + \mathit{FP} +
> \mathit{FN} + \mathit{TN})]F1 score
> is the harmonic mean
> 
> of precision
>  and
> sensitivity [image:
> \mathit{F1} = 2 \mathit{TP} / (2 \mathit{TP} + \mathit{FP} + 
> \mathit{FN})]Matthews
> correlation coefficient
>  (MCC)[image:
> \frac{ \mathit{TP} \times \mathit{TN} - \mathit{FP} \times \mathit{FN} }
> {\sqrt{ (\mathit{TP}+\mathit{FP}) ( \mathit{TP} + \mathit{FN} ) (
> \mathit{TN} + \mathit{FP} ) ( \mathit{TN} + \mathit{FN} ) } 
> }]Informedness[image:
> \mathit{TPR} + \mathit{SPC} - 1]Markedness
> [image: \mathit{PPV} +
> \mathit{NPV} - 1]
>
> *Sources: Fawcett (2006) and Powers (2011).*[1]
> 
> [2]
> 
>
> Thanks and Regards,
> Ashen
> --
> *Ashen Weerathunga*
> Software Engineer - Intern
> WSO2 Inc.: http://wso2.com
> lean.enterprise.middleware
>
> Email: as...@wso2.com
> Mobile: +94 716042995 <94716042995>
> LinkedIn:
> *http://lk.linkedin.com/in/ashenweerathunga
> *
>



-- 
~~
Sinnathamby Mahesan



~~
~~

Re: [Dev] [ML] Accuracy Measure for Anomaly Detection?

2015-09-16 Thread CD Athuraliya
Hi Ashen,

Please note the class imbalance which can typically occur in anomaly data
when selecting evaluation measures (anomalous data can be very infrequent
compared to normal data in a real-world dataset). Please check how this
imbalance affects evaluation measures. I found this paper [1] on this topic.

And since the data clusters play a vital role in this model it would be
better if we can show some measures on them as well IMO.

[1] http://marmota.dlsi.uji.es/WebBIB/papers/2007/1_GarciaTamida2007.pdf

Regards,
CD

On Thu, Sep 17, 2015 at 6:18 AM, A. R.Weerasinghe 
wrote:

> I'm sorry, that was a general answer.
>
> For anomaly detection, I'd say sensitivity or specificity (depending what
> is positive and what is negative: Mahesan's point) is more important than
> the others.
>
> For example, in a data set of 10,000 samples, where 100 of these samples
> are labeled positive (anomalous), a predictor that predicts "Negative" for
> every instance it is presented with evaluates to Precision = 100%, Accuracy
> = 99%, and Specificity = 100%. This predictor would be entirely useless,
> and yet these measures show it performs very well. The same predictor would
> evaluate to Recall (sensitivity) = 0%. In this case, Sensitivity seems to
> be most in tune with how well the classifier is actually performing.
>
> The other extreme is a data set where many of the examples are positive
> (normal). For example if 9,900 out of 10,000 instances are positive, and a
> classifier predicts positive on all instances, then Precision = 99%,
> Accuracy = 99%, Specificity = 0%, and Recall = 100%. In this case,
> Specificity shows that this classifier is problematic.
>
> Hope this helps.
>
>
>
> On Thu, Sep 17, 2015 at 6:05 AM, A. R.Weerasinghe 
> wrote:
>
>> Usually F1 measure and area under ROC curve.
>>
>> Ruvan.
>>
>>
>> On Thu, Sep 17, 2015 at 5:20 AM, Sinnathamby Mahesan <
>> sinnatha...@wso2.com> wrote:
>>
>>> Dear Ashen
>>>  Sensitivity  - in view of reducing the false negative
>>> Precision - in view of reducing the false positive
>>>
>>> F1 score combines both as the harmonic mean of precision and sensitivity
>>>
>>> That's why F1 is chosen normally and is simple  (2TP / (2TP + FN + FP))
>>>
>>>
>>>
>>> By the way, which you consider is True positive
>>> (a) Anomaly  - Anomaly
>>> or
>>> (b) Normal - Normal
>>>
>>> I think case (a) is more suited to your with regard to your objective.
>>>
>>> Or If you have trouble in choosing which way:
>>>
>>> You could consider Accuracy (Acc) which is somewhat similar to F1, but
>>> gives same weight to TP and TN
>>> Acc= ( ( TP + TN) / (TP + TN + FN + FP))
>>>
>>>
>>>
>>> = Good Luck
>>>
>>>
>>>
>>>
>>> On 16 September 2015 at 15:35, Ashen Weerathunga  wrote:
>>>
 Hi all,

 I am currently doing the integration of anomaly detection feature for
 ML. I have a problem of choosing the best accuracy measure for the model. I
 can get the confusion matrix which consists of true positives, true
 negatives, false positives and false negatives. There are few different
 measures such as sensitivity, accuracy, F1 score, etc. So what will be the
 best measure to give as the model accuracy for anomaly detection model.

 [1] Some
 details about those measures.

 Terminology and derivations
 from a confusion matrix
  true positive (TP)eqv.
 with hittrue negative (TN)eqv. with correct rejectionfalse positive
 (FP)eqv. with false alarm , Type
 I error false negative (FN)eqv.
 with miss, Type II error 
 --
 sensitivity  or
 true positive rate (TPR)eqv. with hit rate
 , recall
 [image:
 \mathit{TPR} = \mathit{TP} / P = \mathit{TP} / (\mathit{TP}+\mathit{FN})]
 specificity 
 (SPC) or true negative rate[image: \mathit{SPC} = \mathit{TN} / N =
 \mathit{TN} / (\mathit{TN}+\mathit{FP})]precision
  or positive
 predictive value
  (PPV)[image:
 \mathit{PPV} = \mathit{TP} / (\mathit{TP} + \mathit{FP})]negative
 predictive value
  (NPV)[image:
 \mathit{NPV} = \mathit{TN} / (\mathit{TN} + \mathit{FN})]fall-out
  or false
 positive rate  
 (FPR)[image:

Re: [Dev] [ML] Accuracy Measure for Anomaly Detection?

2015-09-16 Thread Srinath Perera
Ashen, when you conclude this, can you write a blog/ article on comparing
different methods and why given thing is better.

--Srinath

On Thu, Sep 17, 2015 at 9:59 AM, Srinath Perera  wrote:

> Seshika and myself were talking to forester analyst and he mentioned "Lorenz
> curve" is used in fraud cases.
>
> Please read and find out what it is and how it compare to RoC etc.
>  see
> https://www.quora.com/What-is-the-difference-between-a-ROC-curve-and-a-precision-recall-curve-When-should-I-use-each
>
> On Thu, Sep 17, 2015 at 9:07 AM, CD Athuraliya 
> wrote:
>
>> Hi Ashen,
>>
>> Please note the class imbalance which can typically occur in anomaly data
>> when selecting evaluation measures (anomalous data can be very infrequent
>> compared to normal data in a real-world dataset). Please check how this
>> imbalance affects evaluation measures. I found this paper [1] on this topic.
>>
>> And since the data clusters play a vital role in this model it would be
>> better if we can show some measures on them as well IMO.
>>
>> [1] http://marmota.dlsi.uji.es/WebBIB/papers/2007/1_GarciaTamida2007.pdf
>>
>> Regards,
>> CD
>>
>> On Thu, Sep 17, 2015 at 6:18 AM, A. R.Weerasinghe 
>> wrote:
>>
>>> I'm sorry, that was a general answer.
>>>
>>> For anomaly detection, I'd say sensitivity or specificity (depending
>>> what is positive and what is negative: Mahesan's point) is more important
>>> than the others.
>>>
>>> For example, in a data set of 10,000 samples, where 100 of these samples
>>> are labeled positive (anomalous), a predictor that predicts "Negative" for
>>> every instance it is presented with evaluates to Precision = 100%, Accuracy
>>> = 99%, and Specificity = 100%. This predictor would be entirely useless,
>>> and yet these measures show it performs very well. The same predictor would
>>> evaluate to Recall (sensitivity) = 0%. In this case, Sensitivity seems to
>>> be most in tune with how well the classifier is actually performing.
>>>
>>> The other extreme is a data set where many of the examples are positive
>>> (normal). For example if 9,900 out of 10,000 instances are positive, and a
>>> classifier predicts positive on all instances, then Precision = 99%,
>>> Accuracy = 99%, Specificity = 0%, and Recall = 100%. In this case,
>>> Specificity shows that this classifier is problematic.
>>>
>>> Hope this helps.
>>>
>>>
>>>
>>> On Thu, Sep 17, 2015 at 6:05 AM, A. R.Weerasinghe 
>>> wrote:
>>>
 Usually F1 measure and area under ROC curve.

 Ruvan.


 On Thu, Sep 17, 2015 at 5:20 AM, Sinnathamby Mahesan <
 sinnatha...@wso2.com> wrote:

> Dear Ashen
>  Sensitivity  - in view of reducing the false negative
> Precision - in view of reducing the false positive
>
> F1 score combines both as the harmonic mean of precision and
> sensitivity
>
> That's why F1 is chosen normally and is simple  (2TP / (2TP + FN + FP))
>
>
>
> By the way, which you consider is True positive
> (a) Anomaly  - Anomaly
> or
> (b) Normal - Normal
>
> I think case (a) is more suited to your with regard to your objective.
>
> Or If you have trouble in choosing which way:
>
> You could consider Accuracy (Acc) which is somewhat similar to F1, but
> gives same weight to TP and TN
> Acc= ( ( TP + TN) / (TP + TN + FN + FP))
>
>
>
> = Good Luck
>
>
>
>
> On 16 September 2015 at 15:35, Ashen Weerathunga 
> wrote:
>
>> Hi all,
>>
>> I am currently doing the integration of anomaly detection feature for
>> ML. I have a problem of choosing the best accuracy measure for the 
>> model. I
>> can get the confusion matrix which consists of true positives, true
>> negatives, false positives and false negatives. There are few different
>> measures such as sensitivity, accuracy, F1 score, etc. So what will be 
>> the
>> best measure to give as the model accuracy for anomaly detection model.
>>
>> [1] Some
>> details about those measures.
>>
>> Terminology and derivations
>> from a confusion matrix
>>  true positive (TP)eqv.
>> with hittrue negative (TN)eqv. with correct rejectionfalse positive
>> (FP)eqv. with false alarm ,
>> Type I error false
>> negative (FN)eqv. with miss, Type II error
>> 
>> --
>> sensitivity 
>> or true positive rate (TPR)eqv. with hit rate
>> , recall
>> 

Re: [Dev] [ML] Accuracy Measure for Anomaly Detection?

2015-09-16 Thread Srinath Perera
Seshika and myself were talking to forester analyst and he mentioned "Lorenz
curve" is used in fraud cases.

Please read and find out what it is and how it compare to RoC etc.
 see
https://www.quora.com/What-is-the-difference-between-a-ROC-curve-and-a-precision-recall-curve-When-should-I-use-each

On Thu, Sep 17, 2015 at 9:07 AM, CD Athuraliya  wrote:

> Hi Ashen,
>
> Please note the class imbalance which can typically occur in anomaly data
> when selecting evaluation measures (anomalous data can be very infrequent
> compared to normal data in a real-world dataset). Please check how this
> imbalance affects evaluation measures. I found this paper [1] on this topic.
>
> And since the data clusters play a vital role in this model it would be
> better if we can show some measures on them as well IMO.
>
> [1] http://marmota.dlsi.uji.es/WebBIB/papers/2007/1_GarciaTamida2007.pdf
>
> Regards,
> CD
>
> On Thu, Sep 17, 2015 at 6:18 AM, A. R.Weerasinghe 
> wrote:
>
>> I'm sorry, that was a general answer.
>>
>> For anomaly detection, I'd say sensitivity or specificity (depending what
>> is positive and what is negative: Mahesan's point) is more important than
>> the others.
>>
>> For example, in a data set of 10,000 samples, where 100 of these samples
>> are labeled positive (anomalous), a predictor that predicts "Negative" for
>> every instance it is presented with evaluates to Precision = 100%, Accuracy
>> = 99%, and Specificity = 100%. This predictor would be entirely useless,
>> and yet these measures show it performs very well. The same predictor would
>> evaluate to Recall (sensitivity) = 0%. In this case, Sensitivity seems to
>> be most in tune with how well the classifier is actually performing.
>>
>> The other extreme is a data set where many of the examples are positive
>> (normal). For example if 9,900 out of 10,000 instances are positive, and a
>> classifier predicts positive on all instances, then Precision = 99%,
>> Accuracy = 99%, Specificity = 0%, and Recall = 100%. In this case,
>> Specificity shows that this classifier is problematic.
>>
>> Hope this helps.
>>
>>
>>
>> On Thu, Sep 17, 2015 at 6:05 AM, A. R.Weerasinghe 
>> wrote:
>>
>>> Usually F1 measure and area under ROC curve.
>>>
>>> Ruvan.
>>>
>>>
>>> On Thu, Sep 17, 2015 at 5:20 AM, Sinnathamby Mahesan <
>>> sinnatha...@wso2.com> wrote:
>>>
 Dear Ashen
  Sensitivity  - in view of reducing the false negative
 Precision - in view of reducing the false positive

 F1 score combines both as the harmonic mean of precision and sensitivity

 That's why F1 is chosen normally and is simple  (2TP / (2TP + FN + FP))



 By the way, which you consider is True positive
 (a) Anomaly  - Anomaly
 or
 (b) Normal - Normal

 I think case (a) is more suited to your with regard to your objective.

 Or If you have trouble in choosing which way:

 You could consider Accuracy (Acc) which is somewhat similar to F1, but
 gives same weight to TP and TN
 Acc= ( ( TP + TN) / (TP + TN + FN + FP))



 = Good Luck




 On 16 September 2015 at 15:35, Ashen Weerathunga 
 wrote:

> Hi all,
>
> I am currently doing the integration of anomaly detection feature for
> ML. I have a problem of choosing the best accuracy measure for the model. 
> I
> can get the confusion matrix which consists of true positives, true
> negatives, false positives and false negatives. There are few different
> measures such as sensitivity, accuracy, F1 score, etc. So what will be the
> best measure to give as the model accuracy for anomaly detection model.
>
> [1] Some
> details about those measures.
>
> Terminology and derivations
> from a confusion matrix
>  true positive (TP)eqv.
> with hittrue negative (TN)eqv. with correct rejectionfalse positive
> (FP)eqv. with false alarm ,
> Type I error false
> negative (FN)eqv. with miss, Type II error
> 
> --
> sensitivity  or
> true positive rate (TPR)eqv. with hit rate
> , recall
> [image:
> \mathit{TPR} = \mathit{TP} / P = \mathit{TP} / (\mathit{TP}+\mathit{FN})]
> specificity 
> (SPC) or true negative rate[image: \mathit{SPC} = \mathit{TN} / N =
> \mathit{TN} / (\mathit{TN}+\mathit{FP})]precision
> 

Re: [Dev] [ML] Accuracy Measure for Anomaly Detection?

2015-09-16 Thread madhuka udantha
Hi,

This is good survey paper that can be found regard to Anomaly detection
[1], According to your need; it seems you will no need to go through whole
the survey papers. But few sub topics will be very useful for you. This
paper will be useful for your work.

[1] Varun Chandola, Arindam Banerjee, and Vipin Kumar. 2009. Anomaly
detection: A survey. ACM Comput. Surv. 41, 3, Article 15 (July 2009), 58
pages. DOI=10.1145/1541880.1541882

[Cited by 2458]

On Wed, Sep 16, 2015 at 3:35 PM, Ashen Weerathunga  wrote:

> Hi all,
>
> I am currently doing the integration of anomaly detection feature for ML.
> I have a problem of choosing the best accuracy measure for the model. I can
> get the confusion matrix which consists of true positives, true negatives,
> false positives and false negatives. There are few different measures such
> as sensitivity, accuracy, F1 score, etc. So what will be the best measure
> to give as the model accuracy for anomaly detection model.
>
> [1] Some
> details about those measures.
>
> Terminology and derivations
> from a confusion matrix  true
> positive (TP)eqv. with hittrue negative (TN)eqv. with correct rejectionfalse
> positive (FP)eqv. with false alarm
> , Type I error
> false negative (FN)eqv. with
> miss, Type II error 
> --
> sensitivity  or
> true positive rate (TPR)eqv. with hit rate
> , recall
> [image:
> \mathit{TPR} = \mathit{TP} / P = \mathit{TP} / (\mathit{TP}+\mathit{FN})]
> specificity  (SPC)
> or true negative rate[image: \mathit{SPC} = \mathit{TN} / N = \mathit{TN}
> / (\mathit{TN}+\mathit{FP})]precision
>  or positive
> predictive value 
> (PPV)[image: \mathit{PPV} = \mathit{TP} / (\mathit{TP} + \mathit{FP})]negative
> predictive value 
> (NPV)[image: \mathit{NPV} = \mathit{TN} / (\mathit{TN} + \mathit{FN})]
> fall-out 
> or false positive rate 
> (FPR)[image: \mathit{FPR} = \mathit{FP} / N = \mathit{FP} / (\mathit{FP}
> + \mathit{TN}) = 1-\mathit{SPC}]false negative rate
>  (FNR)[image:
> \mathit{FNR} = \mathit{FN} / (\mathit{TP} + \mathit{FN}) = 
> 1-\mathit{TPR}]false
> discovery rate  
> (FDR)[image:
> \mathit{FDR} = \mathit{FP} / (\mathit{TP} + \mathit{FP}) = 1 - \mathit{PPV}]
> --
> accuracy  (ACC)[image:
> \mathit{ACC} = (\mathit{TP} + \mathit{TN}) / (\mathit{TP} + \mathit{FP} +
> \mathit{FN} + \mathit{TN})]F1 score
> is the harmonic mean
> 
> of precision
>  and
> sensitivity [image:
> \mathit{F1} = 2 \mathit{TP} / (2 \mathit{TP} + \mathit{FP} + 
> \mathit{FN})]Matthews
> correlation coefficient
>  (MCC)[image:
> \frac{ \mathit{TP} \times \mathit{TN} - \mathit{FP} \times \mathit{FN} }
> {\sqrt{ (\mathit{TP}+\mathit{FP}) ( \mathit{TP} + \mathit{FN} ) (
> \mathit{TN} + \mathit{FP} ) ( \mathit{TN} + \mathit{FN} ) } 
> }]Informedness[image:
> \mathit{TPR} + \mathit{SPC} - 1]Markedness
> [image: \mathit{PPV} +
> \mathit{NPV} - 1]
>
> *Sources: Fawcett (2006) and Powers (2011).*[1]
> 
> [2]
> 
>
> Thanks and Regards,
> Ashen
> --
> *Ashen Weerathunga*
> Software Engineer - Intern
> WSO2 Inc.: http://wso2.com
> lean.enterprise.middleware
>
> Email: as...@wso2.com
> Mobile: +94 716042995 <94716042995>
> LinkedIn:
> *http://lk.linkedin.com/in/ashenweerathunga
> *
>
> ___
> Dev mailing list
> Dev@wso2.org
> http://wso2.org/cgi-bin/mailman/listinfo/dev
>
>


-- 
Cheers,
Madhuka Udantha
http://madhukaudantha.blogspot.com