+1 nice work Ashen!!

On Fri, Nov 20, 2015 at 10:14 AM, Nirmal Fernando <[email protected]> wrote:

> Have to mention about the excellent work done and the support extended by
> Ashen during the whole implementation process. He was able to grasp things
> soon and get regular feedback and improve very quickly. Kudos to Ashen and
> keep up the good work! Hope you have learnt a lot during this process.
>
> On Thu, Nov 19, 2015 at 10:43 AM, Seshika Fernando <[email protected]>
> wrote:
>
>> Welldone Ashen. The documentation looks good too. Will review later.
>>
>> seshi
>>
>> On Thu, Nov 19, 2015 at 10:35 AM, Ashen Weerathunga <[email protected]>
>> wrote:
>>
>>> Hi all,
>>>
>>> This feature was implemented on ML and released with WSO2 Machine
>>> Learner 1.1.0 - Milestone 1
>>> <https://github.com/wso2/product-ml/releases/tag/v1.1.0-m1>. Thanks
>>> everyone for your ideas and support. Please find the attachments.
>>>
>>> [1] PR - carbon-ml
>>> [2] PR - product-ml
>>>
>>> [1] https://github.com/wso2/carbon-ml/pull/138
>>> [2] https://github.com/wso2/product-ml/pull/263
>>>
>>> Thanks and Regards,
>>> Ashen
>>>
>>> On Mon, Sep 28, 2015 at 11:12 PM, Ashen Weerathunga <[email protected]>
>>> wrote:
>>>
>>>> Sure, thanks Mahesan!
>>>>
>>>> On Mon, Sep 28, 2015 at 9:51 AM, Sinnathamby Mahesan <
>>>> [email protected]> wrote:
>>>>
>>>>
>>>>> ---------- Forwarded message ----------
>>>>> From: Sinnathamby Mahesan <[email protected]>
>>>>> Date: 28 September 2015 at 09:50
>>>>> Subject: Re: [Architecture] [ML] Anomaly Detection Feature for WSO2 ML
>>>>> To: [email protected]
>>>>> Cc: Nirmal Fernando <[email protected]>
>>>>>
>>>>>
>>>>> Dear Ashen
>>>>> I know you  have programmed correctly,
>>>>>
>>>>> but here too
>>>>> it is better to show that
>>>>>
>>>>> if   (ri > di ) for all i=1..k  => Anomalous
>>>>>
>>>>> where k is the number of clusters
>>>>> di is the distance between the point under consideration and the
>>>>> cluster centre i
>>>>> and
>>>>> ri is the percentile radius of cluster i
>>>>>
>>>>>
>>>>> [image: Inline images 2]
>>>>>
>>>>> :-)
>>>>> Best Wishes
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 24 September 2015 at 11:43, Ashen Weerathunga <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Variables of the above diagram.
>>>>>>
>>>>>>    - Cc1, Cc2, Cc3 - Cluster centers
>>>>>>
>>>>>>
>>>>>>    - r1 - ith percentile distance of distances of all the points of
>>>>>>    cluster 1 to their cluster center (Cc1)
>>>>>>    (this is considered as the boundary of cluster 1)
>>>>>>
>>>>>>
>>>>>>    - d1 - distance between particular data point and it's closest
>>>>>>    cluster center (Cc1)
>>>>>>
>>>>>>
>>>>>> On Thu, Sep 24, 2015 at 11:25 AM, Ashen Weerathunga <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Thanks for the suggestion!
>>>>>>>
>>>>>>> This diagram shows how the algorithm detect anomaly behaviors. As in
>>>>>>> the diagram when we do the K means clustering there will be set of 
>>>>>>> clusters
>>>>>>> of normal data and some deviated points which behave as anomalies. 
>>>>>>> since we
>>>>>>> consider a percentile distance to identify cluster boundaries we can
>>>>>>> eliminate those anomaly data from clusters. so when a new data point 
>>>>>>> comes
>>>>>>> closest cluster center will be calculated and after that comparing
>>>>>>> distances we can identify whether it is belong to the cluster or not. 
>>>>>>> If it
>>>>>>> is not algorithms detect it as a anomaly data.
>>>>>>>
>>>>>>> [image: Inline image 3]
>>>>>>> Hope this will give a more clear view about the algorithm.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Ashen
>>>>>>>
>>>>>>> On Wed, Sep 23, 2015 at 6:11 PM, Nirmal Fernando <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Thanks Ashen! Few diagrams will help readers to understand the
>>>>>>>> algorithm better.
>>>>>>>>
>>>>>>>> On Wed, Sep 23, 2015 at 6:03 PM, Ashen Weerathunga <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> I am currently doing the integration of Anomaly detection feature
>>>>>>>>> to the WSO2 ML. There are some anomaly/fraud detection features 
>>>>>>>>> already
>>>>>>>>> implemented in CEP/DAS using different approaches. But this will be 
>>>>>>>>> done
>>>>>>>>> using a machine learning approach which is K means clustering. 
>>>>>>>>> Basically I
>>>>>>>>> have used K means algorithm provided by Apache Spark MLib which is 
>>>>>>>>> already
>>>>>>>>> using in WSO2 ML.
>>>>>>>>>
>>>>>>>>> This feature supports both labeled and unlabeled data. User can
>>>>>>>>> build a model using existing data and use that for prediction.
>>>>>>>>>
>>>>>>>>> The main steps of this feature are as follows,
>>>>>>>>>
>>>>>>>>>    - After doing the preprocessing steps user will have to select
>>>>>>>>>    the algorithm. There will be two algorithms under Anomaly 
>>>>>>>>> Detection category
>>>>>>>>>       - K Means with Unlabeled data
>>>>>>>>>       - K Means with Labeled data - If user have labeled data
>>>>>>>>>       user can go for this option
>>>>>>>>>       - If user select K Means with labeled data option user
>>>>>>>>>    should input Normal label(s) values and train data fraction as 
>>>>>>>>> well.
>>>>>>>>>    - In the next step user will have to input three parameters
>>>>>>>>>       - Maximum number of iterations
>>>>>>>>>       - Number of normal clusters
>>>>>>>>>       - Percentile value
>>>>>>>>>       - Then the model will be build using those parameters
>>>>>>>>>    - A model summery will be provided for labeled data option
>>>>>>>>>    which shows the model accuracy measures,confusion matrix, etc.
>>>>>>>>>    - In the prediction part user will have two options as to
>>>>>>>>>    input new data as a csv or tsv file or manually enter new data 
>>>>>>>>> values. As
>>>>>>>>>    the prediction it will show whether the new data point is an 
>>>>>>>>> anomaly or not.
>>>>>>>>>
>>>>>>>>> The methodology used is as follows,
>>>>>>>>>
>>>>>>>>>    - First the dataset will be clustered using K means algorithm
>>>>>>>>>    according to hyper parameters that user provided.
>>>>>>>>>    - Since in the real world scenario of anomaly detection the
>>>>>>>>>    positive(anomaly) instances are vary rare, we assume that those 
>>>>>>>>> anomalies
>>>>>>>>>    will be in outside from the clusters.
>>>>>>>>>    - So we can detect them by calculating the cluster boundaries.
>>>>>>>>>    This is how we identify the cluster boundaries,
>>>>>>>>>       - First calculate all the distances between data points and
>>>>>>>>>       their respective cluster centers.
>>>>>>>>>       - Then select the percentile value from distances of each
>>>>>>>>>       clusters as their cluster boundaries.
>>>>>>>>>    - When a new data point comes the closest cluster center will
>>>>>>>>>    be calculated by K means predict function.
>>>>>>>>>    - Then the distance between new data point and It's cluster
>>>>>>>>>    center will be calculated. If it is less than the percentile 
>>>>>>>>> distance value
>>>>>>>>>    it is considered as a normal data. If it is grater than the 
>>>>>>>>> percentile
>>>>>>>>>    distance value it is considered as a anomaly since it is in 
>>>>>>>>> outside the
>>>>>>>>>    cluster.
>>>>>>>>>
>>>>>>>>> Most of the work have completed by now. Please let me know if
>>>>>>>>> there are any issues or improvements to be done.
>>>>>>>>> https://github.com/ashensw/carbon-ml/tree/fraud_detection
>>>>>>>>>
>>>>>>>>> Thanks and Regards,
>>>>>>>>> Ashen
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> *Ashen Weerathunga*
>>>>>>>>> Software Engineer - Intern
>>>>>>>>> WSO2 Inc.: http://wso2.com
>>>>>>>>> lean.enterprise.middleware
>>>>>>>>>
>>>>>>>>> Email: [email protected]
>>>>>>>>> Mobile: +94 716042995 <94716042995>
>>>>>>>>> LinkedIn:
>>>>>>>>> *http://lk.linkedin.com/in/ashenweerathunga
>>>>>>>>> <http://lk.linkedin.com/in/ashenweerathunga>*
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> Thanks & regards,
>>>>>>>> Nirmal
>>>>>>>>
>>>>>>>> Team Lead - WSO2 Machine Learner
>>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>>>>>>>> Mobile: +94715779733
>>>>>>>> Blog: http://nirmalfdo.blogspot.com/
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> *Ashen Weerathunga*
>>>>>>> Software Engineer - Intern
>>>>>>> WSO2 Inc.: http://wso2.com
>>>>>>> lean.enterprise.middleware
>>>>>>>
>>>>>>> Email: [email protected]
>>>>>>> Mobile: +94 716042995 <94716042995>
>>>>>>> LinkedIn:
>>>>>>> *http://lk.linkedin.com/in/ashenweerathunga
>>>>>>> <http://lk.linkedin.com/in/ashenweerathunga>*
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> *Ashen Weerathunga*
>>>>>> Software Engineer - Intern
>>>>>> WSO2 Inc.: http://wso2.com
>>>>>> lean.enterprise.middleware
>>>>>>
>>>>>> Email: [email protected]
>>>>>> Mobile: +94 716042995 <94716042995>
>>>>>> LinkedIn:
>>>>>> *http://lk.linkedin.com/in/ashenweerathunga
>>>>>> <http://lk.linkedin.com/in/ashenweerathunga>*
>>>>>>
>>>>>> _______________________________________________
>>>>>> Architecture mailing list
>>>>>> [email protected]
>>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>> Sinnathamby Mahesan
>>>>>
>>>>>
>>>>>
>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>> Sinnathamby Mahesan
>>>>>
>>>>>
>>>>>
>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> *Ashen Weerathunga*
>>>> Software Engineer - Intern
>>>> WSO2 Inc.: http://wso2.com
>>>> lean.enterprise.middleware
>>>>
>>>> Email: [email protected]
>>>> Mobile: +94 716042995 <94716042995>
>>>> LinkedIn:
>>>> *http://lk.linkedin.com/in/ashenweerathunga
>>>> <http://lk.linkedin.com/in/ashenweerathunga>*
>>>>
>>>
>>>
>>>
>>> --
>>> *Ashen Weerathunga*
>>> Software Engineer - Intern
>>> WSO2 Inc.: http://wso2.com
>>> lean.enterprise.middleware
>>>
>>> Email: [email protected]
>>> Mobile: +94 716042995 <94716042995>
>>> LinkedIn:
>>> *http://lk.linkedin.com/in/ashenweerathunga
>>> <http://lk.linkedin.com/in/ashenweerathunga>*
>>>
>>
>>
>
>
> --
>
> Thanks & regards,
> Nirmal
>
> Team Lead - WSO2 Machine Learner
> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
> Mobile: +94715779733
> Blog: http://nirmalfdo.blogspot.com/
>
>
>


-- 
============================
Blog: http://srinathsview.blogspot.com twitter:@srinath_perera
Site: http://people.apache.org/~hemapani/
Photos: http://www.flickr.com/photos/hemapani/
Phone: 0772360902
_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to