Yeah, it was a great learning experience for me. Thanks for all the support and guidance given me throughout the process.
On Fri, Nov 20, 2015 at 10:14 AM, Nirmal Fernando <[email protected]> wrote: > Have to mention about the excellent work done and the support extended by > Ashen during the whole implementation process. He was able to grasp things > soon and get regular feedback and improve very quickly. Kudos to Ashen and > keep up the good work! Hope you have learnt a lot during this process. > > On Thu, Nov 19, 2015 at 10:43 AM, Seshika Fernando <[email protected]> > wrote: > >> Welldone Ashen. The documentation looks good too. Will review later. >> >> seshi >> >> On Thu, Nov 19, 2015 at 10:35 AM, Ashen Weerathunga <[email protected]> >> wrote: >> >>> Hi all, >>> >>> This feature was implemented on ML and released with WSO2 Machine >>> Learner 1.1.0 - Milestone 1 >>> <https://github.com/wso2/product-ml/releases/tag/v1.1.0-m1>. Thanks >>> everyone for your ideas and support. Please find the attachments. >>> >>> [1] PR - carbon-ml >>> [2] PR - product-ml >>> >>> [1] https://github.com/wso2/carbon-ml/pull/138 >>> [2] https://github.com/wso2/product-ml/pull/263 >>> >>> Thanks and Regards, >>> Ashen >>> >>> On Mon, Sep 28, 2015 at 11:12 PM, Ashen Weerathunga <[email protected]> >>> wrote: >>> >>>> Sure, thanks Mahesan! >>>> >>>> On Mon, Sep 28, 2015 at 9:51 AM, Sinnathamby Mahesan < >>>> [email protected]> wrote: >>>> >>>> >>>>> ---------- Forwarded message ---------- >>>>> From: Sinnathamby Mahesan <[email protected]> >>>>> Date: 28 September 2015 at 09:50 >>>>> Subject: Re: [Architecture] [ML] Anomaly Detection Feature for WSO2 ML >>>>> To: [email protected] >>>>> Cc: Nirmal Fernando <[email protected]> >>>>> >>>>> >>>>> Dear Ashen >>>>> I know you have programmed correctly, >>>>> >>>>> but here too >>>>> it is better to show that >>>>> >>>>> if (ri > di ) for all i=1..k => Anomalous >>>>> >>>>> where k is the number of clusters >>>>> di is the distance between the point under consideration and the >>>>> cluster centre i >>>>> and >>>>> ri is the percentile radius of cluster i >>>>> >>>>> >>>>> [image: Inline images 2] >>>>> >>>>> :-) >>>>> Best Wishes >>>>> >>>>> >>>>> >>>>> >>>>> On 24 September 2015 at 11:43, Ashen Weerathunga <[email protected]> >>>>> wrote: >>>>> >>>>>> Variables of the above diagram. >>>>>> >>>>>> - Cc1, Cc2, Cc3 - Cluster centers >>>>>> >>>>>> >>>>>> - r1 - ith percentile distance of distances of all the points of >>>>>> cluster 1 to their cluster center (Cc1) >>>>>> (this is considered as the boundary of cluster 1) >>>>>> >>>>>> >>>>>> - d1 - distance between particular data point and it's closest >>>>>> cluster center (Cc1) >>>>>> >>>>>> >>>>>> On Thu, Sep 24, 2015 at 11:25 AM, Ashen Weerathunga <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Thanks for the suggestion! >>>>>>> >>>>>>> This diagram shows how the algorithm detect anomaly behaviors. As in >>>>>>> the diagram when we do the K means clustering there will be set of >>>>>>> clusters >>>>>>> of normal data and some deviated points which behave as anomalies. >>>>>>> since we >>>>>>> consider a percentile distance to identify cluster boundaries we can >>>>>>> eliminate those anomaly data from clusters. so when a new data point >>>>>>> comes >>>>>>> closest cluster center will be calculated and after that comparing >>>>>>> distances we can identify whether it is belong to the cluster or not. >>>>>>> If it >>>>>>> is not algorithms detect it as a anomaly data. >>>>>>> >>>>>>> [image: Inline image 3] >>>>>>> Hope this will give a more clear view about the algorithm. >>>>>>> >>>>>>> Thanks, >>>>>>> Ashen >>>>>>> >>>>>>> On Wed, Sep 23, 2015 at 6:11 PM, Nirmal Fernando <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Thanks Ashen! Few diagrams will help readers to understand the >>>>>>>> algorithm better. >>>>>>>> >>>>>>>> On Wed, Sep 23, 2015 at 6:03 PM, Ashen Weerathunga <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> I am currently doing the integration of Anomaly detection feature >>>>>>>>> to the WSO2 ML. There are some anomaly/fraud detection features >>>>>>>>> already >>>>>>>>> implemented in CEP/DAS using different approaches. But this will be >>>>>>>>> done >>>>>>>>> using a machine learning approach which is K means clustering. >>>>>>>>> Basically I >>>>>>>>> have used K means algorithm provided by Apache Spark MLib which is >>>>>>>>> already >>>>>>>>> using in WSO2 ML. >>>>>>>>> >>>>>>>>> This feature supports both labeled and unlabeled data. User can >>>>>>>>> build a model using existing data and use that for prediction. >>>>>>>>> >>>>>>>>> The main steps of this feature are as follows, >>>>>>>>> >>>>>>>>> - After doing the preprocessing steps user will have to select >>>>>>>>> the algorithm. There will be two algorithms under Anomaly >>>>>>>>> Detection category >>>>>>>>> - K Means with Unlabeled data >>>>>>>>> - K Means with Labeled data - If user have labeled data >>>>>>>>> user can go for this option >>>>>>>>> - If user select K Means with labeled data option user >>>>>>>>> should input Normal label(s) values and train data fraction as >>>>>>>>> well. >>>>>>>>> - In the next step user will have to input three parameters >>>>>>>>> - Maximum number of iterations >>>>>>>>> - Number of normal clusters >>>>>>>>> - Percentile value >>>>>>>>> - Then the model will be build using those parameters >>>>>>>>> - A model summery will be provided for labeled data option >>>>>>>>> which shows the model accuracy measures,confusion matrix, etc. >>>>>>>>> - In the prediction part user will have two options as to >>>>>>>>> input new data as a csv or tsv file or manually enter new data >>>>>>>>> values. As >>>>>>>>> the prediction it will show whether the new data point is an >>>>>>>>> anomaly or not. >>>>>>>>> >>>>>>>>> The methodology used is as follows, >>>>>>>>> >>>>>>>>> - First the dataset will be clustered using K means algorithm >>>>>>>>> according to hyper parameters that user provided. >>>>>>>>> - Since in the real world scenario of anomaly detection the >>>>>>>>> positive(anomaly) instances are vary rare, we assume that those >>>>>>>>> anomalies >>>>>>>>> will be in outside from the clusters. >>>>>>>>> - So we can detect them by calculating the cluster boundaries. >>>>>>>>> This is how we identify the cluster boundaries, >>>>>>>>> - First calculate all the distances between data points and >>>>>>>>> their respective cluster centers. >>>>>>>>> - Then select the percentile value from distances of each >>>>>>>>> clusters as their cluster boundaries. >>>>>>>>> - When a new data point comes the closest cluster center will >>>>>>>>> be calculated by K means predict function. >>>>>>>>> - Then the distance between new data point and It's cluster >>>>>>>>> center will be calculated. If it is less than the percentile >>>>>>>>> distance value >>>>>>>>> it is considered as a normal data. If it is grater than the >>>>>>>>> percentile >>>>>>>>> distance value it is considered as a anomaly since it is in >>>>>>>>> outside the >>>>>>>>> cluster. >>>>>>>>> >>>>>>>>> Most of the work have completed by now. Please let me know if >>>>>>>>> there are any issues or improvements to be done. >>>>>>>>> https://github.com/ashensw/carbon-ml/tree/fraud_detection >>>>>>>>> >>>>>>>>> Thanks and Regards, >>>>>>>>> Ashen >>>>>>>>> >>>>>>>>> -- >>>>>>>>> *Ashen Weerathunga* >>>>>>>>> Software Engineer - Intern >>>>>>>>> WSO2 Inc.: http://wso2.com >>>>>>>>> lean.enterprise.middleware >>>>>>>>> >>>>>>>>> Email: [email protected] >>>>>>>>> Mobile: +94 716042995 <94716042995> >>>>>>>>> LinkedIn: >>>>>>>>> *http://lk.linkedin.com/in/ashenweerathunga >>>>>>>>> <http://lk.linkedin.com/in/ashenweerathunga>* >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> Thanks & regards, >>>>>>>> Nirmal >>>>>>>> >>>>>>>> Team Lead - WSO2 Machine Learner >>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >>>>>>>> Mobile: +94715779733 >>>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> *Ashen Weerathunga* >>>>>>> Software Engineer - Intern >>>>>>> WSO2 Inc.: http://wso2.com >>>>>>> lean.enterprise.middleware >>>>>>> >>>>>>> Email: [email protected] >>>>>>> Mobile: +94 716042995 <94716042995> >>>>>>> LinkedIn: >>>>>>> *http://lk.linkedin.com/in/ashenweerathunga >>>>>>> <http://lk.linkedin.com/in/ashenweerathunga>* >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> *Ashen Weerathunga* >>>>>> Software Engineer - Intern >>>>>> WSO2 Inc.: http://wso2.com >>>>>> lean.enterprise.middleware >>>>>> >>>>>> Email: [email protected] >>>>>> Mobile: +94 716042995 <94716042995> >>>>>> LinkedIn: >>>>>> *http://lk.linkedin.com/in/ashenweerathunga >>>>>> <http://lk.linkedin.com/in/ashenweerathunga>* >>>>>> >>>>>> _______________________________________________ >>>>>> Architecture mailing list >>>>>> [email protected] >>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>>>> Sinnathamby Mahesan >>>>> >>>>> >>>>> >>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>>>> >>>>> >>>>> >>>>> -- >>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>>>> Sinnathamby Mahesan >>>>> >>>>> >>>>> >>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>>>> >>>> >>>> >>>> >>>> -- >>>> *Ashen Weerathunga* >>>> Software Engineer - Intern >>>> WSO2 Inc.: http://wso2.com >>>> lean.enterprise.middleware >>>> >>>> Email: [email protected] >>>> Mobile: +94 716042995 <94716042995> >>>> LinkedIn: >>>> *http://lk.linkedin.com/in/ashenweerathunga >>>> <http://lk.linkedin.com/in/ashenweerathunga>* >>>> >>> >>> >>> >>> -- >>> *Ashen Weerathunga* >>> Software Engineer - Intern >>> WSO2 Inc.: http://wso2.com >>> lean.enterprise.middleware >>> >>> Email: [email protected] >>> Mobile: +94 716042995 <94716042995> >>> LinkedIn: >>> *http://lk.linkedin.com/in/ashenweerathunga >>> <http://lk.linkedin.com/in/ashenweerathunga>* >>> >> >> > > > -- > > Thanks & regards, > Nirmal > > Team Lead - WSO2 Machine Learner > Associate Technical Lead - Data Technologies Team, WSO2 Inc. > Mobile: +94715779733 > Blog: http://nirmalfdo.blogspot.com/ > > > -- *Ashen Weerathunga* Software Engineer - Intern WSO2 Inc.: http://wso2.com lean.enterprise.middleware Email: [email protected] Mobile: +94 716042995 <94716042995> LinkedIn: *http://lk.linkedin.com/in/ashenweerathunga <http://lk.linkedin.com/in/ashenweerathunga>*
_______________________________________________ Architecture mailing list [email protected] https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
