Hi Maheshakya, Google have accepted my proof of enrollment. So do i need to proceed further with the project?t. I have been working with the Spark MLLib and trying to implement those two algorithms. Can you please tell me what is the next step i want to do.do i need to wait?thank you. regards, Mahesh.
On Fri, Mar 25, 2016 at 10:40 PM, Mahesh Dananjaya < [email protected]> wrote: > Hi Maheshakya, > Thank you very much for the support given during the last couple of > weeks.I have finally submitted the proposal to the site.And i am looking > forward to contribute to your wso2 ml.thank you. > regards, > Mahesh. > > On Fri, Mar 25, 2016 at 7:49 PM, Mahesh Dananjaya < > [email protected]> wrote: > >> Hi maheshakya, >> i added the timeline according to my knowledge and uploaded.pls >> check.thank you. >> regards, >> Mahesh. >> >> On Fri, Mar 25, 2016 at 7:09 PM, Maheshakya Wijewardena < >> [email protected]> wrote: >> >>> Hi Mahesh, >>> >>> Can you add the time line of the project as I've mentioned. It's one of >>> the crucial parts of the proposal that allows us to evaluate feasibility of >>> the project in accordance with the given time period by Google. >>> >>> Best regards. >>> >>> On Fri, Mar 25, 2016 at 6:53 PM, Mahesh Dananjaya < >>> [email protected]> wrote: >>> >>>> >>>> ---------- Forwarded message ---------- >>>> From: Mahesh Dananjaya <[email protected]> >>>> Date: Fri, Mar 25, 2016 at 7:02 PM >>>> Subject: Re: [Dev] Fwd: GSOC2016: Proposal 6: [ML] >>>> To: Maheshakya Wijewardena <[email protected]> >>>> >>>> >>>> Hi maheshakya, >>>> I have uploaded my final submission.here it is. pls check it and inform >>>> me anything i need to change.thank you. >>>> BR, >>>> Mahesh. >>>> >>>> On Fri, Mar 25, 2016 at 6:28 PM, Mahesh Dananjaya < >>>> [email protected]> wrote: >>>> >>>>> Hi Maheshakya, >>>>> thank you very much. I will be updating the proposal with those >>>>> changes and i will submit it by now.thank you. >>>>> regards, >>>>> Mahesh. >>>>> >>>>> On Fri, Mar 25, 2016 at 6:07 PM, Maheshakya Wijewardena < >>>>> [email protected]> wrote: >>>>> >>>>>> Hi Mahesh, >>>>>> >>>>>> In the title, please include both tags [ML] and [CEP] >>>>>> >>>>>> Best regards. >>>>>> >>>>>> On Fri, Mar 25, 2016 at 5:49 PM, Maheshakya Wijewardena < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Also, please include an introduction to yourself (University, >>>>>>> department), past experience in machine learning, language proficiency, >>>>>>> etc >>>>>>> at the beginning of the proposal. >>>>>>> >>>>>>> Best regards. >>>>>>> >>>>>>> On Fri, Mar 25, 2016 at 5:47 PM, Maheshakya Wijewardena < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> Hi Mahesh, >>>>>>>> >>>>>>>> Thank you for sending the draft. Please submit it as soon as >>>>>>>> possible. >>>>>>>> >>>>>>>> Few high level comments: >>>>>>>> >>>>>>>> In the proposal, you must specifically mention that this will be >>>>>>>> implemented as a Siddhi extension that can operate directly on incoming >>>>>>>> streams. >>>>>>>> >>>>>>>> Also, you need to have a time line for the project, A sample looks >>>>>>>> like: >>>>>>>> >>>>>>>> May 1- May 20 - Community bonding period - Getting familiar with >>>>>>>> the platform and discussing implementation methods. >>>>>>>> May 20 - May 30 - Implementing streaming k-means, >>>>>>>> ----- >>>>>>>> ----- >>>>>>>> July 20-24 - Writing examples >>>>>>>> July 24-18 - Documentation >>>>>>>> >>>>>>>> This should end before pencils down date. Refer to the correct time >>>>>>>> line given in GSoC site. >>>>>>>> >>>>>>>> The implementation details of the the streaming algorithms looks >>>>>>>> fine. >>>>>>>> >>>>>>>> Best regards. >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Mar 25, 2016 at 5:23 PM, Mahesh Dananjaya < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> Hi Maheshakya, >>>>>>>>> this is my draft proposal. >>>>>>>>> >>>>>>>>> https://docs.google.com/document/d/1apZfEXZXEH5GwSwS7hARINbGw5_zinxWdZjEmyqfKu4/edit?usp=sha >>>>>>>>> <https://docs.google.com/document/d/1apZfEXZXEH5GwSwS7hARINbGw5_zinxWdZjEmyqfKu4/edit?usp=sharing> >>>>>>>>> ring >>>>>>>>> can you ple check this and see whether it is correct.thank you. >>>>>>>>> BR, >>>>>>>>> Mahesh >>>>>>>>> >>>>>>>>> >>>>>>>>> On Mon, Mar 21, 2016 at 1:15 PM, Maheshakya Wijewardena < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>>> Hi Mahesh, >>>>>>>>>> >>>>>>>>>> The deadline for submitting your proposals is on March 25th, >>>>>>>>>> 2016, therefore please start writing the proposal and get feedback. >>>>>>>>>> >>>>>>>>>> Best regards. >>>>>>>>>> >>>>>>>>>> On Tue, Mar 15, 2016 at 4:14 PM, Mahesh Dananjaya < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Maheshakaya, >>>>>>>>>>> Ok.I have been trying some examples and try to split them and >>>>>>>>>>> train incrementally. Still doing that. i have been adding them to >>>>>>>>>>> my github >>>>>>>>>>> repo too. https://github.com/dananjayamahesh/GSOC2016 . i saw >>>>>>>>>>> that there is only scala API support for those streaming algorithms >>>>>>>>>>> in >>>>>>>>>>> Spark. so my task is to develop Java API. will let you nkow my >>>>>>>>>>> progress.thank you very much. >>>>>>>>>>> BR, >>>>>>>>>>> Mahesh >>>>>>>>>>> >>>>>>>>>>> On Tue, Mar 15, 2016 at 3:21 PM, Maheshakya Wijewardena < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi Mahesh, >>>>>>>>>>>> >>>>>>>>>>>> No you don't need to use Hadoop at any stage in this project. >>>>>>>>>>>> Everything you need is in Spark (regarding ML algorithms). >>>>>>>>>>>> You can also use Spark MLLibs methods to randomly split >>>>>>>>>>>> datasets. >>>>>>>>>>>> >>>>>>>>>>>> Best regards. >>>>>>>>>>>> >>>>>>>>>>>> On Mon, Mar 14, 2016 at 1:28 PM, Mahesh Dananjaya < >>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi Maheshakya, >>>>>>>>>>>>> I am writing some java programs and try to break the dataset >>>>>>>>>>>>> into several pieces and train a model repeatedly with those data >>>>>>>>>>>>> sets using >>>>>>>>>>>>> Spark MLLib. Do i have to do anything with Hadoop at this stage, >>>>>>>>>>>>> because i >>>>>>>>>>>>> am working with a standalone mode.thank you. >>>>>>>>>>>>> BR, >>>>>>>>>>>>> Mahesh. >>>>>>>>>>>>> >>>>>>>>>>>>> On Sun, Mar 13, 2016 at 6:30 PM, Maheshakya Wijewardena < >>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Mahesh, >>>>>>>>>>>>>> >>>>>>>>>>>>>> You don't have to look into carbon-ml. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Best regards. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Sun, Mar 13, 2016 at 5:49 PM, Mahesh Dananjaya < >>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi maheshakya, >>>>>>>>>>>>>>> i am working on some examples related to Spark and ML.is >>>>>>>>>>>>>>> there anything to do with carbon-ml. I think i dont need to >>>>>>>>>>>>>>> look into that >>>>>>>>>>>>>>> one.do i? >>>>>>>>>>>>>>> BR, >>>>>>>>>>>>>>> Mahesh >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Tue, Mar 8, 2016 at 11:55 AM, Maheshakya Wijewardena < >>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi Mahesh, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> does that Scala API is with your current product or repo? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> No, we don't have the Scala API included. What we want is >>>>>>>>>>>>>>>> to design the Java implementations of those algorithms to >>>>>>>>>>>>>>>> train with >>>>>>>>>>>>>>>> mini-batches of streaming data with the help of the >>>>>>>>>>>>>>>> aforementioned methods >>>>>>>>>>>>>>>> so that we can include in as a CEP extension. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> As to clarify, please try to write a simple Java program >>>>>>>>>>>>>>>> using Spark MLLib linear regression and k-means clustering >>>>>>>>>>>>>>>> with a sample >>>>>>>>>>>>>>>> data set (You can find alot of data sets from UCI repo[1]). >>>>>>>>>>>>>>>> You need to >>>>>>>>>>>>>>>> break the dataset into several pieces and train a model >>>>>>>>>>>>>>>> repeatedly with >>>>>>>>>>>>>>>> those. >>>>>>>>>>>>>>>> After each training run, save the model information (such >>>>>>>>>>>>>>>> as weights, intercepts for regression and cluster centers for >>>>>>>>>>>>>>>> clustering - >>>>>>>>>>>>>>>> please check the arguments of those methods I have mentioned >>>>>>>>>>>>>>>> and save the >>>>>>>>>>>>>>>> required information of the model) >>>>>>>>>>>>>>>> When training a model we a new piece of data, use those >>>>>>>>>>>>>>>> methods to initialize and put the save values for the >>>>>>>>>>>>>>>> arguments. This way >>>>>>>>>>>>>>>> you can start from where you stopped in the previous run. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Let us know your observations and feel free to ask if you >>>>>>>>>>>>>>>> need to know anything more on this. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> We'll let you know what needs to be done to include this in >>>>>>>>>>>>>>>> CEP. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Best regards. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Tue, Mar 8, 2016 at 10:59 AM, Mahesh Dananjaya < >>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi Maheshakya, >>>>>>>>>>>>>>>>> great.thank you.i already have ML and CEP and working more >>>>>>>>>>>>>>>>> towards it. does that Scala API is with your current product >>>>>>>>>>>>>>>>> or repo?. >>>>>>>>>>>>>>>>> thank you. >>>>>>>>>>>>>>>>> BR, >>>>>>>>>>>>>>>>> Mahesh. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Sun, Mar 6, 2016 at 5:49 PM, Maheshakya Wijewardena < >>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hi Mahesh, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Please find the comments inline. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> does data stream is taken to ML as the event publisher's >>>>>>>>>>>>>>>>>>> format through event publisher. Or we can use direct >>>>>>>>>>>>>>>>>>> traffic that comes to >>>>>>>>>>>>>>>>>>> event receiver, or else as streams >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> We intend to use the direct data as even streams. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 1.) Those data coming from wso2 DAS to ML are coming as >>>>>>>>>>>>>>>>>>> streams? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> No, WSO2 ML doesn't use any even stream. The data stored >>>>>>>>>>>>>>>>>> in tables in DAS is loaded into ML. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 2.) Are there any incremental learning algorithms >>>>>>>>>>>>>>>>>>> currently active in ML?you mentioned that there are and >>>>>>>>>>>>>>>>>>> they are with scala >>>>>>>>>>>>>>>>>>> API. So there is a streaming support with that Scala API. >>>>>>>>>>>>>>>>>>> In that API which >>>>>>>>>>>>>>>>>>> format the data is aquired to ML? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> No, there are no incremental learning algorithms in ML. >>>>>>>>>>>>>>>>>> The scala API is about Spark MLLib. MLLib supports streaming >>>>>>>>>>>>>>>>>> k-means and >>>>>>>>>>>>>>>>>> other generalized linear models (linear regression variants >>>>>>>>>>>>>>>>>> and logistic >>>>>>>>>>>>>>>>>> regression) with Scala API. What they basically do in those >>>>>>>>>>>>>>>>>> implementations >>>>>>>>>>>>>>>>>> is retraining the trained models with mini batches when data >>>>>>>>>>>>>>>>>> sequentially >>>>>>>>>>>>>>>>>> arrives. There, the breaking of streaming data into mini >>>>>>>>>>>>>>>>>> batches is done >>>>>>>>>>>>>>>>>> with the help of Spark Streaming. But we do not intend to >>>>>>>>>>>>>>>>>> use Spark >>>>>>>>>>>>>>>>>> streaming in our implementation. What we need to do is >>>>>>>>>>>>>>>>>> implement a similar >>>>>>>>>>>>>>>>>> behavior for event streams using the Java API. The Java API >>>>>>>>>>>>>>>>>> has the >>>>>>>>>>>>>>>>>> following methods: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> - *createModel >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> <http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/regression/LinearRegressionWithSGD.html#createModel%28org.apache.spark.mllib.linalg.Vector,%20double%29>* >>>>>>>>>>>>>>>>>> (Vector >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> <http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/linalg/Vector.html> >>>>>>>>>>>>>>>>>> weights, >>>>>>>>>>>>>>>>>> double intercept) - for GLMs >>>>>>>>>>>>>>>>>> - *setInitialModel >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> <http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/clustering/KMeans.html#setInitialModel%28org.apache.spark.mllib.clustering.KMeansModel%29>* >>>>>>>>>>>>>>>>>> (KMeansModel >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> <http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/clustering/KMeansModel.html> >>>>>>>>>>>>>>>>>> model) >>>>>>>>>>>>>>>>>> - for K means >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> With the help of these methods, we can train models again >>>>>>>>>>>>>>>>>> with newly arriving data, keeping the characteristics >>>>>>>>>>>>>>>>>> learned with the >>>>>>>>>>>>>>>>>> previous data. When implementing this, we need to pay >>>>>>>>>>>>>>>>>> attention to other >>>>>>>>>>>>>>>>>> parameters of incremental learning such as data horizon and >>>>>>>>>>>>>>>>>> data >>>>>>>>>>>>>>>>>> obsolescence (indicated in the project ideas page). >>>>>>>>>>>>>>>>>> We need to discuss on how to add these with CEP event >>>>>>>>>>>>>>>>>> streams. I have added Suho into the thread for more >>>>>>>>>>>>>>>>>> clarification. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Best regards. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Sat, Mar 5, 2016 at 5:15 PM, Mahesh Dananjaya < >>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Hi maheshakya, >>>>>>>>>>>>>>>>>>> as we concerned to use WSO2 CEP to handle streaming data >>>>>>>>>>>>>>>>>>> and implement the machine learning algorithms with Spark >>>>>>>>>>>>>>>>>>> MLLib, does data >>>>>>>>>>>>>>>>>>> stream is taken to ML as the event publisher's format >>>>>>>>>>>>>>>>>>> through event >>>>>>>>>>>>>>>>>>> publisher. Or we can use direct traffic that comes to >>>>>>>>>>>>>>>>>>> event receiver, or >>>>>>>>>>>>>>>>>>> else as streams. referring to >>>>>>>>>>>>>>>>>>> https://docs.wso2.com/display/CEP410/User+Guide >>>>>>>>>>>>>>>>>>> 1.) Those data coming from wso2 DAS to ML are coming >>>>>>>>>>>>>>>>>>> as streams? >>>>>>>>>>>>>>>>>>> 2.) Are there any incremental learning algorithms >>>>>>>>>>>>>>>>>>> currently active in ML?you mentioned that there are and >>>>>>>>>>>>>>>>>>> they are with scala >>>>>>>>>>>>>>>>>>> API. So there is a streaming support with that Scala API. >>>>>>>>>>>>>>>>>>> In that API which >>>>>>>>>>>>>>>>>>> format the data is aquired to ML? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> thank you. >>>>>>>>>>>>>>>>>>> BR, >>>>>>>>>>>>>>>>>>> Mahesh. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Fri, Mar 4, 2016 at 2:03 PM, Maheshakya Wijewardena < >>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Hi Mahesh, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> We had to modify a the project scope a little to suit >>>>>>>>>>>>>>>>>>>> best for the requirements. We will update the project idea >>>>>>>>>>>>>>>>>>>> with those >>>>>>>>>>>>>>>>>>>> concerns soon and let you know. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> We do not support streaming data in WSO2 Machine >>>>>>>>>>>>>>>>>>>> learner at the moment. The new concern is to use WSO2 CEP >>>>>>>>>>>>>>>>>>>> to handle >>>>>>>>>>>>>>>>>>>> streaming data and implement the machine learning >>>>>>>>>>>>>>>>>>>> algorithms with Spark >>>>>>>>>>>>>>>>>>>> MLLib. You can look at the streaming k-means and streaming >>>>>>>>>>>>>>>>>>>> linear >>>>>>>>>>>>>>>>>>>> regression implementations in MLLib. Currently, the API is >>>>>>>>>>>>>>>>>>>> only for scala. >>>>>>>>>>>>>>>>>>>> Our need is to get the Java APIs of k-means and >>>>>>>>>>>>>>>>>>>> generalized linear models >>>>>>>>>>>>>>>>>>>> to support incremental learning with streaming data. This >>>>>>>>>>>>>>>>>>>> has to be done as >>>>>>>>>>>>>>>>>>>> mini-batch learning since these algorithms operates as >>>>>>>>>>>>>>>>>>>> stochastic gradient >>>>>>>>>>>>>>>>>>>> descents so that any learning with new data can be done on >>>>>>>>>>>>>>>>>>>> top of the >>>>>>>>>>>>>>>>>>>> previously learned models. So please go through the those >>>>>>>>>>>>>>>>>>>> APIs[1][2][3] and >>>>>>>>>>>>>>>>>>>> try to get an idea. >>>>>>>>>>>>>>>>>>>> Also please try to understand how event streams work in >>>>>>>>>>>>>>>>>>>> WSO2 CEP [4][5]. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Best regards. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>>>>> http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/regression/LinearRegressionWithSGD.html >>>>>>>>>>>>>>>>>>>> [2] >>>>>>>>>>>>>>>>>>>> http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/clustering/KMeans.html >>>>>>>>>>>>>>>>>>>> [3] >>>>>>>>>>>>>>>>>>>> http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/classification/LogisticRegressionWithSGD.html >>>>>>>>>>>>>>>>>>>> [4] >>>>>>>>>>>>>>>>>>>> https://docs.wso2.com/display/CEP310/Working+with+Event+Streams >>>>>>>>>>>>>>>>>>>> [5] >>>>>>>>>>>>>>>>>>>> https://docs.wso2.com/display/CEP310/Working+with+Execution+Plans >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Fri, Mar 4, 2016 at 11:26 AM, Mahesh Dananjaya < >>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Hi maheshakya, >>>>>>>>>>>>>>>>>>>>> give me sometime to go through your ML package. Do >>>>>>>>>>>>>>>>>>>>> current product have any stream data support?. i did some >>>>>>>>>>>>>>>>>>>>> university >>>>>>>>>>>>>>>>>>>>> projects related to machine learning with >>>>>>>>>>>>>>>>>>>>> regressions,modelling, factor >>>>>>>>>>>>>>>>>>>>> analysis, cluster analysis and classification problems >>>>>>>>>>>>>>>>>>>>> (Discriminant >>>>>>>>>>>>>>>>>>>>> Analysis) with SVM (Support Vector machines), Neural >>>>>>>>>>>>>>>>>>>>> networks, LS >>>>>>>>>>>>>>>>>>>>> classification and ML(Maximum likelihood). give me >>>>>>>>>>>>>>>>>>>>> sometime to see how wso2 >>>>>>>>>>>>>>>>>>>>> architecture works.then i can come up with good >>>>>>>>>>>>>>>>>>>>> architecture.thank you. >>>>>>>>>>>>>>>>>>>>> BR, >>>>>>>>>>>>>>>>>>>>> Mahesh. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On Wed, Mar 2, 2016 at 2:41 PM, Mahesh Dananjaya < >>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Hi Maheshakya, >>>>>>>>>>>>>>>>>>>>>> Thank you for the resources. I will go through this >>>>>>>>>>>>>>>>>>>>>> and looking forward to this proposed project.Thank you. >>>>>>>>>>>>>>>>>>>>>> BR, >>>>>>>>>>>>>>>>>>>>>> Mahesh. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On Wed, Mar 2, 2016 at 1:52 PM, Maheshakya >>>>>>>>>>>>>>>>>>>>>> Wijewardena <[email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Hi Mahesh, >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Thank you for the interest for this project. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> We would like to know what type of similar projects >>>>>>>>>>>>>>>>>>>>>>> you have worked on. You may have seen that WSO2 Machine >>>>>>>>>>>>>>>>>>>>>>> Learner supports >>>>>>>>>>>>>>>>>>>>>>> several learning algorithms at the moment[1]. This >>>>>>>>>>>>>>>>>>>>>>> project intends to >>>>>>>>>>>>>>>>>>>>>>> leverage the existing algorithms in WSO2 Machine >>>>>>>>>>>>>>>>>>>>>>> Learner to support >>>>>>>>>>>>>>>>>>>>>>> streaming data. As an initiative, first you can get an >>>>>>>>>>>>>>>>>>>>>>> idea about what WSO2 >>>>>>>>>>>>>>>>>>>>>>> Machine Learner does and how it operates. You can >>>>>>>>>>>>>>>>>>>>>>> download WSO2 Machine >>>>>>>>>>>>>>>>>>>>>>> Learner from product page[2] and the the source code >>>>>>>>>>>>>>>>>>>>>>> [3]. ML is using >>>>>>>>>>>>>>>>>>>>>>> Apache Spark MLLib[4] for its' algorithms so it's >>>>>>>>>>>>>>>>>>>>>>> better to read and >>>>>>>>>>>>>>>>>>>>>>> understand what it does as well. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> In order to get an idea about the deliverables and >>>>>>>>>>>>>>>>>>>>>>> the scope of this project, try to understand how Spark >>>>>>>>>>>>>>>>>>>>>>> streaming[5] (see >>>>>>>>>>>>>>>>>>>>>>> examples) handles streaming data. Also, have a look in >>>>>>>>>>>>>>>>>>>>>>> the streaming >>>>>>>>>>>>>>>>>>>>>>> algorithms[6][7] supported by MLLib. There are two >>>>>>>>>>>>>>>>>>>>>>> approaches discussed to >>>>>>>>>>>>>>>>>>>>>>> employ incremental learning in ML in the project >>>>>>>>>>>>>>>>>>>>>>> proposals page. These >>>>>>>>>>>>>>>>>>>>>>> streaming algorithms can be directly used in the first >>>>>>>>>>>>>>>>>>>>>>> approach. For the >>>>>>>>>>>>>>>>>>>>>>> other approach, the your implementation should contain >>>>>>>>>>>>>>>>>>>>>>> a procedure to >>>>>>>>>>>>>>>>>>>>>>> create mini batches from streaming data with relevant >>>>>>>>>>>>>>>>>>>>>>> sizes (i.e. a moving >>>>>>>>>>>>>>>>>>>>>>> window) and do periodic retraining of the same >>>>>>>>>>>>>>>>>>>>>>> algorithm. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> To start with the project, you will need to come up >>>>>>>>>>>>>>>>>>>>>>> with a suitable plan and an architecture first. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Please watch the video referenced in the proposal >>>>>>>>>>>>>>>>>>>>>>> (reference: 5). It will help you getting a better idea >>>>>>>>>>>>>>>>>>>>>>> about machine >>>>>>>>>>>>>>>>>>>>>>> learning algorithms with streaming data. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Let us know if you need any help with these. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Best regards >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>>>>>>>> https://docs.wso2.com/display/ML110/Machine+Learner+Algorithms >>>>>>>>>>>>>>>>>>>>>>> [2] http://wso2.com/products/machine-learner/ >>>>>>>>>>>>>>>>>>>>>>> [3] >>>>>>>>>>>>>>>>>>>>>>> https://docs.wso2.com/display/ML110/Building+from+Source#BuildingfromSource-Downloadingthesourcecheckout >>>>>>>>>>>>>>>>>>>>>>> [4] >>>>>>>>>>>>>>>>>>>>>>> https://spark.apache.org/docs/1.4.1/mllib-guide.html >>>>>>>>>>>>>>>>>>>>>>> [5] >>>>>>>>>>>>>>>>>>>>>>> https://spark.apache.org/docs/1.4.1/streaming-programming-guide.html >>>>>>>>>>>>>>>>>>>>>>> [6] >>>>>>>>>>>>>>>>>>>>>>> https://spark.apache.org/docs/1.4.1/mllib-linear-methods.html#streaming-linear-regression >>>>>>>>>>>>>>>>>>>>>>> [7] >>>>>>>>>>>>>>>>>>>>>>> https://spark.apache.org/docs/1.4.1/mllib-clustering.html#streaming-k-means >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On Wed, Mar 2, 2016 at 1:19 PM, Mahesh Dananjaya < >>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>>> I am interesting on contribute to proposal 6: >>>>>>>>>>>>>>>>>>>>>>>> "Predictive analytic with online data for WSO2 Machine >>>>>>>>>>>>>>>>>>>>>>>> Learner" for GSOC2 >>>>>>>>>>>>>>>>>>>>>>>> this time. Since i have been engaging with some >>>>>>>>>>>>>>>>>>>>>>>> similar projects i think it >>>>>>>>>>>>>>>>>>>>>>>> will be a great experience for me. Please let me know >>>>>>>>>>>>>>>>>>>>>>>> what you think and >>>>>>>>>>>>>>>>>>>>>>>> what you suggest. I have been going through your >>>>>>>>>>>>>>>>>>>>>>>> documents.thank you. >>>>>>>>>>>>>>>>>>>>>>>> regards, >>>>>>>>>>>>>>>>>>>>>>>> Mahesh Dananjaya. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>>>>>>>> Dev mailing list >>>>>>>>>>>>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>>>>>>>>>>> http://wso2.org/cgi-bin/mailman/listinfo/dev >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>> Pruthuvi Maheshakya Wijewardena >>>>>>>>>>>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>>>>>>>>>> +94711228855 >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>> Pruthuvi Maheshakya Wijewardena >>>>>>>>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>>>>>>> +94711228855 >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>> Pruthuvi Maheshakya Wijewardena >>>>>>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>>>>> +94711228855 >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> Pruthuvi Maheshakya Wijewardena >>>>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>>> +94711228855 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Pruthuvi Maheshakya Wijewardena >>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>> +94711228855 >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Pruthuvi Maheshakya Wijewardena >>>>>>>>>>>> [email protected] >>>>>>>>>>>> +94711228855 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Pruthuvi Maheshakya Wijewardena >>>>>>>>>> [email protected] >>>>>>>>>> +94711228855 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Pruthuvi Maheshakya Wijewardena >>>>>>>> [email protected] >>>>>>>> +94711228855 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Pruthuvi Maheshakya Wijewardena >>>>>>> [email protected] >>>>>>> +94711228855 >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Pruthuvi Maheshakya Wijewardena >>>>>> [email protected] >>>>>> +94711228855 >>>>>> >>>>>> >>>>>> >>>>> >>>> >>>> >>> >>> >>> -- >>> Pruthuvi Maheshakya Wijewardena >>> [email protected] >>> +94711228855 >>> >>> >>> >> >
_______________________________________________ Dev mailing list [email protected] http://wso2.org/cgi-bin/mailman/listinfo/dev
