Hi Maheshakya, Did you check the repo. I will add recent works today.And also i was going through the Java docs related to spark streaming work. It is with that scala API. thank you. regards, Mahesh.
On Tue, May 17, 2016 at 10:11 AM, Mahesh Dananjaya < dananjayamah...@gmail.com> wrote: > Hi Maheshakya, > I have gone through the Java Docs and run some of the Spark examples on > spark shell which are paramount improtant for our work. Then i have been > writing my codes to check the Linear regression, K means for streaming. > please check my git repo [1]. I think now i have to ask on dev regarding > the capturing event streams for our work. I will update the recent things > on git. check the park-example directory for java. examples run on git > shell is not included there. In my case i think i have to build mini > batches from data streams that comes as individual samples. Now i am > working on some coding to collect mini batches from data streams.thank you. > regards, > Mahesh. > [1]https://github.com/dananjayamahesh/GSOC2016 > > On Tue, May 17, 2016 at 10:10 AM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi Maheshakya, >> I have gone through the Java Docs and run some of the Spark examples on >> spark shell which are paramount improtant for our work. Then i have been >> writing my codes to check the Linear regression, K means for streaming. >> please check my git repo [1]. I think now i have to ask on dev regarding >> the capturing event streams for our work. I will update the recent things >> on git. check the park-example directory for java. examples run on git >> shell is not included there. In my case i think i have to build mini >> batches from data streams that comes as individual samples. Now i am >> working on some coding to collect mini batches from data streams.thank you. >> regards, >> Mahesh. >> [1]https://github.com/dananjayamahesh/GSOC2016 >> >> On Mon, May 16, 2016 at 1:19 PM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi Maheshakya, >>> thank you. i will update the repo today.thank you.i changed the carbon >>> ml siddhi extention and see how the changes are effecting. i will update >>> the progress as soon as possible.thank you. i had some problem in spark >>> mllib dependency. i was fixing that. >>> regards, >>> Mahesh. >>> p.s: do i need to maintain a blog? >>> >>> On Mon, May 16, 2016 at 10:02 AM, Maheshakya Wijewardena < >>> mahesha...@wso2.com> wrote: >>> >>>> Hi Mahesh, >>>> >>>> Sorry for replying late. >>>> >>>> Thank you for the update. I believe you have done some implementations >>>> with with Spark MLLIb algorithms in streaming fashion as we have discussed. >>>> If so, can you please share your code in a Github repo. >>>> >>>> Now i want to implements some machine learning algorithms with >>>>> importing mllib and want to run within your code base >>>>> >>>> >>>> For the moment you can try out editing the same class >>>> PredictStreamProcessor in the siddhi extension in carbon-ml. Later we will >>>> add this separately. You should be able to add org.apache.spark.mllib. >>>> classes to there. >>>> >>>> And i want to see how event streams are coming from cep. As i think it >>>>> is not in a RDD format since it is arriving as the individual samples. I >>>>> will send a email to dev asking about how to get the streams. >>>> >>>> >>>> Please pay attention to length[1] and lengthbatch[1] inbuilt windows in >>>> siddhi. What you need to write are functions similar to a custom aggregate >>>> function[2]. >>>> When you send the email to dev list, explain your requirement. You need >>>> to get a set of event with from a stream with a specified window size >>>> (number of events). Then build a model within that function. You also need >>>> to retain the data (learned weights, cluster centers, etc.) from the >>>> previous window to use in the current window. Ask what can be the most >>>> suitable option for this among the set of siddhi extensions given. >>>> >>>> Best regards. >>>> >>>> [1] >>>> https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength >>>> [2] >>>> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function >>>> >>>> On Wed, May 11, 2016 at 1:43 PM, Mahesh Dananjaya < >>>> dananjayamah...@gmail.com> wrote: >>>> >>>>> >>>>> ---------- Forwarded message ---------- >>>>> From: Mahesh Dananjaya <dananjayamah...@gmail.com> >>>>> Date: Wed, May 11, 2016 at 1:43 PM >>>>> Subject: Re: [Dev] GSOC2016: [ML][CEP] Predictive analytic with online >>>>> data for WSO2 Machine Learner >>>>> To: Maheshakya Wijewardena <mahesha...@wso2.com> >>>>> >>>>> >>>>> Hi Maheshakya, >>>>> sorry for not updating. I did what you wanted me to do. I checked the >>>>> code base and train functions. I went through those java docs. I went >>>>> through the carbon-ml current implementation of LG and K-Mean. And i had >>>>> Apache Spark and i tried with several examples. Now i want to implements >>>>> some machine learning algorithms with importing mllib and want to run >>>>> within your code base. Can you help me with that. >>>>> And i want to see how event streams are coming from cep. As i think it >>>>> is not in a RDD format since it is arriving as the individual samples. I >>>>> will send a email to dev asking about how to get the streams. I debugged >>>>> many of those functions in the code base. So need further instructions to >>>>> proceed.thank you. >>>>> regards, >>>>> Mahesh. >>>>> >>>>> On Wed, May 11, 2016 at 10:32 AM, Maheshakya Wijewardena < >>>>> mahesha...@wso2.com> wrote: >>>>> >>>>>> Hi Mahesh, >>>>>> >>>>>> Any update on your progress? >>>>>> >>>>>> Best regards. >>>>>> >>>>>> On Wed, May 4, 2016 at 8:35 PM, Maheshakya Wijewardena < >>>>>> mahesha...@wso2.com> wrote: >>>>>> >>>>>>> Hi Mahesh, >>>>>>> >>>>>>> is that "Put break points in train methods in Linear Regression >>>>>>>> class" means the spark/algorithms/ LinearRegrassion.java class in the >>>>>>>> org.wso2.carbon.ml.core? is that the correct file? >>>>>>> >>>>>>> >>>>>>> Yes, this is the correct place. >>>>>>> >>>>>>> You can refer to spark programming guide[1][2] as well as our ML >>>>>>> code base when you try those algorithms out. Please try to do rough >>>>>>> implementations of the streaming versions of linear regression, logistic >>>>>>> regression and k-means clustering as we have discussed in the proposal >>>>>>> in >>>>>>> plain Java. It's better if you can create a git repo and share your code >>>>>>> once you have made some progress. >>>>>>> >>>>>>> Were you able debug and understand the flow of the ML siddhi >>>>>>> extension? I hope you haven't encountered more errors after switching >>>>>>> the >>>>>>> released version of CEP. >>>>>>> >>>>>>> Is this Friday okay for you? Afternoon at 2:00 pm? >>>>>>> >>>>>>> Best regards. >>>>>>> >>>>>>> >>>>>>> Best regards. >>>>>>> >>>>>>> [1] http://spark.apache.org/docs/latest/programming-guide.html >>>>>>> [2] http://spark.apache.org/docs/latest/mllib-guide.html >>>>>>> >>>>>>> On Wed, May 4, 2016 at 1:07 PM, Mahesh Dananjaya < >>>>>>> dananjayamah...@gmail.com> wrote: >>>>>>> >>>>>>>> Hi Maheshakya, >>>>>>>> I have been looking into some algorithms related to stochastic >>>>>>>> gradient descent based algorithms.anything i should focus please let me >>>>>>>> know.Ans also i will be available for calling this week and next >>>>>>>> week.thank >>>>>>>> you. >>>>>>>> BR, >>>>>>>> Mahesh. >>>>>>>> >>>>>>>> On Tue, May 3, 2016 at 5:05 PM, Mahesh Dananjaya < >>>>>>>> dananjayamah...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Hi Maheshakya, >>>>>>>>> thank you.that's good. i have been trying to fix that for couple >>>>>>>>> of days. please inform me when it will be fixed.now i have been >>>>>>>>> testing the >>>>>>>>> ML algorithms and trying to identify the flow and the hierarchy. is >>>>>>>>> that >>>>>>>>> "Put break points in train methods in Linear Regression class" means >>>>>>>>> the >>>>>>>>> spark/algorithms/ LinearRegrassion.java class in the >>>>>>>>> org.wso2.carbon.ml.core? is that the correct file? >>>>>>>>> And also i am planning to write some programs to use apache spark >>>>>>>>> mllib algorithms. and i refer to [1] and some wso2 documentations to >>>>>>>>> get >>>>>>>>> some idea about ML structure.thank you. >>>>>>>>> >>>>>>>>> BR, >>>>>>>>> Mahesh. >>>>>>>>> >>>>>>>>> [1]nirmalfdo.blogspot.com >>>>>>>>> >>>>>>>>> On Tue, May 3, 2016 at 4:36 PM, Maheshakya Wijewardena < >>>>>>>>> mahesha...@wso2.com> wrote: >>>>>>>>> >>>>>>>>>> Hi Mahesh, >>>>>>>>>> >>>>>>>>>> I have checked. It seems the issue you have encountered is cause >>>>>>>>>> only in the current development branch of the product-cep. It doesn't >>>>>>>>>> identify the ML siddhi extension as an extension. ML siddhi >>>>>>>>>> extension works >>>>>>>>>> fine in the latest release of CEP (4.1.0) [1]. >>>>>>>>>> Until we figure out the reason and come up with a solution, can >>>>>>>>>> you use the latest CEP release for your work. It's fine to use that >>>>>>>>>> since >>>>>>>>>> you haven't started actual development yet. >>>>>>>>>> >>>>>>>>>> Best regards. >>>>>>>>>> >>>>>>>>>> [1] http://wso2.com/products/complex-event-processor/ >>>>>>>>>> >>>>>>>>>> On Tue, May 3, 2016 at 3:19 PM, Maheshakya Wijewardena < >>>>>>>>>> mahesha...@wso2.com> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Mahesh, >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Is is vital to use those local repo in my upcoming >>>>>>>>>>>> implementation? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Yes. The remote p2-repo contains the p2-repos of released >>>>>>>>>>> versions. What you have to develop on is the current master of the >>>>>>>>>>> carbon-ml and product-ml. You can try out with the modification I >>>>>>>>>>> have >>>>>>>>>>> suggested. In the meantime, I'll verify whether the current repos >>>>>>>>>>> are >>>>>>>>>>> working as expected. >>>>>>>>>>> >>>>>>>>>>> And also i am trying to debug the carbon-ml >>>>>>>>>>>> org.wso2.carbon.ml.core by putting some break point in the >>>>>>>>>>>> spark/algorithms/Linear Regression >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> It's great that you have started looking at the implementation >>>>>>>>>>> of linear regression as well. Put break points in train methods in >>>>>>>>>>> LinearRegression class. This is being used when you run linear >>>>>>>>>>> regression >>>>>>>>>>> from UI. >>>>>>>>>>> >>>>>>>>>>> I can see some comments left behind for streaming algo as >>>>>>>>>>>> well.thank you >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> You may be referring to the linear regression with SGD model. >>>>>>>>>>> Here, there's no retraining with streaming data involved. The SGD >>>>>>>>>>> with >>>>>>>>>>> minibatches is used to train the model with the data set only once. >>>>>>>>>>> What you have to do is create a similar mechanism to involve >>>>>>>>>>> streaming data and retrain models. We will get to that part once >>>>>>>>>>> you get >>>>>>>>>>> comfortable with siddhi extensions. >>>>>>>>>>> >>>>>>>>>>> BTW, is it possible for you to join a call on this Friday or in >>>>>>>>>>> the next week. We'll try to resolve your current issues and discuss >>>>>>>>>>> further >>>>>>>>>>> on project. >>>>>>>>>>> >>>>>>>>>>> Best regards. >>>>>>>>>>> >>>>>>>>>>> On Tue, May 3, 2016 at 1:03 PM, Mahesh Dananjaya < >>>>>>>>>>> dananjayamah...@gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi maheshakya, >>>>>>>>>>>> Is it ok to go with p2 repo at >>>>>>>>>>>> http://product-dist.wso2.com/p2/carbon/releases/wilkes/features/ >>>>>>>>>>>> rather than the P2-repo at >>>>>>>>>>>> product-ml/modules/p2-profile/target/p2-repo in >>>>>>>>>>>> local repo.What is the impact?.Is is vital to use those local repo >>>>>>>>>>>> in my >>>>>>>>>>>> upcoming implementation?so i was trying to give remote p2 repo to >>>>>>>>>>>> cep built >>>>>>>>>>>> by source and trying to debug the cep ml extension and got the >>>>>>>>>>>> same error >>>>>>>>>>>> as yesterday. But pre-built product is working fine. Therefore >>>>>>>>>>>> now i am >>>>>>>>>>>> trying as you described in the last email. >>>>>>>>>>>> >>>>>>>>>>>> And also i am trying to debug the carbon-ml >>>>>>>>>>>> org.wso2.carbon.ml.core by putting some break point in the >>>>>>>>>>>> spark/algorithms/Linear Regression. I am trying to trigger it with >>>>>>>>>>>> product-ml project with data set. Does that Linear Regression also >>>>>>>>>>>> in UI is >>>>>>>>>>>> consuming that spark algorithms or is it in another place? I can >>>>>>>>>>>> see some >>>>>>>>>>>> comments left behind for streaming algo as well.thank you. >>>>>>>>>>>> BR, >>>>>>>>>>>> Mahesh. >>>>>>>>>>>> >>>>>>>>>>>> On Tue, May 3, 2016 at 9:35 AM, Maheshakya Wijewardena < >>>>>>>>>>>> mahesha...@wso2.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi Mahesh, >>>>>>>>>>>>> >>>>>>>>>>>>> The earlier error you have mentioned may occur due to the >>>>>>>>>>>>> incompatible Siddhi versions in ML p2-repo and CEP, when you add >>>>>>>>>>>>> the >>>>>>>>>>>>> p2-repo from prodcut-ml you built. >>>>>>>>>>>>> Current siddhi version in product-cep is 3.0.6-SNAPSHOT[1], >>>>>>>>>>>>> but in ML it's 3.0.2. >>>>>>>>>>>>> >>>>>>>>>>>>> Can you try changing the siddhi.version in carbon-ml/pom.xml >>>>>>>>>>>>> to 3.0.6-SNAPSHOT, build carbon-ml, then build product-ml again. >>>>>>>>>>>>> After >>>>>>>>>>>>> this, add p2-repo as a local repository again to fresh CEP pack >>>>>>>>>>>>> and try it >>>>>>>>>>>>> out. >>>>>>>>>>>>> >>>>>>>>>>>>> Best regards. >>>>>>>>>>>>> >>>>>>>>>>>>> On Mon, May 2, 2016 at 7:02 PM, Mahesh Dananjaya < >>>>>>>>>>>>> dananjayamah...@gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Maheshakya, >>>>>>>>>>>>>> now i remotely debug the CEP extension for ML Prediction. >>>>>>>>>>>>>> What i did was, i have all the pre-build version of CEP and ML. >>>>>>>>>>>>>> Therefore i >>>>>>>>>>>>>> used that built CEP and did the same thing that i was doing with >>>>>>>>>>>>>> the source >>>>>>>>>>>>>> code. I think the only change i did was install those packages >>>>>>>>>>>>>> from remote >>>>>>>>>>>>>> p2 repo. This was work fine and i debugged the carbon-ml >>>>>>>>>>>>>> org.wso2.carbon.ml.siddhi.extension as described in the [1]. So >>>>>>>>>>>>>> now i have >>>>>>>>>>>>>> to try same thing with the build by source content.thank you. >>>>>>>>>>>>>> >>>>>>>>>>>>>> [1] >>>>>>>>>>>>>> https://docs.wso2.com/display/ML110/WSO2+CEP+Extension+for+ML+Predictions#WSO2CEPExtensionforMLPredictions-Siddhisyntaxfortheextension >>>>>>>>>>>>>> >>>>>>>>>>>>>> BR, >>>>>>>>>>>>>> Mahesh >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Mon, Apr 25, 2016 at 5:49 PM, Maheshakya Wijewardena < >>>>>>>>>>>>>> mahesha...@wso2.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Mahesh, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Congratulations and welcome to GSoC 2016. You did a great >>>>>>>>>>>>>>> job in preparing the proposal. Now it's time to dig deep and >>>>>>>>>>>>>>> get started >>>>>>>>>>>>>>> with the project. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> First of all you need to familiarize with the code base. We >>>>>>>>>>>>>>> have agreed to implement this with CEP event streams. We >>>>>>>>>>>>>>> already have a CEP >>>>>>>>>>>>>>> extension for predictions [1][2]. Go through this >>>>>>>>>>>>>>> implementation and >>>>>>>>>>>>>>> familiarize your self with that. You need to understand how: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 1. Even streams are consumed >>>>>>>>>>>>>>> 2. predictions are made from individual event >>>>>>>>>>>>>>> 3. Results are sent back >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Get WSO2 ML and CEP sources (You may use latest released >>>>>>>>>>>>>>> version of CEP) and build the products. Get both carbon-ml[3] >>>>>>>>>>>>>>> and >>>>>>>>>>>>>>> product-ml[4] masters and create new branches for your work >>>>>>>>>>>>>>> from masters. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> After you build the products, you may need to do remote >>>>>>>>>>>>>>> debugging[5] to understand the flow. So please follow an >>>>>>>>>>>>>>> example of real >>>>>>>>>>>>>>> time prediction with ML with debugging and get some idea. The >>>>>>>>>>>>>>> component you >>>>>>>>>>>>>>> need to debug is org.wso2.carbon.ml.siddhi.extension. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Next tasks would be implementing online learning algorithms >>>>>>>>>>>>>>> in plain java with spark ml lib and integrating those to ML. We >>>>>>>>>>>>>>> also need >>>>>>>>>>>>>>> to come up with a proper and detailed architecture to employ >>>>>>>>>>>>>>> those >>>>>>>>>>>>>>> algorithms in ML. Getting familiar with the aforementioned >>>>>>>>>>>>>>> sections would >>>>>>>>>>>>>>> give you some insight on how this should be implemented. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> So please try to get a quick grasp then you can start the >>>>>>>>>>>>>>> implementation. Let us know if you have any questions or you >>>>>>>>>>>>>>> get stuck >>>>>>>>>>>>>>> somewhere. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Also, please always add WSO2 developer's list as well when >>>>>>>>>>>>>>> you communicate with us regarding the project so that you can >>>>>>>>>>>>>>> get opinions >>>>>>>>>>>>>>> and feedback from others as well. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Best regards. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>> https://docs.wso2.com/display/ML110/WSO2+CEP+Extension+for+ML+Predictions#WSO2CEPExtensionforMLPredictions-Siddhisyntaxfortheextension >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> [2] >>>>>>>>>>>>>>> https://github.com/wso2/carbon-ml/tree/master/components/extensions/org.wso2.carbon.ml.siddhi.extension >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> [3] https://github.com/wso2/carbon-ml >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> [4] https://github.com/wso2/product-ml >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> [5] https://dzone.com/articles/how-debug-wso2-carbon-kernel >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Mon, Apr 25, 2016 at 3:33 PM, Mahesh Dananjaya < >>>>>>>>>>>>>>> dananjayamah...@gmail.com> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>> thank you for accepting my GSOC 2016 proposal and i am >>>>>>>>>>>>>>>> looking forward for the further instruction and project >>>>>>>>>>>>>>>> continuation. thank >>>>>>>>>>>>>>>> you very much. >>>>>>>>>>>>>>>> regards, >>>>>>>>>>>>>>>> Mahesh. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> Pruthuvi Maheshakya Wijewardena >>>>>>>>>>>>>>> mahesha...@wso2.com >>>>>>>>>>>>>>> +94711228855 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> Pruthuvi Maheshakya Wijewardena >>>>>>>>>>>>> mahesha...@wso2.com >>>>>>>>>>>>> +94711228855 >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Pruthuvi Maheshakya Wijewardena >>>>>>>>>>> mahesha...@wso2.com >>>>>>>>>>> +94711228855 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Pruthuvi Maheshakya Wijewardena >>>>>>>>>> mahesha...@wso2.com >>>>>>>>>> +94711228855 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Pruthuvi Maheshakya Wijewardena >>>>>>> mahesha...@wso2.com >>>>>>> +94711228855 >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Pruthuvi Maheshakya Wijewardena >>>>>> mahesha...@wso2.com >>>>>> +94711228855 >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Dev mailing list >>>>> Dev@wso2.org >>>>> http://wso2.org/cgi-bin/mailman/listinfo/dev >>>>> >>>>> >>>> >>>> >>>> -- >>>> Pruthuvi Maheshakya Wijewardena >>>> mahesha...@wso2.com >>>> +94711228855 >>>> >>>> >>>> >>> >> >
_______________________________________________ Dev mailing list Dev@wso2.org http://wso2.org/cgi-bin/mailman/listinfo/dev