Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
> wrote: >>>> >>>>> Hi Nirmal, >>>>> *This is what i have done so far in the GSOC2016,* >>>>> >>>>>- prior research before SGD (Stochastic Gradient Descent) >>>>>optimization techniques and mini-batch processing >>>>>- Getting familiar and writing extensions to siddhi >>>>>- Wrote a Stream Processor extensions for streaming application >>>>>and machine learning algorithms (Linear Regression,KMeans & Logistic >>>>>Regression) >>>>>- Developed a Streaming Linear Regression class for periodically >>>>>retrain models as mini batch processing with SGD >>>>>- Extend the functionality for Moving Window Mini Batch Processing >>>>>with SGD providing windowShift which control data horizon and data >>>>>obsolescences >>>>>- Performance evaluation of the implementation >>>>>- Adding Streaming Linear Regression class and Stream Processor >>>>>extension to carbon-ml >>>>> >>>>> >>>>> *As a next step,* >>>>> >>>>>- Adding Persisting temporal models for applications such as >>>>>prediction >>>>>- complete Streaming Kmeans clustering and Logistic Regression >>>>>classes >>>>>- Improve batching and streaming mechanisms >>>>>- improve visualization(optional) >>>>>- and writing examples and documentation >>>>> >>>>> regards, >>>>> >>>>> Mahesh. >>>>> >>>>> On Wed, Jun 22, 2016 at 10:28 AM, Maheshakya Wijewardena < >>>>> mahesha...@wso2.com> wrote: >>>>> >>>>>> Sorry, you need to put the returned values of the function into the >>>>>> output stream >>>>>> >>>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95, >>>>>> salary, rbi, walks, strikeouts, errors) >>>>>> >>>>>> >>>>>> >>>>>> *select mseinsert into LinregOutput;* >>>>>> or >>>>>> >>>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95, >>>>>> salary, rbi, walks, strikeouts, errors) >>>>>> select * >>>>>> insert into LinregOutput; >>>>>> >>>>>> where LinregOutput stream definition contains all attributes: mse, >>>>>> intercept, beta1, >>>>>> >>>>>> On Wed, Jun 22, 2016 at 10:24 AM, Maheshakya Wijewardena < >>>>>> mahesha...@wso2.com> wrote: >>>>>> >>>>>>> Hi Mahesh, >>>>>>> >>>>>>> In your output stream, you need to list all the attributes that are >>>>>>> returned from the streamlinreg function: mse, intercept, beta1, >>>>>>> Can you try that? >>>>>>> >>>>>>> On Wed, Jun 22, 2016 at 10:06 AM, Mahesh Dananjaya < >>>>>>> dananjayamah...@gmail.com> wrote: >>>>>>> >>>>>>>> Hi Maheshakya, >>>>>>>> This is the full query i used. >>>>>>>> >>>>>>>> @Import('LinRegInput:1.0.0') >>>>>>>> >>>>>>>> define stream LinRegInput (salary double, rbi double, walks double, >>>>>>>> strikeouts double, errors double); >>>>>>>> >>>>>>>> @Export('LinRegOutput:1.0.0') >>>>>>>> >>>>>>>> define stream LinregOutput (mse double); >>>>>>>> >>>>>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, >>>>>>>> 0.95, salary, rbi, walks, strikeouts, errors) >>>>>>>> >>>>>>>> select * >>>>>>>> insert into mse; >>>>>>>> >>>>>>>> but i am sending [mse,intercept,beta1betap] as a outputData >>>>>>>> Object[]. SO how can i publish all these infomation on event publisher. >>>>>>>> regards, >>>>>>>> Mahesh. >>>>>>>> >>>>>>>> On Tue, Jun 21, 2016 at 6:10 PM,
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
ng Linear Regression class for periodically >>>>retrain models as mini batch processing with SGD >>>>- Extend the functionality for Moving Window Mini Batch Processing >>>>with SGD providing windowShift which control data horizon and data >>>>obsolescences >>>>- Performance evaluation of the implementation >>>>- Adding Streaming Linear Regression class and Stream Processor >>>>extension to carbon-ml >>>> >>>> >>>> *As a next step,* >>>> >>>>- Adding Persisting temporal models for applications such as >>>>prediction >>>>- complete Streaming Kmeans clustering and Logistic Regression >>>>classes >>>>- Improve batching and streaming mechanisms >>>>- improve visualization(optional) >>>>- and writing examples and documentation >>>> >>>> regards, >>>> >>>> Mahesh. >>>> >>>> On Wed, Jun 22, 2016 at 10:28 AM, Maheshakya Wijewardena < >>>> mahesha...@wso2.com> wrote: >>>> >>>>> Sorry, you need to put the returned values of the function into the >>>>> output stream >>>>> >>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95, >>>>> salary, rbi, walks, strikeouts, errors) >>>>> >>>>> >>>>> >>>>> *select mseinsert into LinregOutput;* >>>>> or >>>>> >>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95, >>>>> salary, rbi, walks, strikeouts, errors) >>>>> select * >>>>> insert into LinregOutput; >>>>> >>>>> where LinregOutput stream definition contains all attributes: mse, >>>>> intercept, beta1, >>>>> >>>>> On Wed, Jun 22, 2016 at 10:24 AM, Maheshakya Wijewardena < >>>>> mahesha...@wso2.com> wrote: >>>>> >>>>>> Hi Mahesh, >>>>>> >>>>>> In your output stream, you need to list all the attributes that are >>>>>> returned from the streamlinreg function: mse, intercept, beta1, >>>>>> Can you try that? >>>>>> >>>>>> On Wed, Jun 22, 2016 at 10:06 AM, Mahesh Dananjaya < >>>>>> dananjayamah...@gmail.com> wrote: >>>>>> >>>>>>> Hi Maheshakya, >>>>>>> This is the full query i used. >>>>>>> >>>>>>> @Import('LinRegInput:1.0.0') >>>>>>> >>>>>>> define stream LinRegInput (salary double, rbi double, walks double, >>>>>>> strikeouts double, errors double); >>>>>>> >>>>>>> @Export('LinRegOutput:1.0.0') >>>>>>> >>>>>>> define stream LinregOutput (mse double); >>>>>>> >>>>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, >>>>>>> 0.95, salary, rbi, walks, strikeouts, errors) >>>>>>> >>>>>>> select * >>>>>>> insert into mse; >>>>>>> >>>>>>> but i am sending [mse,intercept,beta1betap] as a outputData >>>>>>> Object[]. SO how can i publish all these infomation on event publisher. >>>>>>> regards, >>>>>>> Mahesh. >>>>>>> >>>>>>> On Tue, Jun 21, 2016 at 6:10 PM, Nirmal Fernando <nir...@wso2.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi Mahesh, >>>>>>>> >>>>>>>> Can you summarize the work we have done so far and the remaining >>>>>>>> work items please? >>>>>>>> >>>>>>>> Thanks. >>>>>>>> >>>>>>>> On Tue, Jun 21, 2016 at 5:56 PM, Mahesh Dananjaya < >>>>>>>> dananjayamah...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Hi Maheshakya, >>>>>>>>> I have updated the repo [2] and upto date documents can be found >>>>>>>>> at [1].thank you. >>>>>>>>> regards, >>>>>>>>> Mahesh. >>>>>>>>> [1] >>>&g
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
e attributes that are >>>>> returned from the streamlinreg function: mse, intercept, beta1, >>>>> Can you try that? >>>>> >>>>> On Wed, Jun 22, 2016 at 10:06 AM, Mahesh Dananjaya < >>>>> dananjayamah...@gmail.com> wrote: >>>>> >>>>>> Hi Maheshakya, >>>>>> This is the full query i used. >>>>>> >>>>>> @Import('LinRegInput:1.0.0') >>>>>> >>>>>> define stream LinRegInput (salary double, rbi double, walks double, >>>>>> strikeouts double, errors double); >>>>>> >>>>>> @Export('LinRegOutput:1.0.0') >>>>>> >>>>>> define stream LinregOutput (mse double); >>>>>> >>>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95, >>>>>> salary, rbi, walks, strikeouts, errors) >>>>>> >>>>>> select * >>>>>> insert into mse; >>>>>> >>>>>> but i am sending [mse,intercept,beta1....betap] as a outputData >>>>>> Object[]. SO how can i publish all these infomation on event publisher. >>>>>> regards, >>>>>> Mahesh. >>>>>> >>>>>> On Tue, Jun 21, 2016 at 6:10 PM, Nirmal Fernando <nir...@wso2.com> >>>>>> wrote: >>>>>> >>>>>>> Hi Mahesh, >>>>>>> >>>>>>> Can you summarize the work we have done so far and the remaining >>>>>>> work items please? >>>>>>> >>>>>>> Thanks. >>>>>>> >>>>>>> On Tue, Jun 21, 2016 at 5:56 PM, Mahesh Dananjaya < >>>>>>> dananjayamah...@gmail.com> wrote: >>>>>>> >>>>>>>> Hi Maheshakya, >>>>>>>> I have updated the repo [2] and upto date documents can be found at >>>>>>>> [1].thank you. >>>>>>>> regards, >>>>>>>> Mahesh. >>>>>>>> [1] >>>>>>>> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming >>>>>>>> [2] >>>>>>>> https://github.com/dananjayamahesh/carbon-ml/tree/wso2_gsoc_ml6_cml >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Jun 21, 2016 at 5:08 PM, Mahesh Dananjaya < >>>>>>>> dananjayamah...@gmail.com> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> -- Forwarded message -- >>>>>>>>> From: Mahesh Dananjaya <dananjayamah...@gmail.com> >>>>>>>>> Date: Tue, Jun 21, 2016 at 5:08 PM >>>>>>>>> Subject: Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic >>>>>>>>> with online data for WSO2 Machine Learner >>>>>>>>> To: Maheshakya Wijewardena <mahesha...@wso2.com> >>>>>>>>> >>>>>>>>> >>>>>>>>> Hi Maheshakya, >>>>>>>>> new query is like this adding spport for moving window methods. >>>>>>>>> >>>>>>>>> >>>>>>>>> @Import('LinRegInput:1.0.1') >>>>>>>>> define stream LinRegInput (salary double, rbi double, walks >>>>>>>>> double, strikeouts double, errors double); >>>>>>>>> >>>>>>>>> @Export('LinRegOutput:1.0.1') >>>>>>>>> define stream LinRegOutput (mse double); >>>>>>>>> >>>>>>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, >>>>>>>>> 0.95, salary, rbi, walks, strikeouts, errors) >>>>>>>>> select * >>>>>>>>> insert into mse; >>>>>>>>> 1=learnType >>>>>>>>> 2=windowShift >>>>>>>>> 4=batchSize... >>>>>>>>> >>>>>>>>> windowShift is added to configure the amount of shift. i have >>>>>>>>> added log.infe(mse) to view the MSE. >>>>>>>>> Mahesh. >>>>>>>>> >>>>>&
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
gt;>>>> strikeouts double, errors double); >>>>> >>>>> @Export('LinRegOutput:1.0.0') >>>>> >>>>> define stream LinregOutput (mse double); >>>>> >>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95, >>>>> salary, rbi, walks, strikeouts, errors) >>>>> >>>>> select * >>>>> insert into mse; >>>>> >>>>> but i am sending [mse,intercept,beta1betap] as a outputData >>>>> Object[]. SO how can i publish all these infomation on event publisher. >>>>> regards, >>>>> Mahesh. >>>>> >>>>> On Tue, Jun 21, 2016 at 6:10 PM, Nirmal Fernando <nir...@wso2.com> >>>>> wrote: >>>>> >>>>>> Hi Mahesh, >>>>>> >>>>>> Can you summarize the work we have done so far and the remaining work >>>>>> items please? >>>>>> >>>>>> Thanks. >>>>>> >>>>>> On Tue, Jun 21, 2016 at 5:56 PM, Mahesh Dananjaya < >>>>>> dananjayamah...@gmail.com> wrote: >>>>>> >>>>>>> Hi Maheshakya, >>>>>>> I have updated the repo [2] and upto date documents can be found at >>>>>>> [1].thank you. >>>>>>> regards, >>>>>>> Mahesh. >>>>>>> [1] >>>>>>> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming >>>>>>> [2] >>>>>>> https://github.com/dananjayamahesh/carbon-ml/tree/wso2_gsoc_ml6_cml >>>>>>> >>>>>>> >>>>>>> On Tue, Jun 21, 2016 at 5:08 PM, Mahesh Dananjaya < >>>>>>> dananjayamah...@gmail.com> wrote: >>>>>>> >>>>>>>> >>>>>>>> -- Forwarded message -- >>>>>>>> From: Mahesh Dananjaya <dananjayamah...@gmail.com> >>>>>>>> Date: Tue, Jun 21, 2016 at 5:08 PM >>>>>>>> Subject: Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic >>>>>>>> with online data for WSO2 Machine Learner >>>>>>>> To: Maheshakya Wijewardena <mahesha...@wso2.com> >>>>>>>> >>>>>>>> >>>>>>>> Hi Maheshakya, >>>>>>>> new query is like this adding spport for moving window methods. >>>>>>>> >>>>>>>> >>>>>>>> @Import('LinRegInput:1.0.1') >>>>>>>> define stream LinRegInput (salary double, rbi double, walks double, >>>>>>>> strikeouts double, errors double); >>>>>>>> >>>>>>>> @Export('LinRegOutput:1.0.1') >>>>>>>> define stream LinRegOutput (mse double); >>>>>>>> >>>>>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, >>>>>>>> 0.95, salary, rbi, walks, strikeouts, errors) >>>>>>>> select * >>>>>>>> insert into mse; >>>>>>>> 1=learnType >>>>>>>> 2=windowShift >>>>>>>> 4=batchSize... >>>>>>>> >>>>>>>> windowShift is added to configure the amount of shift. i have added >>>>>>>> log.infe(mse) to view the MSE. >>>>>>>> Mahesh. >>>>>>>> >>>>>>>> On Tue, Jun 21, 2016 at 2:33 PM, Maheshakya Wijewardena < >>>>>>>> mahesha...@wso2.com> wrote: >>>>>>>> >>>>>>>>> Hi Mahesh, >>>>>>>>> >>>>>>>>> If you are installing features from new p2 repo into a new CEP >>>>>>>>> pack, then you wont need to replace those jars. >>>>>>>>> If you have already installed those in the CEP from a previous >>>>>>>>> p2-repo, then you have to un-install those features and reinstall >>>>>>>>> with new >>>>>>>>> p2 repo. But you don't need to do this because you can just replace >>>>>>>>> the >>>>>>>>> jar. It's easy. >>>>>>>
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
gt;>>>>> >>>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95, >>>>>> salary, rbi, walks, strikeouts, errors) >>>>>> select * >>>>>> insert into LinregOutput; >>>>>> >>>>>> where LinregOutput stream definition contains all attributes: mse, >>>>>> intercept, beta1, >>>>>> >>>>>> On Wed, Jun 22, 2016 at 10:24 AM, Maheshakya Wijewardena < >>>>>> mahesha...@wso2.com> wrote: >>>>>> >>>>>>> Hi Mahesh, >>>>>>> >>>>>>> In your output stream, you need to list all the attributes that are >>>>>>> returned from the streamlinreg function: mse, intercept, beta1, >>>>>>> Can you try that? >>>>>>> >>>>>>> On Wed, Jun 22, 2016 at 10:06 AM, Mahesh Dananjaya < >>>>>>> dananjayamah...@gmail.com> wrote: >>>>>>> >>>>>>>> Hi Maheshakya, >>>>>>>> This is the full query i used. >>>>>>>> >>>>>>>> @Import('LinRegInput:1.0.0') >>>>>>>> >>>>>>>> define stream LinRegInput (salary double, rbi double, walks double, >>>>>>>> strikeouts double, errors double); >>>>>>>> >>>>>>>> @Export('LinRegOutput:1.0.0') >>>>>>>> >>>>>>>> define stream LinregOutput (mse double); >>>>>>>> >>>>>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, >>>>>>>> 0.95, salary, rbi, walks, strikeouts, errors) >>>>>>>> >>>>>>>> select * >>>>>>>> insert into mse; >>>>>>>> >>>>>>>> but i am sending [mse,intercept,beta1betap] as a outputData >>>>>>>> Object[]. SO how can i publish all these infomation on event publisher. >>>>>>>> regards, >>>>>>>> Mahesh. >>>>>>>> >>>>>>>> On Tue, Jun 21, 2016 at 6:10 PM, Nirmal Fernando <nir...@wso2.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi Mahesh, >>>>>>>>> >>>>>>>>> Can you summarize the work we have done so far and the remaining >>>>>>>>> work items please? >>>>>>>>> >>>>>>>>> Thanks. >>>>>>>>> >>>>>>>>> On Tue, Jun 21, 2016 at 5:56 PM, Mahesh Dananjaya < >>>>>>>>> dananjayamah...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Hi Maheshakya, >>>>>>>>>> I have updated the repo [2] and upto date documents can be found >>>>>>>>>> at [1].thank you. >>>>>>>>>> regards, >>>>>>>>>> Mahesh. >>>>>>>>>> [1] >>>>>>>>>> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming >>>>>>>>>> [2] >>>>>>>>>> https://github.com/dananjayamahesh/carbon-ml/tree/wso2_gsoc_ml6_cml >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, Jun 21, 2016 at 5:08 PM, Mahesh Dananjaya < >>>>>>>>>> dananjayamah...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- Forwarded message -- >>>>>>>>>>> From: Mahesh Dananjaya <dananjayamah...@gmail.com> >>>>>>>>>>> Date: Tue, Jun 21, 2016 at 5:08 PM >>>>>>>>>>> Subject: Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic >>>>>>>>>>> with online data for WSO2 Machine Learner >>>>>>>>>>> To: Maheshakya Wijewardena <mahesha...@wso2.com> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Hi Maheshakya, >>>>>>>>>>> new query is like this adding spport for moving window methods. &
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Maheshakya, can i give external data sources like data from database , data from HDFS to generate events in the cep event simulator rather than giving a file. i saw "Switch to upload file for simulation" in the input Data By Data Source in the event simulator. How can i feed data real time from other sources or directly as data generating from remote server as JSON or etc... What format the database should be.This is just for my knowledge.thank you. regards, Mahesh. On Wed, Jun 22, 2016 at 10:59 AM, Mahesh Dananjaya < dananjayamah...@gmail.com> wrote: > Hi Nirmal, > *This is what i have done so far in the GSOC2016,* > >- prior research before SGD (Stochastic Gradient Descent) optimization >techniques and mini-batch processing >- Getting familiar and writing extensions to siddhi >- Wrote a Stream Processor extensions for streaming application and >machine learning algorithms (Linear Regression,KMeans & Logistic > Regression) >- Developed a Streaming Linear Regression class for periodically >retrain models as mini batch processing with SGD >- Extend the functionality for Moving Window Mini Batch Processing >with SGD providing windowShift which control data horizon and data >obsolescences >- Performance evaluation of the implementation >- Adding Streaming Linear Regression class and Stream Processor >extension to carbon-ml > > > *As a next step,* > >- Adding Persisting temporal models for applications such as prediction >- complete Streaming Kmeans clustering and Logistic Regression classes >- Improve batching and streaming mechanisms >- improve visualization(optional) >- and writing examples and documentation > > regards, > > Mahesh. > > On Wed, Jun 22, 2016 at 10:28 AM, Maheshakya Wijewardena < > mahesha...@wso2.com> wrote: > >> Sorry, you need to put the returned values of the function into the >> output stream >> >> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95, >> salary, rbi, walks, strikeouts, errors) >> >> >> >> *select mseinsert into LinregOutput;* >> or >> >> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95, >> salary, rbi, walks, strikeouts, errors) >> select * >> insert into LinregOutput; >> >> where LinregOutput stream definition contains all attributes: mse, >> intercept, beta1, >> >> On Wed, Jun 22, 2016 at 10:24 AM, Maheshakya Wijewardena < >> mahesha...@wso2.com> wrote: >> >>> Hi Mahesh, >>> >>> In your output stream, you need to list all the attributes that are >>> returned from the streamlinreg function: mse, intercept, beta1, >>> Can you try that? >>> >>> On Wed, Jun 22, 2016 at 10:06 AM, Mahesh Dananjaya < >>> dananjayamah...@gmail.com> wrote: >>> >>>> Hi Maheshakya, >>>> This is the full query i used. >>>> >>>> @Import('LinRegInput:1.0.0') >>>> >>>> define stream LinRegInput (salary double, rbi double, walks double, >>>> strikeouts double, errors double); >>>> >>>> @Export('LinRegOutput:1.0.0') >>>> >>>> define stream LinregOutput (mse double); >>>> >>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95, >>>> salary, rbi, walks, strikeouts, errors) >>>> >>>> select * >>>> insert into mse; >>>> >>>> but i am sending [mse,intercept,beta1betap] as a outputData >>>> Object[]. SO how can i publish all these infomation on event publisher. >>>> regards, >>>> Mahesh. >>>> >>>> On Tue, Jun 21, 2016 at 6:10 PM, Nirmal Fernando <nir...@wso2.com> >>>> wrote: >>>> >>>>> Hi Mahesh, >>>>> >>>>> Can you summarize the work we have done so far and the remaining work >>>>> items please? >>>>> >>>>> Thanks. >>>>> >>>>> On Tue, Jun 21, 2016 at 5:56 PM, Mahesh Dananjaya < >>>>> dananjayamah...@gmail.com> wrote: >>>>> >>>>>> Hi Maheshakya, >>>>>> I have updated the repo [2] and upto date documents can be found at >>>>>> [1].thank you. >>>>>> regards, >>>>>> Mahesh. >>>>>> [1] >>>>>> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming >>>>>> [2] >>>>&g
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Nirmal, *This is what i have done so far in the GSOC2016,* - prior research before SGD (Stochastic Gradient Descent) optimization techniques and mini-batch processing - Getting familiar and writing extensions to siddhi - Wrote a Stream Processor extensions for streaming application and machine learning algorithms (Linear Regression,KMeans & Logistic Regression) - Developed a Streaming Linear Regression class for periodically retrain models as mini batch processing with SGD - Extend the functionality for Moving Window Mini Batch Processing with SGD providing windowShift which control data horizon and data obsolescences - Performance evaluation of the implementation - Adding Streaming Linear Regression class and Stream Processor extension to carbon-ml *As a next step,* - Adding Persisting temporal models for applications such as prediction - complete Streaming Kmeans clustering and Logistic Regression classes - Improve batching and streaming mechanisms - improve visualization(optional) - and writing examples and documentation regards, Mahesh. On Wed, Jun 22, 2016 at 10:28 AM, Maheshakya Wijewardena < mahesha...@wso2.com> wrote: > Sorry, you need to put the returned values of the function into the output > stream > > from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95, > salary, rbi, walks, strikeouts, errors) > > > > *select mseinsert into LinregOutput;* > or > > from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95, > salary, rbi, walks, strikeouts, errors) > select * > insert into LinregOutput; > > where LinregOutput stream definition contains all attributes: mse, > intercept, beta1, > > On Wed, Jun 22, 2016 at 10:24 AM, Maheshakya Wijewardena < > mahesha...@wso2.com> wrote: > >> Hi Mahesh, >> >> In your output stream, you need to list all the attributes that are >> returned from the streamlinreg function: mse, intercept, beta1, >> Can you try that? >> >> On Wed, Jun 22, 2016 at 10:06 AM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi Maheshakya, >>> This is the full query i used. >>> >>> @Import('LinRegInput:1.0.0') >>> >>> define stream LinRegInput (salary double, rbi double, walks double, >>> strikeouts double, errors double); >>> >>> @Export('LinRegOutput:1.0.0') >>> >>> define stream LinregOutput (mse double); >>> >>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95, >>> salary, rbi, walks, strikeouts, errors) >>> >>> select * >>> insert into mse; >>> >>> but i am sending [mse,intercept,beta1betap] as a outputData >>> Object[]. SO how can i publish all these infomation on event publisher. >>> regards, >>> Mahesh. >>> >>> On Tue, Jun 21, 2016 at 6:10 PM, Nirmal Fernando <nir...@wso2.com> >>> wrote: >>> >>>> Hi Mahesh, >>>> >>>> Can you summarize the work we have done so far and the remaining work >>>> items please? >>>> >>>> Thanks. >>>> >>>> On Tue, Jun 21, 2016 at 5:56 PM, Mahesh Dananjaya < >>>> dananjayamah...@gmail.com> wrote: >>>> >>>>> Hi Maheshakya, >>>>> I have updated the repo [2] and upto date documents can be found at >>>>> [1].thank you. >>>>> regards, >>>>> Mahesh. >>>>> [1] >>>>> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming >>>>> [2] >>>>> https://github.com/dananjayamahesh/carbon-ml/tree/wso2_gsoc_ml6_cml >>>>> >>>>> >>>>> On Tue, Jun 21, 2016 at 5:08 PM, Mahesh Dananjaya < >>>>> dananjayamah...@gmail.com> wrote: >>>>> >>>>>> >>>>>> -- Forwarded message -- >>>>>> From: Mahesh Dananjaya <dananjayamah...@gmail.com> >>>>>> Date: Tue, Jun 21, 2016 at 5:08 PM >>>>>> Subject: Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with >>>>>> online data for WSO2 Machine Learner >>>>>> To: Maheshakya Wijewardena <mahesha...@wso2.com> >>>>>> >>>>>> >>>>>> Hi Maheshakya, >>>>>> new query is like this adding spport for moving window methods. >>>>>> >>>>>> >>>>>> @Import('LinRegInput:1.0.1') >>>>>>
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Sorry, you need to put the returned values of the function into the output stream from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95, salary, rbi, walks, strikeouts, errors) *select mseinsert into LinregOutput;* or from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95, salary, rbi, walks, strikeouts, errors) select * insert into LinregOutput; where LinregOutput stream definition contains all attributes: mse, intercept, beta1, On Wed, Jun 22, 2016 at 10:24 AM, Maheshakya Wijewardena < mahesha...@wso2.com> wrote: > Hi Mahesh, > > In your output stream, you need to list all the attributes that are > returned from the streamlinreg function: mse, intercept, beta1, > Can you try that? > > On Wed, Jun 22, 2016 at 10:06 AM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi Maheshakya, >> This is the full query i used. >> >> @Import('LinRegInput:1.0.0') >> >> define stream LinRegInput (salary double, rbi double, walks double, >> strikeouts double, errors double); >> >> @Export('LinRegOutput:1.0.0') >> >> define stream LinregOutput (mse double); >> >> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95, >> salary, rbi, walks, strikeouts, errors) >> >> select * >> insert into mse; >> >> but i am sending [mse,intercept,beta1betap] as a outputData Object[]. >> SO how can i publish all these infomation on event publisher. >> regards, >> Mahesh. >> >> On Tue, Jun 21, 2016 at 6:10 PM, Nirmal Fernando <nir...@wso2.com> wrote: >> >>> Hi Mahesh, >>> >>> Can you summarize the work we have done so far and the remaining work >>> items please? >>> >>> Thanks. >>> >>> On Tue, Jun 21, 2016 at 5:56 PM, Mahesh Dananjaya < >>> dananjayamah...@gmail.com> wrote: >>> >>>> Hi Maheshakya, >>>> I have updated the repo [2] and upto date documents can be found at >>>> [1].thank you. >>>> regards, >>>> Mahesh. >>>> [1] >>>> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming >>>> [2] https://github.com/dananjayamahesh/carbon-ml/tree/wso2_gsoc_ml6_cml >>>> >>>> >>>> On Tue, Jun 21, 2016 at 5:08 PM, Mahesh Dananjaya < >>>> dananjayamah...@gmail.com> wrote: >>>> >>>>> >>>>> -- Forwarded message -- >>>>> From: Mahesh Dananjaya <dananjayamah...@gmail.com> >>>>> Date: Tue, Jun 21, 2016 at 5:08 PM >>>>> Subject: Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with >>>>> online data for WSO2 Machine Learner >>>>> To: Maheshakya Wijewardena <mahesha...@wso2.com> >>>>> >>>>> >>>>> Hi Maheshakya, >>>>> new query is like this adding spport for moving window methods. >>>>> >>>>> >>>>> @Import('LinRegInput:1.0.1') >>>>> define stream LinRegInput (salary double, rbi double, walks double, >>>>> strikeouts double, errors double); >>>>> >>>>> @Export('LinRegOutput:1.0.1') >>>>> define stream LinRegOutput (mse double); >>>>> >>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95, >>>>> salary, rbi, walks, strikeouts, errors) >>>>> select * >>>>> insert into mse; >>>>> 1=learnType >>>>> 2=windowShift >>>>> 4=batchSize... >>>>> >>>>> windowShift is added to configure the amount of shift. i have added >>>>> log.infe(mse) to view the MSE. >>>>> Mahesh. >>>>> >>>>> On Tue, Jun 21, 2016 at 2:33 PM, Maheshakya Wijewardena < >>>>> mahesha...@wso2.com> wrote: >>>>> >>>>>> Hi Mahesh, >>>>>> >>>>>> If you are installing features from new p2 repo into a new CEP pack, >>>>>> then you wont need to replace those jars. >>>>>> If you have already installed those in the CEP from a previous >>>>>> p2-repo, then you have to un-install those features and reinstall with >>>>>> new >>>>>> p2 repo. But you don't need to do this because you can just replace the >>>>>> jar. It's easy. >>>>>> >>>>>> Best regards. >>>>>&
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Mahesh, In your output stream, you need to list all the attributes that are returned from the streamlinreg function: mse, intercept, beta1, Can you try that? On Wed, Jun 22, 2016 at 10:06 AM, Mahesh Dananjaya < dananjayamah...@gmail.com> wrote: > Hi Maheshakya, > This is the full query i used. > > @Import('LinRegInput:1.0.0') > > define stream LinRegInput (salary double, rbi double, walks double, > strikeouts double, errors double); > > @Export('LinRegOutput:1.0.0') > > define stream LinregOutput (mse double); > > from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95, > salary, rbi, walks, strikeouts, errors) > > select * > insert into mse; > > but i am sending [mse,intercept,beta1betap] as a outputData Object[]. > SO how can i publish all these infomation on event publisher. > regards, > Mahesh. > > On Tue, Jun 21, 2016 at 6:10 PM, Nirmal Fernando <nir...@wso2.com> wrote: > >> Hi Mahesh, >> >> Can you summarize the work we have done so far and the remaining work >> items please? >> >> Thanks. >> >> On Tue, Jun 21, 2016 at 5:56 PM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi Maheshakya, >>> I have updated the repo [2] and upto date documents can be found at >>> [1].thank you. >>> regards, >>> Mahesh. >>> [1] >>> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming >>> [2] https://github.com/dananjayamahesh/carbon-ml/tree/wso2_gsoc_ml6_cml >>> >>> >>> On Tue, Jun 21, 2016 at 5:08 PM, Mahesh Dananjaya < >>> dananjayamah...@gmail.com> wrote: >>> >>>> >>>> -- Forwarded message -- >>>> From: Mahesh Dananjaya <dananjayamah...@gmail.com> >>>> Date: Tue, Jun 21, 2016 at 5:08 PM >>>> Subject: Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with >>>> online data for WSO2 Machine Learner >>>> To: Maheshakya Wijewardena <mahesha...@wso2.com> >>>> >>>> >>>> Hi Maheshakya, >>>> new query is like this adding spport for moving window methods. >>>> >>>> >>>> @Import('LinRegInput:1.0.1') >>>> define stream LinRegInput (salary double, rbi double, walks double, >>>> strikeouts double, errors double); >>>> >>>> @Export('LinRegOutput:1.0.1') >>>> define stream LinRegOutput (mse double); >>>> >>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95, >>>> salary, rbi, walks, strikeouts, errors) >>>> select * >>>> insert into mse; >>>> 1=learnType >>>> 2=windowShift >>>> 4=batchSize... >>>> >>>> windowShift is added to configure the amount of shift. i have added >>>> log.infe(mse) to view the MSE. >>>> Mahesh. >>>> >>>> On Tue, Jun 21, 2016 at 2:33 PM, Maheshakya Wijewardena < >>>> mahesha...@wso2.com> wrote: >>>> >>>>> Hi Mahesh, >>>>> >>>>> If you are installing features from new p2 repo into a new CEP pack, >>>>> then you wont need to replace those jars. >>>>> If you have already installed those in the CEP from a previous >>>>> p2-repo, then you have to un-install those features and reinstall with new >>>>> p2 repo. But you don't need to do this because you can just replace the >>>>> jar. It's easy. >>>>> >>>>> Best regards. >>>>> >>>>> On Tue, Jun 21, 2016 at 2:26 PM, Mahesh Dananjaya < >>>>> dananjayamah...@gmail.com> wrote: >>>>> >>>>>> Hi Maheshakya, >>>>>> If i built the carbon-ml then product-ml and point new p2 repository >>>>>> to cep features, do i need to copy that >>>>>> org.wso2.carbon.ml.siddhi.extension1.1. thing into >>>>>> cep_home/repository/component/... place. >>>>>> regards, >>>>>> Mahesh. >>>>>> >>>>>> On Thu, Jun 16, 2016 at 6:39 PM, Mahesh Dananjaya < >>>>>> dananjayamah...@gmail.com> wrote: >>>>>> >>>>>>> In MLModelhandler there's persistModel method >>>>>>> debug that method while trying to train a model from ML >>>>>>> you can see the steps it takes >>>>>>> don't use deep
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Maheshakya, This is the full query i used. @Import('LinRegInput:1.0.0') define stream LinRegInput (salary double, rbi double, walks double, strikeouts double, errors double); @Export('LinRegOutput:1.0.0') define stream LinregOutput (mse double); from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95, salary, rbi, walks, strikeouts, errors) select * insert into mse; but i am sending [mse,intercept,beta1betap] as a outputData Object[]. SO how can i publish all these infomation on event publisher. regards, Mahesh. On Tue, Jun 21, 2016 at 6:10 PM, Nirmal Fernando <nir...@wso2.com> wrote: > Hi Mahesh, > > Can you summarize the work we have done so far and the remaining work > items please? > > Thanks. > > On Tue, Jun 21, 2016 at 5:56 PM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi Maheshakya, >> I have updated the repo [2] and upto date documents can be found at >> [1].thank you. >> regards, >> Mahesh. >> [1] >> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming >> [2] https://github.com/dananjayamahesh/carbon-ml/tree/wso2_gsoc_ml6_cml >> >> >> On Tue, Jun 21, 2016 at 5:08 PM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> >>> -- Forwarded message ---------- >>> From: Mahesh Dananjaya <dananjayamah...@gmail.com> >>> Date: Tue, Jun 21, 2016 at 5:08 PM >>> Subject: Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with >>> online data for WSO2 Machine Learner >>> To: Maheshakya Wijewardena <mahesha...@wso2.com> >>> >>> >>> Hi Maheshakya, >>> new query is like this adding spport for moving window methods. >>> >>> >>> @Import('LinRegInput:1.0.1') >>> define stream LinRegInput (salary double, rbi double, walks double, >>> strikeouts double, errors double); >>> >>> @Export('LinRegOutput:1.0.1') >>> define stream LinRegOutput (mse double); >>> >>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95, >>> salary, rbi, walks, strikeouts, errors) >>> select * >>> insert into mse; >>> 1=learnType >>> 2=windowShift >>> 4=batchSize... >>> >>> windowShift is added to configure the amount of shift. i have added >>> log.infe(mse) to view the MSE. >>> Mahesh. >>> >>> On Tue, Jun 21, 2016 at 2:33 PM, Maheshakya Wijewardena < >>> mahesha...@wso2.com> wrote: >>> >>>> Hi Mahesh, >>>> >>>> If you are installing features from new p2 repo into a new CEP pack, >>>> then you wont need to replace those jars. >>>> If you have already installed those in the CEP from a previous p2-repo, >>>> then you have to un-install those features and reinstall with new p2 repo. >>>> But you don't need to do this because you can just replace the jar. It's >>>> easy. >>>> >>>> Best regards. >>>> >>>> On Tue, Jun 21, 2016 at 2:26 PM, Mahesh Dananjaya < >>>> dananjayamah...@gmail.com> wrote: >>>> >>>>> Hi Maheshakya, >>>>> If i built the carbon-ml then product-ml and point new p2 repository >>>>> to cep features, do i need to copy that >>>>> org.wso2.carbon.ml.siddhi.extension1.1. thing into >>>>> cep_home/repository/component/... place. >>>>> regards, >>>>> Mahesh. >>>>> >>>>> On Thu, Jun 16, 2016 at 6:39 PM, Mahesh Dananjaya < >>>>> dananjayamah...@gmail.com> wrote: >>>>> >>>>>> In MLModelhandler there's persistModel method >>>>>> debug that method while trying to train a model from ML >>>>>> you can see the steps it takes >>>>>> don't use deep learning algorithm >>>>>> any other algorithm would work >>>>>> from line 777 is the section for creating the serializable object >>>>>> from trained model and saving it >>>>>> >>>>>> >>>>>> I think you don't need to directly use ML model handler >>>>>> you need to use the code in that for persisting models in the >>>>>> streaming algorithm >>>>>> so you can add a utils class in the streaming folder >>>>>> then add the persisting logic there >>>>>> ignore the deeplearning section in that >>>>>> only forcus
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Mahesh, Can you summarize the work we have done so far and the remaining work items please? Thanks. On Tue, Jun 21, 2016 at 5:56 PM, Mahesh Dananjaya <dananjayamah...@gmail.com > wrote: > Hi Maheshakya, > I have updated the repo [2] and upto date documents can be found at > [1].thank you. > regards, > Mahesh. > [1] > https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming > [2] https://github.com/dananjayamahesh/carbon-ml/tree/wso2_gsoc_ml6_cml > > > On Tue, Jun 21, 2016 at 5:08 PM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> >> -- Forwarded message -- >> From: Mahesh Dananjaya <dananjayamah...@gmail.com> >> Date: Tue, Jun 21, 2016 at 5:08 PM >> Subject: Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with >> online data for WSO2 Machine Learner >> To: Maheshakya Wijewardena <mahesha...@wso2.com> >> >> >> Hi Maheshakya, >> new query is like this adding spport for moving window methods. >> >> >> @Import('LinRegInput:1.0.1') >> define stream LinRegInput (salary double, rbi double, walks double, >> strikeouts double, errors double); >> >> @Export('LinRegOutput:1.0.1') >> define stream LinRegOutput (mse double); >> >> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95, >> salary, rbi, walks, strikeouts, errors) >> select * >> insert into mse; >> 1=learnType >> 2=windowShift >> 4=batchSize... >> >> windowShift is added to configure the amount of shift. i have added >> log.infe(mse) to view the MSE. >> Mahesh. >> >> On Tue, Jun 21, 2016 at 2:33 PM, Maheshakya Wijewardena < >> mahesha...@wso2.com> wrote: >> >>> Hi Mahesh, >>> >>> If you are installing features from new p2 repo into a new CEP pack, >>> then you wont need to replace those jars. >>> If you have already installed those in the CEP from a previous p2-repo, >>> then you have to un-install those features and reinstall with new p2 repo. >>> But you don't need to do this because you can just replace the jar. It's >>> easy. >>> >>> Best regards. >>> >>> On Tue, Jun 21, 2016 at 2:26 PM, Mahesh Dananjaya < >>> dananjayamah...@gmail.com> wrote: >>> >>>> Hi Maheshakya, >>>> If i built the carbon-ml then product-ml and point new p2 repository to >>>> cep features, do i need to copy that >>>> org.wso2.carbon.ml.siddhi.extension1.1. thing into >>>> cep_home/repository/component/... place. >>>> regards, >>>> Mahesh. >>>> >>>> On Thu, Jun 16, 2016 at 6:39 PM, Mahesh Dananjaya < >>>> dananjayamah...@gmail.com> wrote: >>>> >>>>> In MLModelhandler there's persistModel method >>>>> debug that method while trying to train a model from ML >>>>> you can see the steps it takes >>>>> don't use deep learning algorithm >>>>> any other algorithm would work >>>>> from line 777 is the section for creating the serializable object from >>>>> trained model and saving it >>>>> >>>>> >>>>> I think you don't need to directly use ML model handler >>>>> you need to use the code in that for persisting models in the >>>>> streaming algorithm >>>>> so you can add a utils class in the streaming folder >>>>> then add the persisting logic there >>>>> ignore the deeplearning section in that >>>>> only forcus on persisting spark mod >>>>> >>>>> On Wed, Jun 15, 2016 at 4:11 PM, Mahesh Dananjaya < >>>>> dananjayamah...@gmail.com> wrote: >>>>> >>>>>> Hi Maheshakya, >>>>>> I pushed the StreamingLinearRegression modules into my forked >>>>>> carbon-ml repo at branch wso2_gsoc_ml6_cml [1]. I am working on >>>>>> persisting >>>>>> model.thank you. >>>>>> Mahesh. >>>>>> [1] https://github.com/dananjayamahesh/carbon-ml >>>>>> >>>>>> On Tue, Jun 14, 2016 at 5:56 PM, Mahesh Dananjaya < >>>>>> dananjayamah...@gmail.com> wrote: >>>>>> >>>>>>> yes >>>>>>> you should develop in tha fork repo >>>>>>> clone your forked repo >>>>>>> then go into that >>>>>>> then ad
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Maheshakya, I have updated the repo [2] and upto date documents can be found at [1].thank you. regards, Mahesh. [1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming [2] https://github.com/dananjayamahesh/carbon-ml/tree/wso2_gsoc_ml6_cml On Tue, Jun 21, 2016 at 5:08 PM, Mahesh Dananjaya <dananjayamah...@gmail.com > wrote: > > -- Forwarded message -- > From: Mahesh Dananjaya <dananjayamah...@gmail.com> > Date: Tue, Jun 21, 2016 at 5:08 PM > Subject: Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with > online data for WSO2 Machine Learner > To: Maheshakya Wijewardena <mahesha...@wso2.com> > > > Hi Maheshakya, > new query is like this adding spport for moving window methods. > > > @Import('LinRegInput:1.0.1') > define stream LinRegInput (salary double, rbi double, walks double, > strikeouts double, errors double); > > @Export('LinRegOutput:1.0.1') > define stream LinRegOutput (mse double); > > from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95, > salary, rbi, walks, strikeouts, errors) > select * > insert into mse; > 1=learnType > 2=windowShift > 4=batchSize... > > windowShift is added to configure the amount of shift. i have added > log.infe(mse) to view the MSE. > Mahesh. > > On Tue, Jun 21, 2016 at 2:33 PM, Maheshakya Wijewardena < > mahesha...@wso2.com> wrote: > >> Hi Mahesh, >> >> If you are installing features from new p2 repo into a new CEP pack, >> then you wont need to replace those jars. >> If you have already installed those in the CEP from a previous p2-repo, >> then you have to un-install those features and reinstall with new p2 repo. >> But you don't need to do this because you can just replace the jar. It's >> easy. >> >> Best regards. >> >> On Tue, Jun 21, 2016 at 2:26 PM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi Maheshakya, >>> If i built the carbon-ml then product-ml and point new p2 repository to >>> cep features, do i need to copy that >>> org.wso2.carbon.ml.siddhi.extension1.1. thing into >>> cep_home/repository/component/... place. >>> regards, >>> Mahesh. >>> >>> On Thu, Jun 16, 2016 at 6:39 PM, Mahesh Dananjaya < >>> dananjayamah...@gmail.com> wrote: >>> >>>> In MLModelhandler there's persistModel method >>>> debug that method while trying to train a model from ML >>>> you can see the steps it takes >>>> don't use deep learning algorithm >>>> any other algorithm would work >>>> from line 777 is the section for creating the serializable object from >>>> trained model and saving it >>>> >>>> >>>> I think you don't need to directly use ML model handler >>>> you need to use the code in that for persisting models in the streaming >>>> algorithm >>>> so you can add a utils class in the streaming folder >>>> then add the persisting logic there >>>> ignore the deeplearning section in that >>>> only forcus on persisting spark mod >>>> >>>> On Wed, Jun 15, 2016 at 4:11 PM, Mahesh Dananjaya < >>>> dananjayamah...@gmail.com> wrote: >>>> >>>>> Hi Maheshakya, >>>>> I pushed the StreamingLinearRegression modules into my forked >>>>> carbon-ml repo at branch wso2_gsoc_ml6_cml [1]. I am working on persisting >>>>> model.thank you. >>>>> Mahesh. >>>>> [1] https://github.com/dananjayamahesh/carbon-ml >>>>> >>>>> On Tue, Jun 14, 2016 at 5:56 PM, Mahesh Dananjaya < >>>>> dananjayamah...@gmail.com> wrote: >>>>> >>>>>> yes >>>>>> you should develop in tha fork repo >>>>>> clone your forked repo >>>>>> then go into that >>>>>> then add upstream repo as original wso2 repo >>>>>> see the remote tracking branchs by >>>>>> git remote -v >>>>>> you will see the origin as your forked repo >>>>>> to add upstream >>>>>> git remote add upstream >>>>>> when you change something create a new branch by >>>>>> git checkout -b new_branch_name >>>>>> then add and commit to this branch >>>>>> after that push to the forked by >>>>>> git push origin new_branch_name >>>>>> >>>>>> On
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Mahesh, If you are installing features from new p2 repo into a new CEP pack, then you wont need to replace those jars. If you have already installed those in the CEP from a previous p2-repo, then you have to un-install those features and reinstall with new p2 repo. But you don't need to do this because you can just replace the jar. It's easy. Best regards. On Tue, Jun 21, 2016 at 2:26 PM, Mahesh Dananjayawrote: > Hi Maheshakya, > If i built the carbon-ml then product-ml and point new p2 repository to > cep features, do i need to copy that > org.wso2.carbon.ml.siddhi.extension1.1. thing into > cep_home/repository/component/... place. > regards, > Mahesh. > > On Thu, Jun 16, 2016 at 6:39 PM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> In MLModelhandler there's persistModel method >> debug that method while trying to train a model from ML >> you can see the steps it takes >> don't use deep learning algorithm >> any other algorithm would work >> from line 777 is the section for creating the serializable object from >> trained model and saving it >> >> >> I think you don't need to directly use ML model handler >> you need to use the code in that for persisting models in the streaming >> algorithm >> so you can add a utils class in the streaming folder >> then add the persisting logic there >> ignore the deeplearning section in that >> only forcus on persisting spark mod >> >> On Wed, Jun 15, 2016 at 4:11 PM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi Maheshakya, >>> I pushed the StreamingLinearRegression modules into my forked carbon-ml >>> repo at branch wso2_gsoc_ml6_cml [1]. I am working on persisting >>> model.thank you. >>> Mahesh. >>> [1] https://github.com/dananjayamahesh/carbon-ml >>> >>> On Tue, Jun 14, 2016 at 5:56 PM, Mahesh Dananjaya < >>> dananjayamah...@gmail.com> wrote: >>> yes you should develop in tha fork repo clone your forked repo then go into that then add upstream repo as original wso2 repo see the remote tracking branchs by git remote -v you will see the origin as your forked repo to add upstream git remote add upstream when you change something create a new branch by git checkout -b new_branch_name then add and commit to this branch after that push to the forked by git push origin new_branch_name On Tue, Jun 14, 2016 at 5:32 PM, Mahesh Dananjaya < dananjayamah...@gmail.com> wrote: > Hi Maheshakya, > the above error is due to a simple mistake of not providing my local > p2 repo.Now it is working and i debugged the StreamingLinearRegression > model cep. > regards, > Mahesh. > > On Tue, Jun 14, 2016 at 3:19 PM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi Maheshakya, >> I did what you recommend. But when i am adding the query the >> following error is appearing. >> No extension exist for StreamFunctionExtension{namespace='ml'} in >> execution plan "NewExecutionPlan" >> >> *My query is as follows, >> @Import('LinRegInput:1.0.0') >> define stream LinRegInput (salary double, rbi double, walks double, >> strikeouts double, errors double); >> >> @Export('LinRegOutput:1.0.0') >> define stream LinRegOutput (mse double); >> >> from LinRegInput#ml:streamlinreg(0, 2, 100, 0.0001, 1.0, 0.95, >> salary, rbi, walks, strikeouts, errors) >> select * >> insert into mse; >> >> I have added my files as follows, >> >> org.wso2.carbon.ml.siddhi.extension.streaming.StreamingLinearRegression; >> org.wso2.carbon.ml.siddhi.extension.streaming.algorithm.StreamingLinearModel; >> >> and add following lines to ml.siddhiext >> >> streamlinreg=org.wso2.carbon.ml.siddhi.extension.streaming.StreamingLinearRegressionStreamProcessor >> >> .Then i build the carbon-ml. The replace the jar file you asked me >> replace with the name changed.any thoughts? >> regards, >> Mahesh. >> >> On Tue, Jun 14, 2016 at 2:43 PM, Maheshakya Wijewardena < >> mahesha...@wso2.com> wrote: >> >>> Hi Mahesh, >>> >>> You don't need to add new p2 repo. >>> In the /repository/components/plugins folder, you will >>> find org.wso2.carbon.ml.siddhi.extension_some_version.jar. Replace this >>> with >>> carbon-ml/components/extensions/org.wso2.carbon.ml.siddhi.extension/target/org.wso2.carbon.ml.siddhi.extension-1.1.2-SNAPSHOT.jar. >>> First rename this jar in the target folder to the jar name in the >>> plugins >>> folder then replace (Make sure, otherwise will not work). >>> Your updates will be there in the CEP after this. >>> >>> Best regards. >>> >>> On Tue, Jun 14, 2016 at 2:37 PM, Mahesh Dananjaya < >>> dananjayamah...@gmail.com> wrote: >>> Hi Maheshakya, Do i need to add p2 local repos
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Maheshakya, If i built the carbon-ml then product-ml and point new p2 repository to cep features, do i need to copy that org.wso2.carbon.ml.siddhi.extension1.1. thing into cep_home/repository/component/... place. regards, Mahesh. On Thu, Jun 16, 2016 at 6:39 PM, Mahesh Dananjayawrote: > In MLModelhandler there's persistModel method > debug that method while trying to train a model from ML > you can see the steps it takes > don't use deep learning algorithm > any other algorithm would work > from line 777 is the section for creating the serializable object from > trained model and saving it > > > I think you don't need to directly use ML model handler > you need to use the code in that for persisting models in the streaming > algorithm > so you can add a utils class in the streaming folder > then add the persisting logic there > ignore the deeplearning section in that > only forcus on persisting spark mod > > On Wed, Jun 15, 2016 at 4:11 PM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi Maheshakya, >> I pushed the StreamingLinearRegression modules into my forked carbon-ml >> repo at branch wso2_gsoc_ml6_cml [1]. I am working on persisting >> model.thank you. >> Mahesh. >> [1] https://github.com/dananjayamahesh/carbon-ml >> >> On Tue, Jun 14, 2016 at 5:56 PM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> yes >>> you should develop in tha fork repo >>> clone your forked repo >>> then go into that >>> then add upstream repo as original wso2 repo >>> see the remote tracking branchs by >>> git remote -v >>> you will see the origin as your forked repo >>> to add upstream >>> git remote add upstream >>> when you change something create a new branch by >>> git checkout -b new_branch_name >>> then add and commit to this branch >>> after that push to the forked by >>> git push origin new_branch_name >>> >>> On Tue, Jun 14, 2016 at 5:32 PM, Mahesh Dananjaya < >>> dananjayamah...@gmail.com> wrote: >>> Hi Maheshakya, the above error is due to a simple mistake of not providing my local p2 repo.Now it is working and i debugged the StreamingLinearRegression model cep. regards, Mahesh. On Tue, Jun 14, 2016 at 3:19 PM, Mahesh Dananjaya < dananjayamah...@gmail.com> wrote: > Hi Maheshakya, > I did what you recommend. But when i am adding the query the following > error is appearing. > No extension exist for StreamFunctionExtension{namespace='ml'} in > execution plan "NewExecutionPlan" > > *My query is as follows, > @Import('LinRegInput:1.0.0') > define stream LinRegInput (salary double, rbi double, walks double, > strikeouts double, errors double); > > @Export('LinRegOutput:1.0.0') > define stream LinRegOutput (mse double); > > from LinRegInput#ml:streamlinreg(0, 2, 100, 0.0001, 1.0, 0.95, > salary, rbi, walks, strikeouts, errors) > select * > insert into mse; > > I have added my files as follows, > > org.wso2.carbon.ml.siddhi.extension.streaming.StreamingLinearRegression; > org.wso2.carbon.ml.siddhi.extension.streaming.algorithm.StreamingLinearModel; > > and add following lines to ml.siddhiext > > streamlinreg=org.wso2.carbon.ml.siddhi.extension.streaming.StreamingLinearRegressionStreamProcessor > > .Then i build the carbon-ml. The replace the jar file you asked me > replace with the name changed.any thoughts? > regards, > Mahesh. > > On Tue, Jun 14, 2016 at 2:43 PM, Maheshakya Wijewardena < > mahesha...@wso2.com> wrote: > >> Hi Mahesh, >> >> You don't need to add new p2 repo. >> In the /repository/components/plugins folder, you will find >> org.wso2.carbon.ml.siddhi.extension_some_version.jar. Replace this with >> carbon-ml/components/extensions/org.wso2.carbon.ml.siddhi.extension/target/org.wso2.carbon.ml.siddhi.extension-1.1.2-SNAPSHOT.jar. >> First rename this jar in the target folder to the jar name in the plugins >> folder then replace (Make sure, otherwise will not work). >> Your updates will be there in the CEP after this. >> >> Best regards. >> >> On Tue, Jun 14, 2016 at 2:37 PM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi Maheshakya, >>> Do i need to add p2 local repos of ML into CEP after i made changes >>> to ml extensions. Or will it be automatically updated. I am trying to >>> debug >>> my extension with the cep.thank you. >>> regards, >>> Mahesh. >>> >>> On Tue, Jun 14, 2016 at 1:57 PM, Maheshakya Wijewardena < >>> mahesha...@wso2.com> wrote: >>> Mahesh when you add your work to carbon-ml follow the bellow guidelines, it will help to keep the code clean. - Add only the sources code file you have newly added or changed. - Do not use add . (add all)
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Maheshakya, I pushed the StreamingLinearRegression modules into my forked carbon-ml repo at branch wso2_gsoc_ml6_cml [1]. I am working on persisting model.thank you. Mahesh. [1] https://github.com/dananjayamahesh/carbon-ml On Tue, Jun 14, 2016 at 5:56 PM, Mahesh Dananjayawrote: > yes > you should develop in tha fork repo > clone your forked repo > then go into that > then add upstream repo as original wso2 repo > see the remote tracking branchs by > git remote -v > you will see the origin as your forked repo > to add upstream > git remote add upstream > when you change something create a new branch by > git checkout -b new_branch_name > then add and commit to this branch > after that push to the forked by > git push origin new_branch_name > > On Tue, Jun 14, 2016 at 5:32 PM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi Maheshakya, >> the above error is due to a simple mistake of not providing my local p2 >> repo.Now it is working and i debugged the StreamingLinearRegression model >> cep. >> regards, >> Mahesh. >> >> On Tue, Jun 14, 2016 at 3:19 PM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi Maheshakya, >>> I did what you recommend. But when i am adding the query the following >>> error is appearing. >>> No extension exist for StreamFunctionExtension{namespace='ml'} in >>> execution plan "NewExecutionPlan" >>> >>> *My query is as follows, >>> @Import('LinRegInput:1.0.0') >>> define stream LinRegInput (salary double, rbi double, walks double, >>> strikeouts double, errors double); >>> >>> @Export('LinRegOutput:1.0.0') >>> define stream LinRegOutput (mse double); >>> >>> from LinRegInput#ml:streamlinreg(0, 2, 100, 0.0001, 1.0, 0.95, >>> salary, rbi, walks, strikeouts, errors) >>> select * >>> insert into mse; >>> >>> I have added my files as follows, >>> >>> org.wso2.carbon.ml.siddhi.extension.streaming.StreamingLinearRegression; >>> org.wso2.carbon.ml.siddhi.extension.streaming.algorithm.StreamingLinearModel; >>> >>> and add following lines to ml.siddhiext >>> >>> streamlinreg=org.wso2.carbon.ml.siddhi.extension.streaming.StreamingLinearRegressionStreamProcessor >>> >>> .Then i build the carbon-ml. The replace the jar file you asked me >>> replace with the name changed.any thoughts? >>> regards, >>> Mahesh. >>> >>> On Tue, Jun 14, 2016 at 2:43 PM, Maheshakya Wijewardena < >>> mahesha...@wso2.com> wrote: >>> Hi Mahesh, You don't need to add new p2 repo. In the /repository/components/plugins folder, you will find org.wso2.carbon.ml.siddhi.extension_some_version.jar. Replace this with carbon-ml/components/extensions/org.wso2.carbon.ml.siddhi.extension/target/org.wso2.carbon.ml.siddhi.extension-1.1.2-SNAPSHOT.jar. First rename this jar in the target folder to the jar name in the plugins folder then replace (Make sure, otherwise will not work). Your updates will be there in the CEP after this. Best regards. On Tue, Jun 14, 2016 at 2:37 PM, Mahesh Dananjaya < dananjayamah...@gmail.com> wrote: > Hi Maheshakya, > Do i need to add p2 local repos of ML into CEP after i made changes to > ml extensions. Or will it be automatically updated. I am trying to debug > my > extension with the cep.thank you. > regards, > Mahesh. > > On Tue, Jun 14, 2016 at 1:57 PM, Maheshakya Wijewardena < > mahesha...@wso2.com> wrote: > >> Mahesh when you add your work to carbon-ml follow the bellow >> guidelines, it will help to keep the code clean. >> >> >>- Add only the sources code file you have newly added or changed. >>- Do not use add . (add all) command in git. Only use add filename >> >> I have seen in your gsoc repo that there are gitignore files, idea >> related files and the target folder is there. These should not be in the >> source code, only the source files you add. >> >>- Commit when you have done some major activity. Do not add >>commits always when you make a change. >> >> >> On Tue, Jun 14, 2016 at 12:22 PM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi Maheshakya, >>> May i seperately put the classes to ml and extensions in >>> carbon-core. I can put Streaming Extensions to extensions and >>> Algorithms/StreamingLinear Regression and StreamingKMeans in ml core. >>> what >>> is the suitable format. I will commit my changes today as seperate >>> branch >>> in my forked carbon-ml local repo.thank you. >>> regards, >>> Mahesh. >>> p.s: better if you can meet me via hangout. >>> >> >> >> >> -- >> Pruthuvi Maheshakya Wijewardena >> mahesha...@wso2.com >> +94711228855 >> >> >> > -- Pruthuvi Maheshakya Wijewardena mahesha...@wso2.com +94711228855 >>> >> >
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Maheshakya, the above error is due to a simple mistake of not providing my local p2 repo.Now it is working and i debugged the StreamingLinearRegression model cep. regards, Mahesh. On Tue, Jun 14, 2016 at 3:19 PM, Mahesh Dananjayawrote: > Hi Maheshakya, > I did what you recommend. But when i am adding the query the following > error is appearing. > No extension exist for StreamFunctionExtension{namespace='ml'} in > execution plan "NewExecutionPlan" > > *My query is as follows, > @Import('LinRegInput:1.0.0') > define stream LinRegInput (salary double, rbi double, walks double, > strikeouts double, errors double); > > @Export('LinRegOutput:1.0.0') > define stream LinRegOutput (mse double); > > from LinRegInput#ml:streamlinreg(0, 2, 100, 0.0001, 1.0, 0.95, salary, > rbi, walks, strikeouts, errors) > select * > insert into mse; > > I have added my files as follows, > > org.wso2.carbon.ml.siddhi.extension.streaming.StreamingLinearRegression; > org.wso2.carbon.ml.siddhi.extension.streaming.algorithm.StreamingLinearModel; > > and add following lines to ml.siddhiext > > streamlinreg=org.wso2.carbon.ml.siddhi.extension.streaming.StreamingLinearRegressionStreamProcessor > > .Then i build the carbon-ml. The replace the jar file you asked me replace > with the name changed.any thoughts? > regards, > Mahesh. > > On Tue, Jun 14, 2016 at 2:43 PM, Maheshakya Wijewardena < > mahesha...@wso2.com> wrote: > >> Hi Mahesh, >> >> You don't need to add new p2 repo. >> In the /repository/components/plugins folder, you will find >> org.wso2.carbon.ml.siddhi.extension_some_version.jar. Replace this with >> carbon-ml/components/extensions/org.wso2.carbon.ml.siddhi.extension/target/org.wso2.carbon.ml.siddhi.extension-1.1.2-SNAPSHOT.jar. >> First rename this jar in the target folder to the jar name in the plugins >> folder then replace (Make sure, otherwise will not work). >> Your updates will be there in the CEP after this. >> >> Best regards. >> >> On Tue, Jun 14, 2016 at 2:37 PM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi Maheshakya, >>> Do i need to add p2 local repos of ML into CEP after i made changes to >>> ml extensions. Or will it be automatically updated. I am trying to debug my >>> extension with the cep.thank you. >>> regards, >>> Mahesh. >>> >>> On Tue, Jun 14, 2016 at 1:57 PM, Maheshakya Wijewardena < >>> mahesha...@wso2.com> wrote: >>> Mahesh when you add your work to carbon-ml follow the bellow guidelines, it will help to keep the code clean. - Add only the sources code file you have newly added or changed. - Do not use add . (add all) command in git. Only use add filename I have seen in your gsoc repo that there are gitignore files, idea related files and the target folder is there. These should not be in the source code, only the source files you add. - Commit when you have done some major activity. Do not add commits always when you make a change. On Tue, Jun 14, 2016 at 12:22 PM, Mahesh Dananjaya < dananjayamah...@gmail.com> wrote: > Hi Maheshakya, > May i seperately put the classes to ml and extensions in carbon-core. > I can put Streaming Extensions to extensions and > Algorithms/StreamingLinear > Regression and StreamingKMeans in ml core. what is the suitable format. I > will commit my changes today as seperate branch in my forked carbon-ml > local repo.thank you. > regards, > Mahesh. > p.s: better if you can meet me via hangout. > -- Pruthuvi Maheshakya Wijewardena mahesha...@wso2.com +94711228855 >>> >> >> >> -- >> Pruthuvi Maheshakya Wijewardena >> mahesha...@wso2.com >> +94711228855 >> >> >> > ___ Dev mailing list Dev@wso2.org http://wso2.org/cgi-bin/mailman/listinfo/dev
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Maheshakya, I did what you recommend. But when i am adding the query the following error is appearing. No extension exist for StreamFunctionExtension{namespace='ml'} in execution plan "NewExecutionPlan" *My query is as follows, @Import('LinRegInput:1.0.0') define stream LinRegInput (salary double, rbi double, walks double, strikeouts double, errors double); @Export('LinRegOutput:1.0.0') define stream LinRegOutput (mse double); from LinRegInput#ml:streamlinreg(0, 2, 100, 0.0001, 1.0, 0.95, salary, rbi, walks, strikeouts, errors) select * insert into mse; I have added my files as follows, org.wso2.carbon.ml.siddhi.extension.streaming.StreamingLinearRegression; org.wso2.carbon.ml.siddhi.extension.streaming.algorithm.StreamingLinearModel; and add following lines to ml.siddhiext streamlinreg=org.wso2.carbon.ml.siddhi.extension.streaming.StreamingLinearRegressionStreamProcessor .Then i build the carbon-ml. The replace the jar file you asked me replace with the name changed.any thoughts? regards, Mahesh. On Tue, Jun 14, 2016 at 2:43 PM, Maheshakya Wijewardenawrote: > Hi Mahesh, > > You don't need to add new p2 repo. > In the /repository/components/plugins folder, you will find > org.wso2.carbon.ml.siddhi.extension_some_version.jar. Replace this with > carbon-ml/components/extensions/org.wso2.carbon.ml.siddhi.extension/target/org.wso2.carbon.ml.siddhi.extension-1.1.2-SNAPSHOT.jar. > First rename this jar in the target folder to the jar name in the plugins > folder then replace (Make sure, otherwise will not work). > Your updates will be there in the CEP after this. > > Best regards. > > On Tue, Jun 14, 2016 at 2:37 PM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi Maheshakya, >> Do i need to add p2 local repos of ML into CEP after i made changes to ml >> extensions. Or will it be automatically updated. I am trying to debug my >> extension with the cep.thank you. >> regards, >> Mahesh. >> >> On Tue, Jun 14, 2016 at 1:57 PM, Maheshakya Wijewardena < >> mahesha...@wso2.com> wrote: >> >>> Mahesh when you add your work to carbon-ml follow the bellow guidelines, >>> it will help to keep the code clean. >>> >>> >>>- Add only the sources code file you have newly added or changed. >>>- Do not use add . (add all) command in git. Only use add filename >>> >>> I have seen in your gsoc repo that there are gitignore files, idea >>> related files and the target folder is there. These should not be in the >>> source code, only the source files you add. >>> >>>- Commit when you have done some major activity. Do not add commits >>>always when you make a change. >>> >>> >>> On Tue, Jun 14, 2016 at 12:22 PM, Mahesh Dananjaya < >>> dananjayamah...@gmail.com> wrote: >>> Hi Maheshakya, May i seperately put the classes to ml and extensions in carbon-core. I can put Streaming Extensions to extensions and Algorithms/StreamingLinear Regression and StreamingKMeans in ml core. what is the suitable format. I will commit my changes today as seperate branch in my forked carbon-ml local repo.thank you. regards, Mahesh. p.s: better if you can meet me via hangout. >>> >>> >>> >>> -- >>> Pruthuvi Maheshakya Wijewardena >>> mahesha...@wso2.com >>> +94711228855 >>> >>> >>> >> > > > -- > Pruthuvi Maheshakya Wijewardena > mahesha...@wso2.com > +94711228855 > > > ___ Dev mailing list Dev@wso2.org http://wso2.org/cgi-bin/mailman/listinfo/dev
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Mahesh, You don't need to add new p2 repo. In the /repository/components/plugins folder, you will find org.wso2.carbon.ml.siddhi.extension_some_version.jar. Replace this with carbon-ml/components/extensions/org.wso2.carbon.ml.siddhi.extension/target/org.wso2.carbon.ml.siddhi.extension-1.1.2-SNAPSHOT.jar. First rename this jar in the target folder to the jar name in the plugins folder then replace (Make sure, otherwise will not work). Your updates will be there in the CEP after this. Best regards. On Tue, Jun 14, 2016 at 2:37 PM, Mahesh Dananjayawrote: > Hi Maheshakya, > Do i need to add p2 local repos of ML into CEP after i made changes to ml > extensions. Or will it be automatically updated. I am trying to debug my > extension with the cep.thank you. > regards, > Mahesh. > > On Tue, Jun 14, 2016 at 1:57 PM, Maheshakya Wijewardena < > mahesha...@wso2.com> wrote: > >> Mahesh when you add your work to carbon-ml follow the bellow guidelines, >> it will help to keep the code clean. >> >> >>- Add only the sources code file you have newly added or changed. >>- Do not use add . (add all) command in git. Only use add filename >> >> I have seen in your gsoc repo that there are gitignore files, idea >> related files and the target folder is there. These should not be in the >> source code, only the source files you add. >> >>- Commit when you have done some major activity. Do not add commits >>always when you make a change. >> >> >> On Tue, Jun 14, 2016 at 12:22 PM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi Maheshakya, >>> May i seperately put the classes to ml and extensions in carbon-core. I >>> can put Streaming Extensions to extensions and Algorithms/StreamingLinear >>> Regression and StreamingKMeans in ml core. what is the suitable format. I >>> will commit my changes today as seperate branch in my forked carbon-ml >>> local repo.thank you. >>> regards, >>> Mahesh. >>> p.s: better if you can meet me via hangout. >>> >> >> >> >> -- >> Pruthuvi Maheshakya Wijewardena >> mahesha...@wso2.com >> +94711228855 >> >> >> > -- Pruthuvi Maheshakya Wijewardena mahesha...@wso2.com +94711228855 ___ Dev mailing list Dev@wso2.org http://wso2.org/cgi-bin/mailman/listinfo/dev
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Maheshakya, Do i need to add p2 local repos of ML into CEP after i made changes to ml extensions. Or will it be automatically updated. I am trying to debug my extension with the cep.thank you. regards, Mahesh. On Tue, Jun 14, 2016 at 1:57 PM, Maheshakya Wijewardenawrote: > Mahesh when you add your work to carbon-ml follow the bellow guidelines, > it will help to keep the code clean. > > >- Add only the sources code file you have newly added or changed. >- Do not use add . (add all) command in git. Only use add filename > > I have seen in your gsoc repo that there are gitignore files, idea related > files and the target folder is there. These should not be in the source > code, only the source files you add. > >- Commit when you have done some major activity. Do not add commits >always when you make a change. > > > On Tue, Jun 14, 2016 at 12:22 PM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi Maheshakya, >> May i seperately put the classes to ml and extensions in carbon-core. I >> can put Streaming Extensions to extensions and Algorithms/StreamingLinear >> Regression and StreamingKMeans in ml core. what is the suitable format. I >> will commit my changes today as seperate branch in my forked carbon-ml >> local repo.thank you. >> regards, >> Mahesh. >> p.s: better if you can meet me via hangout. >> > > > > -- > Pruthuvi Maheshakya Wijewardena > mahesha...@wso2.com > +94711228855 > > > ___ Dev mailing list Dev@wso2.org http://wso2.org/cgi-bin/mailman/listinfo/dev
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Mahesh, You can add a new folder for streaming algorithms in the siddhi extension. There, keep stream processors and the algorithms classes separately. We can arrange a hangout tomorrow. Best regards. On Tue, Jun 14, 2016 at 12:22 PM, Mahesh Dananjaya < dananjayamah...@gmail.com> wrote: > Hi Maheshakya, > May i seperately put the classes to ml and extensions in carbon-core. I > can put Streaming Extensions to extensions and Algorithms/StreamingLinear > Regression and StreamingKMeans in ml core. what is the suitable format. I > will commit my changes today as seperate branch in my forked carbon-ml > local repo.thank you. > regards, > Mahesh. > p.s: better if you can meet me via hangout. > -- Pruthuvi Maheshakya Wijewardena mahesha...@wso2.com +94711228855 ___ Dev mailing list Dev@wso2.org http://wso2.org/cgi-bin/mailman/listinfo/dev
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Maheshakya, May i seperately put the classes to ml and extensions in carbon-core. I can put Streaming Extensions to extensions and Algorithms/StreamingLinear Regression and StreamingKMeans in ml core. what is the suitable format. I will commit my changes today as seperate branch in my forked carbon-ml local repo.thank you. regards, Mahesh. p.s: better if you can meet me via hangout. ___ Dev mailing list Dev@wso2.org http://wso2.org/cgi-bin/mailman/listinfo/dev
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi maheshakya, ok.these couple of days i have spent on implementing streamin clustering in a efficient way.i have found couple of methods.intially i am developing k batch k means for streaming.i will let you know the progress within next couple of days.i have already added paramter in query for window shift.i will add tto repo tomorrow morning. Thank you. Mahesh. On 6/12/16, Maheshakya Wijewardenawrote: > Hi Mahesh, > > Since you have already implemented the streaming algorithms as separate > siddhi extensions, our next task is to include them in the carbon-ml siddhi > extensions. Please start that by adding streaming linear regression first. > You also need to persist models that are trained. > Refer to method [1] in carbon-ml to see how model persistence is done. > > Best regards. > > [1] > https://github.com/wso2/carbon-ml/blob/5211f8b1d662778af832c54fbbcc81fe4aa78e1e/components/ml/org.wso2.carbon.ml.core/src/main/java/org/wso2/carbon/ml/core/impl/MLModelHandler.java#L727 > > On Sat, Jun 11, 2016 at 10:58 PM, Maheshakya Wijewardena < > mahesha...@wso2.com> wrote: > >> Hi Mahesh, >> >> Regarding your question: >> >> my outputData Object[]array is in the format of >>> [mse,beta0,beta1,betap].But seems to be that cep does not understand >>> it. >> >> >> Did you create an output stream first for the publisher? You need to >> create a stream with attributes: mse double, beta1 double, ... >> and point to that from the publisher. >> >> >> >> On Wed, Jun 8, 2016 at 1:48 PM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi Maheshakya, >>> you can find the details of the queries in this ReadMe [1]. i have add >>> some changes . so previous querirs may not valid.please use these new >>> queries in the README. >>> *1.Streaming Linear regression* >>> from LinRegInputStream#streaming:streaminglr((learnType), >>> (batchSize/timeFrame), (numIterations), (stepSize), (miniBatchFraction), >>> (ci), salary, rbi, walks, strikeouts, errors) >>> select * >>> >>> >>> >>> >>> *insert into regResults; from LinRegInputStream#streaming:streaminglr(0, >>> 2, 100, 0.0001, 1, 0.95, salary, rbi, walks, strikeouts, >>> errors)select >>> *insert into regResults*; >>> >>> *2.Streaming KMeans Clustering* >>> from LinRegInputStream#streaming:streamingkm((learnType), >>> (batchSize/timeFrame), (numClusters), (numIterations),(alpha), (ci), >>> salary, rbi, walks, strikeouts, errors) >>> select * >>> insert into regResults; >>> >>> >>> >>> *from >>> KMeansInputStream#streaming:streamingkm(0,3,0.95,2,10,1,salary,rbi,walks,strikeouts,errors)select >>> *insert into regResults* >>> >>> And i need a help in returning the outputData of my program back to >>> cep. >>> therefore currenlt you may not find the stream output in event >>> publish.but >>> you can see the output in the console. i want to understand the final >>> stepd >>> of putting the output data back to output stream after the batch size is >>> completed and the algorithms is completed. you may find that following >>> line >>> passes an exception. Thats have actually no clue of outputData format >>> that >>> need to give for Output stream. >>> >>> Object[] outputData = streamingLinearRegression.regress(eventData); >>> >>> >>> if (outputData == null) { >>> streamEventChunk.remove(); >>> } else { >>> complexEventPopulater.populateComplexEvent(complexEvent, >>> outputData); >>> } >>> >>> my outputData Object[]array is in the format of >>> [mse,beta0,beta1,betap].But seems to be that cep does not understand >>> it. i do it by looking at the time series stream rpocessor extension at >>> [2].can you please help me with this. >>> regards, >>> Mahesh. >>> >>> [1] >>> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming >>> [2] >>> https://github.com/wso2/siddhi/blob/master/modules/siddhi-extensions/timeseries/src/main/java/org/wso2/siddhi/extension/timeseries/LinearRegressionStreamProcessor.java >>> >>> On Tue, Jun 7, 2016 at 10:42 PM, Maheshakya Wijewardena < >>> mahesha...@wso2.com> wrote: >>> Hi Mahesh, Great work so far. Regarding the queries: streamingkm(0, 2,2,20,1,0.95 salary, rbi, walks, strikeouts, errors) Can you give me the definitions of the first few entities in the order. Also in previous supervised cases (linear regression), what is the response variable, etc. I'll go through the code and give you a feedback. After this, we need to me this implementation into carbon-ml siddhi extension. Please also do a similar implementation for logistic regression as well because we need to have a streaming version for classification as well. Best regards. On Tue, Jun 7, 2016 at 5:50 PM, Mahesh Dananjaya < dananjayamah...@gmail.com> wrote: > Hi Maheshkya, > I have changed the siddhi query for our StreamingKMeansClustering by > adding
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Mahesh, Regarding your question: my outputData Object[]array is in the format of > [mse,beta0,beta1,betap].But seems to be that cep does not understand it. Did you create an output stream first for the publisher? You need to create a stream with attributes: mse double, beta1 double, ...and point to that from the publisher. On Wed, Jun 8, 2016 at 1:48 PM, Mahesh Dananjayawrote: > Hi Maheshakya, > you can find the details of the queries in this ReadMe [1]. i have add > some changes . so previous querirs may not valid.please use these new > queries in the README. > *1.Streaming Linear regression* > from LinRegInputStream#streaming:streaminglr((learnType), > (batchSize/timeFrame), (numIterations), (stepSize), (miniBatchFraction), > (ci), salary, rbi, walks, strikeouts, errors) > select * > > > > > *insert into regResults; from LinRegInputStream#streaming:streaminglr(0, > 2, 100, 0.0001, 1, 0.95, salary, rbi, walks, strikeouts, errors)select > *insert into regResults*; > > *2.Streaming KMeans Clustering* > from LinRegInputStream#streaming:streamingkm((learnType), > (batchSize/timeFrame), (numClusters), (numIterations),(alpha), (ci), > salary, rbi, walks, strikeouts, errors) > select * > insert into regResults; > > > > *from > KMeansInputStream#streaming:streamingkm(0,3,0.95,2,10,1,salary,rbi,walks,strikeouts,errors)select > *insert into regResults* > > And i need a help in returning the outputData of my program back to cep. > therefore currenlt you may not find the stream output in event publish.but > you can see the output in the console. i want to understand the final stepd > of putting the output data back to output stream after the batch size is > completed and the algorithms is completed. you may find that following line > passes an exception. Thats have actually no clue of outputData format that > need to give for Output stream. > > Object[] outputData = streamingLinearRegression.regress(eventData); > > > if (outputData == null) { > streamEventChunk.remove(); > } else { > complexEventPopulater.populateComplexEvent(complexEvent, outputData); > } > > my outputData Object[]array is in the format of > [mse,beta0,beta1,betap].But seems to be that cep does not understand > it. i do it by looking at the time series stream rpocessor extension at > [2].can you please help me with this. > regards, > Mahesh. > > [1] > https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming > [2] > https://github.com/wso2/siddhi/blob/master/modules/siddhi-extensions/timeseries/src/main/java/org/wso2/siddhi/extension/timeseries/LinearRegressionStreamProcessor.java > > On Tue, Jun 7, 2016 at 10:42 PM, Maheshakya Wijewardena < > mahesha...@wso2.com> wrote: > >> Hi Mahesh, >> >> Great work so far. >> >> Regarding the queries: >> >> streamingkm(0, 2,2,20,1,0.95 salary, rbi, walks, strikeouts, errors) >> >> >> Can you give me the definitions of the first few entities in the order. >> Also in previous supervised cases (linear regression), what is the response >> variable, etc. >> I'll go through the code and give you a feedback. >> >> After this, we need to me this implementation into carbon-ml siddhi >> extension. Please also do a similar implementation for logistic regression >> as well because we need to have a streaming version for classification as >> well. >> >> Best regards. >> >> >> >> On Tue, Jun 7, 2016 at 5:50 PM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi Maheshkya, >>> I have changed the siddhi query for our StreamingKMeansClustering by >>> adding Alpha into the picture which we can use to make data horizon (how >>> quickly a most recent data point becomes a part of the model) and data >>> obsolescence (how long does it take a past data point to become irrelevant >>> to the model)in the streaming clustering algorithms.i have added new >>> changes to repo [1] introducing StreamingKMeansClusteringModel and >>> StreamingKMeansCLustering classes to project.new siddhi query is as follows. >>> >>> from Stream8Input#streaming:streamingkm(0, 2,2,20,1,0.95 salary, rbi, >>> walks, strikeouts, errors) >>> >>> select * >>> insert into regResults; >>> >>> regrads, >>> Mahesh. >>> >>> [1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc >>> >>> On Mon, Jun 6, 2016 at 6:31 PM, Mahesh Dananjaya < >>> dananjayamah...@gmail.com> wrote: >>> Hi Maheshakya, As we have discussed the architecture of the project i have already developed a couple of essential components for our project. During last week i completed the writing cep siddhi extension for our streaming algorithms which are developed to learn incrementally with past experiences. I have written the siddhi extensions with StreamProcessor extension for StreamingLinearRegerssion and StreamingKMeansClustering with the relevant parameters to call it as siddhi query. On the other hand i did some research on
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Maheshakya, in the last one mentioned example query for streaming linear regression should be, *insert into regResults; from LinRegInputStream#streaming:streaminglr(0, 2, 100, 0.0001, 1.0, 0.95, salary, rbi, walks, strikeouts, errors)select *insert into regResults*; miniBatchFraction should be given in double fomat.i wrote it wrong when i document it.thank you. On Wed, Jun 8, 2016 at 1:48 PM, Mahesh Dananjayawrote: > Hi Maheshakya, > you can find the details of the queries in this ReadMe [1]. i have add > some changes . so previous querirs may not valid.please use these new > queries in the README. > *1.Streaming Linear regression* > from LinRegInputStream#streaming:streaminglr((learnType), > (batchSize/timeFrame), (numIterations), (stepSize), (miniBatchFraction), > (ci), salary, rbi, walks, strikeouts, errors) > select * > > > > > *insert into regResults; from LinRegInputStream#streaming:streaminglr(0, > 2, 100, 0.0001, 1, 0.95, salary, rbi, walks, strikeouts, errors)select > *insert into regResults*; > > *2.Streaming KMeans Clustering* > from LinRegInputStream#streaming:streamingkm((learnType), > (batchSize/timeFrame), (numClusters), (numIterations),(alpha), (ci), > salary, rbi, walks, strikeouts, errors) > select * > insert into regResults; > > > > *from > KMeansInputStream#streaming:streamingkm(0,3,0.95,2,10,1,salary,rbi,walks,strikeouts,errors)select > *insert into regResults* > > And i need a help in returning the outputData of my program back to cep. > therefore currenlt you may not find the stream output in event publish.but > you can see the output in the console. i want to understand the final stepd > of putting the output data back to output stream after the batch size is > completed and the algorithms is completed. you may find that following line > passes an exception. Thats have actually no clue of outputData format that > need to give for Output stream. > > Object[] outputData = streamingLinearRegression.regress(eventData); > > > if (outputData == null) { > streamEventChunk.remove(); > } else { > complexEventPopulater.populateComplexEvent(complexEvent, outputData); > } > > my outputData Object[]array is in the format of > [mse,beta0,beta1,betap].But seems to be that cep does not understand > it. i do it by looking at the time series stream rpocessor extension at > [2].can you please help me with this. > regards, > Mahesh. > > [1] > https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming > [2] > https://github.com/wso2/siddhi/blob/master/modules/siddhi-extensions/timeseries/src/main/java/org/wso2/siddhi/extension/timeseries/LinearRegressionStreamProcessor.java > > On Tue, Jun 7, 2016 at 10:42 PM, Maheshakya Wijewardena < > mahesha...@wso2.com> wrote: > >> Hi Mahesh, >> >> Great work so far. >> >> Regarding the queries: >> >> streamingkm(0, 2,2,20,1,0.95 salary, rbi, walks, strikeouts, errors) >> >> >> Can you give me the definitions of the first few entities in the order. >> Also in previous supervised cases (linear regression), what is the response >> variable, etc. >> I'll go through the code and give you a feedback. >> >> After this, we need to me this implementation into carbon-ml siddhi >> extension. Please also do a similar implementation for logistic regression >> as well because we need to have a streaming version for classification as >> well. >> >> Best regards. >> >> >> >> On Tue, Jun 7, 2016 at 5:50 PM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi Maheshkya, >>> I have changed the siddhi query for our StreamingKMeansClustering by >>> adding Alpha into the picture which we can use to make data horizon (how >>> quickly a most recent data point becomes a part of the model) and data >>> obsolescence (how long does it take a past data point to become irrelevant >>> to the model)in the streaming clustering algorithms.i have added new >>> changes to repo [1] introducing StreamingKMeansClusteringModel and >>> StreamingKMeansCLustering classes to project.new siddhi query is as follows. >>> >>> from Stream8Input#streaming:streamingkm(0, 2,2,20,1,0.95 salary, rbi, >>> walks, strikeouts, errors) >>> >>> select * >>> insert into regResults; >>> >>> regrads, >>> Mahesh. >>> >>> [1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc >>> >>> On Mon, Jun 6, 2016 at 6:31 PM, Mahesh Dananjaya < >>> dananjayamah...@gmail.com> wrote: >>> Hi Maheshakya, As we have discussed the architecture of the project i have already developed a couple of essential components for our project. During last week i completed the writing cep siddhi extension for our streaming algorithms which are developed to learn incrementally with past experiences. I have written the siddhi extensions with StreamProcessor extension for StreamingLinearRegerssion and StreamingKMeansClustering with the relevant parameters to call it as siddhi query. On the other hand i did
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Maheshakya, you can find the details of the queries in this ReadMe [1]. i have add some changes . so previous querirs may not valid.please use these new queries in the README. *1.Streaming Linear regression* from LinRegInputStream#streaming:streaminglr((learnType), (batchSize/timeFrame), (numIterations), (stepSize), (miniBatchFraction), (ci), salary, rbi, walks, strikeouts, errors) select * *insert into regResults; from LinRegInputStream#streaming:streaminglr(0, 2, 100, 0.0001, 1, 0.95, salary, rbi, walks, strikeouts, errors)select *insert into regResults*; *2.Streaming KMeans Clustering* from LinRegInputStream#streaming:streamingkm((learnType), (batchSize/timeFrame), (numClusters), (numIterations),(alpha), (ci), salary, rbi, walks, strikeouts, errors) select * insert into regResults; *from KMeansInputStream#streaming:streamingkm(0,3,0.95,2,10,1,salary,rbi,walks,strikeouts,errors)select *insert into regResults* And i need a help in returning the outputData of my program back to cep. therefore currenlt you may not find the stream output in event publish.but you can see the output in the console. i want to understand the final stepd of putting the output data back to output stream after the batch size is completed and the algorithms is completed. you may find that following line passes an exception. Thats have actually no clue of outputData format that need to give for Output stream. Object[] outputData = streamingLinearRegression.regress(eventData); if (outputData == null) { streamEventChunk.remove(); } else { complexEventPopulater.populateComplexEvent(complexEvent, outputData); } my outputData Object[]array is in the format of [mse,beta0,beta1,betap].But seems to be that cep does not understand it. i do it by looking at the time series stream rpocessor extension at [2].can you please help me with this. regards, Mahesh. [1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming [2] https://github.com/wso2/siddhi/blob/master/modules/siddhi-extensions/timeseries/src/main/java/org/wso2/siddhi/extension/timeseries/LinearRegressionStreamProcessor.java On Tue, Jun 7, 2016 at 10:42 PM, Maheshakya Wijewardenawrote: > Hi Mahesh, > > Great work so far. > > Regarding the queries: > > streamingkm(0, 2,2,20,1,0.95 salary, rbi, walks, strikeouts, errors) > > > Can you give me the definitions of the first few entities in the order. > Also in previous supervised cases (linear regression), what is the response > variable, etc. > I'll go through the code and give you a feedback. > > After this, we need to me this implementation into carbon-ml siddhi > extension. Please also do a similar implementation for logistic regression > as well because we need to have a streaming version for classification as > well. > > Best regards. > > > > On Tue, Jun 7, 2016 at 5:50 PM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi Maheshkya, >> I have changed the siddhi query for our StreamingKMeansClustering by >> adding Alpha into the picture which we can use to make data horizon (how >> quickly a most recent data point becomes a part of the model) and data >> obsolescence (how long does it take a past data point to become irrelevant >> to the model)in the streaming clustering algorithms.i have added new >> changes to repo [1] introducing StreamingKMeansClusteringModel and >> StreamingKMeansCLustering classes to project.new siddhi query is as follows. >> >> from Stream8Input#streaming:streamingkm(0, 2,2,20,1,0.95 salary, rbi, >> walks, strikeouts, errors) >> >> select * >> insert into regResults; >> >> regrads, >> Mahesh. >> >> [1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc >> >> On Mon, Jun 6, 2016 at 6:31 PM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi Maheshakya, >>> As we have discussed the architecture of the project i have already >>> developed a couple of essential components for our project. During last >>> week i completed the writing cep siddhi extension for our streaming >>> algorithms which are developed to learn incrementally with past >>> experiences. I have written the siddhi extensions with StreamProcessor >>> extension for StreamingLinearRegerssion and StreamingKMeansClustering with >>> the relevant parameters to call it as siddhi query. On the other hand i did >>> some research on developing Mini Batch KMeans clustering for our >>> StreamingKMeansClustering. And also i added the moving window addition to >>> usual batch processing. And currently i am working on the time based >>> incremental re-trainign method for siddhi streams. On the >>> StreamingClustering side i have already part of th >>> StreamingKMeansClustering with the mini batch KMeans clustering. All the >>> work i did were pushed to my repo in github [1]. you can find the >>> development on gsoc/ directory. >>> And also as the ml team and supun was asked, i have did some timing and >>> performance analysis for
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Mahesh, Great work so far. Regarding the queries: streamingkm(0, 2,2,20,1,0.95 salary, rbi, walks, strikeouts, errors) Can you give me the definitions of the first few entities in the order. Also in previous supervised cases (linear regression), what is the response variable, etc. I'll go through the code and give you a feedback. After this, we need to me this implementation into carbon-ml siddhi extension. Please also do a similar implementation for logistic regression as well because we need to have a streaming version for classification as well. Best regards. On Tue, Jun 7, 2016 at 5:50 PM, Mahesh Dananjayawrote: > Hi Maheshkya, > I have changed the siddhi query for our StreamingKMeansClustering by > adding Alpha into the picture which we can use to make data horizon (how > quickly a most recent data point becomes a part of the model) and data > obsolescence (how long does it take a past data point to become irrelevant > to the model)in the streaming clustering algorithms.i have added new > changes to repo [1] introducing StreamingKMeansClusteringModel and > StreamingKMeansCLustering classes to project.new siddhi query is as follows. > > from Stream8Input#streaming:streamingkm(0, 2,2,20,1,0.95 salary, rbi, > walks, strikeouts, errors) > > select * > insert into regResults; > > regrads, > Mahesh. > > [1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc > > On Mon, Jun 6, 2016 at 6:31 PM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi Maheshakya, >> As we have discussed the architecture of the project i have already >> developed a couple of essential components for our project. During last >> week i completed the writing cep siddhi extension for our streaming >> algorithms which are developed to learn incrementally with past >> experiences. I have written the siddhi extensions with StreamProcessor >> extension for StreamingLinearRegerssion and StreamingKMeansClustering with >> the relevant parameters to call it as siddhi query. On the other hand i did >> some research on developing Mini Batch KMeans clustering for our >> StreamingKMeansClustering. And also i added the moving window addition to >> usual batch processing. And currently i am working on the time based >> incremental re-trainign method for siddhi streams. On the >> StreamingClustering side i have already part of th >> StreamingKMeansClustering with the mini batch KMeans clustering. All the >> work i did were pushed to my repo in github [1]. you can find the >> development on gsoc/ directory. >> And also as the ml team and supun was asked, i have did some timing and >> performance analysis for our SGD (Stochastic Gradient Descent) algorithms >> for LinearRegression. Those results also add to my repo in [2]. Now i am >> developing the rest for our purpose and trying to looked into other >> researches on predictive analysis for online big data. Ans also doing some >> work related to mini batch KMEans Clustering. And also i have been working >> on the performance analysis, accuracy and basic comparison between mini >> batch algorithms and moving window algorithms for streaming and periodic >> re-training of ML model. thank you. >> BR, >> Mahesh. >> [1] https://github.com/dananjayamahesh/GSOC2016 >> [2] >> https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/output/lr_timing_1.jpg >> >> >> On Sat, Jun 4, 2016 at 8:50 PM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi Maheshkya, >>> If you want to run it please use following queries. >>> 1. StreamingLInearRegression >>> >>> from Stream4InputStream#streaming:streaminglr(0, 2, 0.95, salary, rbi, >>> walks, strikeouts, errors) >>> >>> select * >>> >>> insert into regResults; >>> >>> from Stream8Input#streaming:streamingkm(0, 2, 0.95,2,20, salary, rbi, >>> walks, strikeouts, errors) >>> >>> select * >>> insert into regResults; >>> >>> in both case the first parameter let you to decide which learning methos >>> you want, moving window, batch processing or time based model learning. >>> BR, >>> Mahesh. >>> >>> On Sat, Jun 4, 2016 at 8:45 PM, Mahesh Dananjaya < >>> dananjayamah...@gmail.com> wrote: >>> Hi Maheshkaya, I have added the moving window method and update the previos StreamingLinearRegression [1] which only performed batch processing with streaming data. and also i added the StreamingKMeansClustering [1] for our purposes and debugged them.thank you. regards, Mahesh. [1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming/src/main/java/org/gsoc/siddhi/extension/streaming On Sat, Jun 4, 2016 at 5:58 PM, Supun Sethunga wrote: > Thanks Mahesh! The graphs look promising! :) > > So by looking at graph, LR with SGD can train a model within 60 secs > (6*10^10 nano sec), using about 900,000 data points . Means, this online > training can handle
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Maheshkya, I have changed the siddhi query for our StreamingKMeansClustering by adding Alpha into the picture which we can use to make data horizon (how quickly a most recent data point becomes a part of the model) and data obsolescence (how long does it take a past data point to become irrelevant to the model)in the streaming clustering algorithms.i have added new changes to repo [1] introducing StreamingKMeansClusteringModel and StreamingKMeansCLustering classes to project.new siddhi query is as follows. from Stream8Input#streaming:streamingkm(0, 2,2,20,1,0.95 salary, rbi, walks, strikeouts, errors) select * insert into regResults; regrads, Mahesh. [1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc On Mon, Jun 6, 2016 at 6:31 PM, Mahesh Dananjayawrote: > Hi Maheshakya, > As we have discussed the architecture of the project i have already > developed a couple of essential components for our project. During last > week i completed the writing cep siddhi extension for our streaming > algorithms which are developed to learn incrementally with past > experiences. I have written the siddhi extensions with StreamProcessor > extension for StreamingLinearRegerssion and StreamingKMeansClustering with > the relevant parameters to call it as siddhi query. On the other hand i did > some research on developing Mini Batch KMeans clustering for our > StreamingKMeansClustering. And also i added the moving window addition to > usual batch processing. And currently i am working on the time based > incremental re-trainign method for siddhi streams. On the > StreamingClustering side i have already part of th > StreamingKMeansClustering with the mini batch KMeans clustering. All the > work i did were pushed to my repo in github [1]. you can find the > development on gsoc/ directory. > And also as the ml team and supun was asked, i have did some timing and > performance analysis for our SGD (Stochastic Gradient Descent) algorithms > for LinearRegression. Those results also add to my repo in [2]. Now i am > developing the rest for our purpose and trying to looked into other > researches on predictive analysis for online big data. Ans also doing some > work related to mini batch KMEans Clustering. And also i have been working > on the performance analysis, accuracy and basic comparison between mini > batch algorithms and moving window algorithms for streaming and periodic > re-training of ML model. thank you. > BR, > Mahesh. > [1] https://github.com/dananjayamahesh/GSOC2016 > [2] > https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/output/lr_timing_1.jpg > > > On Sat, Jun 4, 2016 at 8:50 PM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi Maheshkya, >> If you want to run it please use following queries. >> 1. StreamingLInearRegression >> >> from Stream4InputStream#streaming:streaminglr(0, 2, 0.95, salary, rbi, >> walks, strikeouts, errors) >> >> select * >> >> insert into regResults; >> >> from Stream8Input#streaming:streamingkm(0, 2, 0.95,2,20, salary, rbi, >> walks, strikeouts, errors) >> >> select * >> insert into regResults; >> >> in both case the first parameter let you to decide which learning methos >> you want, moving window, batch processing or time based model learning. >> BR, >> Mahesh. >> >> On Sat, Jun 4, 2016 at 8:45 PM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi Maheshkaya, >>> I have added the moving window method and update the previos >>> StreamingLinearRegression [1] which only performed batch processing with >>> streaming data. and also i added the StreamingKMeansClustering [1] for our >>> purposes and debugged them.thank you. >>> regards, >>> Mahesh. >>> [1] >>> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming/src/main/java/org/gsoc/siddhi/extension/streaming >>> >>> On Sat, Jun 4, 2016 at 5:58 PM, Supun Sethunga wrote: >>> Thanks Mahesh! The graphs look promising! :) So by looking at graph, LR with SGD can train a model within 60 secs (6*10^10 nano sec), using about 900,000 data points . Means, this online training can handle events/data points coming at rate of 15,000 per second (or more) , if the batch size is set to 900,000 (or less) or window size is set to 60 secs (or less). This is great IMO! On Sat, Jun 4, 2016 at 10:51 AM, Mahesh Dananjaya < dananjayamah...@gmail.com> wrote: > Hi Maheshakya, > As you requested i can change other parameters as well such as feature > size(p). Initially i did it with p=3;sure thing. Anyway you can see and > run > the code if you want. source is at [1]. the test timing is called with > random data as you requested if you set args[0] to 1. And you can find the > extension and streaming algorithms in gsoc/ directiry[2]. thank you. > BR, > Mahesh. > [1] >
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Maheshakya, As we have discussed the architecture of the project i have already developed a couple of essential components for our project. During last week i completed the writing cep siddhi extension for our streaming algorithms which are developed to learn incrementally with past experiences. I have written the siddhi extensions with StreamProcessor extension for StreamingLinearRegerssion and StreamingKMeansClustering with the relevant parameters to call it as siddhi query. On the other hand i did some research on developing Mini Batch KMeans clustering for our StreamingKMeansClustering. And also i added the moving window addition to usual batch processing. And currently i am working on the time based incremental re-trainign method for siddhi streams. On the StreamingClustering side i have already part of th StreamingKMeansClustering with the mini batch KMeans clustering. All the work i did were pushed to my repo in github [1]. you can find the development on gsoc/ directory. And also as the ml team and supun was asked, i have did some timing and performance analysis for our SGD (Stochastic Gradient Descent) algorithms for LinearRegression. Those results also add to my repo in [2]. Now i am developing the rest for our purpose and trying to looked into other researches on predictive analysis for online big data. Ans also doing some work related to mini batch KMEans Clustering. And also i have been working on the performance analysis, accuracy and basic comparison between mini batch algorithms and moving window algorithms for streaming and periodic re-training of ML model. thank you. BR, Mahesh. [1] https://github.com/dananjayamahesh/GSOC2016 [2] https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/output/lr_timing_1.jpg On Sat, Jun 4, 2016 at 8:50 PM, Mahesh Dananjayawrote: > Hi Maheshkya, > If you want to run it please use following queries. > 1. StreamingLInearRegression > > from Stream4InputStream#streaming:streaminglr(0, 2, 0.95, salary, rbi, > walks, strikeouts, errors) > > select * > > insert into regResults; > > from Stream8Input#streaming:streamingkm(0, 2, 0.95,2,20, salary, rbi, > walks, strikeouts, errors) > > select * > insert into regResults; > > in both case the first parameter let you to decide which learning methos > you want, moving window, batch processing or time based model learning. > BR, > Mahesh. > > On Sat, Jun 4, 2016 at 8:45 PM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi Maheshkaya, >> I have added the moving window method and update the previos >> StreamingLinearRegression [1] which only performed batch processing with >> streaming data. and also i added the StreamingKMeansClustering [1] for our >> purposes and debugged them.thank you. >> regards, >> Mahesh. >> [1] >> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming/src/main/java/org/gsoc/siddhi/extension/streaming >> >> On Sat, Jun 4, 2016 at 5:58 PM, Supun Sethunga wrote: >> >>> Thanks Mahesh! The graphs look promising! :) >>> >>> So by looking at graph, LR with SGD can train a model within 60 secs >>> (6*10^10 nano sec), using about 900,000 data points . Means, this online >>> training can handle events/data points coming at rate of 15,000 per second >>> (or more) , if the batch size is set to 900,000 (or less) or window size is >>> set to 60 secs (or less). This is great IMO! >>> >>> On Sat, Jun 4, 2016 at 10:51 AM, Mahesh Dananjaya < >>> dananjayamah...@gmail.com> wrote: >>> Hi Maheshakya, As you requested i can change other parameters as well such as feature size(p). Initially i did it with p=3;sure thing. Anyway you can see and run the code if you want. source is at [1]. the test timing is called with random data as you requested if you set args[0] to 1. And you can find the extension and streaming algorithms in gsoc/ directiry[2]. thank you. BR, Mahesh. [1] https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/src/main/java/org/sparkexample/StreamingLinearRegression.java [2] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc On Sat, Jun 4, 2016 at 10:39 AM, Mahesh Dananjaya < dananjayamah...@gmail.com> wrote: > Hi supun, > Though i pushed it yesterday, there was some problems with the > network. now you can see them in the repo location [1].I added some Matlab > plot you can see the patter there.you can use ml also. Ok sure thing. I > can > prepare a report or else blog if you want. files are as follows. The y > axis > is in ns and x axis is in batch size. And also i added two pplots as > jpegs[2], so you can easily compare. > lr_timing_1000.txt -> batch size incremented by 1000 > lr_timing_1.txt -> batch size incremented by 1 > lr_timing_power10.txt -> batch size incremented by power of 10 >
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Maheshkya, If you want to run it please use following queries. 1. StreamingLInearRegression from Stream4InputStream#streaming:streaminglr(0, 2, 0.95, salary, rbi, walks, strikeouts, errors) select * insert into regResults; from Stream8Input#streaming:streamingkm(0, 2, 0.95,2,20, salary, rbi, walks, strikeouts, errors) select * insert into regResults; in both case the first parameter let you to decide which learning methos you want, moving window, batch processing or time based model learning. BR, Mahesh. On Sat, Jun 4, 2016 at 8:45 PM, Mahesh Dananjayawrote: > Hi Maheshkaya, > I have added the moving window method and update the previos > StreamingLinearRegression [1] which only performed batch processing with > streaming data. and also i added the StreamingKMeansClustering [1] for our > purposes and debugged them.thank you. > regards, > Mahesh. > [1] > https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming/src/main/java/org/gsoc/siddhi/extension/streaming > > On Sat, Jun 4, 2016 at 5:58 PM, Supun Sethunga wrote: > >> Thanks Mahesh! The graphs look promising! :) >> >> So by looking at graph, LR with SGD can train a model within 60 secs >> (6*10^10 nano sec), using about 900,000 data points . Means, this online >> training can handle events/data points coming at rate of 15,000 per second >> (or more) , if the batch size is set to 900,000 (or less) or window size is >> set to 60 secs (or less). This is great IMO! >> >> On Sat, Jun 4, 2016 at 10:51 AM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi Maheshakya, >>> As you requested i can change other parameters as well such as feature >>> size(p). Initially i did it with p=3;sure thing. Anyway you can see and run >>> the code if you want. source is at [1]. the test timing is called with >>> random data as you requested if you set args[0] to 1. And you can find the >>> extension and streaming algorithms in gsoc/ directiry[2]. thank you. >>> BR, >>> Mahesh. >>> [1] >>> https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/src/main/java/org/sparkexample/StreamingLinearRegression.java >>> [2] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc >>> >>> On Sat, Jun 4, 2016 at 10:39 AM, Mahesh Dananjaya < >>> dananjayamah...@gmail.com> wrote: >>> Hi supun, Though i pushed it yesterday, there was some problems with the network. now you can see them in the repo location [1].I added some Matlab plot you can see the patter there.you can use ml also. Ok sure thing. I can prepare a report or else blog if you want. files are as follows. The y axis is in ns and x axis is in batch size. And also i added two pplots as jpegs[2], so you can easily compare. lr_timing_1000.txt -> batch size incremented by 1000 lr_timing_1.txt -> batch size incremented by 1 lr_timing_power10.txt -> batch size incremented by power of 10 In here independent variable is only tha batch size.If you want i can send you making other parameters such as step size, number of iteration, feature vector size as independent variables. please let me know if you want further info. thank you. regards, Mahesh. [1 ]https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output [2] https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/output/lr_timing_1.jpg On Sat, Jun 4, 2016 at 9:58 AM, Supun Sethunga wrote: > Hi Mahesh, > > I have added those timing reports to my repo [1]. > > Whats the file name? :) > > Btw, can you compile simple doc (gdoc) with the above results, and > bring everything to one place? That way it is easy to compare, and keep > track. > > Thanks, > Supun > > On Fri, Jun 3, 2016 at 7:23 PM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi Maheshkya, >> I have added those timing reports to my repo [1].please have a look >> at. three files are there. one is using incremet as 1000 for batch sizes >> (lr_timing_1000). Otherone is using incremet by 1 (lr_timing_1) >> upto 1 million in both scenarios.you can see the reports and figures in >> the >> location [2] in the repo. i also added the streaminglinearregression >> classes in the repo gsoc folder.thank you. >> regards, >> Mahesh. >> [1]https://github.com/dananjayamahesh/GSOC2016 >> [2] >> https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output >> >> On Mon, May 30, 2016 at 9:24 AM, Maheshakya Wijewardena < >> mahesha...@wso2.com> wrote: >> >>> Hi Mahesh, >>> >>> Thank you for the update. I will look into your implementation. >>> >>> And i will be able to send you the
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Maheshkaya, I have added the moving window method and update the previos StreamingLinearRegression [1] which only performed batch processing with streaming data. and also i added the StreamingKMeansClustering [1] for our purposes and debugged them.thank you. regards, Mahesh. [1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming/src/main/java/org/gsoc/siddhi/extension/streaming On Sat, Jun 4, 2016 at 5:58 PM, Supun Sethungawrote: > Thanks Mahesh! The graphs look promising! :) > > So by looking at graph, LR with SGD can train a model within 60 secs > (6*10^10 nano sec), using about 900,000 data points . Means, this online > training can handle events/data points coming at rate of 15,000 per second > (or more) , if the batch size is set to 900,000 (or less) or window size is > set to 60 secs (or less). This is great IMO! > > On Sat, Jun 4, 2016 at 10:51 AM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi Maheshakya, >> As you requested i can change other parameters as well such as feature >> size(p). Initially i did it with p=3;sure thing. Anyway you can see and run >> the code if you want. source is at [1]. the test timing is called with >> random data as you requested if you set args[0] to 1. And you can find the >> extension and streaming algorithms in gsoc/ directiry[2]. thank you. >> BR, >> Mahesh. >> [1] >> https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/src/main/java/org/sparkexample/StreamingLinearRegression.java >> [2] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc >> >> On Sat, Jun 4, 2016 at 10:39 AM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi supun, >>> Though i pushed it yesterday, there was some problems with the network. >>> now you can see them in the repo location [1].I added some Matlab plot you >>> can see the patter there.you can use ml also. Ok sure thing. I can prepare >>> a report or else blog if you want. files are as follows. The y axis is in >>> ns and x axis is in batch size. And also i added two pplots as jpegs[2], so >>> you can easily compare. >>> lr_timing_1000.txt -> batch size incremented by 1000 >>> lr_timing_1.txt -> batch size incremented by 1 >>> lr_timing_power10.txt -> batch size incremented by power of 10 >>> >>> In here independent variable is only tha batch size.If you want i can >>> send you making other parameters such as step size, number of iteration, >>> feature vector size as independent variables. please let me know if you >>> want further info. thank you. >>> regards, >>> Mahesh. >>> >>> >>> [1 >>> ]https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output >>> [2] >>> https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/output/lr_timing_1.jpg >>> >>> On Sat, Jun 4, 2016 at 9:58 AM, Supun Sethunga wrote: >>> Hi Mahesh, I have added those timing reports to my repo [1]. Whats the file name? :) Btw, can you compile simple doc (gdoc) with the above results, and bring everything to one place? That way it is easy to compare, and keep track. Thanks, Supun On Fri, Jun 3, 2016 at 7:23 PM, Mahesh Dananjaya < dananjayamah...@gmail.com> wrote: > Hi Maheshkya, > I have added those timing reports to my repo [1].please have a look > at. three files are there. one is using incremet as 1000 for batch sizes > (lr_timing_1000). Otherone is using incremet by 1 (lr_timing_1) > upto 1 million in both scenarios.you can see the reports and figures in > the > location [2] in the repo. i also added the streaminglinearregression > classes in the repo gsoc folder.thank you. > regards, > Mahesh. > [1]https://github.com/dananjayamahesh/GSOC2016 > [2] > https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output > > On Mon, May 30, 2016 at 9:24 AM, Maheshakya Wijewardena < > mahesha...@wso2.com> wrote: > >> Hi Mahesh, >> >> Thank you for the update. I will look into your implementation. >> >> And i will be able to send you the timing/performances analysis >>> report tomorrow for the SGD functions >>> >> >> Great. Sent those asap so that we can proceed. >> >> Best regards. >> >> On Sun, May 29, 2016 at 6:56 PM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> >>> Hi maheshakay, >>> I have implemented the linear regression with cep siddhi event >>> stream with taking batch sizes as parameters from the cep. Now we can >>> trying the moving window method to. Before that i think i should get >>> your >>> opinion on data structures to save the streaming data.please check my >>> repo >>> [1] /gsoc/ folder there you can find all new things i add.. there in
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Thanks Mahesh! The graphs look promising! :) So by looking at graph, LR with SGD can train a model within 60 secs (6*10^10 nano sec), using about 900,000 data points . Means, this online training can handle events/data points coming at rate of 15,000 per second (or more) , if the batch size is set to 900,000 (or less) or window size is set to 60 secs (or less). This is great IMO! On Sat, Jun 4, 2016 at 10:51 AM, Mahesh Dananjayawrote: > Hi Maheshakya, > As you requested i can change other parameters as well such as feature > size(p). Initially i did it with p=3;sure thing. Anyway you can see and run > the code if you want. source is at [1]. the test timing is called with > random data as you requested if you set args[0] to 1. And you can find the > extension and streaming algorithms in gsoc/ directiry[2]. thank you. > BR, > Mahesh. > [1] > https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/src/main/java/org/sparkexample/StreamingLinearRegression.java > [2] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc > > On Sat, Jun 4, 2016 at 10:39 AM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi supun, >> Though i pushed it yesterday, there was some problems with the network. >> now you can see them in the repo location [1].I added some Matlab plot you >> can see the patter there.you can use ml also. Ok sure thing. I can prepare >> a report or else blog if you want. files are as follows. The y axis is in >> ns and x axis is in batch size. And also i added two pplots as jpegs[2], so >> you can easily compare. >> lr_timing_1000.txt -> batch size incremented by 1000 >> lr_timing_1.txt -> batch size incremented by 1 >> lr_timing_power10.txt -> batch size incremented by power of 10 >> >> In here independent variable is only tha batch size.If you want i can >> send you making other parameters such as step size, number of iteration, >> feature vector size as independent variables. please let me know if you >> want further info. thank you. >> regards, >> Mahesh. >> >> >> [1 >> ]https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output >> [2] >> https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/output/lr_timing_1.jpg >> >> On Sat, Jun 4, 2016 at 9:58 AM, Supun Sethunga wrote: >> >>> Hi Mahesh, >>> >>> I have added those timing reports to my repo [1]. >>> >>> Whats the file name? :) >>> >>> Btw, can you compile simple doc (gdoc) with the above results, and bring >>> everything to one place? That way it is easy to compare, and keep track. >>> >>> Thanks, >>> Supun >>> >>> On Fri, Jun 3, 2016 at 7:23 PM, Mahesh Dananjaya < >>> dananjayamah...@gmail.com> wrote: >>> Hi Maheshkya, I have added those timing reports to my repo [1].please have a look at. three files are there. one is using incremet as 1000 for batch sizes (lr_timing_1000). Otherone is using incremet by 1 (lr_timing_1) upto 1 million in both scenarios.you can see the reports and figures in the location [2] in the repo. i also added the streaminglinearregression classes in the repo gsoc folder.thank you. regards, Mahesh. [1]https://github.com/dananjayamahesh/GSOC2016 [2] https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output On Mon, May 30, 2016 at 9:24 AM, Maheshakya Wijewardena < mahesha...@wso2.com> wrote: > Hi Mahesh, > > Thank you for the update. I will look into your implementation. > > And i will be able to send you the timing/performances analysis report >> tomorrow for the SGD functions >> > > Great. Sent those asap so that we can proceed. > > Best regards. > > On Sun, May 29, 2016 at 6:56 PM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> >> Hi maheshakay, >> I have implemented the linear regression with cep siddhi event stream >> with taking batch sizes as parameters from the cep. Now we can trying >> the >> moving window method to. Before that i think i should get your opinion on >> data structures to save the streaming data.please check my repo [1] >> /gsoc/ >> folder there you can find all new things i add.. there in the extension >> folder you can find those extension. And i will be able to send you the >> timing/performances analysis report tomorrow for the SGD functions. thank >> you. >> regards, >> Mahesh. >> [1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc >> >> >> On Fri, May 27, 2016 at 12:56 PM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi maheshkaya, >>> i have written some siddhi extension and trying to develop a one for >>> my one. In time series example in the [1], can you please explain me the >>> input format and query
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Maheshakya, I have looked into the spark streaming fundamentals and k mean clustering to develop the streaming k mean clustering for stream data. those can be found at [1] and [2].I will commit new changes to my repo today including the basic implementation of streaming k mean clustering.thank you. regards, Mahesh. [1] http://spark.apache.org/docs/latest/streaming-programming-guide.html [2] http://spark.apache.org/docs/latest/mllib-clustering.html On Sat, Jun 4, 2016 at 10:51 AM, Mahesh Dananjayawrote: > Hi Maheshakya, > As you requested i can change other parameters as well such as feature > size(p). Initially i did it with p=3;sure thing. Anyway you can see and run > the code if you want. source is at [1]. the test timing is called with > random data as you requested if you set args[0] to 1. And you can find the > extension and streaming algorithms in gsoc/ directiry[2]. thank you. > BR, > Mahesh. > [1] > https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/src/main/java/org/sparkexample/StreamingLinearRegression.java > [2] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc > > On Sat, Jun 4, 2016 at 10:39 AM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi supun, >> Though i pushed it yesterday, there was some problems with the network. >> now you can see them in the repo location [1].I added some Matlab plot you >> can see the patter there.you can use ml also. Ok sure thing. I can prepare >> a report or else blog if you want. files are as follows. The y axis is in >> ns and x axis is in batch size. And also i added two pplots as jpegs[2], so >> you can easily compare. >> lr_timing_1000.txt -> batch size incremented by 1000 >> lr_timing_1.txt -> batch size incremented by 1 >> lr_timing_power10.txt -> batch size incremented by power of 10 >> >> In here independent variable is only tha batch size.If you want i can >> send you making other parameters such as step size, number of iteration, >> feature vector size as independent variables. please let me know if you >> want further info. thank you. >> regards, >> Mahesh. >> >> >> [1 >> ]https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output >> [2] >> https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/output/lr_timing_1.jpg >> >> On Sat, Jun 4, 2016 at 9:58 AM, Supun Sethunga wrote: >> >>> Hi Mahesh, >>> >>> I have added those timing reports to my repo [1]. >>> >>> Whats the file name? :) >>> >>> Btw, can you compile simple doc (gdoc) with the above results, and bring >>> everything to one place? That way it is easy to compare, and keep track. >>> >>> Thanks, >>> Supun >>> >>> On Fri, Jun 3, 2016 at 7:23 PM, Mahesh Dananjaya < >>> dananjayamah...@gmail.com> wrote: >>> Hi Maheshkya, I have added those timing reports to my repo [1].please have a look at. three files are there. one is using incremet as 1000 for batch sizes (lr_timing_1000). Otherone is using incremet by 1 (lr_timing_1) upto 1 million in both scenarios.you can see the reports and figures in the location [2] in the repo. i also added the streaminglinearregression classes in the repo gsoc folder.thank you. regards, Mahesh. [1]https://github.com/dananjayamahesh/GSOC2016 [2] https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output On Mon, May 30, 2016 at 9:24 AM, Maheshakya Wijewardena < mahesha...@wso2.com> wrote: > Hi Mahesh, > > Thank you for the update. I will look into your implementation. > > And i will be able to send you the timing/performances analysis report >> tomorrow for the SGD functions >> > > Great. Sent those asap so that we can proceed. > > Best regards. > > On Sun, May 29, 2016 at 6:56 PM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> >> Hi maheshakay, >> I have implemented the linear regression with cep siddhi event stream >> with taking batch sizes as parameters from the cep. Now we can trying >> the >> moving window method to. Before that i think i should get your opinion on >> data structures to save the streaming data.please check my repo [1] >> /gsoc/ >> folder there you can find all new things i add.. there in the extension >> folder you can find those extension. And i will be able to send you the >> timing/performances analysis report tomorrow for the SGD functions. thank >> you. >> regards, >> Mahesh. >> [1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc >> >> >> On Fri, May 27, 2016 at 12:56 PM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi maheshkaya, >>> i have written some siddhi extension and trying to develop a one for >>> my one. In time series example in
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Maheshakya, As you requested i can change other parameters as well such as feature size(p). Initially i did it with p=3;sure thing. Anyway you can see and run the code if you want. source is at [1]. the test timing is called with random data as you requested if you set args[0] to 1. And you can find the extension and streaming algorithms in gsoc/ directiry[2]. thank you. BR, Mahesh. [1] https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/src/main/java/org/sparkexample/StreamingLinearRegression.java [2] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc On Sat, Jun 4, 2016 at 10:39 AM, Mahesh Dananjayawrote: > Hi supun, > Though i pushed it yesterday, there was some problems with the network. > now you can see them in the repo location [1].I added some Matlab plot you > can see the patter there.you can use ml also. Ok sure thing. I can prepare > a report or else blog if you want. files are as follows. The y axis is in > ns and x axis is in batch size. And also i added two pplots as jpegs[2], so > you can easily compare. > lr_timing_1000.txt -> batch size incremented by 1000 > lr_timing_1.txt -> batch size incremented by 1 > lr_timing_power10.txt -> batch size incremented by power of 10 > > In here independent variable is only tha batch size.If you want i can send > you making other parameters such as step size, number of iteration, feature > vector size as independent variables. please let me know if you want > further info. thank you. > regards, > Mahesh. > > > [1 > ]https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output > [2] > https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/output/lr_timing_1.jpg > > On Sat, Jun 4, 2016 at 9:58 AM, Supun Sethunga wrote: > >> Hi Mahesh, >> >> I have added those timing reports to my repo [1]. >> >> Whats the file name? :) >> >> Btw, can you compile simple doc (gdoc) with the above results, and bring >> everything to one place? That way it is easy to compare, and keep track. >> >> Thanks, >> Supun >> >> On Fri, Jun 3, 2016 at 7:23 PM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi Maheshkya, >>> I have added those timing reports to my repo [1].please have a look at. >>> three files are there. one is using incremet as 1000 for batch sizes >>> (lr_timing_1000). Otherone is using incremet by 1 (lr_timing_1) >>> upto 1 million in both scenarios.you can see the reports and figures in the >>> location [2] in the repo. i also added the streaminglinearregression >>> classes in the repo gsoc folder.thank you. >>> regards, >>> Mahesh. >>> [1]https://github.com/dananjayamahesh/GSOC2016 >>> [2] >>> https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output >>> >>> On Mon, May 30, 2016 at 9:24 AM, Maheshakya Wijewardena < >>> mahesha...@wso2.com> wrote: >>> Hi Mahesh, Thank you for the update. I will look into your implementation. And i will be able to send you the timing/performances analysis report > tomorrow for the SGD functions > Great. Sent those asap so that we can proceed. Best regards. On Sun, May 29, 2016 at 6:56 PM, Mahesh Dananjaya < dananjayamah...@gmail.com> wrote: > > Hi maheshakay, > I have implemented the linear regression with cep siddhi event stream > with taking batch sizes as parameters from the cep. Now we can trying the > moving window method to. Before that i think i should get your opinion on > data structures to save the streaming data.please check my repo [1] > /gsoc/ > folder there you can find all new things i add.. there in the extension > folder you can find those extension. And i will be able to send you the > timing/performances analysis report tomorrow for the SGD functions. thank > you. > regards, > Mahesh. > [1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc > > > On Fri, May 27, 2016 at 12:56 PM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi maheshkaya, >> i have written some siddhi extension and trying to develop a one for >> my one. In time series example in the [1], can you please explain me the >> input format and query lines in that example for my understanding. >> >> from baseballData#timeseries:regress(2, 1, 0.95, salary, rbi, >> walks, strikeouts, errors) >> select * >> insert into regResults; >> >> i just want to knwo how i give a set of data into this extension and >> what is baseballData. Is it input stream as usual.or any data file?how >> can >> i find that data set to create dummy input stream like baseballData? >> >> thank you. >> regards, >> Mahesh. >> [1] >>
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi supun, Though i pushed it yesterday, there was some problems with the network. now you can see them in the repo location [1].I added some Matlab plot you can see the patter there.you can use ml also. Ok sure thing. I can prepare a report or else blog if you want. files are as follows. The y axis is in ns and x axis is in batch size. And also i added two pplots as jpegs[2], so you can easily compare. lr_timing_1000.txt -> batch size incremented by 1000 lr_timing_1.txt -> batch size incremented by 1 lr_timing_power10.txt -> batch size incremented by power of 10 In here independent variable is only tha batch size.If you want i can send you making other parameters such as step size, number of iteration, feature vector size as independent variables. please let me know if you want further info. thank you. regards, Mahesh. [1 ]https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output [2] https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/output/lr_timing_1.jpg On Sat, Jun 4, 2016 at 9:58 AM, Supun Sethungawrote: > Hi Mahesh, > > I have added those timing reports to my repo [1]. > > Whats the file name? :) > > Btw, can you compile simple doc (gdoc) with the above results, and bring > everything to one place? That way it is easy to compare, and keep track. > > Thanks, > Supun > > On Fri, Jun 3, 2016 at 7:23 PM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi Maheshkya, >> I have added those timing reports to my repo [1].please have a look at. >> three files are there. one is using incremet as 1000 for batch sizes >> (lr_timing_1000). Otherone is using incremet by 1 (lr_timing_1) >> upto 1 million in both scenarios.you can see the reports and figures in the >> location [2] in the repo. i also added the streaminglinearregression >> classes in the repo gsoc folder.thank you. >> regards, >> Mahesh. >> [1]https://github.com/dananjayamahesh/GSOC2016 >> [2] >> https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output >> >> On Mon, May 30, 2016 at 9:24 AM, Maheshakya Wijewardena < >> mahesha...@wso2.com> wrote: >> >>> Hi Mahesh, >>> >>> Thank you for the update. I will look into your implementation. >>> >>> And i will be able to send you the timing/performances analysis report tomorrow for the SGD functions >>> >>> Great. Sent those asap so that we can proceed. >>> >>> Best regards. >>> >>> On Sun, May 29, 2016 at 6:56 PM, Mahesh Dananjaya < >>> dananjayamah...@gmail.com> wrote: >>> Hi maheshakay, I have implemented the linear regression with cep siddhi event stream with taking batch sizes as parameters from the cep. Now we can trying the moving window method to. Before that i think i should get your opinion on data structures to save the streaming data.please check my repo [1] /gsoc/ folder there you can find all new things i add.. there in the extension folder you can find those extension. And i will be able to send you the timing/performances analysis report tomorrow for the SGD functions. thank you. regards, Mahesh. [1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc On Fri, May 27, 2016 at 12:56 PM, Mahesh Dananjaya < dananjayamah...@gmail.com> wrote: > Hi maheshkaya, > i have written some siddhi extension and trying to develop a one for > my one. In time series example in the [1], can you please explain me the > input format and query lines in that example for my understanding. > > from baseballData#timeseries:regress(2, 1, 0.95, salary, rbi, > walks, strikeouts, errors) > select * > insert into regResults; > > i just want to knwo how i give a set of data into this extension and > what is baseballData. Is it input stream as usual.or any data file?how can > i find that data set to create dummy input stream like baseballData? > > thank you. > regards, > Mahesh. > [1] > https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension > > On Thu, May 26, 2016 at 2:58 PM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi Maheshakya, >> today i got the siddhi and debug the math extention. then did some >> changes and check. Now i am trying to write same kind of extension in my >> code base. so i add dependencies and it was built fine. Now i am trying >> to >> debug my extension and i did the same thing as i did in previous case. >> Cep >> is sending data, bu my extension is not firing in relevant break point. >> 1. So how can i debug the siddhi extension in my new extension.(you >> can see it in my example repoo) >> >> I think if i do it correctly we can built the extension for our >> purpose. And i will send the relevant timing report of SGD algorithms >> very >>
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Mahesh, I have added those timing reports to my repo [1]. Whats the file name? :) Btw, can you compile simple doc (gdoc) with the above results, and bring everything to one place? That way it is easy to compare, and keep track. Thanks, Supun On Fri, Jun 3, 2016 at 7:23 PM, Mahesh Dananjayawrote: > Hi Maheshkya, > I have added those timing reports to my repo [1].please have a look at. > three files are there. one is using incremet as 1000 for batch sizes > (lr_timing_1000). Otherone is using incremet by 1 (lr_timing_1) > upto 1 million in both scenarios.you can see the reports and figures in the > location [2] in the repo. i also added the streaminglinearregression > classes in the repo gsoc folder.thank you. > regards, > Mahesh. > [1]https://github.com/dananjayamahesh/GSOC2016 > [2] > https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output > > On Mon, May 30, 2016 at 9:24 AM, Maheshakya Wijewardena < > mahesha...@wso2.com> wrote: > >> Hi Mahesh, >> >> Thank you for the update. I will look into your implementation. >> >> And i will be able to send you the timing/performances analysis report >>> tomorrow for the SGD functions >>> >> >> Great. Sent those asap so that we can proceed. >> >> Best regards. >> >> On Sun, May 29, 2016 at 6:56 PM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> >>> Hi maheshakay, >>> I have implemented the linear regression with cep siddhi event stream >>> with taking batch sizes as parameters from the cep. Now we can trying the >>> moving window method to. Before that i think i should get your opinion on >>> data structures to save the streaming data.please check my repo [1] /gsoc/ >>> folder there you can find all new things i add.. there in the extension >>> folder you can find those extension. And i will be able to send you the >>> timing/performances analysis report tomorrow for the SGD functions. thank >>> you. >>> regards, >>> Mahesh. >>> [1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc >>> >>> >>> On Fri, May 27, 2016 at 12:56 PM, Mahesh Dananjaya < >>> dananjayamah...@gmail.com> wrote: >>> Hi maheshkaya, i have written some siddhi extension and trying to develop a one for my one. In time series example in the [1], can you please explain me the input format and query lines in that example for my understanding. from baseballData#timeseries:regress(2, 1, 0.95, salary, rbi, walks, strikeouts, errors) select * insert into regResults; i just want to knwo how i give a set of data into this extension and what is baseballData. Is it input stream as usual.or any data file?how can i find that data set to create dummy input stream like baseballData? thank you. regards, Mahesh. [1] https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension On Thu, May 26, 2016 at 2:58 PM, Mahesh Dananjaya < dananjayamah...@gmail.com> wrote: > Hi Maheshakya, > today i got the siddhi and debug the math extention. then did some > changes and check. Now i am trying to write same kind of extension in my > code base. so i add dependencies and it was built fine. Now i am trying to > debug my extension and i did the same thing as i did in previous case. Cep > is sending data, bu my extension is not firing in relevant break point. > 1. So how can i debug the siddhi extension in my new extension.(you > can see it in my example repoo) > > I think if i do it correctly we can built the extension for our > purpose. And i will send the relevant timing report of SGD algorithms very > soon as supun was asking me. thank you. > regards, > Mahesh. > > On Tue, May 24, 2016 at 11:07 AM, Maheshakya Wijewardena < > mahesha...@wso2.com> wrote: > >> Also note that there is a calculation interval in the siddhi time >> series regression function[1]. You maybe able get some insight for this >> from that as well. >> >> [1] https://docs.wso2.com/display/CEP400/Regression >> >> On Tue, May 24, 2016 at 11:03 AM, Maheshakya Wijewardena < >> mahesha...@wso2.com> wrote: >> >>> Hi Mahesh, >>> >>> As we discussed offline, we can use similar mechanism to train >>> linear regression models, logistic regression models and k-means >>> clustering >>> models. >>> >>> It is very interesting that i have found that somethings that can make use of our work. In the cep 4.0 documentation there is a Custom Stream Processor Extention program [1]. There is a example of LinearRegressionStreamProcessor [1]. >>> >>> As we have to train predictive models with Spark, you can write >>> wrappers around regression/clustering models of Spark. Refer to Siddhi >>> time >>> series regression source codes[1][2].
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Maheshkya, I have added those timing reports to my repo [1].please have a look at. three files are there. one is using incremet as 1000 for batch sizes (lr_timing_1000). Otherone is using incremet by 1 (lr_timing_1) upto 1 million in both scenarios.you can see the reports and figures in the location [2] in the repo. i also added the streaminglinearregression classes in the repo gsoc folder.thank you. regards, Mahesh. [1]https://github.com/dananjayamahesh/GSOC2016 [2] https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output On Mon, May 30, 2016 at 9:24 AM, Maheshakya Wijewardenawrote: > Hi Mahesh, > > Thank you for the update. I will look into your implementation. > > And i will be able to send you the timing/performances analysis report >> tomorrow for the SGD functions >> > > Great. Sent those asap so that we can proceed. > > Best regards. > > On Sun, May 29, 2016 at 6:56 PM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> >> Hi maheshakay, >> I have implemented the linear regression with cep siddhi event stream >> with taking batch sizes as parameters from the cep. Now we can trying the >> moving window method to. Before that i think i should get your opinion on >> data structures to save the streaming data.please check my repo [1] /gsoc/ >> folder there you can find all new things i add.. there in the extension >> folder you can find those extension. And i will be able to send you the >> timing/performances analysis report tomorrow for the SGD functions. thank >> you. >> regards, >> Mahesh. >> [1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc >> >> >> On Fri, May 27, 2016 at 12:56 PM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi maheshkaya, >>> i have written some siddhi extension and trying to develop a one for my >>> one. In time series example in the [1], can you please explain me the input >>> format and query lines in that example for my understanding. >>> >>> from baseballData#timeseries:regress(2, 1, 0.95, salary, rbi, >>> walks, strikeouts, errors) >>> select * >>> insert into regResults; >>> >>> i just want to knwo how i give a set of data into this extension and >>> what is baseballData. Is it input stream as usual.or any data file?how can >>> i find that data set to create dummy input stream like baseballData? >>> >>> thank you. >>> regards, >>> Mahesh. >>> [1] >>> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension >>> >>> On Thu, May 26, 2016 at 2:58 PM, Mahesh Dananjaya < >>> dananjayamah...@gmail.com> wrote: >>> Hi Maheshakya, today i got the siddhi and debug the math extention. then did some changes and check. Now i am trying to write same kind of extension in my code base. so i add dependencies and it was built fine. Now i am trying to debug my extension and i did the same thing as i did in previous case. Cep is sending data, bu my extension is not firing in relevant break point. 1. So how can i debug the siddhi extension in my new extension.(you can see it in my example repoo) I think if i do it correctly we can built the extension for our purpose. And i will send the relevant timing report of SGD algorithms very soon as supun was asking me. thank you. regards, Mahesh. On Tue, May 24, 2016 at 11:07 AM, Maheshakya Wijewardena < mahesha...@wso2.com> wrote: > Also note that there is a calculation interval in the siddhi time > series regression function[1]. You maybe able get some insight for this > from that as well. > > [1] https://docs.wso2.com/display/CEP400/Regression > > On Tue, May 24, 2016 at 11:03 AM, Maheshakya Wijewardena < > mahesha...@wso2.com> wrote: > >> Hi Mahesh, >> >> As we discussed offline, we can use similar mechanism to train linear >> regression models, logistic regression models and k-means clustering >> models. >> >> It is very interesting that i have found that somethings that can >>> make use of our work. In the cep 4.0 documentation there is a Custom >>> Stream >>> Processor Extention program [1]. There is a example of >>> LinearRegressionStreamProcessor [1]. >>> >> >> As we have to train predictive models with Spark, you can write >> wrappers around regression/clustering models of Spark. Refer to Siddhi >> time >> series regression source codes[1][2]. You can write a streaming linear >> regression class for ML in a similar fashion by wrapping Spark mllib >> implementations. You can use the methods "addEvent", "removeEvent", etc. >> (may have to be changed according to requirements) for the similar >> purpose. >> You can introduce trainLinearRegression/LogisticRegression/Kmeans which >> does a similar thing as in createLinearRegression in those time series >> functions. In the
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Mahesh, Thank you for the update. I will look into your implementation. And i will be able to send you the timing/performances analysis report > tomorrow for the SGD functions > Great. Sent those asap so that we can proceed. Best regards. On Sun, May 29, 2016 at 6:56 PM, Mahesh Dananjayawrote: > > Hi maheshakay, > I have implemented the linear regression with cep siddhi event stream > with taking batch sizes as parameters from the cep. Now we can trying the > moving window method to. Before that i think i should get your opinion on > data structures to save the streaming data.please check my repo [1] /gsoc/ > folder there you can find all new things i add.. there in the extension > folder you can find those extension. And i will be able to send you the > timing/performances analysis report tomorrow for the SGD functions. thank > you. > regards, > Mahesh. > [1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc > > > On Fri, May 27, 2016 at 12:56 PM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi maheshkaya, >> i have written some siddhi extension and trying to develop a one for my >> one. In time series example in the [1], can you please explain me the input >> format and query lines in that example for my understanding. >> >> from baseballData#timeseries:regress(2, 1, 0.95, salary, rbi, walks, >> strikeouts, errors) >> select * >> insert into regResults; >> >> i just want to knwo how i give a set of data into this extension and what >> is baseballData. Is it input stream as usual.or any data file?how can i >> find that data set to create dummy input stream like baseballData? >> >> thank you. >> regards, >> Mahesh. >> [1] >> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension >> >> On Thu, May 26, 2016 at 2:58 PM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi Maheshakya, >>> today i got the siddhi and debug the math extention. then did some >>> changes and check. Now i am trying to write same kind of extension in my >>> code base. so i add dependencies and it was built fine. Now i am trying to >>> debug my extension and i did the same thing as i did in previous case. Cep >>> is sending data, bu my extension is not firing in relevant break point. >>> 1. So how can i debug the siddhi extension in my new extension.(you can >>> see it in my example repoo) >>> >>> I think if i do it correctly we can built the extension for our purpose. >>> And i will send the relevant timing report of SGD algorithms very soon as >>> supun was asking me. thank you. >>> regards, >>> Mahesh. >>> >>> On Tue, May 24, 2016 at 11:07 AM, Maheshakya Wijewardena < >>> mahesha...@wso2.com> wrote: >>> Also note that there is a calculation interval in the siddhi time series regression function[1]. You maybe able get some insight for this from that as well. [1] https://docs.wso2.com/display/CEP400/Regression On Tue, May 24, 2016 at 11:03 AM, Maheshakya Wijewardena < mahesha...@wso2.com> wrote: > Hi Mahesh, > > As we discussed offline, we can use similar mechanism to train linear > regression models, logistic regression models and k-means clustering > models. > > It is very interesting that i have found that somethings that can make >> use of our work. In the cep 4.0 documentation there is a Custom Stream >> Processor Extention program [1]. There is a example of >> LinearRegressionStreamProcessor [1]. >> > > As we have to train predictive models with Spark, you can write > wrappers around regression/clustering models of Spark. Refer to Siddhi > time > series regression source codes[1][2]. You can write a streaming linear > regression class for ML in a similar fashion by wrapping Spark mllib > implementations. You can use the methods "addEvent", "removeEvent", etc. > (may have to be changed according to requirements) for the similar > purpose. > You can introduce trainLinearRegression/LogisticRegression/Kmeans which > does a similar thing as in createLinearRegression in those time series > functions. In the processData method you can use Spark mllib classes to > actually train models and return the model weights, evaluation metrics. > So, > converting streams into RDDs and retrieving information from the trained > models shall happen in this method. > > In the stream processor extension example, you can retrieve those > values then use them to train new models with new batches. Weights/cluster > centers maybe passed as initialization parameters for the wrappers. > > Please note that we have to figure out the best siddhi extension type > for this process. In the siddhi query, we define batch size, type of > algorithm and number of features (there can be more). After batch size > number of events received, train a model and save
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi maheshkaya, i have written some siddhi extension and trying to develop a one for my one. In time series example in the [1], can you please explain me the input format and query lines in that example for my understanding. from baseballData#timeseries:regress(2, 1, 0.95, salary, rbi, walks, strikeouts, errors) select * insert into regResults; i just want to knwo how i give a set of data into this extension and what is baseballData. Is it input stream as usual.or any data file?how can i find that data set to create dummy input stream like baseballData? thank you. regards, Mahesh. [1] https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension On Thu, May 26, 2016 at 2:58 PM, Mahesh Dananjayawrote: > Hi Maheshakya, > today i got the siddhi and debug the math extention. then did some changes > and check. Now i am trying to write same kind of extension in my code base. > so i add dependencies and it was built fine. Now i am trying to debug my > extension and i did the same thing as i did in previous case. Cep is > sending data, bu my extension is not firing in relevant break point. > 1. So how can i debug the siddhi extension in my new extension.(you can > see it in my example repoo) > > I think if i do it correctly we can built the extension for our purpose. > And i will send the relevant timing report of SGD algorithms very soon as > supun was asking me. thank you. > regards, > Mahesh. > > On Tue, May 24, 2016 at 11:07 AM, Maheshakya Wijewardena < > mahesha...@wso2.com> wrote: > >> Also note that there is a calculation interval in the siddhi time series >> regression function[1]. You maybe able get some insight for this from that >> as well. >> >> [1] https://docs.wso2.com/display/CEP400/Regression >> >> On Tue, May 24, 2016 at 11:03 AM, Maheshakya Wijewardena < >> mahesha...@wso2.com> wrote: >> >>> Hi Mahesh, >>> >>> As we discussed offline, we can use similar mechanism to train linear >>> regression models, logistic regression models and k-means clustering models. >>> >>> It is very interesting that i have found that somethings that can make use of our work. In the cep 4.0 documentation there is a Custom Stream Processor Extention program [1]. There is a example of LinearRegressionStreamProcessor [1]. >>> >>> As we have to train predictive models with Spark, you can write wrappers >>> around regression/clustering models of Spark. Refer to Siddhi time series >>> regression source codes[1][2]. You can write a streaming linear regression >>> class for ML in a similar fashion by wrapping Spark mllib implementations. >>> You can use the methods "addEvent", "removeEvent", etc. (may have to be >>> changed according to requirements) for the similar purpose. You can >>> introduce trainLinearRegression/LogisticRegression/Kmeans which does a >>> similar thing as in createLinearRegression in those time series functions. >>> In the processData method you can use Spark mllib classes to actually train >>> models and return the model weights, evaluation metrics. So, converting >>> streams into RDDs and retrieving information from the trained models shall >>> happen in this method. >>> >>> In the stream processor extension example, you can retrieve those values >>> then use them to train new models with new batches. Weights/cluster centers >>> maybe passed as initialization parameters for the wrappers. >>> >>> Please note that we have to figure out the best siddhi extension type >>> for this process. In the siddhi query, we define batch size, type of >>> algorithm and number of features (there can be more). After batch size >>> number of events received, train a model and save parameters, return >>> evaluation metric. With the next batch, retrain the model initialized with >>> previously learned parameters. >>> >>> We also may need to test the same scenario with a moving window, but I >>> suspect that that approach may become so slow as a model is trained each >>> time an event is received. So, we may have to change the number of slots >>> the moving window moves at a time (eg: not one by one, but ten by ten). >>> >>> Once this is resolved, majority of the research part will be finished >>> and all we will be left to do is implementing wrappers around the 3 >>> learning algorithms we consider. >>> >>> Best regards. >>> >>> [1] >>> https://github.com/wso2/siddhi/blob/master/modules/siddhi-extensions/timeseries/src/main/java/org/wso2/siddhi/extension/timeseries/linreg/RegressionCalculator.java >>> [2] >>> https://github.com/wso2/siddhi/blob/master/modules/siddhi-extensions/timeseries/src/main/java/org/wso2/siddhi/extension/timeseries/linreg/SimpleLinearRegressionCalculator.java >>> >>> >>> On Sat, May 21, 2016 at 2:55 PM, Mahesh Dananjaya < >>> dananjayamah...@gmail.com> wrote: >>> Hi Maheshkya, shall we use [1] for our work? i am checking the possibility. BR, Mahesh. [1]
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Maheshakya, today i got the siddhi and debug the math extention. then did some changes and check. Now i am trying to write same kind of extension in my code base. so i add dependencies and it was built fine. Now i am trying to debug my extension and i did the same thing as i did in previous case. Cep is sending data, bu my extension is not firing in relevant break point. 1. So how can i debug the siddhi extension in my new extension.(you can see it in my example repoo) I think if i do it correctly we can built the extension for our purpose. And i will send the relevant timing report of SGD algorithms very soon as supun was asking me. thank you. regards, Mahesh. On Tue, May 24, 2016 at 11:07 AM, Maheshakya Wijewardena < mahesha...@wso2.com> wrote: > Also note that there is a calculation interval in the siddhi time series > regression function[1]. You maybe able get some insight for this from that > as well. > > [1] https://docs.wso2.com/display/CEP400/Regression > > On Tue, May 24, 2016 at 11:03 AM, Maheshakya Wijewardena < > mahesha...@wso2.com> wrote: > >> Hi Mahesh, >> >> As we discussed offline, we can use similar mechanism to train linear >> regression models, logistic regression models and k-means clustering models. >> >> It is very interesting that i have found that somethings that can make >>> use of our work. In the cep 4.0 documentation there is a Custom Stream >>> Processor Extention program [1]. There is a example of >>> LinearRegressionStreamProcessor [1]. >>> >> >> As we have to train predictive models with Spark, you can write wrappers >> around regression/clustering models of Spark. Refer to Siddhi time series >> regression source codes[1][2]. You can write a streaming linear regression >> class for ML in a similar fashion by wrapping Spark mllib implementations. >> You can use the methods "addEvent", "removeEvent", etc. (may have to be >> changed according to requirements) for the similar purpose. You can >> introduce trainLinearRegression/LogisticRegression/Kmeans which does a >> similar thing as in createLinearRegression in those time series functions. >> In the processData method you can use Spark mllib classes to actually train >> models and return the model weights, evaluation metrics. So, converting >> streams into RDDs and retrieving information from the trained models shall >> happen in this method. >> >> In the stream processor extension example, you can retrieve those values >> then use them to train new models with new batches. Weights/cluster centers >> maybe passed as initialization parameters for the wrappers. >> >> Please note that we have to figure out the best siddhi extension type for >> this process. In the siddhi query, we define batch size, type of algorithm >> and number of features (there can be more). After batch size number of >> events received, train a model and save parameters, return evaluation >> metric. With the next batch, retrain the model initialized with previously >> learned parameters. >> >> We also may need to test the same scenario with a moving window, but I >> suspect that that approach may become so slow as a model is trained each >> time an event is received. So, we may have to change the number of slots >> the moving window moves at a time (eg: not one by one, but ten by ten). >> >> Once this is resolved, majority of the research part will be finished and >> all we will be left to do is implementing wrappers around the 3 learning >> algorithms we consider. >> >> Best regards. >> >> [1] >> https://github.com/wso2/siddhi/blob/master/modules/siddhi-extensions/timeseries/src/main/java/org/wso2/siddhi/extension/timeseries/linreg/RegressionCalculator.java >> [2] >> https://github.com/wso2/siddhi/blob/master/modules/siddhi-extensions/timeseries/src/main/java/org/wso2/siddhi/extension/timeseries/linreg/SimpleLinearRegressionCalculator.java >> >> >> On Sat, May 21, 2016 at 2:55 PM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi Maheshkya, >>> shall we use [1] for our work? i am checking the possibility. >>> BR, >>> Mahesh. >>> [1] >>> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension >>> [2] >>> https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength >>> [3] >>> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function >>> >>> On Sat, May 21, 2016 at 2:44 PM, Mahesh Dananjaya < >>> dananjayamah...@gmail.com> wrote: >>> Hi Maheshakya, It is very interesting that i have found that somethings that can make use of our work. In the cep 4.0 documentation there is a Custom Stream Processor Extention program [1]. There is a example of LinearRegressionStreamProcessor [1] and also i saw private int batchSize = 10; i am going through this one. Please check whether we can use. WIll there be any compatibility or support issue? regards, Mahesh. [1]
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Also note that there is a calculation interval in the siddhi time series regression function[1]. You maybe able get some insight for this from that as well. [1] https://docs.wso2.com/display/CEP400/Regression On Tue, May 24, 2016 at 11:03 AM, Maheshakya Wijewardena < mahesha...@wso2.com> wrote: > Hi Mahesh, > > As we discussed offline, we can use similar mechanism to train linear > regression models, logistic regression models and k-means clustering models. > > It is very interesting that i have found that somethings that can make use >> of our work. In the cep 4.0 documentation there is a Custom Stream >> Processor Extention program [1]. There is a example of >> LinearRegressionStreamProcessor [1]. >> > > As we have to train predictive models with Spark, you can write wrappers > around regression/clustering models of Spark. Refer to Siddhi time series > regression source codes[1][2]. You can write a streaming linear regression > class for ML in a similar fashion by wrapping Spark mllib implementations. > You can use the methods "addEvent", "removeEvent", etc. (may have to be > changed according to requirements) for the similar purpose. You can > introduce trainLinearRegression/LogisticRegression/Kmeans which does a > similar thing as in createLinearRegression in those time series functions. > In the processData method you can use Spark mllib classes to actually train > models and return the model weights, evaluation metrics. So, converting > streams into RDDs and retrieving information from the trained models shall > happen in this method. > > In the stream processor extension example, you can retrieve those values > then use them to train new models with new batches. Weights/cluster centers > maybe passed as initialization parameters for the wrappers. > > Please note that we have to figure out the best siddhi extension type for > this process. In the siddhi query, we define batch size, type of algorithm > and number of features (there can be more). After batch size number of > events received, train a model and save parameters, return evaluation > metric. With the next batch, retrain the model initialized with previously > learned parameters. > > We also may need to test the same scenario with a moving window, but I > suspect that that approach may become so slow as a model is trained each > time an event is received. So, we may have to change the number of slots > the moving window moves at a time (eg: not one by one, but ten by ten). > > Once this is resolved, majority of the research part will be finished and > all we will be left to do is implementing wrappers around the 3 learning > algorithms we consider. > > Best regards. > > [1] > https://github.com/wso2/siddhi/blob/master/modules/siddhi-extensions/timeseries/src/main/java/org/wso2/siddhi/extension/timeseries/linreg/RegressionCalculator.java > [2] > https://github.com/wso2/siddhi/blob/master/modules/siddhi-extensions/timeseries/src/main/java/org/wso2/siddhi/extension/timeseries/linreg/SimpleLinearRegressionCalculator.java > > > On Sat, May 21, 2016 at 2:55 PM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi Maheshkya, >> shall we use [1] for our work? i am checking the possibility. >> BR, >> Mahesh. >> [1] >> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension >> [2] >> https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength >> [3] >> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function >> >> On Sat, May 21, 2016 at 2:44 PM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi Maheshakya, >>> It is very interesting that i have found that somethings that can make >>> use of our work. In the cep 4.0 documentation there is a Custom Stream >>> Processor Extention program [1]. There is a example of >>> LinearRegressionStreamProcessor [1] and also i saw >>> private int batchSize = 10; i am going through this one. >>> Please check whether we can use. WIll there be any compatibility or >>> support issue? >>> regards, >>> Mahesh. >>> >>> >>> [1] >>> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension >>> >>> On Sat, May 21, 2016 at 11:52 AM, Mahesh Dananjaya < >>> dananjayamah...@gmail.com> wrote: >>> Hi maheshakya, anyway how can test any siddhi extention after write it without integrating it to cep.can you please explain me the procedure. i am referring to [1] [2] [3] [4]. thank you. BR, Mahesh. [1] https://docs.wso2.com/display/CEP310/Writing+Extensions+to+Siddhi [2] https://docs.wso2.com/display/CEP310/Writing+a+Custom+Function [3] https://docs.wso2.com/display/CEP310/Writing+a+Custom+Window [4] https://docs.wso2.com/display/CEP400/Writing+Extensions+to+Siddhi On Thu, May 19, 2016 at 12:08 PM, Mahesh Dananjaya < dananjayamah...@gmail.com> wrote: > Hi Maheshakya, > thank you for the feedback. I have add data-sets into repo.
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Mahesh, As we discussed offline, we can use similar mechanism to train linear regression models, logistic regression models and k-means clustering models. It is very interesting that i have found that somethings that can make use > of our work. In the cep 4.0 documentation there is a Custom Stream > Processor Extention program [1]. There is a example of > LinearRegressionStreamProcessor [1]. > As we have to train predictive models with Spark, you can write wrappers around regression/clustering models of Spark. Refer to Siddhi time series regression source codes[1][2]. You can write a streaming linear regression class for ML in a similar fashion by wrapping Spark mllib implementations. You can use the methods "addEvent", "removeEvent", etc. (may have to be changed according to requirements) for the similar purpose. You can introduce trainLinearRegression/LogisticRegression/Kmeans which does a similar thing as in createLinearRegression in those time series functions. In the processData method you can use Spark mllib classes to actually train models and return the model weights, evaluation metrics. So, converting streams into RDDs and retrieving information from the trained models shall happen in this method. In the stream processor extension example, you can retrieve those values then use them to train new models with new batches. Weights/cluster centers maybe passed as initialization parameters for the wrappers. Please note that we have to figure out the best siddhi extension type for this process. In the siddhi query, we define batch size, type of algorithm and number of features (there can be more). After batch size number of events received, train a model and save parameters, return evaluation metric. With the next batch, retrain the model initialized with previously learned parameters. We also may need to test the same scenario with a moving window, but I suspect that that approach may become so slow as a model is trained each time an event is received. So, we may have to change the number of slots the moving window moves at a time (eg: not one by one, but ten by ten). Once this is resolved, majority of the research part will be finished and all we will be left to do is implementing wrappers around the 3 learning algorithms we consider. Best regards. [1] https://github.com/wso2/siddhi/blob/master/modules/siddhi-extensions/timeseries/src/main/java/org/wso2/siddhi/extension/timeseries/linreg/RegressionCalculator.java [2] https://github.com/wso2/siddhi/blob/master/modules/siddhi-extensions/timeseries/src/main/java/org/wso2/siddhi/extension/timeseries/linreg/SimpleLinearRegressionCalculator.java On Sat, May 21, 2016 at 2:55 PM, Mahesh Dananjayawrote: > Hi Maheshkya, > shall we use [1] for our work? i am checking the possibility. > BR, > Mahesh. > [1] > https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension > [2] > https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength > [3] > https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function > > On Sat, May 21, 2016 at 2:44 PM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi Maheshakya, >> It is very interesting that i have found that somethings that can make >> use of our work. In the cep 4.0 documentation there is a Custom Stream >> Processor Extention program [1]. There is a example of >> LinearRegressionStreamProcessor [1] and also i saw >> private int batchSize = 10; i am going through this one. >> Please check whether we can use. WIll there be any compatibility or >> support issue? >> regards, >> Mahesh. >> >> >> [1] >> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension >> >> On Sat, May 21, 2016 at 11:52 AM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi maheshakya, >>> anyway how can test any siddhi extention after write it without >>> integrating it to cep.can you please explain me the procedure. i am >>> referring to [1] [2] [3] [4]. thank you. >>> BR, >>> Mahesh. >>> >>> [1] https://docs.wso2.com/display/CEP310/Writing+Extensions+to+Siddhi >>> [2] https://docs.wso2.com/display/CEP310/Writing+a+Custom+Function >>> [3] https://docs.wso2.com/display/CEP310/Writing+a+Custom+Window >>> [4] https://docs.wso2.com/display/CEP400/Writing+Extensions+to+Siddhi >>> >>> On Thu, May 19, 2016 at 12:08 PM, Mahesh Dananjaya < >>> dananjayamah...@gmail.com> wrote: >>> Hi Maheshakya, thank you for the feedback. I have add data-sets into repo. data-sets/lr. I am all right with next week.Now i am writing some examples to collect samples and build mini batches and run the algorithms on those mini-batches. thank you. will add those into repo soon.I am still working on that siddhi extention.i will let you know the progress. BR, mahesh. On Thu, May 19, 2016 at 11:10 AM, Maheshakya Wijewardena < mahesha...@wso2.com> wrote: > Hi
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Maheshkya, shall we use [1] for our work? i am checking the possibility. BR, Mahesh. [1] https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension [2] https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength [3]https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function On Sat, May 21, 2016 at 2:44 PM, Mahesh Dananjayawrote: > Hi Maheshakya, > It is very interesting that i have found that somethings that can make use > of our work. In the cep 4.0 documentation there is a Custom Stream > Processor Extention program [1]. There is a example of > LinearRegressionStreamProcessor [1] and also i saw > private int batchSize = 10; i am going through this one. > Please check whether we can use. WIll there be any compatibility or > support issue? > regards, > Mahesh. > > > [1] > https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension > > On Sat, May 21, 2016 at 11:52 AM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi maheshakya, >> anyway how can test any siddhi extention after write it without >> integrating it to cep.can you please explain me the procedure. i am >> referring to [1] [2] [3] [4]. thank you. >> BR, >> Mahesh. >> >> [1] https://docs.wso2.com/display/CEP310/Writing+Extensions+to+Siddhi >> [2] https://docs.wso2.com/display/CEP310/Writing+a+Custom+Function >> [3] https://docs.wso2.com/display/CEP310/Writing+a+Custom+Window >> [4] https://docs.wso2.com/display/CEP400/Writing+Extensions+to+Siddhi >> >> On Thu, May 19, 2016 at 12:08 PM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi Maheshakya, >>> thank you for the feedback. I have add data-sets into repo. >>> data-sets/lr. I am all right with next week.Now i am writing some examples >>> to collect samples and build mini batches and run the algorithms on those >>> mini-batches. thank you. will add those into repo soon.I am still working >>> on that siddhi extention.i will let you know the progress. >>> BR, >>> mahesh. >>> >>> On Thu, May 19, 2016 at 11:10 AM, Maheshakya Wijewardena < >>> mahesha...@wso2.com> wrote: >>> Hi Mahesh, I've look into your code sample of streaming linear regression. Looks good to me, apart from few issues in coding practices which we can improve when you're doing the implementations in carbon-ml and during the code reviews. You are using a set of files as mini-batches of data, right? Can you also send us the datasets you've been using. I'd like to run this. does that cep problem is now all right that we were trying to fix. I am > still using those pre-build versions. If so i can merge with the latest > one. I'll check this and let you know. Can we arrange a meeting (preferably in WSO2 offices) in next week with ML team members as well. Coding period begins on next Monday, so it's better to get overall feedback from others and discuss more about the project. Let me know convenient time slots for you. I'll arrange a meeting with ML team. Best regards. On Wed, May 18, 2016 at 9:53 AM, Mahesh Dananjaya < dananjayamah...@gmail.com> wrote: > Hi Maheshakya, > Ok. I will check it.you have sent me those relevant references and i > am working on that thing.thank you. does that cep problem is now all right > that we were trying to fix. I am still using those pre-build versions. If > so i can merge with the latest one.thanks. > BR, > Mahesh. > > On Wed, May 18, 2016 at 9:44 AM, Maheshakya Wijewardena < > mahesha...@wso2.com> wrote: > >> Hi Mahesh, >> >> You don't actually have to implement anything in spark streaming. Try >> to understand how streaming data is handled in and the specifics of the >> underlying algorithms in streaming. >> What we want to do is having the similar algorithms that support CEP >> event streams with siddhi. >> >> Best regards. >> >> On Wed, May 18, 2016 at 9:38 AM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi Maheshakya, >>> Did you check the repo. I will add recent works today.And also i was >>> going through the Java docs related to spark streaming work. It is with >>> that scala API. thank you. >>> regards, >>> Mahesh. >>> >>> On Tue, May 17, 2016 at 10:11 AM, Mahesh Dananjaya < >>> dananjayamah...@gmail.com> wrote: >>> Hi Maheshakya, I have gone through the Java Docs and run some of the Spark examples on spark shell which are paramount improtant for our work. Then i have been writing my codes to check the Linear regression, K means for streaming. please check my git repo [1]. I think now i have to ask on dev regarding the capturing event streams for our work. I will update the recent
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Maheshakya, It is very interesting that i have found that somethings that can make use of our work. In the cep 4.0 documentation there is a Custom Stream Processor Extention program [1]. There is a example of LinearRegressionStreamProcessor [1] and also i saw private int batchSize = 10; i am going through this one. Please check whether we can use. WIll there be any compatibility or support issue? regards, Mahesh. [1] https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension On Sat, May 21, 2016 at 11:52 AM, Mahesh Dananjaya < dananjayamah...@gmail.com> wrote: > Hi maheshakya, > anyway how can test any siddhi extention after write it without > integrating it to cep.can you please explain me the procedure. i am > referring to [1] [2] [3] [4]. thank you. > BR, > Mahesh. > > [1] https://docs.wso2.com/display/CEP310/Writing+Extensions+to+Siddhi > [2] https://docs.wso2.com/display/CEP310/Writing+a+Custom+Function > [3] https://docs.wso2.com/display/CEP310/Writing+a+Custom+Window > [4] https://docs.wso2.com/display/CEP400/Writing+Extensions+to+Siddhi > > On Thu, May 19, 2016 at 12:08 PM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi Maheshakya, >> thank you for the feedback. I have add data-sets into repo. data-sets/lr. >> I am all right with next week.Now i am writing some examples to collect >> samples and build mini batches and run the algorithms on those >> mini-batches. thank you. will add those into repo soon.I am still working >> on that siddhi extention.i will let you know the progress. >> BR, >> mahesh. >> >> On Thu, May 19, 2016 at 11:10 AM, Maheshakya Wijewardena < >> mahesha...@wso2.com> wrote: >> >>> Hi Mahesh, >>> >>> I've look into your code sample of streaming linear regression. Looks >>> good to me, apart from few issues in coding practices which we can improve >>> when you're doing the implementations in carbon-ml and during the code >>> reviews. You are using a set of files as mini-batches of data, right? Can >>> you also send us the datasets you've been using. I'd like to run this. >>> >>> does that cep problem is now all right that we were trying to fix. I am still using those pre-build versions. If so i can merge with the latest one. >>> >>> >>> I'll check this and let you know. >>> >>> Can we arrange a meeting (preferably in WSO2 offices) in next week with >>> ML team members as well. Coding period begins on next Monday, so it's >>> better to get overall feedback from others and discuss more about the >>> project. Let me know convenient time slots for you. I'll arrange a meeting >>> with ML team. >>> >>> Best regards. >>> >>> On Wed, May 18, 2016 at 9:53 AM, Mahesh Dananjaya < >>> dananjayamah...@gmail.com> wrote: >>> Hi Maheshakya, Ok. I will check it.you have sent me those relevant references and i am working on that thing.thank you. does that cep problem is now all right that we were trying to fix. I am still using those pre-build versions. If so i can merge with the latest one.thanks. BR, Mahesh. On Wed, May 18, 2016 at 9:44 AM, Maheshakya Wijewardena < mahesha...@wso2.com> wrote: > Hi Mahesh, > > You don't actually have to implement anything in spark streaming. Try > to understand how streaming data is handled in and the specifics of the > underlying algorithms in streaming. > What we want to do is having the similar algorithms that support CEP > event streams with siddhi. > > Best regards. > > On Wed, May 18, 2016 at 9:38 AM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi Maheshakya, >> Did you check the repo. I will add recent works today.And also i was >> going through the Java docs related to spark streaming work. It is with >> that scala API. thank you. >> regards, >> Mahesh. >> >> On Tue, May 17, 2016 at 10:11 AM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi Maheshakya, >>> I have gone through the Java Docs and run some of the Spark examples >>> on spark shell which are paramount improtant for our work. Then i have >>> been >>> writing my codes to check the Linear regression, K means for streaming. >>> please check my git repo [1]. I think now i have to ask on dev regarding >>> the capturing event streams for our work. I will update the recent >>> things >>> on git. check the park-example directory for java. examples run on git >>> shell is not included there. In my case i think i have to build mini >>> batches from data streams that comes as individual samples. Now i am >>> working on some coding to collect mini batches from data streams.thank >>> you. >>> regards, >>> Mahesh. >>> [1]https://github.com/dananjayamahesh/GSOC2016 >>> >>> On Tue, May 17, 2016 at 10:10 AM, Mahesh Dananjaya < >>> dananjayamah...@gmail.com> wrote: >>> Hi
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi maheshakya, anyway how can test any siddhi extention after write it without integrating it to cep.can you please explain me the procedure. i am referring to [1] [2] [3] [4]. thank you. BR, Mahesh. [1] https://docs.wso2.com/display/CEP310/Writing+Extensions+to+Siddhi [2] https://docs.wso2.com/display/CEP310/Writing+a+Custom+Function [3] https://docs.wso2.com/display/CEP310/Writing+a+Custom+Window [4] https://docs.wso2.com/display/CEP400/Writing+Extensions+to+Siddhi On Thu, May 19, 2016 at 12:08 PM, Mahesh Dananjaya < dananjayamah...@gmail.com> wrote: > Hi Maheshakya, > thank you for the feedback. I have add data-sets into repo. data-sets/lr. > I am all right with next week.Now i am writing some examples to collect > samples and build mini batches and run the algorithms on those > mini-batches. thank you. will add those into repo soon.I am still working > on that siddhi extention.i will let you know the progress. > BR, > mahesh. > > On Thu, May 19, 2016 at 11:10 AM, Maheshakya Wijewardena < > mahesha...@wso2.com> wrote: > >> Hi Mahesh, >> >> I've look into your code sample of streaming linear regression. Looks >> good to me, apart from few issues in coding practices which we can improve >> when you're doing the implementations in carbon-ml and during the code >> reviews. You are using a set of files as mini-batches of data, right? Can >> you also send us the datasets you've been using. I'd like to run this. >> >> does that cep problem is now all right that we were trying to fix. I am >>> still using those pre-build versions. If so i can merge with the latest one. >> >> >> I'll check this and let you know. >> >> Can we arrange a meeting (preferably in WSO2 offices) in next week with >> ML team members as well. Coding period begins on next Monday, so it's >> better to get overall feedback from others and discuss more about the >> project. Let me know convenient time slots for you. I'll arrange a meeting >> with ML team. >> >> Best regards. >> >> On Wed, May 18, 2016 at 9:53 AM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi Maheshakya, >>> Ok. I will check it.you have sent me those relevant references and i am >>> working on that thing.thank you. does that cep problem is now all right >>> that we were trying to fix. I am still using those pre-build versions. If >>> so i can merge with the latest one.thanks. >>> BR, >>> Mahesh. >>> >>> On Wed, May 18, 2016 at 9:44 AM, Maheshakya Wijewardena < >>> mahesha...@wso2.com> wrote: >>> Hi Mahesh, You don't actually have to implement anything in spark streaming. Try to understand how streaming data is handled in and the specifics of the underlying algorithms in streaming. What we want to do is having the similar algorithms that support CEP event streams with siddhi. Best regards. On Wed, May 18, 2016 at 9:38 AM, Mahesh Dananjaya < dananjayamah...@gmail.com> wrote: > Hi Maheshakya, > Did you check the repo. I will add recent works today.And also i was > going through the Java docs related to spark streaming work. It is with > that scala API. thank you. > regards, > Mahesh. > > On Tue, May 17, 2016 at 10:11 AM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi Maheshakya, >> I have gone through the Java Docs and run some of the Spark examples >> on spark shell which are paramount improtant for our work. Then i have >> been >> writing my codes to check the Linear regression, K means for streaming. >> please check my git repo [1]. I think now i have to ask on dev regarding >> the capturing event streams for our work. I will update the recent things >> on git. check the park-example directory for java. examples run on git >> shell is not included there. In my case i think i have to build mini >> batches from data streams that comes as individual samples. Now i am >> working on some coding to collect mini batches from data streams.thank >> you. >> regards, >> Mahesh. >> [1]https://github.com/dananjayamahesh/GSOC2016 >> >> On Tue, May 17, 2016 at 10:10 AM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi Maheshakya, >>> I have gone through the Java Docs and run some of the Spark examples >>> on spark shell which are paramount improtant for our work. Then i have >>> been >>> writing my codes to check the Linear regression, K means for streaming. >>> please check my git repo [1]. I think now i have to ask on dev regarding >>> the capturing event streams for our work. I will update the recent >>> things >>> on git. check the park-example directory for java. examples run on git >>> shell is not included there. In my case i think i have to build mini >>> batches from data streams that comes as individual samples. Now i am >>> working on some coding to collect mini batches from data streams.thank
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Maheshakya, thank you for the feedback. I have add data-sets into repo. data-sets/lr. I am all right with next week.Now i am writing some examples to collect samples and build mini batches and run the algorithms on those mini-batches. thank you. will add those into repo soon.I am still working on that siddhi extention.i will let you know the progress. BR, mahesh. On Thu, May 19, 2016 at 11:10 AM, Maheshakya Wijewardena < mahesha...@wso2.com> wrote: > Hi Mahesh, > > I've look into your code sample of streaming linear regression. Looks good > to me, apart from few issues in coding practices which we can improve when > you're doing the implementations in carbon-ml and during the code reviews. > You are using a set of files as mini-batches of data, right? Can you also > send us the datasets you've been using. I'd like to run this. > > does that cep problem is now all right that we were trying to fix. I am >> still using those pre-build versions. If so i can merge with the latest one. > > > I'll check this and let you know. > > Can we arrange a meeting (preferably in WSO2 offices) in next week with ML > team members as well. Coding period begins on next Monday, so it's better > to get overall feedback from others and discuss more about the project. Let > me know convenient time slots for you. I'll arrange a meeting with ML team. > > Best regards. > > On Wed, May 18, 2016 at 9:53 AM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi Maheshakya, >> Ok. I will check it.you have sent me those relevant references and i am >> working on that thing.thank you. does that cep problem is now all right >> that we were trying to fix. I am still using those pre-build versions. If >> so i can merge with the latest one.thanks. >> BR, >> Mahesh. >> >> On Wed, May 18, 2016 at 9:44 AM, Maheshakya Wijewardena < >> mahesha...@wso2.com> wrote: >> >>> Hi Mahesh, >>> >>> You don't actually have to implement anything in spark streaming. Try to >>> understand how streaming data is handled in and the specifics of the >>> underlying algorithms in streaming. >>> What we want to do is having the similar algorithms that support CEP >>> event streams with siddhi. >>> >>> Best regards. >>> >>> On Wed, May 18, 2016 at 9:38 AM, Mahesh Dananjaya < >>> dananjayamah...@gmail.com> wrote: >>> Hi Maheshakya, Did you check the repo. I will add recent works today.And also i was going through the Java docs related to spark streaming work. It is with that scala API. thank you. regards, Mahesh. On Tue, May 17, 2016 at 10:11 AM, Mahesh Dananjaya < dananjayamah...@gmail.com> wrote: > Hi Maheshakya, > I have gone through the Java Docs and run some of the Spark examples > on spark shell which are paramount improtant for our work. Then i have > been > writing my codes to check the Linear regression, K means for streaming. > please check my git repo [1]. I think now i have to ask on dev regarding > the capturing event streams for our work. I will update the recent things > on git. check the park-example directory for java. examples run on git > shell is not included there. In my case i think i have to build mini > batches from data streams that comes as individual samples. Now i am > working on some coding to collect mini batches from data streams.thank > you. > regards, > Mahesh. > [1]https://github.com/dananjayamahesh/GSOC2016 > > On Tue, May 17, 2016 at 10:10 AM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi Maheshakya, >> I have gone through the Java Docs and run some of the Spark examples >> on spark shell which are paramount improtant for our work. Then i have >> been >> writing my codes to check the Linear regression, K means for streaming. >> please check my git repo [1]. I think now i have to ask on dev regarding >> the capturing event streams for our work. I will update the recent things >> on git. check the park-example directory for java. examples run on git >> shell is not included there. In my case i think i have to build mini >> batches from data streams that comes as individual samples. Now i am >> working on some coding to collect mini batches from data streams.thank >> you. >> regards, >> Mahesh. >> [1]https://github.com/dananjayamahesh/GSOC2016 >> >> On Mon, May 16, 2016 at 1:19 PM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi Maheshakya, >>> thank you. i will update the repo today.thank you.i changed the >>> carbon ml siddhi extention and see how the changes are effecting. i will >>> update the progress as soon as possible.thank you. i had some problem in >>> spark mllib dependency. i was fixing that. >>> regards, >>> Mahesh. >>> p.s: do i need to maintain a blog? >>> >>> On Mon, May 16, 2016 at 10:02 AM, Maheshakya Wijewardena < >>>
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Mahesh, I've look into your code sample of streaming linear regression. Looks good to me, apart from few issues in coding practices which we can improve when you're doing the implementations in carbon-ml and during the code reviews. You are using a set of files as mini-batches of data, right? Can you also send us the datasets you've been using. I'd like to run this. does that cep problem is now all right that we were trying to fix. I am > still using those pre-build versions. If so i can merge with the latest one. I'll check this and let you know. Can we arrange a meeting (preferably in WSO2 offices) in next week with ML team members as well. Coding period begins on next Monday, so it's better to get overall feedback from others and discuss more about the project. Let me know convenient time slots for you. I'll arrange a meeting with ML team. Best regards. On Wed, May 18, 2016 at 9:53 AM, Mahesh Dananjayawrote: > Hi Maheshakya, > Ok. I will check it.you have sent me those relevant references and i am > working on that thing.thank you. does that cep problem is now all right > that we were trying to fix. I am still using those pre-build versions. If > so i can merge with the latest one.thanks. > BR, > Mahesh. > > On Wed, May 18, 2016 at 9:44 AM, Maheshakya Wijewardena < > mahesha...@wso2.com> wrote: > >> Hi Mahesh, >> >> You don't actually have to implement anything in spark streaming. Try to >> understand how streaming data is handled in and the specifics of the >> underlying algorithms in streaming. >> What we want to do is having the similar algorithms that support CEP >> event streams with siddhi. >> >> Best regards. >> >> On Wed, May 18, 2016 at 9:38 AM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi Maheshakya, >>> Did you check the repo. I will add recent works today.And also i was >>> going through the Java docs related to spark streaming work. It is with >>> that scala API. thank you. >>> regards, >>> Mahesh. >>> >>> On Tue, May 17, 2016 at 10:11 AM, Mahesh Dananjaya < >>> dananjayamah...@gmail.com> wrote: >>> Hi Maheshakya, I have gone through the Java Docs and run some of the Spark examples on spark shell which are paramount improtant for our work. Then i have been writing my codes to check the Linear regression, K means for streaming. please check my git repo [1]. I think now i have to ask on dev regarding the capturing event streams for our work. I will update the recent things on git. check the park-example directory for java. examples run on git shell is not included there. In my case i think i have to build mini batches from data streams that comes as individual samples. Now i am working on some coding to collect mini batches from data streams.thank you. regards, Mahesh. [1]https://github.com/dananjayamahesh/GSOC2016 On Tue, May 17, 2016 at 10:10 AM, Mahesh Dananjaya < dananjayamah...@gmail.com> wrote: > Hi Maheshakya, > I have gone through the Java Docs and run some of the Spark examples > on spark shell which are paramount improtant for our work. Then i have > been > writing my codes to check the Linear regression, K means for streaming. > please check my git repo [1]. I think now i have to ask on dev regarding > the capturing event streams for our work. I will update the recent things > on git. check the park-example directory for java. examples run on git > shell is not included there. In my case i think i have to build mini > batches from data streams that comes as individual samples. Now i am > working on some coding to collect mini batches from data streams.thank > you. > regards, > Mahesh. > [1]https://github.com/dananjayamahesh/GSOC2016 > > On Mon, May 16, 2016 at 1:19 PM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi Maheshakya, >> thank you. i will update the repo today.thank you.i changed the >> carbon ml siddhi extention and see how the changes are effecting. i will >> update the progress as soon as possible.thank you. i had some problem in >> spark mllib dependency. i was fixing that. >> regards, >> Mahesh. >> p.s: do i need to maintain a blog? >> >> On Mon, May 16, 2016 at 10:02 AM, Maheshakya Wijewardena < >> mahesha...@wso2.com> wrote: >> >>> Hi Mahesh, >>> >>> Sorry for replying late. >>> >>> Thank you for the update. I believe you have done some >>> implementations with with Spark MLLIb algorithms in streaming fashion >>> as we >>> have discussed. If so, can you please share your code in a Github repo. >>> >>> Now i want to implements some machine learning algorithms with importing mllib and want to run within your code base >>> >>> For the moment you can try out editing the same class >>> PredictStreamProcessor in the
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Maheshakya, Ok. I will check it.you have sent me those relevant references and i am working on that thing.thank you. does that cep problem is now all right that we were trying to fix. I am still using those pre-build versions. If so i can merge with the latest one.thanks. BR, Mahesh. On Wed, May 18, 2016 at 9:44 AM, Maheshakya Wijewardenawrote: > Hi Mahesh, > > You don't actually have to implement anything in spark streaming. Try to > understand how streaming data is handled in and the specifics of the > underlying algorithms in streaming. > What we want to do is having the similar algorithms that support CEP event > streams with siddhi. > > Best regards. > > On Wed, May 18, 2016 at 9:38 AM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi Maheshakya, >> Did you check the repo. I will add recent works today.And also i was >> going through the Java docs related to spark streaming work. It is with >> that scala API. thank you. >> regards, >> Mahesh. >> >> On Tue, May 17, 2016 at 10:11 AM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi Maheshakya, >>> I have gone through the Java Docs and run some of the Spark examples on >>> spark shell which are paramount improtant for our work. Then i have been >>> writing my codes to check the Linear regression, K means for streaming. >>> please check my git repo [1]. I think now i have to ask on dev regarding >>> the capturing event streams for our work. I will update the recent things >>> on git. check the park-example directory for java. examples run on git >>> shell is not included there. In my case i think i have to build mini >>> batches from data streams that comes as individual samples. Now i am >>> working on some coding to collect mini batches from data streams.thank you. >>> regards, >>> Mahesh. >>> [1]https://github.com/dananjayamahesh/GSOC2016 >>> >>> On Tue, May 17, 2016 at 10:10 AM, Mahesh Dananjaya < >>> dananjayamah...@gmail.com> wrote: >>> Hi Maheshakya, I have gone through the Java Docs and run some of the Spark examples on spark shell which are paramount improtant for our work. Then i have been writing my codes to check the Linear regression, K means for streaming. please check my git repo [1]. I think now i have to ask on dev regarding the capturing event streams for our work. I will update the recent things on git. check the park-example directory for java. examples run on git shell is not included there. In my case i think i have to build mini batches from data streams that comes as individual samples. Now i am working on some coding to collect mini batches from data streams.thank you. regards, Mahesh. [1]https://github.com/dananjayamahesh/GSOC2016 On Mon, May 16, 2016 at 1:19 PM, Mahesh Dananjaya < dananjayamah...@gmail.com> wrote: > Hi Maheshakya, > thank you. i will update the repo today.thank you.i changed the carbon > ml siddhi extention and see how the changes are effecting. i will update > the progress as soon as possible.thank you. i had some problem in spark > mllib dependency. i was fixing that. > regards, > Mahesh. > p.s: do i need to maintain a blog? > > On Mon, May 16, 2016 at 10:02 AM, Maheshakya Wijewardena < > mahesha...@wso2.com> wrote: > >> Hi Mahesh, >> >> Sorry for replying late. >> >> Thank you for the update. I believe you have done some >> implementations with with Spark MLLIb algorithms in streaming fashion as >> we >> have discussed. If so, can you please share your code in a Github repo. >> >> Now i want to implements some machine learning algorithms with >>> importing mllib and want to run within your code base >>> >> >> For the moment you can try out editing the same class >> PredictStreamProcessor in the siddhi extension in carbon-ml. Later we >> will >> add this separately. You should be able to add org.apache.spark.mllib. >> classes to there. >> >> And i want to see how event streams are coming from cep. As i think >>> it is not in a RDD format since it is arriving as the individual >>> samples. I >>> will send a email to dev asking about how to get the streams. >> >> >> Please pay attention to length[1] and lengthbatch[1] inbuilt windows >> in siddhi. What you need to write are functions similar to a custom >> aggregate function[2]. >> When you send the email to dev list, explain your requirement. You >> need to get a set of event with from a stream with a specified window >> size >> (number of events). Then build a model within that function. You also >> need >> to retain the data (learned weights, cluster centers, etc.) from the >> previous window to use in the current window. Ask what can be the most >> suitable option for this among the set of siddhi extensions given. >>
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Mahesh, You don't actually have to implement anything in spark streaming. Try to understand how streaming data is handled in and the specifics of the underlying algorithms in streaming. What we want to do is having the similar algorithms that support CEP event streams with siddhi. Best regards. On Wed, May 18, 2016 at 9:38 AM, Mahesh Dananjayawrote: > Hi Maheshakya, > Did you check the repo. I will add recent works today.And also i was going > through the Java docs related to spark streaming work. It is with that > scala API. thank you. > regards, > Mahesh. > > On Tue, May 17, 2016 at 10:11 AM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi Maheshakya, >> I have gone through the Java Docs and run some of the Spark examples on >> spark shell which are paramount improtant for our work. Then i have been >> writing my codes to check the Linear regression, K means for streaming. >> please check my git repo [1]. I think now i have to ask on dev regarding >> the capturing event streams for our work. I will update the recent things >> on git. check the park-example directory for java. examples run on git >> shell is not included there. In my case i think i have to build mini >> batches from data streams that comes as individual samples. Now i am >> working on some coding to collect mini batches from data streams.thank you. >> regards, >> Mahesh. >> [1]https://github.com/dananjayamahesh/GSOC2016 >> >> On Tue, May 17, 2016 at 10:10 AM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi Maheshakya, >>> I have gone through the Java Docs and run some of the Spark examples on >>> spark shell which are paramount improtant for our work. Then i have been >>> writing my codes to check the Linear regression, K means for streaming. >>> please check my git repo [1]. I think now i have to ask on dev regarding >>> the capturing event streams for our work. I will update the recent things >>> on git. check the park-example directory for java. examples run on git >>> shell is not included there. In my case i think i have to build mini >>> batches from data streams that comes as individual samples. Now i am >>> working on some coding to collect mini batches from data streams.thank you. >>> regards, >>> Mahesh. >>> [1]https://github.com/dananjayamahesh/GSOC2016 >>> >>> On Mon, May 16, 2016 at 1:19 PM, Mahesh Dananjaya < >>> dananjayamah...@gmail.com> wrote: >>> Hi Maheshakya, thank you. i will update the repo today.thank you.i changed the carbon ml siddhi extention and see how the changes are effecting. i will update the progress as soon as possible.thank you. i had some problem in spark mllib dependency. i was fixing that. regards, Mahesh. p.s: do i need to maintain a blog? On Mon, May 16, 2016 at 10:02 AM, Maheshakya Wijewardena < mahesha...@wso2.com> wrote: > Hi Mahesh, > > Sorry for replying late. > > Thank you for the update. I believe you have done some implementations > with with Spark MLLIb algorithms in streaming fashion as we have > discussed. > If so, can you please share your code in a Github repo. > > Now i want to implements some machine learning algorithms with >> importing mllib and want to run within your code base >> > > For the moment you can try out editing the same class > PredictStreamProcessor in the siddhi extension in carbon-ml. Later we will > add this separately. You should be able to add org.apache.spark.mllib. > classes to there. > > And i want to see how event streams are coming from cep. As i think it >> is not in a RDD format since it is arriving as the individual samples. I >> will send a email to dev asking about how to get the streams. > > > Please pay attention to length[1] and lengthbatch[1] inbuilt windows > in siddhi. What you need to write are functions similar to a custom > aggregate function[2]. > When you send the email to dev list, explain your requirement. You > need to get a set of event with from a stream with a specified window size > (number of events). Then build a model within that function. You also need > to retain the data (learned weights, cluster centers, etc.) from the > previous window to use in the current window. Ask what can be the most > suitable option for this among the set of siddhi extensions given. > > Best regards. > > [1] > https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength > [2] > https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function > > On Wed, May 11, 2016 at 1:43 PM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> >> -- Forwarded message -- >> From: Mahesh Dananjaya >> Date: Wed, May 11, 2016 at 1:43 PM >> Subject: Re: [Dev] GSOC2016:
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Mahesh, I'll review your code sample and give you our feedback asap. In the meantime, please go through the documentation for writing siddhi extensions and get some idea. It's better if you can try writing some simple siddhi extensions your self and test them to get a good understanding. Best regards. On Tue, May 17, 2016 at 10:11 AM, Mahesh Dananjaya < dananjayamah...@gmail.com> wrote: > Hi Maheshakya, > I have gone through the Java Docs and run some of the Spark examples on > spark shell which are paramount improtant for our work. Then i have been > writing my codes to check the Linear regression, K means for streaming. > please check my git repo [1]. I think now i have to ask on dev regarding > the capturing event streams for our work. I will update the recent things > on git. check the park-example directory for java. examples run on git > shell is not included there. In my case i think i have to build mini > batches from data streams that comes as individual samples. Now i am > working on some coding to collect mini batches from data streams.thank you. > regards, > Mahesh. > [1]https://github.com/dananjayamahesh/GSOC2016 > > On Tue, May 17, 2016 at 10:10 AM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi Maheshakya, >> I have gone through the Java Docs and run some of the Spark examples on >> spark shell which are paramount improtant for our work. Then i have been >> writing my codes to check the Linear regression, K means for streaming. >> please check my git repo [1]. I think now i have to ask on dev regarding >> the capturing event streams for our work. I will update the recent things >> on git. check the park-example directory for java. examples run on git >> shell is not included there. In my case i think i have to build mini >> batches from data streams that comes as individual samples. Now i am >> working on some coding to collect mini batches from data streams.thank you. >> regards, >> Mahesh. >> [1]https://github.com/dananjayamahesh/GSOC2016 >> >> On Mon, May 16, 2016 at 1:19 PM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi Maheshakya, >>> thank you. i will update the repo today.thank you.i changed the carbon >>> ml siddhi extention and see how the changes are effecting. i will update >>> the progress as soon as possible.thank you. i had some problem in spark >>> mllib dependency. i was fixing that. >>> regards, >>> Mahesh. >>> p.s: do i need to maintain a blog? >>> >>> On Mon, May 16, 2016 at 10:02 AM, Maheshakya Wijewardena < >>> mahesha...@wso2.com> wrote: >>> Hi Mahesh, Sorry for replying late. Thank you for the update. I believe you have done some implementations with with Spark MLLIb algorithms in streaming fashion as we have discussed. If so, can you please share your code in a Github repo. Now i want to implements some machine learning algorithms with > importing mllib and want to run within your code base > For the moment you can try out editing the same class PredictStreamProcessor in the siddhi extension in carbon-ml. Later we will add this separately. You should be able to add org.apache.spark.mllib. classes to there. And i want to see how event streams are coming from cep. As i think it > is not in a RDD format since it is arriving as the individual samples. I > will send a email to dev asking about how to get the streams. Please pay attention to length[1] and lengthbatch[1] inbuilt windows in siddhi. What you need to write are functions similar to a custom aggregate function[2]. When you send the email to dev list, explain your requirement. You need to get a set of event with from a stream with a specified window size (number of events). Then build a model within that function. You also need to retain the data (learned weights, cluster centers, etc.) from the previous window to use in the current window. Ask what can be the most suitable option for this among the set of siddhi extensions given. Best regards. [1] https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength [2] https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function On Wed, May 11, 2016 at 1:43 PM, Mahesh Dananjaya < dananjayamah...@gmail.com> wrote: > > -- Forwarded message -- > From: Mahesh Dananjaya> Date: Wed, May 11, 2016 at 1:43 PM > Subject: Re: [Dev] GSOC2016: [ML][CEP] Predictive analytic with online > data for WSO2 Machine Learner > To: Maheshakya Wijewardena > > > Hi Maheshakya, > sorry for not updating. I did what you wanted me to do. I checked the > code base and train functions. I went through those java docs. I went > through the carbon-ml current implementation of LG and K-Mean. And i had >
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Maheshakya, Did you check the repo. I will add recent works today.And also i was going through the Java docs related to spark streaming work. It is with that scala API. thank you. regards, Mahesh. On Tue, May 17, 2016 at 10:11 AM, Mahesh Dananjaya < dananjayamah...@gmail.com> wrote: > Hi Maheshakya, > I have gone through the Java Docs and run some of the Spark examples on > spark shell which are paramount improtant for our work. Then i have been > writing my codes to check the Linear regression, K means for streaming. > please check my git repo [1]. I think now i have to ask on dev regarding > the capturing event streams for our work. I will update the recent things > on git. check the park-example directory for java. examples run on git > shell is not included there. In my case i think i have to build mini > batches from data streams that comes as individual samples. Now i am > working on some coding to collect mini batches from data streams.thank you. > regards, > Mahesh. > [1]https://github.com/dananjayamahesh/GSOC2016 > > On Tue, May 17, 2016 at 10:10 AM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi Maheshakya, >> I have gone through the Java Docs and run some of the Spark examples on >> spark shell which are paramount improtant for our work. Then i have been >> writing my codes to check the Linear regression, K means for streaming. >> please check my git repo [1]. I think now i have to ask on dev regarding >> the capturing event streams for our work. I will update the recent things >> on git. check the park-example directory for java. examples run on git >> shell is not included there. In my case i think i have to build mini >> batches from data streams that comes as individual samples. Now i am >> working on some coding to collect mini batches from data streams.thank you. >> regards, >> Mahesh. >> [1]https://github.com/dananjayamahesh/GSOC2016 >> >> On Mon, May 16, 2016 at 1:19 PM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi Maheshakya, >>> thank you. i will update the repo today.thank you.i changed the carbon >>> ml siddhi extention and see how the changes are effecting. i will update >>> the progress as soon as possible.thank you. i had some problem in spark >>> mllib dependency. i was fixing that. >>> regards, >>> Mahesh. >>> p.s: do i need to maintain a blog? >>> >>> On Mon, May 16, 2016 at 10:02 AM, Maheshakya Wijewardena < >>> mahesha...@wso2.com> wrote: >>> Hi Mahesh, Sorry for replying late. Thank you for the update. I believe you have done some implementations with with Spark MLLIb algorithms in streaming fashion as we have discussed. If so, can you please share your code in a Github repo. Now i want to implements some machine learning algorithms with > importing mllib and want to run within your code base > For the moment you can try out editing the same class PredictStreamProcessor in the siddhi extension in carbon-ml. Later we will add this separately. You should be able to add org.apache.spark.mllib. classes to there. And i want to see how event streams are coming from cep. As i think it > is not in a RDD format since it is arriving as the individual samples. I > will send a email to dev asking about how to get the streams. Please pay attention to length[1] and lengthbatch[1] inbuilt windows in siddhi. What you need to write are functions similar to a custom aggregate function[2]. When you send the email to dev list, explain your requirement. You need to get a set of event with from a stream with a specified window size (number of events). Then build a model within that function. You also need to retain the data (learned weights, cluster centers, etc.) from the previous window to use in the current window. Ask what can be the most suitable option for this among the set of siddhi extensions given. Best regards. [1] https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength [2] https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function On Wed, May 11, 2016 at 1:43 PM, Mahesh Dananjaya < dananjayamah...@gmail.com> wrote: > > -- Forwarded message -- > From: Mahesh Dananjaya> Date: Wed, May 11, 2016 at 1:43 PM > Subject: Re: [Dev] GSOC2016: [ML][CEP] Predictive analytic with online > data for WSO2 Machine Learner > To: Maheshakya Wijewardena > > > Hi Maheshakya, > sorry for not updating. I did what you wanted me to do. I checked the > code base and train functions. I went through those java docs. I went > through the carbon-ml current implementation of LG and K-Mean. And i had > Apache Spark and i tried with several examples. Now i want to implements > some machine learning algorithms
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Maheshakya, I have gone through the Java Docs and run some of the Spark examples on spark shell which are paramount improtant for our work. Then i have been writing my codes to check the Linear regression, K means for streaming. please check my git repo [1]. I think now i have to ask on dev regarding the capturing event streams for our work. I will update the recent things on git. check the park-example directory for java. examples run on git shell is not included there. In my case i think i have to build mini batches from data streams that comes as individual samples. Now i am working on some coding to collect mini batches from data streams.thank you. regards, Mahesh. [1]https://github.com/dananjayamahesh/GSOC2016 On Tue, May 17, 2016 at 10:10 AM, Mahesh Dananjaya < dananjayamah...@gmail.com> wrote: > Hi Maheshakya, > I have gone through the Java Docs and run some of the Spark examples on > spark shell which are paramount improtant for our work. Then i have been > writing my codes to check the Linear regression, K means for streaming. > please check my git repo [1]. I think now i have to ask on dev regarding > the capturing event streams for our work. I will update the recent things > on git. check the park-example directory for java. examples run on git > shell is not included there. In my case i think i have to build mini > batches from data streams that comes as individual samples. Now i am > working on some coding to collect mini batches from data streams.thank you. > regards, > Mahesh. > [1]https://github.com/dananjayamahesh/GSOC2016 > > On Mon, May 16, 2016 at 1:19 PM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi Maheshakya, >> thank you. i will update the repo today.thank you.i changed the carbon ml >> siddhi extention and see how the changes are effecting. i will update the >> progress as soon as possible.thank you. i had some problem in spark mllib >> dependency. i was fixing that. >> regards, >> Mahesh. >> p.s: do i need to maintain a blog? >> >> On Mon, May 16, 2016 at 10:02 AM, Maheshakya Wijewardena < >> mahesha...@wso2.com> wrote: >> >>> Hi Mahesh, >>> >>> Sorry for replying late. >>> >>> Thank you for the update. I believe you have done some implementations >>> with with Spark MLLIb algorithms in streaming fashion as we have discussed. >>> If so, can you please share your code in a Github repo. >>> >>> Now i want to implements some machine learning algorithms with importing mllib and want to run within your code base >>> >>> For the moment you can try out editing the same class >>> PredictStreamProcessor in the siddhi extension in carbon-ml. Later we will >>> add this separately. You should be able to add org.apache.spark.mllib. >>> classes to there. >>> >>> And i want to see how event streams are coming from cep. As i think it is not in a RDD format since it is arriving as the individual samples. I will send a email to dev asking about how to get the streams. >>> >>> >>> Please pay attention to length[1] and lengthbatch[1] inbuilt windows in >>> siddhi. What you need to write are functions similar to a custom aggregate >>> function[2]. >>> When you send the email to dev list, explain your requirement. You need >>> to get a set of event with from a stream with a specified window size >>> (number of events). Then build a model within that function. You also need >>> to retain the data (learned weights, cluster centers, etc.) from the >>> previous window to use in the current window. Ask what can be the most >>> suitable option for this among the set of siddhi extensions given. >>> >>> Best regards. >>> >>> [1] >>> https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength >>> [2] >>> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function >>> >>> On Wed, May 11, 2016 at 1:43 PM, Mahesh Dananjaya < >>> dananjayamah...@gmail.com> wrote: >>> -- Forwarded message -- From: Mahesh DananjayaDate: Wed, May 11, 2016 at 1:43 PM Subject: Re: [Dev] GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner To: Maheshakya Wijewardena Hi Maheshakya, sorry for not updating. I did what you wanted me to do. I checked the code base and train functions. I went through those java docs. I went through the carbon-ml current implementation of LG and K-Mean. And i had Apache Spark and i tried with several examples. Now i want to implements some machine learning algorithms with importing mllib and want to run within your code base. Can you help me with that. And i want to see how event streams are coming from cep. As i think it is not in a RDD format since it is arriving as the individual samples. I will send a email to dev asking about how to get the streams. I debugged many of those functions in the code base. So need further instructions to
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Maheshakya, thank you. i will update the repo today.thank you.i changed the carbon ml siddhi extention and see how the changes are effecting. i will update the progress as soon as possible.thank you. i had some problem in spark mllib dependency. i was fixing that. regards, Mahesh. p.s: do i need to maintain a blog? On Mon, May 16, 2016 at 10:02 AM, Maheshakya Wijewardena < mahesha...@wso2.com> wrote: > Hi Mahesh, > > Sorry for replying late. > > Thank you for the update. I believe you have done some implementations > with with Spark MLLIb algorithms in streaming fashion as we have discussed. > If so, can you please share your code in a Github repo. > > Now i want to implements some machine learning algorithms with importing >> mllib and want to run within your code base >> > > For the moment you can try out editing the same class > PredictStreamProcessor in the siddhi extension in carbon-ml. Later we will > add this separately. You should be able to add org.apache.spark.mllib. > classes to there. > > And i want to see how event streams are coming from cep. As i think it is >> not in a RDD format since it is arriving as the individual samples. I will >> send a email to dev asking about how to get the streams. > > > Please pay attention to length[1] and lengthbatch[1] inbuilt windows in > siddhi. What you need to write are functions similar to a custom aggregate > function[2]. > When you send the email to dev list, explain your requirement. You need to > get a set of event with from a stream with a specified window size (number > of events). Then build a model within that function. You also need to > retain the data (learned weights, cluster centers, etc.) from the previous > window to use in the current window. Ask what can be the most suitable > option for this among the set of siddhi extensions given. > > Best regards. > > [1] > https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength > [2] > https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function > > On Wed, May 11, 2016 at 1:43 PM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> >> -- Forwarded message -- >> From: Mahesh Dananjaya>> Date: Wed, May 11, 2016 at 1:43 PM >> Subject: Re: [Dev] GSOC2016: [ML][CEP] Predictive analytic with online >> data for WSO2 Machine Learner >> To: Maheshakya Wijewardena >> >> >> Hi Maheshakya, >> sorry for not updating. I did what you wanted me to do. I checked the >> code base and train functions. I went through those java docs. I went >> through the carbon-ml current implementation of LG and K-Mean. And i had >> Apache Spark and i tried with several examples. Now i want to implements >> some machine learning algorithms with importing mllib and want to run >> within your code base. Can you help me with that. >> And i want to see how event streams are coming from cep. As i think it is >> not in a RDD format since it is arriving as the individual samples. I will >> send a email to dev asking about how to get the streams. I debugged many of >> those functions in the code base. So need further instructions to >> proceed.thank you. >> regards, >> Mahesh. >> >> On Wed, May 11, 2016 at 10:32 AM, Maheshakya Wijewardena < >> mahesha...@wso2.com> wrote: >> >>> Hi Mahesh, >>> >>> Any update on your progress? >>> >>> Best regards. >>> >>> On Wed, May 4, 2016 at 8:35 PM, Maheshakya Wijewardena < >>> mahesha...@wso2.com> wrote: >>> Hi Mahesh, is that "Put break points in train methods in Linear Regression class" > means the spark/algorithms/ LinearRegrassion.java class in the > org.wso2.carbon.ml.core? is that the correct file? Yes, this is the correct place. You can refer to spark programming guide[1][2] as well as our ML code base when you try those algorithms out. Please try to do rough implementations of the streaming versions of linear regression, logistic regression and k-means clustering as we have discussed in the proposal in plain Java. It's better if you can create a git repo and share your code once you have made some progress. Were you able debug and understand the flow of the ML siddhi extension? I hope you haven't encountered more errors after switching the released version of CEP. Is this Friday okay for you? Afternoon at 2:00 pm? Best regards. Best regards. [1] http://spark.apache.org/docs/latest/programming-guide.html [2] http://spark.apache.org/docs/latest/mllib-guide.html On Wed, May 4, 2016 at 1:07 PM, Mahesh Dananjaya < dananjayamah...@gmail.com> wrote: > Hi Maheshakya, > I have been looking into some algorithms related to stochastic > gradient descent based algorithms.anything i should focus please let me > know.Ans also i will be available for calling this week and next > week.thank > you. > BR,
Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi Mahesh, Sorry for replying late. Thank you for the update. I believe you have done some implementations with with Spark MLLIb algorithms in streaming fashion as we have discussed. If so, can you please share your code in a Github repo. Now i want to implements some machine learning algorithms with importing > mllib and want to run within your code base > For the moment you can try out editing the same class PredictStreamProcessor in the siddhi extension in carbon-ml. Later we will add this separately. You should be able to add org.apache.spark.mllib. classes to there. And i want to see how event streams are coming from cep. As i think it is > not in a RDD format since it is arriving as the individual samples. I will > send a email to dev asking about how to get the streams. Please pay attention to length[1] and lengthbatch[1] inbuilt windows in siddhi. What you need to write are functions similar to a custom aggregate function[2]. When you send the email to dev list, explain your requirement. You need to get a set of event with from a stream with a specified window size (number of events). Then build a model within that function. You also need to retain the data (learned weights, cluster centers, etc.) from the previous window to use in the current window. Ask what can be the most suitable option for this among the set of siddhi extensions given. Best regards. [1] https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength [2] https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function On Wed, May 11, 2016 at 1:43 PM, Mahesh Dananjayawrote: > > -- Forwarded message -- > From: Mahesh Dananjaya > Date: Wed, May 11, 2016 at 1:43 PM > Subject: Re: [Dev] GSOC2016: [ML][CEP] Predictive analytic with online > data for WSO2 Machine Learner > To: Maheshakya Wijewardena > > > Hi Maheshakya, > sorry for not updating. I did what you wanted me to do. I checked the code > base and train functions. I went through those java docs. I went through > the carbon-ml current implementation of LG and K-Mean. And i had Apache > Spark and i tried with several examples. Now i want to implements some > machine learning algorithms with importing mllib and want to run within > your code base. Can you help me with that. > And i want to see how event streams are coming from cep. As i think it is > not in a RDD format since it is arriving as the individual samples. I will > send a email to dev asking about how to get the streams. I debugged many of > those functions in the code base. So need further instructions to > proceed.thank you. > regards, > Mahesh. > > On Wed, May 11, 2016 at 10:32 AM, Maheshakya Wijewardena < > mahesha...@wso2.com> wrote: > >> Hi Mahesh, >> >> Any update on your progress? >> >> Best regards. >> >> On Wed, May 4, 2016 at 8:35 PM, Maheshakya Wijewardena < >> mahesha...@wso2.com> wrote: >> >>> Hi Mahesh, >>> >>> is that "Put break points in train methods in Linear Regression class" means the spark/algorithms/ LinearRegrassion.java class in the org.wso2.carbon.ml.core? is that the correct file? >>> >>> >>> Yes, this is the correct place. >>> >>> You can refer to spark programming guide[1][2] as well as our ML code >>> base when you try those algorithms out. Please try to do rough >>> implementations of the streaming versions of linear regression, logistic >>> regression and k-means clustering as we have discussed in the proposal in >>> plain Java. It's better if you can create a git repo and share your code >>> once you have made some progress. >>> >>> Were you able debug and understand the flow of the ML siddhi extension? >>> I hope you haven't encountered more errors after switching the released >>> version of CEP. >>> >>> Is this Friday okay for you? Afternoon at 2:00 pm? >>> >>> Best regards. >>> >>> >>> Best regards. >>> >>> [1] http://spark.apache.org/docs/latest/programming-guide.html >>> [2] http://spark.apache.org/docs/latest/mllib-guide.html >>> >>> On Wed, May 4, 2016 at 1:07 PM, Mahesh Dananjaya < >>> dananjayamah...@gmail.com> wrote: >>> Hi Maheshakya, I have been looking into some algorithms related to stochastic gradient descent based algorithms.anything i should focus please let me know.Ans also i will be available for calling this week and next week.thank you. BR, Mahesh. On Tue, May 3, 2016 at 5:05 PM, Mahesh Dananjaya < dananjayamah...@gmail.com> wrote: > Hi Maheshakya, > thank you.that's good. i have been trying to fix that for couple of > days. please inform me when it will be fixed.now i have been testing the > ML > algorithms and trying to identify the flow and the hierarchy. is that "Put > break points in train methods in Linear Regression class" means the > spark/algorithms/ LinearRegrassion.java class in the > org.wso2.carbon.ml.core? is that the correct
[Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
-- Forwarded message -- From: Mahesh DananjayaDate: Wed, May 11, 2016 at 1:43 PM Subject: Re: [Dev] GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner To: Maheshakya Wijewardena Hi Maheshakya, sorry for not updating. I did what you wanted me to do. I checked the code base and train functions. I went through those java docs. I went through the carbon-ml current implementation of LG and K-Mean. And i had Apache Spark and i tried with several examples. Now i want to implements some machine learning algorithms with importing mllib and want to run within your code base. Can you help me with that. And i want to see how event streams are coming from cep. As i think it is not in a RDD format since it is arriving as the individual samples. I will send a email to dev asking about how to get the streams. I debugged many of those functions in the code base. So need further instructions to proceed.thank you. regards, Mahesh. On Wed, May 11, 2016 at 10:32 AM, Maheshakya Wijewardena < mahesha...@wso2.com> wrote: > Hi Mahesh, > > Any update on your progress? > > Best regards. > > On Wed, May 4, 2016 at 8:35 PM, Maheshakya Wijewardena < > mahesha...@wso2.com> wrote: > >> Hi Mahesh, >> >> is that "Put break points in train methods in Linear Regression class" >>> means the spark/algorithms/ LinearRegrassion.java class in the >>> org.wso2.carbon.ml.core? is that the correct file? >> >> >> Yes, this is the correct place. >> >> You can refer to spark programming guide[1][2] as well as our ML code >> base when you try those algorithms out. Please try to do rough >> implementations of the streaming versions of linear regression, logistic >> regression and k-means clustering as we have discussed in the proposal in >> plain Java. It's better if you can create a git repo and share your code >> once you have made some progress. >> >> Were you able debug and understand the flow of the ML siddhi extension? I >> hope you haven't encountered more errors after switching the released >> version of CEP. >> >> Is this Friday okay for you? Afternoon at 2:00 pm? >> >> Best regards. >> >> >> Best regards. >> >> [1] http://spark.apache.org/docs/latest/programming-guide.html >> [2] http://spark.apache.org/docs/latest/mllib-guide.html >> >> On Wed, May 4, 2016 at 1:07 PM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi Maheshakya, >>> I have been looking into some algorithms related to stochastic gradient >>> descent based algorithms.anything i should focus please let me know.Ans >>> also i will be available for calling this week and next week.thank you. >>> BR, >>> Mahesh. >>> >>> On Tue, May 3, 2016 at 5:05 PM, Mahesh Dananjaya < >>> dananjayamah...@gmail.com> wrote: >>> Hi Maheshakya, thank you.that's good. i have been trying to fix that for couple of days. please inform me when it will be fixed.now i have been testing the ML algorithms and trying to identify the flow and the hierarchy. is that "Put break points in train methods in Linear Regression class" means the spark/algorithms/ LinearRegrassion.java class in the org.wso2.carbon.ml.core? is that the correct file? And also i am planning to write some programs to use apache spark mllib algorithms. and i refer to [1] and some wso2 documentations to get some idea about ML structure.thank you. BR, Mahesh. [1]nirmalfdo.blogspot.com On Tue, May 3, 2016 at 4:36 PM, Maheshakya Wijewardena < mahesha...@wso2.com> wrote: > Hi Mahesh, > > I have checked. It seems the issue you have encountered is cause only > in the current development branch of the product-cep. It doesn't identify > the ML siddhi extension as an extension. ML siddhi extension works fine in > the latest release of CEP (4.1.0) [1]. > Until we figure out the reason and come up with a solution, can you > use the latest CEP release for your work. It's fine to use that since you > haven't started actual development yet. > > Best regards. > > [1] http://wso2.com/products/complex-event-processor/ > > On Tue, May 3, 2016 at 3:19 PM, Maheshakya Wijewardena < > mahesha...@wso2.com> wrote: > >> Hi Mahesh, >> >> >>> Is is vital to use those local repo in my upcoming implementation? >> >> >> Yes. The remote p2-repo contains the p2-repos of released versions. >> What you have to develop on is the current master of the carbon-ml and >> product-ml. You can try out with the modification I have suggested. In >> the >> meantime, I'll verify whether the current repos are working as expected. >> >> And also i am trying to debug the carbon-ml org.wso2.carbon.ml.core >>> by putting some break point in the spark/algorithms/Linear Regression >> >> >> It's great that you have started looking at the implementation
[Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner
Hi maheshakya, I have installed them correctly.now I am trying to debug the siddhi extention with the cep as the [1] describes. But when i created an input stream and a predictionStream (output stream). when i was trying to create new execution plan with above streams i got error when i clicked "Validate Query Expression".Error was, Error: No extension exist for StreamFunctionExtension{namespace='ml'} in execution plan "ExecutionPlan" and my expression is like a /* Enter a unique ExecutionPlan */ @Plan:name('ExecutionPlan') /* Enter a unique description for ExecutionPlan */ -- @Plan:description('ExecutionPlan') /* define streams/tables and write queries here ... */ @Import('InputStream:1.0.0') define stream InputStream (NumPregnancies double, TSFT double, DPF double, BMI double, DBP double, PG2 double, Age double, SI2 double); @Export('PredictionStream:1.0.0') define stream PredictionSTream (NumPregnancies double, TSFT double, DPF double, BMI double, DBP double, PG2 double, Age double, SI2 double, Class double); from InputStream#ml:predict('file:///home/mahesh/GSOC/WSO2/data-set/pima-indian-diabetes.data','double') select * insert into PredictionStream i used file instead of registry. And i referred to the [2] and there they mention that solution for fixing CEP is running on distributed mode with apache Storm cluster. 1. Is that CEP i built is originally run as distributed mode? 2. Is this cuased by an not having sudo privilleges in current user when installing ML features onto CEP? 3.Is this the correct way to give file to CEP. [1] https://docs.wso2.com/display/ML110/WSO2+CEP+Extension+for+ML+Predictions#WSO2CEPExtensionforMLPredictions-Siddhisyntaxfortheextension [2]https://wso2.org/jira/browse/CEP-1400 BR, Mahesh. On Mon, May 2, 2016 at 12:35 PM, Maheshakya Wijewardenawrote: > Hi Mahesh, > > If you have built product-ml, you can find the P2-repo at > product-ml/modules/p2-profile/target/p2-repo > Add this folder as a local repository. > After that, you should be able to see the ML features. > > Best regards. > > On Mon, May 2, 2016 at 12:24 PM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi Maheshakya, >> Since i already have carbon-ml built in my pc can i use my local >> repository to install those features in to CEP.is that correct.thank you. >> regards, >> Mahesh. >> >> On Mon, May 2, 2016 at 12:20 PM, Mahesh Dananjaya < >> dananjayamah...@gmail.com> wrote: >> >>> Hi Maheshakya, >>> Can you please tell me how to find the most recent p2 repository URL to >>> add machine learner Core, Machine learner commons, Machine learner database >>> service and ML Siddhi extension to add as features in CEP as describes in >>> the [1]. When i use >>> http://product-dist.wso2.com/p2/carbon/releases/4.2.0/ URL those >>> features are not visible in the CEP.Is that not he most recent one. >>> BR, >>> Mahesh. >>> >>> [1] >>> https://docs.wso2.com/display/ML110/WSO2+CEP+Extension+for+ML+Predictions#WSO2CEPExtensionforMLPredictions-Siddhisyntaxfortheextension >>> >>> On Mon, May 2, 2016 at 11:28 AM, Mahesh Dananjaya < >>> dananjayamah...@gmail.com> wrote: >>> Hi Maheshakya, sorry for the incomplete message.I have set up the dev environment and now i am trying to remotely debug. The following steps were done. 1. build product-cep, carbon-ml and product-ml by source. 2. go through their code bases and trying to understand the way and the flow you developed. 3. i have set up break point in org.wso2.carbon.ml.siddhi.extension in carbon-ml 4. start the ./wso2server.sh debug 5005 in the SNAPSHOT directory of product-ml 5. trying to trigger the break points with the [1] reference.break points are placed in the PredictStreamProcessor.java file within the extention. This is the way i followed. I was trying to remotely debug the ML core by putting break-points in ml core.(org.wso2.carbon.ml.core) in spark java files. Is this the right way to do those things. [1] https://docs.wso2.com/display/ML110/WSO2+CEP+Extension+for+ML+Predictions#WSO2CEPExtensionforMLPredictions-Siddhisyntaxfortheextension On Mon, May 2, 2016 at 11:19 AM, Mahesh Dananjaya < dananjayamah...@gmail.com> wrote: > Hi maheshakya, > I have set up the dev environment and now i am trying to remotely > debug. The following steps were done. > 1. build product-cep, carbon-ml and product-ml by source. > 2. go through their code bases and trying to understand the way and > the flow you developed. > 3. i have set up break point in > > > On Thu, Apr 28, 2016 at 7:05 PM, Mahesh Dananjaya < > dananjayamah...@gmail.com> wrote: > >> Hi Maheshakya, >> ok.i got it.thank you. >> regards, >> Mahesh. >> >> On Thu, Apr 28, 2016 at 6:56 PM, Maheshakya Wijewardena < >> mahesha...@wso2.com> wrote: >> >>> Hi Mahesh, >>> >>>