Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-07-26 Thread Maheshakya Wijewardena
> wrote:
>>>>
>>>>> Hi Nirmal,
>>>>> *This is what i have done so far in the GSOC2016,*
>>>>>
>>>>>- prior research before SGD (Stochastic Gradient Descent)
>>>>>optimization techniques and mini-batch processing
>>>>>- Getting familiar and writing extensions to siddhi
>>>>>- Wrote a Stream Processor extensions for streaming application
>>>>>and machine learning algorithms (Linear Regression,KMeans & Logistic
>>>>>Regression)
>>>>>- Developed a Streaming Linear Regression class for periodically
>>>>>retrain models as mini batch processing with SGD
>>>>>- Extend the functionality for Moving Window Mini Batch Processing
>>>>>with SGD providing windowShift which control data horizon and data
>>>>>obsolescences
>>>>>- Performance evaluation of the implementation
>>>>>- Adding Streaming Linear Regression class and Stream Processor
>>>>>extension to carbon-ml
>>>>>
>>>>>
>>>>> *As a next step,*
>>>>>
>>>>>- Adding Persisting temporal models for applications such as
>>>>>prediction
>>>>>- complete Streaming Kmeans clustering and Logistic Regression
>>>>>classes
>>>>>- Improve batching and streaming mechanisms
>>>>>- improve visualization(optional)
>>>>>- and writing examples and documentation
>>>>>
>>>>> regards,
>>>>>
>>>>> Mahesh.
>>>>>
>>>>> On Wed, Jun 22, 2016 at 10:28 AM, Maheshakya Wijewardena <
>>>>> mahesha...@wso2.com> wrote:
>>>>>
>>>>>> Sorry, you need to put the returned values of the function into the
>>>>>> output stream
>>>>>>
>>>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
>>>>>> salary, rbi, walks, strikeouts, errors)
>>>>>>
>>>>>>
>>>>>>
>>>>>> *select mseinsert into LinregOutput;*
>>>>>> or
>>>>>>
>>>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
>>>>>> salary, rbi, walks, strikeouts, errors)
>>>>>> select *
>>>>>> insert into LinregOutput;
>>>>>>
>>>>>> where LinregOutput stream definition contains all attributes: mse,
>>>>>> intercept, beta1, 
>>>>>>
>>>>>> On Wed, Jun 22, 2016 at 10:24 AM, Maheshakya Wijewardena <
>>>>>> mahesha...@wso2.com> wrote:
>>>>>>
>>>>>>> Hi Mahesh,
>>>>>>>
>>>>>>> In your output stream, you need to list all the attributes that are
>>>>>>> returned from the streamlinreg function: mse, intercept, beta1, 
>>>>>>> Can you try that?
>>>>>>>
>>>>>>> On Wed, Jun 22, 2016 at 10:06 AM, Mahesh Dananjaya <
>>>>>>> dananjayamah...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi Maheshakya,
>>>>>>>> This is the full query i used.
>>>>>>>>
>>>>>>>> @Import('LinRegInput:1.0.0')
>>>>>>>>
>>>>>>>> define stream LinRegInput (salary double, rbi double, walks double,
>>>>>>>> strikeouts double, errors double);
>>>>>>>>
>>>>>>>> @Export('LinRegOutput:1.0.0')
>>>>>>>>
>>>>>>>> define stream LinregOutput (mse double);
>>>>>>>>
>>>>>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0,
>>>>>>>> 0.95, salary, rbi, walks, strikeouts, errors)
>>>>>>>>
>>>>>>>> select *
>>>>>>>> insert into mse;
>>>>>>>>
>>>>>>>> but i am sending [mse,intercept,beta1betap] as a outputData
>>>>>>>> Object[]. SO how can i publish all these infomation on event publisher.
>>>>>>>> regards,
>>>>>>>> Mahesh.
>>>>>>>>
>>>>>>>> On Tue, Jun 21, 2016 at 6:10 PM, 

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-07-26 Thread Mahesh Dananjaya
ng Linear Regression class for periodically
>>>>retrain models as mini batch processing with SGD
>>>>- Extend the functionality for Moving Window Mini Batch Processing
>>>>with SGD providing windowShift which control data horizon and data
>>>>obsolescences
>>>>- Performance evaluation of the implementation
>>>>- Adding Streaming Linear Regression class and Stream Processor
>>>>extension to carbon-ml
>>>>
>>>>
>>>> *As a next step,*
>>>>
>>>>- Adding Persisting temporal models for applications such as
>>>>prediction
>>>>- complete Streaming Kmeans clustering and Logistic Regression
>>>>classes
>>>>- Improve batching and streaming mechanisms
>>>>- improve visualization(optional)
>>>>- and writing examples and documentation
>>>>
>>>> regards,
>>>>
>>>> Mahesh.
>>>>
>>>> On Wed, Jun 22, 2016 at 10:28 AM, Maheshakya Wijewardena <
>>>> mahesha...@wso2.com> wrote:
>>>>
>>>>> Sorry, you need to put the returned values of the function into the
>>>>> output stream
>>>>>
>>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
>>>>> salary, rbi, walks, strikeouts, errors)
>>>>>
>>>>>
>>>>>
>>>>> *select mseinsert into LinregOutput;*
>>>>> or
>>>>>
>>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
>>>>> salary, rbi, walks, strikeouts, errors)
>>>>> select *
>>>>> insert into LinregOutput;
>>>>>
>>>>> where LinregOutput stream definition contains all attributes: mse,
>>>>> intercept, beta1, 
>>>>>
>>>>> On Wed, Jun 22, 2016 at 10:24 AM, Maheshakya Wijewardena <
>>>>> mahesha...@wso2.com> wrote:
>>>>>
>>>>>> Hi Mahesh,
>>>>>>
>>>>>> In your output stream, you need to list all the attributes that are
>>>>>> returned from the streamlinreg function: mse, intercept, beta1, 
>>>>>> Can you try that?
>>>>>>
>>>>>> On Wed, Jun 22, 2016 at 10:06 AM, Mahesh Dananjaya <
>>>>>> dananjayamah...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi Maheshakya,
>>>>>>> This is the full query i used.
>>>>>>>
>>>>>>> @Import('LinRegInput:1.0.0')
>>>>>>>
>>>>>>> define stream LinRegInput (salary double, rbi double, walks double,
>>>>>>> strikeouts double, errors double);
>>>>>>>
>>>>>>> @Export('LinRegOutput:1.0.0')
>>>>>>>
>>>>>>> define stream LinregOutput (mse double);
>>>>>>>
>>>>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0,
>>>>>>> 0.95, salary, rbi, walks, strikeouts, errors)
>>>>>>>
>>>>>>> select *
>>>>>>> insert into mse;
>>>>>>>
>>>>>>> but i am sending [mse,intercept,beta1betap] as a outputData
>>>>>>> Object[]. SO how can i publish all these infomation on event publisher.
>>>>>>> regards,
>>>>>>> Mahesh.
>>>>>>>
>>>>>>> On Tue, Jun 21, 2016 at 6:10 PM, Nirmal Fernando <nir...@wso2.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Mahesh,
>>>>>>>>
>>>>>>>> Can you summarize the work we have done so far and the remaining
>>>>>>>> work items please?
>>>>>>>>
>>>>>>>> Thanks.
>>>>>>>>
>>>>>>>> On Tue, Jun 21, 2016 at 5:56 PM, Mahesh Dananjaya <
>>>>>>>> dananjayamah...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi Maheshakya,
>>>>>>>>> I have updated the repo [2] and upto date documents can be found
>>>>>>>>> at [1].thank you.
>>>>>>>>> regards,
>>>>>>>>> Mahesh.
>>>>>>>>> [1]
>>>&g

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-07-25 Thread Nirmal Fernando
e attributes that are
>>>>> returned from the streamlinreg function: mse, intercept, beta1, 
>>>>> Can you try that?
>>>>>
>>>>> On Wed, Jun 22, 2016 at 10:06 AM, Mahesh Dananjaya <
>>>>> dananjayamah...@gmail.com> wrote:
>>>>>
>>>>>> Hi Maheshakya,
>>>>>> This is the full query i used.
>>>>>>
>>>>>> @Import('LinRegInput:1.0.0')
>>>>>>
>>>>>> define stream LinRegInput (salary double, rbi double, walks double,
>>>>>> strikeouts double, errors double);
>>>>>>
>>>>>> @Export('LinRegOutput:1.0.0')
>>>>>>
>>>>>> define stream LinregOutput (mse double);
>>>>>>
>>>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
>>>>>> salary, rbi, walks, strikeouts, errors)
>>>>>>
>>>>>> select *
>>>>>> insert into mse;
>>>>>>
>>>>>> but i am sending [mse,intercept,beta1....betap] as a outputData
>>>>>> Object[]. SO how can i publish all these infomation on event publisher.
>>>>>> regards,
>>>>>> Mahesh.
>>>>>>
>>>>>> On Tue, Jun 21, 2016 at 6:10 PM, Nirmal Fernando <nir...@wso2.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Mahesh,
>>>>>>>
>>>>>>> Can you summarize the work we have done so far and the remaining
>>>>>>> work items please?
>>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>> On Tue, Jun 21, 2016 at 5:56 PM, Mahesh Dananjaya <
>>>>>>> dananjayamah...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi Maheshakya,
>>>>>>>> I have updated the repo [2] and upto date documents can be found at
>>>>>>>> [1].thank you.
>>>>>>>> regards,
>>>>>>>> Mahesh.
>>>>>>>> [1]
>>>>>>>> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming
>>>>>>>> [2]
>>>>>>>> https://github.com/dananjayamahesh/carbon-ml/tree/wso2_gsoc_ml6_cml
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Jun 21, 2016 at 5:08 PM, Mahesh Dananjaya <
>>>>>>>> dananjayamah...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> -- Forwarded message --
>>>>>>>>> From: Mahesh Dananjaya <dananjayamah...@gmail.com>
>>>>>>>>> Date: Tue, Jun 21, 2016 at 5:08 PM
>>>>>>>>> Subject: Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic
>>>>>>>>> with online data for WSO2 Machine Learner
>>>>>>>>> To: Maheshakya Wijewardena <mahesha...@wso2.com>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Hi Maheshakya,
>>>>>>>>> new query is like this adding spport for moving window methods.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> @Import('LinRegInput:1.0.1')
>>>>>>>>> define stream LinRegInput (salary double, rbi double, walks
>>>>>>>>> double, strikeouts double, errors double);
>>>>>>>>>
>>>>>>>>> @Export('LinRegOutput:1.0.1')
>>>>>>>>> define stream LinRegOutput (mse double);
>>>>>>>>>
>>>>>>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0,
>>>>>>>>> 0.95, salary, rbi, walks, strikeouts, errors)
>>>>>>>>> select *
>>>>>>>>> insert into mse;
>>>>>>>>> 1=learnType
>>>>>>>>> 2=windowShift
>>>>>>>>> 4=batchSize...
>>>>>>>>>
>>>>>>>>> windowShift is added to configure the amount of shift. i have
>>>>>>>>> added log.infe(mse) to view the MSE.
>>>>>>>>> Mahesh.
>>>>>>>>>
>>>>>&

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-07-13 Thread Mahesh Dananjaya
gt;>>>> strikeouts double, errors double);
>>>>>
>>>>> @Export('LinRegOutput:1.0.0')
>>>>>
>>>>> define stream LinregOutput (mse double);
>>>>>
>>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
>>>>> salary, rbi, walks, strikeouts, errors)
>>>>>
>>>>> select *
>>>>> insert into mse;
>>>>>
>>>>> but i am sending [mse,intercept,beta1betap] as a outputData
>>>>> Object[]. SO how can i publish all these infomation on event publisher.
>>>>> regards,
>>>>> Mahesh.
>>>>>
>>>>> On Tue, Jun 21, 2016 at 6:10 PM, Nirmal Fernando <nir...@wso2.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Mahesh,
>>>>>>
>>>>>> Can you summarize the work we have done so far and the remaining work
>>>>>> items please?
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>> On Tue, Jun 21, 2016 at 5:56 PM, Mahesh Dananjaya <
>>>>>> dananjayamah...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi Maheshakya,
>>>>>>> I have updated the repo [2] and upto date documents can be found at
>>>>>>> [1].thank you.
>>>>>>> regards,
>>>>>>> Mahesh.
>>>>>>> [1]
>>>>>>> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming
>>>>>>> [2]
>>>>>>> https://github.com/dananjayamahesh/carbon-ml/tree/wso2_gsoc_ml6_cml
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Jun 21, 2016 at 5:08 PM, Mahesh Dananjaya <
>>>>>>> dananjayamah...@gmail.com> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> -- Forwarded message --
>>>>>>>> From: Mahesh Dananjaya <dananjayamah...@gmail.com>
>>>>>>>> Date: Tue, Jun 21, 2016 at 5:08 PM
>>>>>>>> Subject: Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic
>>>>>>>> with online data for WSO2 Machine Learner
>>>>>>>> To: Maheshakya Wijewardena <mahesha...@wso2.com>
>>>>>>>>
>>>>>>>>
>>>>>>>> Hi Maheshakya,
>>>>>>>> new query is like this adding spport for moving window methods.
>>>>>>>>
>>>>>>>>
>>>>>>>> @Import('LinRegInput:1.0.1')
>>>>>>>> define stream LinRegInput (salary double, rbi double, walks double,
>>>>>>>> strikeouts double, errors double);
>>>>>>>>
>>>>>>>> @Export('LinRegOutput:1.0.1')
>>>>>>>> define stream LinRegOutput (mse double);
>>>>>>>>
>>>>>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0,
>>>>>>>> 0.95, salary, rbi, walks, strikeouts, errors)
>>>>>>>> select *
>>>>>>>> insert into mse;
>>>>>>>> 1=learnType
>>>>>>>> 2=windowShift
>>>>>>>> 4=batchSize...
>>>>>>>>
>>>>>>>> windowShift is added to configure the amount of shift. i have added
>>>>>>>> log.infe(mse) to view the MSE.
>>>>>>>> Mahesh.
>>>>>>>>
>>>>>>>> On Tue, Jun 21, 2016 at 2:33 PM, Maheshakya Wijewardena <
>>>>>>>> mahesha...@wso2.com> wrote:
>>>>>>>>
>>>>>>>>> Hi Mahesh,
>>>>>>>>>
>>>>>>>>> If you are installing features  from new p2 repo into a new CEP
>>>>>>>>> pack, then you wont need to replace those jars.
>>>>>>>>> If you have already installed those in the CEP from a previous
>>>>>>>>> p2-repo, then you have to un-install those features and reinstall 
>>>>>>>>> with new
>>>>>>>>> p2 repo. But you don't need to do this because you can just replace 
>>>>>>>>> the
>>>>>>>>> jar. It's easy.
>>>>>>>

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-29 Thread Mahesh Dananjaya
gt;>>>>>
>>>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
>>>>>> salary, rbi, walks, strikeouts, errors)
>>>>>> select *
>>>>>> insert into LinregOutput;
>>>>>>
>>>>>> where LinregOutput stream definition contains all attributes: mse,
>>>>>> intercept, beta1, 
>>>>>>
>>>>>> On Wed, Jun 22, 2016 at 10:24 AM, Maheshakya Wijewardena <
>>>>>> mahesha...@wso2.com> wrote:
>>>>>>
>>>>>>> Hi Mahesh,
>>>>>>>
>>>>>>> In your output stream, you need to list all the attributes that are
>>>>>>> returned from the streamlinreg function: mse, intercept, beta1, 
>>>>>>> Can you try that?
>>>>>>>
>>>>>>> On Wed, Jun 22, 2016 at 10:06 AM, Mahesh Dananjaya <
>>>>>>> dananjayamah...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi Maheshakya,
>>>>>>>> This is the full query i used.
>>>>>>>>
>>>>>>>> @Import('LinRegInput:1.0.0')
>>>>>>>>
>>>>>>>> define stream LinRegInput (salary double, rbi double, walks double,
>>>>>>>> strikeouts double, errors double);
>>>>>>>>
>>>>>>>> @Export('LinRegOutput:1.0.0')
>>>>>>>>
>>>>>>>> define stream LinregOutput (mse double);
>>>>>>>>
>>>>>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0,
>>>>>>>> 0.95, salary, rbi, walks, strikeouts, errors)
>>>>>>>>
>>>>>>>> select *
>>>>>>>> insert into mse;
>>>>>>>>
>>>>>>>> but i am sending [mse,intercept,beta1betap] as a outputData
>>>>>>>> Object[]. SO how can i publish all these infomation on event publisher.
>>>>>>>> regards,
>>>>>>>> Mahesh.
>>>>>>>>
>>>>>>>> On Tue, Jun 21, 2016 at 6:10 PM, Nirmal Fernando <nir...@wso2.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi Mahesh,
>>>>>>>>>
>>>>>>>>> Can you summarize the work we have done so far and the remaining
>>>>>>>>> work items please?
>>>>>>>>>
>>>>>>>>> Thanks.
>>>>>>>>>
>>>>>>>>> On Tue, Jun 21, 2016 at 5:56 PM, Mahesh Dananjaya <
>>>>>>>>> dananjayamah...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Maheshakya,
>>>>>>>>>> I have updated the repo [2] and upto date documents can be found
>>>>>>>>>> at [1].thank you.
>>>>>>>>>> regards,
>>>>>>>>>> Mahesh.
>>>>>>>>>> [1]
>>>>>>>>>> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming
>>>>>>>>>> [2]
>>>>>>>>>> https://github.com/dananjayamahesh/carbon-ml/tree/wso2_gsoc_ml6_cml
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Jun 21, 2016 at 5:08 PM, Mahesh Dananjaya <
>>>>>>>>>> dananjayamah...@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> -- Forwarded message --
>>>>>>>>>>> From: Mahesh Dananjaya <dananjayamah...@gmail.com>
>>>>>>>>>>> Date: Tue, Jun 21, 2016 at 5:08 PM
>>>>>>>>>>> Subject: Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic
>>>>>>>>>>> with online data for WSO2 Machine Learner
>>>>>>>>>>> To: Maheshakya Wijewardena <mahesha...@wso2.com>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Hi Maheshakya,
>>>>>>>>>>> new query is like this adding spport for moving window methods.
&

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-22 Thread Mahesh Dananjaya
Hi Maheshakya,
can i give external data sources like data from database , data from HDFS
to generate events in the cep event simulator rather than giving a file. i
saw "Switch to upload file for simulation" in the input Data By Data Source
in  the event simulator. How can i feed data real time from other sources
or directly as data generating from remote server as JSON or etc... What
format the database should be.This is just for my knowledge.thank you.
regards,
Mahesh.

On Wed, Jun 22, 2016 at 10:59 AM, Mahesh Dananjaya <
dananjayamah...@gmail.com> wrote:

> Hi Nirmal,
> *This is what i have done so far in the GSOC2016,*
>
>- prior research before SGD (Stochastic Gradient Descent) optimization
>techniques and mini-batch processing
>- Getting familiar and writing extensions to siddhi
>- Wrote a Stream Processor extensions for streaming application and
>machine learning algorithms (Linear Regression,KMeans & Logistic 
> Regression)
>- Developed a Streaming Linear Regression class for periodically
>retrain models as mini batch processing with SGD
>- Extend the functionality for Moving Window Mini Batch Processing
>with SGD providing windowShift which control data horizon and data
>obsolescences
>- Performance evaluation of the implementation
>- Adding Streaming Linear Regression class and Stream Processor
>extension to carbon-ml
>
>
> *As a next step,*
>
>- Adding Persisting temporal models for applications such as prediction
>- complete Streaming Kmeans clustering and Logistic Regression classes
>- Improve batching and streaming mechanisms
>- improve visualization(optional)
>- and writing examples and documentation
>
> regards,
>
> Mahesh.
>
> On Wed, Jun 22, 2016 at 10:28 AM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Sorry, you need to put the returned values of the function into the
>> output stream
>>
>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
>> salary, rbi, walks, strikeouts, errors)
>>
>>
>>
>> *select mseinsert into LinregOutput;*
>> or
>>
>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
>> salary, rbi, walks, strikeouts, errors)
>> select *
>> insert into LinregOutput;
>>
>> where LinregOutput stream definition contains all attributes: mse,
>> intercept, beta1, 
>>
>> On Wed, Jun 22, 2016 at 10:24 AM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Hi Mahesh,
>>>
>>> In your output stream, you need to list all the attributes that are
>>> returned from the streamlinreg function: mse, intercept, beta1, 
>>> Can you try that?
>>>
>>> On Wed, Jun 22, 2016 at 10:06 AM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> Hi Maheshakya,
>>>> This is the full query i used.
>>>>
>>>> @Import('LinRegInput:1.0.0')
>>>>
>>>> define stream LinRegInput (salary double, rbi double, walks double,
>>>> strikeouts double, errors double);
>>>>
>>>> @Export('LinRegOutput:1.0.0')
>>>>
>>>> define stream LinregOutput (mse double);
>>>>
>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
>>>> salary, rbi, walks, strikeouts, errors)
>>>>
>>>> select *
>>>> insert into mse;
>>>>
>>>> but i am sending [mse,intercept,beta1betap] as a outputData
>>>> Object[]. SO how can i publish all these infomation on event publisher.
>>>> regards,
>>>> Mahesh.
>>>>
>>>> On Tue, Jun 21, 2016 at 6:10 PM, Nirmal Fernando <nir...@wso2.com>
>>>> wrote:
>>>>
>>>>> Hi Mahesh,
>>>>>
>>>>> Can you summarize the work we have done so far and the remaining work
>>>>> items please?
>>>>>
>>>>> Thanks.
>>>>>
>>>>> On Tue, Jun 21, 2016 at 5:56 PM, Mahesh Dananjaya <
>>>>> dananjayamah...@gmail.com> wrote:
>>>>>
>>>>>> Hi Maheshakya,
>>>>>> I have updated the repo [2] and upto date documents can be found at
>>>>>> [1].thank you.
>>>>>> regards,
>>>>>> Mahesh.
>>>>>> [1]
>>>>>> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming
>>>>>> [2]
>>>>&g

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-21 Thread Mahesh Dananjaya
Hi Nirmal,
*This is what i have done so far in the GSOC2016,*

   - prior research before SGD (Stochastic Gradient Descent) optimization
   techniques and mini-batch processing
   - Getting familiar and writing extensions to siddhi
   - Wrote a Stream Processor extensions for streaming application and
   machine learning algorithms (Linear Regression,KMeans & Logistic Regression)
   - Developed a Streaming Linear Regression class for periodically retrain
   models as mini batch processing with SGD
   - Extend the functionality for Moving Window Mini Batch Processing with
   SGD providing windowShift which control data horizon and data obsolescences
   - Performance evaluation of the implementation
   - Adding Streaming Linear Regression class and Stream Processor
   extension to carbon-ml


*As a next step,*

   - Adding Persisting temporal models for applications such as prediction
   - complete Streaming Kmeans clustering and Logistic Regression classes
   - Improve batching and streaming mechanisms
   - improve visualization(optional)
   - and writing examples and documentation

regards,

Mahesh.

On Wed, Jun 22, 2016 at 10:28 AM, Maheshakya Wijewardena <
mahesha...@wso2.com> wrote:

> Sorry, you need to put the returned values of the function into the output
> stream
>
> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
> salary, rbi, walks, strikeouts, errors)
>
>
>
> *select mseinsert into LinregOutput;*
> or
>
> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
> salary, rbi, walks, strikeouts, errors)
> select *
> insert into LinregOutput;
>
> where LinregOutput stream definition contains all attributes: mse,
> intercept, beta1, 
>
> On Wed, Jun 22, 2016 at 10:24 AM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> In your output stream, you need to list all the attributes that are
>> returned from the streamlinreg function: mse, intercept, beta1, 
>> Can you try that?
>>
>> On Wed, Jun 22, 2016 at 10:06 AM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> This is the full query i used.
>>>
>>> @Import('LinRegInput:1.0.0')
>>>
>>> define stream LinRegInput (salary double, rbi double, walks double,
>>> strikeouts double, errors double);
>>>
>>> @Export('LinRegOutput:1.0.0')
>>>
>>> define stream LinregOutput (mse double);
>>>
>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
>>> salary, rbi, walks, strikeouts, errors)
>>>
>>> select *
>>> insert into mse;
>>>
>>> but i am sending [mse,intercept,beta1betap] as a outputData
>>> Object[]. SO how can i publish all these infomation on event publisher.
>>> regards,
>>> Mahesh.
>>>
>>> On Tue, Jun 21, 2016 at 6:10 PM, Nirmal Fernando <nir...@wso2.com>
>>> wrote:
>>>
>>>> Hi Mahesh,
>>>>
>>>> Can you summarize the work we have done so far and the remaining work
>>>> items please?
>>>>
>>>> Thanks.
>>>>
>>>> On Tue, Jun 21, 2016 at 5:56 PM, Mahesh Dananjaya <
>>>> dananjayamah...@gmail.com> wrote:
>>>>
>>>>> Hi Maheshakya,
>>>>> I have updated the repo [2] and upto date documents can be found at
>>>>> [1].thank you.
>>>>> regards,
>>>>> Mahesh.
>>>>> [1]
>>>>> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming
>>>>> [2]
>>>>> https://github.com/dananjayamahesh/carbon-ml/tree/wso2_gsoc_ml6_cml
>>>>>
>>>>>
>>>>> On Tue, Jun 21, 2016 at 5:08 PM, Mahesh Dananjaya <
>>>>> dananjayamah...@gmail.com> wrote:
>>>>>
>>>>>>
>>>>>> -- Forwarded message --
>>>>>> From: Mahesh Dananjaya <dananjayamah...@gmail.com>
>>>>>> Date: Tue, Jun 21, 2016 at 5:08 PM
>>>>>> Subject: Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with
>>>>>> online data for WSO2 Machine Learner
>>>>>> To: Maheshakya Wijewardena <mahesha...@wso2.com>
>>>>>>
>>>>>>
>>>>>> Hi Maheshakya,
>>>>>> new query is like this adding spport for moving window methods.
>>>>>>
>>>>>>
>>>>>> @Import('LinRegInput:1.0.1')
>>>>>>

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-21 Thread Maheshakya Wijewardena
Sorry, you need to put the returned values of the function into the output
stream

from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
salary, rbi, walks, strikeouts, errors)



*select mseinsert into LinregOutput;*
or

from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
salary, rbi, walks, strikeouts, errors)
select *
insert into LinregOutput;

where LinregOutput stream definition contains all attributes: mse,
intercept, beta1, 

On Wed, Jun 22, 2016 at 10:24 AM, Maheshakya Wijewardena <
mahesha...@wso2.com> wrote:

> Hi Mahesh,
>
> In your output stream, you need to list all the attributes that are
> returned from the streamlinreg function: mse, intercept, beta1, 
> Can you try that?
>
> On Wed, Jun 22, 2016 at 10:06 AM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> This is the full query i used.
>>
>> @Import('LinRegInput:1.0.0')
>>
>> define stream LinRegInput (salary double, rbi double, walks double,
>> strikeouts double, errors double);
>>
>> @Export('LinRegOutput:1.0.0')
>>
>> define stream LinregOutput (mse double);
>>
>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
>> salary, rbi, walks, strikeouts, errors)
>>
>> select *
>> insert into mse;
>>
>> but i am sending [mse,intercept,beta1betap] as a outputData Object[].
>> SO how can i publish all these infomation on event publisher.
>> regards,
>> Mahesh.
>>
>> On Tue, Jun 21, 2016 at 6:10 PM, Nirmal Fernando <nir...@wso2.com> wrote:
>>
>>> Hi Mahesh,
>>>
>>> Can you summarize the work we have done so far and the remaining work
>>> items please?
>>>
>>> Thanks.
>>>
>>> On Tue, Jun 21, 2016 at 5:56 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> Hi Maheshakya,
>>>> I have updated the repo [2] and upto date documents can be found at
>>>> [1].thank you.
>>>> regards,
>>>> Mahesh.
>>>> [1]
>>>> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming
>>>> [2] https://github.com/dananjayamahesh/carbon-ml/tree/wso2_gsoc_ml6_cml
>>>>
>>>>
>>>> On Tue, Jun 21, 2016 at 5:08 PM, Mahesh Dananjaya <
>>>> dananjayamah...@gmail.com> wrote:
>>>>
>>>>>
>>>>> -- Forwarded message --
>>>>> From: Mahesh Dananjaya <dananjayamah...@gmail.com>
>>>>> Date: Tue, Jun 21, 2016 at 5:08 PM
>>>>> Subject: Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with
>>>>> online data for WSO2 Machine Learner
>>>>> To: Maheshakya Wijewardena <mahesha...@wso2.com>
>>>>>
>>>>>
>>>>> Hi Maheshakya,
>>>>> new query is like this adding spport for moving window methods.
>>>>>
>>>>>
>>>>> @Import('LinRegInput:1.0.1')
>>>>> define stream LinRegInput (salary double, rbi double, walks double,
>>>>> strikeouts double, errors double);
>>>>>
>>>>> @Export('LinRegOutput:1.0.1')
>>>>> define stream LinRegOutput (mse double);
>>>>>
>>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
>>>>> salary, rbi, walks, strikeouts, errors)
>>>>> select *
>>>>> insert into mse;
>>>>> 1=learnType
>>>>> 2=windowShift
>>>>> 4=batchSize...
>>>>>
>>>>> windowShift is added to configure the amount of shift. i have added
>>>>> log.infe(mse) to view the MSE.
>>>>> Mahesh.
>>>>>
>>>>> On Tue, Jun 21, 2016 at 2:33 PM, Maheshakya Wijewardena <
>>>>> mahesha...@wso2.com> wrote:
>>>>>
>>>>>> Hi Mahesh,
>>>>>>
>>>>>> If you are installing features  from new p2 repo into a new CEP pack,
>>>>>> then you wont need to replace those jars.
>>>>>> If you have already installed those in the CEP from a previous
>>>>>> p2-repo, then you have to un-install those features and reinstall with 
>>>>>> new
>>>>>> p2 repo. But you don't need to do this because you can just replace the
>>>>>> jar. It's easy.
>>>>>>
>>>>>> Best regards.
>>>>>&

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-21 Thread Maheshakya Wijewardena
Hi Mahesh,

In your output stream, you need to list all the attributes that are
returned from the streamlinreg function: mse, intercept, beta1, 
Can you try that?

On Wed, Jun 22, 2016 at 10:06 AM, Mahesh Dananjaya <
dananjayamah...@gmail.com> wrote:

> Hi Maheshakya,
> This is the full query i used.
>
> @Import('LinRegInput:1.0.0')
>
> define stream LinRegInput (salary double, rbi double, walks double,
> strikeouts double, errors double);
>
> @Export('LinRegOutput:1.0.0')
>
> define stream LinregOutput (mse double);
>
> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
> salary, rbi, walks, strikeouts, errors)
>
> select *
> insert into mse;
>
> but i am sending [mse,intercept,beta1betap] as a outputData Object[].
> SO how can i publish all these infomation on event publisher.
> regards,
> Mahesh.
>
> On Tue, Jun 21, 2016 at 6:10 PM, Nirmal Fernando <nir...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> Can you summarize the work we have done so far and the remaining work
>> items please?
>>
>> Thanks.
>>
>> On Tue, Jun 21, 2016 at 5:56 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> I have updated the repo [2] and upto date documents can be found at
>>> [1].thank you.
>>> regards,
>>> Mahesh.
>>> [1]
>>> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming
>>> [2] https://github.com/dananjayamahesh/carbon-ml/tree/wso2_gsoc_ml6_cml
>>>
>>>
>>> On Tue, Jun 21, 2016 at 5:08 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>>
>>>> -- Forwarded message --
>>>> From: Mahesh Dananjaya <dananjayamah...@gmail.com>
>>>> Date: Tue, Jun 21, 2016 at 5:08 PM
>>>> Subject: Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with
>>>> online data for WSO2 Machine Learner
>>>> To: Maheshakya Wijewardena <mahesha...@wso2.com>
>>>>
>>>>
>>>> Hi Maheshakya,
>>>> new query is like this adding spport for moving window methods.
>>>>
>>>>
>>>> @Import('LinRegInput:1.0.1')
>>>> define stream LinRegInput (salary double, rbi double, walks double,
>>>> strikeouts double, errors double);
>>>>
>>>> @Export('LinRegOutput:1.0.1')
>>>> define stream LinRegOutput (mse double);
>>>>
>>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
>>>> salary, rbi, walks, strikeouts, errors)
>>>> select *
>>>> insert into mse;
>>>> 1=learnType
>>>> 2=windowShift
>>>> 4=batchSize...
>>>>
>>>> windowShift is added to configure the amount of shift. i have added
>>>> log.infe(mse) to view the MSE.
>>>> Mahesh.
>>>>
>>>> On Tue, Jun 21, 2016 at 2:33 PM, Maheshakya Wijewardena <
>>>> mahesha...@wso2.com> wrote:
>>>>
>>>>> Hi Mahesh,
>>>>>
>>>>> If you are installing features  from new p2 repo into a new CEP pack,
>>>>> then you wont need to replace those jars.
>>>>> If you have already installed those in the CEP from a previous
>>>>> p2-repo, then you have to un-install those features and reinstall with new
>>>>> p2 repo. But you don't need to do this because you can just replace the
>>>>> jar. It's easy.
>>>>>
>>>>> Best regards.
>>>>>
>>>>> On Tue, Jun 21, 2016 at 2:26 PM, Mahesh Dananjaya <
>>>>> dananjayamah...@gmail.com> wrote:
>>>>>
>>>>>> Hi Maheshakya,
>>>>>> If i built the carbon-ml then product-ml and point new p2 repository
>>>>>> to cep features, do i need to copy that
>>>>>> org.wso2.carbon.ml.siddhi.extension1.1. thing into
>>>>>> cep_home/repository/component/... place.
>>>>>> regards,
>>>>>> Mahesh.
>>>>>>
>>>>>> On Thu, Jun 16, 2016 at 6:39 PM, Mahesh Dananjaya <
>>>>>> dananjayamah...@gmail.com> wrote:
>>>>>>
>>>>>>> In MLModelhandler there's persistModel method
>>>>>>> debug that method while trying to train a model from ML
>>>>>>> you can see the steps it takes
>>>>>>> don't use deep

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-21 Thread Mahesh Dananjaya
Hi Maheshakya,
This is the full query i used.

@Import('LinRegInput:1.0.0')

define stream LinRegInput (salary double, rbi double, walks double,
strikeouts double, errors double);

@Export('LinRegOutput:1.0.0')

define stream LinregOutput (mse double);

from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
salary, rbi, walks, strikeouts, errors)

select *
insert into mse;

but i am sending [mse,intercept,beta1betap] as a outputData Object[].
SO how can i publish all these infomation on event publisher.
regards,
Mahesh.

On Tue, Jun 21, 2016 at 6:10 PM, Nirmal Fernando <nir...@wso2.com> wrote:

> Hi Mahesh,
>
> Can you summarize the work we have done so far and the remaining work
> items please?
>
> Thanks.
>
> On Tue, Jun 21, 2016 at 5:56 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> I have updated the repo [2] and upto date documents can be found at
>> [1].thank you.
>> regards,
>> Mahesh.
>> [1]
>> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming
>> [2] https://github.com/dananjayamahesh/carbon-ml/tree/wso2_gsoc_ml6_cml
>>
>>
>> On Tue, Jun 21, 2016 at 5:08 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>>
>>> -- Forwarded message ----------
>>> From: Mahesh Dananjaya <dananjayamah...@gmail.com>
>>> Date: Tue, Jun 21, 2016 at 5:08 PM
>>> Subject: Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with
>>> online data for WSO2 Machine Learner
>>> To: Maheshakya Wijewardena <mahesha...@wso2.com>
>>>
>>>
>>> Hi Maheshakya,
>>> new query is like this adding spport for moving window methods.
>>>
>>>
>>> @Import('LinRegInput:1.0.1')
>>> define stream LinRegInput (salary double, rbi double, walks double,
>>> strikeouts double, errors double);
>>>
>>> @Export('LinRegOutput:1.0.1')
>>> define stream LinRegOutput (mse double);
>>>
>>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
>>> salary, rbi, walks, strikeouts, errors)
>>> select *
>>> insert into mse;
>>> 1=learnType
>>> 2=windowShift
>>> 4=batchSize...
>>>
>>> windowShift is added to configure the amount of shift. i have added
>>> log.infe(mse) to view the MSE.
>>> Mahesh.
>>>
>>> On Tue, Jun 21, 2016 at 2:33 PM, Maheshakya Wijewardena <
>>> mahesha...@wso2.com> wrote:
>>>
>>>> Hi Mahesh,
>>>>
>>>> If you are installing features  from new p2 repo into a new CEP pack,
>>>> then you wont need to replace those jars.
>>>> If you have already installed those in the CEP from a previous p2-repo,
>>>> then you have to un-install those features and reinstall with new p2 repo.
>>>> But you don't need to do this because you can just replace the jar. It's
>>>> easy.
>>>>
>>>> Best regards.
>>>>
>>>> On Tue, Jun 21, 2016 at 2:26 PM, Mahesh Dananjaya <
>>>> dananjayamah...@gmail.com> wrote:
>>>>
>>>>> Hi Maheshakya,
>>>>> If i built the carbon-ml then product-ml and point new p2 repository
>>>>> to cep features, do i need to copy that
>>>>> org.wso2.carbon.ml.siddhi.extension1.1. thing into
>>>>> cep_home/repository/component/... place.
>>>>> regards,
>>>>> Mahesh.
>>>>>
>>>>> On Thu, Jun 16, 2016 at 6:39 PM, Mahesh Dananjaya <
>>>>> dananjayamah...@gmail.com> wrote:
>>>>>
>>>>>> In MLModelhandler there's persistModel method
>>>>>> debug that method while trying to train a model from ML
>>>>>> you can see the steps it takes
>>>>>> don't use deep learning algorithm
>>>>>> any other algorithm would work
>>>>>> from line 777 is the section for creating the serializable object
>>>>>> from trained model and saving it
>>>>>>
>>>>>>
>>>>>> I think you don't need to directly use ML model handler
>>>>>> you need to use the code in that for persisting models in the
>>>>>> streaming algorithm
>>>>>> so you can add a utils class in the streaming folder
>>>>>> then add the persisting logic there
>>>>>> ignore the deeplearning section in that
>>>>>> only forcus

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-21 Thread Nirmal Fernando
Hi Mahesh,

Can you summarize the work we have done so far and the remaining work items
please?

Thanks.

On Tue, Jun 21, 2016 at 5:56 PM, Mahesh Dananjaya <dananjayamah...@gmail.com
> wrote:

> Hi Maheshakya,
> I have updated the repo [2] and upto date documents can be found at
> [1].thank you.
> regards,
> Mahesh.
> [1]
> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming
> [2] https://github.com/dananjayamahesh/carbon-ml/tree/wso2_gsoc_ml6_cml
>
>
> On Tue, Jun 21, 2016 at 5:08 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>>
>> -- Forwarded message --
>> From: Mahesh Dananjaya <dananjayamah...@gmail.com>
>> Date: Tue, Jun 21, 2016 at 5:08 PM
>> Subject: Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with
>> online data for WSO2 Machine Learner
>> To: Maheshakya Wijewardena <mahesha...@wso2.com>
>>
>>
>> Hi Maheshakya,
>> new query is like this adding spport for moving window methods.
>>
>>
>> @Import('LinRegInput:1.0.1')
>> define stream LinRegInput (salary double, rbi double, walks double,
>> strikeouts double, errors double);
>>
>> @Export('LinRegOutput:1.0.1')
>> define stream LinRegOutput (mse double);
>>
>> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
>> salary, rbi, walks, strikeouts, errors)
>> select *
>> insert into mse;
>> 1=learnType
>> 2=windowShift
>> 4=batchSize...
>>
>> windowShift is added to configure the amount of shift. i have added
>> log.infe(mse) to view the MSE.
>> Mahesh.
>>
>> On Tue, Jun 21, 2016 at 2:33 PM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Hi Mahesh,
>>>
>>> If you are installing features  from new p2 repo into a new CEP pack,
>>> then you wont need to replace those jars.
>>> If you have already installed those in the CEP from a previous p2-repo,
>>> then you have to un-install those features and reinstall with new p2 repo.
>>> But you don't need to do this because you can just replace the jar. It's
>>> easy.
>>>
>>> Best regards.
>>>
>>> On Tue, Jun 21, 2016 at 2:26 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> Hi Maheshakya,
>>>> If i built the carbon-ml then product-ml and point new p2 repository to
>>>> cep features, do i need to copy that
>>>> org.wso2.carbon.ml.siddhi.extension1.1. thing into
>>>> cep_home/repository/component/... place.
>>>> regards,
>>>> Mahesh.
>>>>
>>>> On Thu, Jun 16, 2016 at 6:39 PM, Mahesh Dananjaya <
>>>> dananjayamah...@gmail.com> wrote:
>>>>
>>>>> In MLModelhandler there's persistModel method
>>>>> debug that method while trying to train a model from ML
>>>>> you can see the steps it takes
>>>>> don't use deep learning algorithm
>>>>> any other algorithm would work
>>>>> from line 777 is the section for creating the serializable object from
>>>>> trained model and saving it
>>>>>
>>>>>
>>>>> I think you don't need to directly use ML model handler
>>>>> you need to use the code in that for persisting models in the
>>>>> streaming algorithm
>>>>> so you can add a utils class in the streaming folder
>>>>> then add the persisting logic there
>>>>> ignore the deeplearning section in that
>>>>> only forcus on persisting spark mod
>>>>>
>>>>> On Wed, Jun 15, 2016 at 4:11 PM, Mahesh Dananjaya <
>>>>> dananjayamah...@gmail.com> wrote:
>>>>>
>>>>>> Hi Maheshakya,
>>>>>> I pushed the StreamingLinearRegression modules into my forked
>>>>>> carbon-ml repo at branch wso2_gsoc_ml6_cml [1]. I am working on 
>>>>>> persisting
>>>>>> model.thank you.
>>>>>> Mahesh.
>>>>>> [1] https://github.com/dananjayamahesh/carbon-ml
>>>>>>
>>>>>> On Tue, Jun 14, 2016 at 5:56 PM, Mahesh Dananjaya <
>>>>>> dananjayamah...@gmail.com> wrote:
>>>>>>
>>>>>>> yes
>>>>>>> you should develop in tha fork repo
>>>>>>> clone your forked repo
>>>>>>> then go into that
>>>>>>> then ad

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-21 Thread Mahesh Dananjaya
Hi Maheshakya,
I have updated the repo [2] and upto date documents can be found at
[1].thank you.
regards,
Mahesh.
[1]
https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming
[2] https://github.com/dananjayamahesh/carbon-ml/tree/wso2_gsoc_ml6_cml


On Tue, Jun 21, 2016 at 5:08 PM, Mahesh Dananjaya <dananjayamah...@gmail.com
> wrote:

>
> -- Forwarded message --
> From: Mahesh Dananjaya <dananjayamah...@gmail.com>
> Date: Tue, Jun 21, 2016 at 5:08 PM
> Subject: Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with
> online data for WSO2 Machine Learner
> To: Maheshakya Wijewardena <mahesha...@wso2.com>
>
>
> Hi Maheshakya,
> new query is like this adding spport for moving window methods.
>
>
> @Import('LinRegInput:1.0.1')
> define stream LinRegInput (salary double, rbi double, walks double,
> strikeouts double, errors double);
>
> @Export('LinRegOutput:1.0.1')
> define stream LinRegOutput (mse double);
>
> from LinRegInput#ml:streamlinreg(1, 2, 4, 100, 0.0001, 1.0, 0.95,
> salary, rbi, walks, strikeouts, errors)
> select *
> insert into mse;
> 1=learnType
> 2=windowShift
> 4=batchSize...
>
> windowShift is added to configure the amount of shift. i have added
> log.infe(mse) to view the MSE.
> Mahesh.
>
> On Tue, Jun 21, 2016 at 2:33 PM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> If you are installing features  from new p2 repo into a new CEP pack,
>> then you wont need to replace those jars.
>> If you have already installed those in the CEP from a previous p2-repo,
>> then you have to un-install those features and reinstall with new p2 repo.
>> But you don't need to do this because you can just replace the jar. It's
>> easy.
>>
>> Best regards.
>>
>> On Tue, Jun 21, 2016 at 2:26 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> If i built the carbon-ml then product-ml and point new p2 repository to
>>> cep features, do i need to copy that
>>> org.wso2.carbon.ml.siddhi.extension1.1. thing into
>>> cep_home/repository/component/... place.
>>> regards,
>>> Mahesh.
>>>
>>> On Thu, Jun 16, 2016 at 6:39 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
>>>> In MLModelhandler there's persistModel method
>>>> debug that method while trying to train a model from ML
>>>> you can see the steps it takes
>>>> don't use deep learning algorithm
>>>> any other algorithm would work
>>>> from line 777 is the section for creating the serializable object from
>>>> trained model and saving it
>>>>
>>>>
>>>> I think you don't need to directly use ML model handler
>>>> you need to use the code in that for persisting models in the streaming
>>>> algorithm
>>>> so you can add a utils class in the streaming folder
>>>> then add the persisting logic there
>>>> ignore the deeplearning section in that
>>>> only forcus on persisting spark mod
>>>>
>>>> On Wed, Jun 15, 2016 at 4:11 PM, Mahesh Dananjaya <
>>>> dananjayamah...@gmail.com> wrote:
>>>>
>>>>> Hi Maheshakya,
>>>>> I pushed the StreamingLinearRegression modules into my forked
>>>>> carbon-ml repo at branch wso2_gsoc_ml6_cml [1]. I am working on persisting
>>>>> model.thank you.
>>>>> Mahesh.
>>>>> [1] https://github.com/dananjayamahesh/carbon-ml
>>>>>
>>>>> On Tue, Jun 14, 2016 at 5:56 PM, Mahesh Dananjaya <
>>>>> dananjayamah...@gmail.com> wrote:
>>>>>
>>>>>> yes
>>>>>> you should develop in tha fork repo
>>>>>> clone your forked repo
>>>>>> then go into that
>>>>>> then add upstream repo as original wso2 repo
>>>>>> see the remote tracking branchs by
>>>>>> git remote -v
>>>>>> you will see the origin as your forked repo
>>>>>> to add upstream
>>>>>> git remote add upstream 
>>>>>> when you change something create a new branch by
>>>>>> git checkout -b new_branch_name
>>>>>> then add and commit to this branch
>>>>>> after that push to the forked by
>>>>>> git push origin new_branch_name
>>>>>>
>>>>>> On

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-21 Thread Maheshakya Wijewardena
Hi Mahesh,

If you are installing features  from new p2 repo into a new CEP pack, then
you wont need to replace those jars.
If you have already installed those in the CEP from a previous p2-repo,
then you have to un-install those features and reinstall with new p2 repo.
But you don't need to do this because you can just replace the jar. It's
easy.

Best regards.

On Tue, Jun 21, 2016 at 2:26 PM, Mahesh Dananjaya  wrote:

> Hi Maheshakya,
> If i built the carbon-ml then product-ml and point new p2 repository to
> cep features, do i need to copy that
> org.wso2.carbon.ml.siddhi.extension1.1. thing into
> cep_home/repository/component/... place.
> regards,
> Mahesh.
>
> On Thu, Jun 16, 2016 at 6:39 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> In MLModelhandler there's persistModel method
>> debug that method while trying to train a model from ML
>> you can see the steps it takes
>> don't use deep learning algorithm
>> any other algorithm would work
>> from line 777 is the section for creating the serializable object from
>> trained model and saving it
>>
>>
>> I think you don't need to directly use ML model handler
>> you need to use the code in that for persisting models in the streaming
>> algorithm
>> so you can add a utils class in the streaming folder
>> then add the persisting logic there
>> ignore the deeplearning section in that
>> only forcus on persisting spark mod
>>
>> On Wed, Jun 15, 2016 at 4:11 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> I pushed the StreamingLinearRegression modules into my forked carbon-ml
>>> repo at branch wso2_gsoc_ml6_cml [1]. I am working on persisting
>>> model.thank you.
>>> Mahesh.
>>> [1] https://github.com/dananjayamahesh/carbon-ml
>>>
>>> On Tue, Jun 14, 2016 at 5:56 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
 yes
 you should develop in tha fork repo
 clone your forked repo
 then go into that
 then add upstream repo as original wso2 repo
 see the remote tracking branchs by
 git remote -v
 you will see the origin as your forked repo
 to add upstream
 git remote add upstream 
 when you change something create a new branch by
 git checkout -b new_branch_name
 then add and commit to this branch
 after that push to the forked by
 git push origin new_branch_name

 On Tue, Jun 14, 2016 at 5:32 PM, Mahesh Dananjaya <
 dananjayamah...@gmail.com> wrote:

> Hi Maheshakya,
> the above error is due to a simple mistake of not providing my local
> p2 repo.Now it is working and i debugged the StreamingLinearRegression
> model cep.
> regards,
> Mahesh.
>
> On Tue, Jun 14, 2016 at 3:19 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> I did what you recommend. But when i am adding the query the
>> following error is appearing.
>> No extension exist for StreamFunctionExtension{namespace='ml'} in
>> execution plan "NewExecutionPlan"
>>
>> *My query is as follows,
>> @Import('LinRegInput:1.0.0')
>> define stream LinRegInput (salary double, rbi double, walks double,
>> strikeouts double, errors double);
>>
>> @Export('LinRegOutput:1.0.0')
>> define stream LinRegOutput (mse double);
>>
>> from LinRegInput#ml:streamlinreg(0, 2, 100, 0.0001, 1.0, 0.95,
>> salary, rbi, walks, strikeouts, errors)
>> select *
>> insert into mse;
>>
>> I have added my files as follows,
>>
>> org.wso2.carbon.ml.siddhi.extension.streaming.StreamingLinearRegression;
>> org.wso2.carbon.ml.siddhi.extension.streaming.algorithm.StreamingLinearModel;
>>
>> and add following lines to ml.siddhiext
>>
>> streamlinreg=org.wso2.carbon.ml.siddhi.extension.streaming.StreamingLinearRegressionStreamProcessor
>>
>> .Then i build the carbon-ml. The replace the jar file you asked me
>> replace with the name changed.any thoughts?
>> regards,
>> Mahesh.
>>
>> On Tue, Jun 14, 2016 at 2:43 PM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Hi Mahesh,
>>>
>>> You don't need to add new p2 repo.
>>> In the /repository/components/plugins folder, you will
>>> find org.wso2.carbon.ml.siddhi.extension_some_version.jar. Replace this
>>> with
>>> carbon-ml/components/extensions/org.wso2.carbon.ml.siddhi.extension/target/org.wso2.carbon.ml.siddhi.extension-1.1.2-SNAPSHOT.jar.
>>> First rename this jar in the target folder to the jar name in the 
>>> plugins
>>> folder then replace (Make sure, otherwise will not work).
>>> Your updates will be there in the CEP after this.
>>>
>>> Best regards.
>>>
>>> On Tue, Jun 14, 2016 at 2:37 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
 Hi Maheshakya,
 Do i need to add p2 local repos 

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-21 Thread Mahesh Dananjaya
Hi Maheshakya,
If i built the carbon-ml then product-ml and point new p2 repository to cep
features, do i need to copy that
org.wso2.carbon.ml.siddhi.extension1.1. thing into
cep_home/repository/component/... place.
regards,
Mahesh.

On Thu, Jun 16, 2016 at 6:39 PM, Mahesh Dananjaya  wrote:

> In MLModelhandler there's persistModel method
> debug that method while trying to train a model from ML
> you can see the steps it takes
> don't use deep learning algorithm
> any other algorithm would work
> from line 777 is the section for creating the serializable object from
> trained model and saving it
>
>
> I think you don't need to directly use ML model handler
> you need to use the code in that for persisting models in the streaming
> algorithm
> so you can add a utils class in the streaming folder
> then add the persisting logic there
> ignore the deeplearning section in that
> only forcus on persisting spark mod
>
> On Wed, Jun 15, 2016 at 4:11 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> I pushed the StreamingLinearRegression modules into my forked carbon-ml
>> repo at branch wso2_gsoc_ml6_cml [1]. I am working on persisting
>> model.thank you.
>> Mahesh.
>> [1] https://github.com/dananjayamahesh/carbon-ml
>>
>> On Tue, Jun 14, 2016 at 5:56 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> yes
>>> you should develop in tha fork repo
>>> clone your forked repo
>>> then go into that
>>> then add upstream repo as original wso2 repo
>>> see the remote tracking branchs by
>>> git remote -v
>>> you will see the origin as your forked repo
>>> to add upstream
>>> git remote add upstream 
>>> when you change something create a new branch by
>>> git checkout -b new_branch_name
>>> then add and commit to this branch
>>> after that push to the forked by
>>> git push origin new_branch_name
>>>
>>> On Tue, Jun 14, 2016 at 5:32 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
 Hi Maheshakya,
 the above error is due to a simple mistake of not providing my local p2
 repo.Now it is working and i debugged the StreamingLinearRegression model
 cep.
 regards,
 Mahesh.

 On Tue, Jun 14, 2016 at 3:19 PM, Mahesh Dananjaya <
 dananjayamah...@gmail.com> wrote:

> Hi Maheshakya,
> I did what you recommend. But when i am adding the query the following
> error is appearing.
> No extension exist for StreamFunctionExtension{namespace='ml'} in
> execution plan "NewExecutionPlan"
>
> *My query is as follows,
> @Import('LinRegInput:1.0.0')
> define stream LinRegInput (salary double, rbi double, walks double,
> strikeouts double, errors double);
>
> @Export('LinRegOutput:1.0.0')
> define stream LinRegOutput (mse double);
>
> from LinRegInput#ml:streamlinreg(0, 2, 100, 0.0001, 1.0, 0.95,
> salary, rbi, walks, strikeouts, errors)
> select *
> insert into mse;
>
> I have added my files as follows,
>
> org.wso2.carbon.ml.siddhi.extension.streaming.StreamingLinearRegression;
> org.wso2.carbon.ml.siddhi.extension.streaming.algorithm.StreamingLinearModel;
>
> and add following lines to ml.siddhiext
>
> streamlinreg=org.wso2.carbon.ml.siddhi.extension.streaming.StreamingLinearRegressionStreamProcessor
>
> .Then i build the carbon-ml. The replace the jar file you asked me
> replace with the name changed.any thoughts?
> regards,
> Mahesh.
>
> On Tue, Jun 14, 2016 at 2:43 PM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> You don't need to add new p2 repo.
>> In the /repository/components/plugins folder, you will find
>> org.wso2.carbon.ml.siddhi.extension_some_version.jar. Replace this with
>> carbon-ml/components/extensions/org.wso2.carbon.ml.siddhi.extension/target/org.wso2.carbon.ml.siddhi.extension-1.1.2-SNAPSHOT.jar.
>> First rename this jar in the target folder to the jar name in the plugins
>> folder then replace (Make sure, otherwise will not work).
>> Your updates will be there in the CEP after this.
>>
>> Best regards.
>>
>> On Tue, Jun 14, 2016 at 2:37 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> Do i need to add p2 local repos of ML into CEP after i made changes
>>> to ml extensions. Or will it be automatically updated. I am trying to 
>>> debug
>>> my extension with the cep.thank you.
>>> regards,
>>> Mahesh.
>>>
>>> On Tue, Jun 14, 2016 at 1:57 PM, Maheshakya Wijewardena <
>>> mahesha...@wso2.com> wrote:
>>>
 Mahesh when you add your work to carbon-ml follow the bellow
 guidelines, it will help to keep the code clean.


- Add only the sources code file you have newly added or
changed.
- Do not use add . (add all) 

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-15 Thread Mahesh Dananjaya
Hi Maheshakya,
I pushed the StreamingLinearRegression modules into my forked carbon-ml
repo at branch wso2_gsoc_ml6_cml [1]. I am working on persisting
model.thank you.
Mahesh.
[1] https://github.com/dananjayamahesh/carbon-ml

On Tue, Jun 14, 2016 at 5:56 PM, Mahesh Dananjaya  wrote:

> yes
> you should develop in tha fork repo
> clone your forked repo
> then go into that
> then add upstream repo as original wso2 repo
> see the remote tracking branchs by
> git remote -v
> you will see the origin as your forked repo
> to add upstream
> git remote add upstream 
> when you change something create a new branch by
> git checkout -b new_branch_name
> then add and commit to this branch
> after that push to the forked by
> git push origin new_branch_name
>
> On Tue, Jun 14, 2016 at 5:32 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> the above error is due to a simple mistake of not providing my local p2
>> repo.Now it is working and i debugged the StreamingLinearRegression model
>> cep.
>> regards,
>> Mahesh.
>>
>> On Tue, Jun 14, 2016 at 3:19 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> I did what you recommend. But when i am adding the query the following
>>> error is appearing.
>>> No extension exist for StreamFunctionExtension{namespace='ml'} in
>>> execution plan "NewExecutionPlan"
>>>
>>> *My query is as follows,
>>> @Import('LinRegInput:1.0.0')
>>> define stream LinRegInput (salary double, rbi double, walks double,
>>> strikeouts double, errors double);
>>>
>>> @Export('LinRegOutput:1.0.0')
>>> define stream LinRegOutput (mse double);
>>>
>>> from LinRegInput#ml:streamlinreg(0, 2, 100, 0.0001, 1.0, 0.95,
>>> salary, rbi, walks, strikeouts, errors)
>>> select *
>>> insert into mse;
>>>
>>> I have added my files as follows,
>>>
>>> org.wso2.carbon.ml.siddhi.extension.streaming.StreamingLinearRegression;
>>> org.wso2.carbon.ml.siddhi.extension.streaming.algorithm.StreamingLinearModel;
>>>
>>> and add following lines to ml.siddhiext
>>>
>>> streamlinreg=org.wso2.carbon.ml.siddhi.extension.streaming.StreamingLinearRegressionStreamProcessor
>>>
>>> .Then i build the carbon-ml. The replace the jar file you asked me
>>> replace with the name changed.any thoughts?
>>> regards,
>>> Mahesh.
>>>
>>> On Tue, Jun 14, 2016 at 2:43 PM, Maheshakya Wijewardena <
>>> mahesha...@wso2.com> wrote:
>>>
 Hi Mahesh,

 You don't need to add new p2 repo.
 In the /repository/components/plugins folder, you will find
 org.wso2.carbon.ml.siddhi.extension_some_version.jar. Replace this with
 carbon-ml/components/extensions/org.wso2.carbon.ml.siddhi.extension/target/org.wso2.carbon.ml.siddhi.extension-1.1.2-SNAPSHOT.jar.
 First rename this jar in the target folder to the jar name in the plugins
 folder then replace (Make sure, otherwise will not work).
 Your updates will be there in the CEP after this.

 Best regards.

 On Tue, Jun 14, 2016 at 2:37 PM, Mahesh Dananjaya <
 dananjayamah...@gmail.com> wrote:

> Hi Maheshakya,
> Do i need to add p2 local repos of ML into CEP after i made changes to
> ml extensions. Or will it be automatically updated. I am trying to debug 
> my
> extension with the cep.thank you.
> regards,
> Mahesh.
>
> On Tue, Jun 14, 2016 at 1:57 PM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Mahesh when you add your work to carbon-ml follow the bellow
>> guidelines, it will help to keep the code clean.
>>
>>
>>- Add only the sources code file you have newly added or changed.
>>- Do not use add . (add all) command in git. Only use add filename
>>
>> I have seen in your gsoc repo that there are gitignore files, idea
>> related files and the target folder is there. These should not be in the
>> source code, only the source files you add.
>>
>>- Commit when you have done some major activity. Do not add
>>commits always when you make a change.
>>
>>
>> On Tue, Jun 14, 2016 at 12:22 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> May i seperately put the classes to ml and extensions in
>>> carbon-core. I can put Streaming Extensions to extensions and
>>> Algorithms/StreamingLinear Regression and StreamingKMeans in ml core. 
>>> what
>>> is the suitable format. I will commit my changes today as seperate 
>>> branch
>>> in my forked carbon-ml local repo.thank you.
>>> regards,
>>> Mahesh.
>>> p.s: better if you can meet me via hangout.
>>>
>>
>>
>>
>> --
>> Pruthuvi Maheshakya Wijewardena
>> mahesha...@wso2.com
>> +94711228855
>>
>>
>>
>


 --
 Pruthuvi Maheshakya Wijewardena
 mahesha...@wso2.com
 +94711228855



>>>
>>
>

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-14 Thread Mahesh Dananjaya
Hi Maheshakya,
the above error is due to a simple mistake of not providing my local p2
repo.Now it is working and i debugged the StreamingLinearRegression model
cep.
regards,
Mahesh.

On Tue, Jun 14, 2016 at 3:19 PM, Mahesh Dananjaya  wrote:

> Hi Maheshakya,
> I did what you recommend. But when i am adding the query the following
> error is appearing.
> No extension exist for StreamFunctionExtension{namespace='ml'} in
> execution plan "NewExecutionPlan"
>
> *My query is as follows,
> @Import('LinRegInput:1.0.0')
> define stream LinRegInput (salary double, rbi double, walks double,
> strikeouts double, errors double);
>
> @Export('LinRegOutput:1.0.0')
> define stream LinRegOutput (mse double);
>
> from LinRegInput#ml:streamlinreg(0, 2, 100, 0.0001, 1.0, 0.95, salary,
> rbi, walks, strikeouts, errors)
> select *
> insert into mse;
>
> I have added my files as follows,
>
> org.wso2.carbon.ml.siddhi.extension.streaming.StreamingLinearRegression;
> org.wso2.carbon.ml.siddhi.extension.streaming.algorithm.StreamingLinearModel;
>
> and add following lines to ml.siddhiext
>
> streamlinreg=org.wso2.carbon.ml.siddhi.extension.streaming.StreamingLinearRegressionStreamProcessor
>
> .Then i build the carbon-ml. The replace the jar file you asked me replace
> with the name changed.any thoughts?
> regards,
> Mahesh.
>
> On Tue, Jun 14, 2016 at 2:43 PM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> You don't need to add new p2 repo.
>> In the /repository/components/plugins folder, you will find
>> org.wso2.carbon.ml.siddhi.extension_some_version.jar. Replace this with
>> carbon-ml/components/extensions/org.wso2.carbon.ml.siddhi.extension/target/org.wso2.carbon.ml.siddhi.extension-1.1.2-SNAPSHOT.jar.
>> First rename this jar in the target folder to the jar name in the plugins
>> folder then replace (Make sure, otherwise will not work).
>> Your updates will be there in the CEP after this.
>>
>> Best regards.
>>
>> On Tue, Jun 14, 2016 at 2:37 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> Do i need to add p2 local repos of ML into CEP after i made changes to
>>> ml extensions. Or will it be automatically updated. I am trying to debug my
>>> extension with the cep.thank you.
>>> regards,
>>> Mahesh.
>>>
>>> On Tue, Jun 14, 2016 at 1:57 PM, Maheshakya Wijewardena <
>>> mahesha...@wso2.com> wrote:
>>>
 Mahesh when you add your work to carbon-ml follow the bellow
 guidelines, it will help to keep the code clean.


- Add only the sources code file you have newly added or changed.
- Do not use add . (add all) command in git. Only use add filename

 I have seen in your gsoc repo that there are gitignore files, idea
 related files and the target folder is there. These should not be in the
 source code, only the source files you add.

- Commit when you have done some major activity. Do not add commits
always when you make a change.


 On Tue, Jun 14, 2016 at 12:22 PM, Mahesh Dananjaya <
 dananjayamah...@gmail.com> wrote:

> Hi Maheshakya,
> May i seperately put the classes to ml and extensions in carbon-core.
> I can put Streaming Extensions to extensions and 
> Algorithms/StreamingLinear
> Regression and StreamingKMeans in ml core. what is the suitable format. I
> will commit my changes today as seperate branch in my forked carbon-ml
> local repo.thank you.
> regards,
> Mahesh.
> p.s: better if you can meet me via hangout.
>



 --
 Pruthuvi Maheshakya Wijewardena
 mahesha...@wso2.com
 +94711228855



>>>
>>
>>
>> --
>> Pruthuvi Maheshakya Wijewardena
>> mahesha...@wso2.com
>> +94711228855
>>
>>
>>
>
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-14 Thread Mahesh Dananjaya
Hi Maheshakya,
I did what you recommend. But when i am adding the query the following
error is appearing.
No extension exist for StreamFunctionExtension{namespace='ml'} in execution
plan "NewExecutionPlan"

*My query is as follows,
@Import('LinRegInput:1.0.0')
define stream LinRegInput (salary double, rbi double, walks double,
strikeouts double, errors double);

@Export('LinRegOutput:1.0.0')
define stream LinRegOutput (mse double);

from LinRegInput#ml:streamlinreg(0, 2, 100, 0.0001, 1.0, 0.95, salary,
rbi, walks, strikeouts, errors)
select *
insert into mse;

I have added my files as follows,

org.wso2.carbon.ml.siddhi.extension.streaming.StreamingLinearRegression;
org.wso2.carbon.ml.siddhi.extension.streaming.algorithm.StreamingLinearModel;

and add following lines to ml.siddhiext

streamlinreg=org.wso2.carbon.ml.siddhi.extension.streaming.StreamingLinearRegressionStreamProcessor

.Then i build the carbon-ml. The replace the jar file you asked me replace
with the name changed.any thoughts?
regards,
Mahesh.

On Tue, Jun 14, 2016 at 2:43 PM, Maheshakya Wijewardena  wrote:

> Hi Mahesh,
>
> You don't need to add new p2 repo.
> In the /repository/components/plugins folder, you will find
> org.wso2.carbon.ml.siddhi.extension_some_version.jar. Replace this with
> carbon-ml/components/extensions/org.wso2.carbon.ml.siddhi.extension/target/org.wso2.carbon.ml.siddhi.extension-1.1.2-SNAPSHOT.jar.
> First rename this jar in the target folder to the jar name in the plugins
> folder then replace (Make sure, otherwise will not work).
> Your updates will be there in the CEP after this.
>
> Best regards.
>
> On Tue, Jun 14, 2016 at 2:37 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> Do i need to add p2 local repos of ML into CEP after i made changes to ml
>> extensions. Or will it be automatically updated. I am trying to debug my
>> extension with the cep.thank you.
>> regards,
>> Mahesh.
>>
>> On Tue, Jun 14, 2016 at 1:57 PM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Mahesh when you add your work to carbon-ml follow the bellow guidelines,
>>> it will help to keep the code clean.
>>>
>>>
>>>- Add only the sources code file you have newly added or changed.
>>>- Do not use add . (add all) command in git. Only use add filename
>>>
>>> I have seen in your gsoc repo that there are gitignore files, idea
>>> related files and the target folder is there. These should not be in the
>>> source code, only the source files you add.
>>>
>>>- Commit when you have done some major activity. Do not add commits
>>>always when you make a change.
>>>
>>>
>>> On Tue, Jun 14, 2016 at 12:22 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
 Hi Maheshakya,
 May i seperately put the classes to ml and extensions in carbon-core. I
 can put Streaming Extensions to extensions and Algorithms/StreamingLinear
 Regression and StreamingKMeans in ml core. what is the suitable format. I
 will commit my changes today as seperate branch in my forked carbon-ml
 local repo.thank you.
 regards,
 Mahesh.
 p.s: better if you can meet me via hangout.

>>>
>>>
>>>
>>> --
>>> Pruthuvi Maheshakya Wijewardena
>>> mahesha...@wso2.com
>>> +94711228855
>>>
>>>
>>>
>>
>
>
> --
> Pruthuvi Maheshakya Wijewardena
> mahesha...@wso2.com
> +94711228855
>
>
>
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-14 Thread Maheshakya Wijewardena
Hi Mahesh,

You don't need to add new p2 repo.
In the /repository/components/plugins folder, you will find
org.wso2.carbon.ml.siddhi.extension_some_version.jar. Replace this with
carbon-ml/components/extensions/org.wso2.carbon.ml.siddhi.extension/target/org.wso2.carbon.ml.siddhi.extension-1.1.2-SNAPSHOT.jar.
First rename this jar in the target folder to the jar name in the plugins
folder then replace (Make sure, otherwise will not work).
Your updates will be there in the CEP after this.

Best regards.

On Tue, Jun 14, 2016 at 2:37 PM, Mahesh Dananjaya  wrote:

> Hi Maheshakya,
> Do i need to add p2 local repos of ML into CEP after i made changes to ml
> extensions. Or will it be automatically updated. I am trying to debug my
> extension with the cep.thank you.
> regards,
> Mahesh.
>
> On Tue, Jun 14, 2016 at 1:57 PM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Mahesh when you add your work to carbon-ml follow the bellow guidelines,
>> it will help to keep the code clean.
>>
>>
>>- Add only the sources code file you have newly added or changed.
>>- Do not use add . (add all) command in git. Only use add filename
>>
>> I have seen in your gsoc repo that there are gitignore files, idea
>> related files and the target folder is there. These should not be in the
>> source code, only the source files you add.
>>
>>- Commit when you have done some major activity. Do not add commits
>>always when you make a change.
>>
>>
>> On Tue, Jun 14, 2016 at 12:22 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> May i seperately put the classes to ml and extensions in carbon-core. I
>>> can put Streaming Extensions to extensions and Algorithms/StreamingLinear
>>> Regression and StreamingKMeans in ml core. what is the suitable format. I
>>> will commit my changes today as seperate branch in my forked carbon-ml
>>> local repo.thank you.
>>> regards,
>>> Mahesh.
>>> p.s: better if you can meet me via hangout.
>>>
>>
>>
>>
>> --
>> Pruthuvi Maheshakya Wijewardena
>> mahesha...@wso2.com
>> +94711228855
>>
>>
>>
>


-- 
Pruthuvi Maheshakya Wijewardena
mahesha...@wso2.com
+94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-14 Thread Mahesh Dananjaya
Hi Maheshakya,
Do i need to add p2 local repos of ML into CEP after i made changes to ml
extensions. Or will it be automatically updated. I am trying to debug my
extension with the cep.thank you.
regards,
Mahesh.

On Tue, Jun 14, 2016 at 1:57 PM, Maheshakya Wijewardena  wrote:

> Mahesh when you add your work to carbon-ml follow the bellow guidelines,
> it will help to keep the code clean.
>
>
>- Add only the sources code file you have newly added or changed.
>- Do not use add . (add all) command in git. Only use add filename
>
> I have seen in your gsoc repo that there are gitignore files, idea related
> files and the target folder is there. These should not be in the source
> code, only the source files you add.
>
>- Commit when you have done some major activity. Do not add commits
>always when you make a change.
>
>
> On Tue, Jun 14, 2016 at 12:22 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> May i seperately put the classes to ml and extensions in carbon-core. I
>> can put Streaming Extensions to extensions and Algorithms/StreamingLinear
>> Regression and StreamingKMeans in ml core. what is the suitable format. I
>> will commit my changes today as seperate branch in my forked carbon-ml
>> local repo.thank you.
>> regards,
>> Mahesh.
>> p.s: better if you can meet me via hangout.
>>
>
>
>
> --
> Pruthuvi Maheshakya Wijewardena
> mahesha...@wso2.com
> +94711228855
>
>
>
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-14 Thread Maheshakya Wijewardena
Hi Mahesh,

You can add a new folder for streaming algorithms in the siddhi extension.
There, keep stream processors and the algorithms classes separately.

We can arrange a hangout tomorrow.

Best regards.

On Tue, Jun 14, 2016 at 12:22 PM, Mahesh Dananjaya <
dananjayamah...@gmail.com> wrote:

> Hi Maheshakya,
> May i seperately put the classes to ml and extensions in carbon-core. I
> can put Streaming Extensions to extensions and Algorithms/StreamingLinear
> Regression and StreamingKMeans in ml core. what is the suitable format. I
> will commit my changes today as seperate branch in my forked carbon-ml
> local repo.thank you.
> regards,
> Mahesh.
> p.s: better if you can meet me via hangout.
>



-- 
Pruthuvi Maheshakya Wijewardena
mahesha...@wso2.com
+94711228855
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-14 Thread Mahesh Dananjaya
Hi Maheshakya,
May i seperately put the classes to ml and extensions in carbon-core. I can
put Streaming Extensions to extensions and Algorithms/StreamingLinear
Regression and StreamingKMeans in ml core. what is the suitable format. I
will commit my changes today as seperate branch in my forked carbon-ml
local repo.thank you.
regards,
Mahesh.
p.s: better if you can meet me via hangout.
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-12 Thread Mahesh Dananjaya
Hi maheshakya,
ok.these couple of days i have spent on implementing streamin
clustering in a efficient way.i have found couple of methods.intially
i am developing k batch k means for streaming.i will let you know the
progress within next couple of days.i have already added paramter in
query for window shift.i will add tto repo tomorrow morning.
Thank you.
Mahesh.

On 6/12/16, Maheshakya Wijewardena  wrote:
> Hi Mahesh,
>
> Since you have already implemented the streaming algorithms as separate
> siddhi extensions, our next task is to include them in the carbon-ml siddhi
> extensions. Please start that by adding streaming linear regression first.
> You also need to persist models that are trained.
> Refer to method [1] in carbon-ml to see how model persistence is done.
>
> Best regards.
>
> [1]
> https://github.com/wso2/carbon-ml/blob/5211f8b1d662778af832c54fbbcc81fe4aa78e1e/components/ml/org.wso2.carbon.ml.core/src/main/java/org/wso2/carbon/ml/core/impl/MLModelHandler.java#L727
>
> On Sat, Jun 11, 2016 at 10:58 PM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> Regarding your question:
>>
>> my outputData Object[]array is in the format of
>>> [mse,beta0,beta1,betap].But seems to be that cep does not understand
>>> it.
>>
>>
>> Did you create an output stream first for the publisher? You need to
>> create a stream with attributes: mse double, beta1 double, ...
>> and point to that from the publisher.
>>
>>
>>
>> On Wed, Jun 8, 2016 at 1:48 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> you can find the details of the queries in this ReadMe [1]. i have add
>>> some changes . so previous querirs may not valid.please use these new
>>> queries in the README.
>>> *1.Streaming Linear regression*
>>> from LinRegInputStream#streaming:streaminglr((learnType),
>>> (batchSize/timeFrame), (numIterations), (stepSize), (miniBatchFraction),
>>> (ci), salary, rbi, walks, strikeouts, errors)
>>> select *
>>>
>>>
>>>
>>>
>>> *insert into regResults; from LinRegInputStream#streaming:streaminglr(0,
>>> 2, 100, 0.0001, 1, 0.95, salary, rbi, walks, strikeouts,
>>> errors)select
>>> *insert into regResults*;
>>>
>>> *2.Streaming KMeans Clustering*
>>> from LinRegInputStream#streaming:streamingkm((learnType),
>>> (batchSize/timeFrame), (numClusters), (numIterations),(alpha), (ci),
>>> salary, rbi, walks, strikeouts, errors)
>>> select *
>>> insert into regResults;
>>>
>>>
>>>
>>> *from
>>> KMeansInputStream#streaming:streamingkm(0,3,0.95,2,10,1,salary,rbi,walks,strikeouts,errors)select
>>> *insert into regResults*
>>>
>>>  And i need a help in returning the outputData of my program back to
>>> cep.
>>> therefore currenlt you may not find the stream output in event
>>> publish.but
>>> you can see the output in the console. i want to understand the final
>>> stepd
>>> of putting the output data back to output stream after the batch size is
>>> completed and the algorithms is completed. you may find that following
>>> line
>>> passes an exception. Thats have actually no clue of outputData format
>>> that
>>> need to give for Output stream.
>>>
>>> Object[] outputData = streamingLinearRegression.regress(eventData);
>>>
>>>
>>> if (outputData == null) {
>>> streamEventChunk.remove();
>>> } else {
>>> complexEventPopulater.populateComplexEvent(complexEvent,
>>> outputData);
>>> }
>>>
>>> my outputData Object[]array is in the format of
>>> [mse,beta0,beta1,betap].But seems to be that cep does not understand
>>> it. i do it by looking at the time series stream rpocessor extension at
>>> [2].can you please help me with this.
>>> regards,
>>> Mahesh.
>>>
>>> [1]
>>> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming
>>> [2]
>>> https://github.com/wso2/siddhi/blob/master/modules/siddhi-extensions/timeseries/src/main/java/org/wso2/siddhi/extension/timeseries/LinearRegressionStreamProcessor.java
>>>
>>> On Tue, Jun 7, 2016 at 10:42 PM, Maheshakya Wijewardena <
>>> mahesha...@wso2.com> wrote:
>>>
 Hi Mahesh,

 Great work so far.

 Regarding the queries:

 streamingkm(0, 2,2,20,1,0.95 salary, rbi, walks, strikeouts, errors)


 Can you give me the definitions of the first few entities in the order.
 Also in previous supervised cases (linear regression), what is the
 response
 variable, etc.
 I'll go through the code and give you a feedback.

  After this, we need to me this implementation into carbon-ml siddhi
 extension. Please also do a similar implementation for logistic
 regression
 as well because we need to have a streaming version for classification
 as
 well.

 Best regards.



 On Tue, Jun 7, 2016 at 5:50 PM, Mahesh Dananjaya <
 dananjayamah...@gmail.com> wrote:

> Hi Maheshkya,
> I have changed the siddhi query for our StreamingKMeansClustering by
> adding 

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-11 Thread Maheshakya Wijewardena
Hi Mahesh,

Regarding your question:

my outputData Object[]array is in the format of
> [mse,beta0,beta1,betap].But seems to be that cep does not understand it.


Did you create an output stream first for the publisher? You need to create
a stream with attributes: mse double, beta1 double, ...and
point to that from the publisher.



On Wed, Jun 8, 2016 at 1:48 PM, Mahesh Dananjaya 
wrote:

> Hi Maheshakya,
> you can find the details of the queries in this ReadMe [1]. i have add
> some changes . so previous querirs may not valid.please use these new
> queries in the README.
> *1.Streaming Linear regression*
> from LinRegInputStream#streaming:streaminglr((learnType),
> (batchSize/timeFrame), (numIterations), (stepSize), (miniBatchFraction),
> (ci), salary, rbi, walks, strikeouts, errors)
> select *
>
>
>
>
> *insert into regResults; from LinRegInputStream#streaming:streaminglr(0,
> 2, 100, 0.0001, 1, 0.95, salary, rbi, walks, strikeouts, errors)select
> *insert into regResults*;
>
> *2.Streaming KMeans Clustering*
> from LinRegInputStream#streaming:streamingkm((learnType),
> (batchSize/timeFrame), (numClusters), (numIterations),(alpha), (ci),
> salary, rbi, walks, strikeouts, errors)
> select *
> insert into regResults;
>
>
>
> *from
> KMeansInputStream#streaming:streamingkm(0,3,0.95,2,10,1,salary,rbi,walks,strikeouts,errors)select
> *insert into regResults*
>
>  And i need a help in returning the outputData of my program back to cep.
> therefore currenlt you may not find the stream output in event publish.but
> you can see the output in the console. i want to understand the final stepd
> of putting the output data back to output stream after the batch size is
> completed and the algorithms is completed. you may find that following line
> passes an exception. Thats have actually no clue of outputData format that
> need to give for Output stream.
>
> Object[] outputData = streamingLinearRegression.regress(eventData);
>
>
> if (outputData == null) {
> streamEventChunk.remove();
> } else {
> complexEventPopulater.populateComplexEvent(complexEvent, outputData);
> }
>
> my outputData Object[]array is in the format of
> [mse,beta0,beta1,betap].But seems to be that cep does not understand
> it. i do it by looking at the time series stream rpocessor extension at
> [2].can you please help me with this.
> regards,
> Mahesh.
>
> [1]
> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming
> [2]
> https://github.com/wso2/siddhi/blob/master/modules/siddhi-extensions/timeseries/src/main/java/org/wso2/siddhi/extension/timeseries/LinearRegressionStreamProcessor.java
>
> On Tue, Jun 7, 2016 at 10:42 PM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> Great work so far.
>>
>> Regarding the queries:
>>
>> streamingkm(0, 2,2,20,1,0.95 salary, rbi, walks, strikeouts, errors)
>>
>>
>> Can you give me the definitions of the first few entities in the order.
>> Also in previous supervised cases (linear regression), what is the response
>> variable, etc.
>> I'll go through the code and give you a feedback.
>>
>>  After this, we need to me this implementation into carbon-ml siddhi
>> extension. Please also do a similar implementation for logistic regression
>> as well because we need to have a streaming version for classification as
>> well.
>>
>> Best regards.
>>
>>
>>
>> On Tue, Jun 7, 2016 at 5:50 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshkya,
>>> I have changed the siddhi query for our StreamingKMeansClustering by
>>> adding Alpha into the picture which we can use to make data horizon (how
>>> quickly a most recent data point becomes a part of the model) and data
>>> obsolescence (how long does it take a past data point to become irrelevant
>>> to the model)in the streaming clustering algorithms.i have added new
>>> changes to repo [1] introducing StreamingKMeansClusteringModel and
>>> StreamingKMeansCLustering classes to project.new siddhi query is as follows.
>>>
>>> from Stream8Input#streaming:streamingkm(0, 2,2,20,1,0.95 salary, rbi,
>>> walks, strikeouts, errors)
>>>
>>> select *
>>> insert into regResults;
>>>
>>> regrads,
>>> Mahesh.
>>>
>>> [1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc
>>>
>>> On Mon, Jun 6, 2016 at 6:31 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
 Hi Maheshakya,
 As we have discussed the architecture of the project i have already
 developed a couple of essential components for our project. During last
 week i completed the writing cep siddhi extension for our streaming
 algorithms which are developed to learn incrementally with past
 experiences. I have written the siddhi extensions with StreamProcessor
 extension for StreamingLinearRegerssion and StreamingKMeansClustering with
 the relevant parameters to call it as siddhi query. On the other hand i did
 some research on 

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-08 Thread Mahesh Dananjaya
Hi Maheshakya,
in the last one mentioned example query for streaming linear regression
should be,





*insert into regResults; from LinRegInputStream#streaming:streaminglr(0, 2,
100, 0.0001, 1.0, 0.95, salary, rbi, walks, strikeouts, errors)select
*insert into regResults*;

miniBatchFraction should be given in double fomat.i wrote it wrong when i
document it.thank you.


On Wed, Jun 8, 2016 at 1:48 PM, Mahesh Dananjaya 
wrote:

> Hi Maheshakya,
> you can find the details of the queries in this ReadMe [1]. i have add
> some changes . so previous querirs may not valid.please use these new
> queries in the README.
> *1.Streaming Linear regression*
> from LinRegInputStream#streaming:streaminglr((learnType),
> (batchSize/timeFrame), (numIterations), (stepSize), (miniBatchFraction),
> (ci), salary, rbi, walks, strikeouts, errors)
> select *
>
>
>
>
> *insert into regResults; from LinRegInputStream#streaming:streaminglr(0,
> 2, 100, 0.0001, 1, 0.95, salary, rbi, walks, strikeouts, errors)select
> *insert into regResults*;
>
> *2.Streaming KMeans Clustering*
> from LinRegInputStream#streaming:streamingkm((learnType),
> (batchSize/timeFrame), (numClusters), (numIterations),(alpha), (ci),
> salary, rbi, walks, strikeouts, errors)
> select *
> insert into regResults;
>
>
>
> *from
> KMeansInputStream#streaming:streamingkm(0,3,0.95,2,10,1,salary,rbi,walks,strikeouts,errors)select
> *insert into regResults*
>
>  And i need a help in returning the outputData of my program back to cep.
> therefore currenlt you may not find the stream output in event publish.but
> you can see the output in the console. i want to understand the final stepd
> of putting the output data back to output stream after the batch size is
> completed and the algorithms is completed. you may find that following line
> passes an exception. Thats have actually no clue of outputData format that
> need to give for Output stream.
>
> Object[] outputData = streamingLinearRegression.regress(eventData);
>
>
> if (outputData == null) {
> streamEventChunk.remove();
> } else {
> complexEventPopulater.populateComplexEvent(complexEvent, outputData);
> }
>
> my outputData Object[]array is in the format of
> [mse,beta0,beta1,betap].But seems to be that cep does not understand
> it. i do it by looking at the time series stream rpocessor extension at
> [2].can you please help me with this.
> regards,
> Mahesh.
>
> [1]
> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming
> [2]
> https://github.com/wso2/siddhi/blob/master/modules/siddhi-extensions/timeseries/src/main/java/org/wso2/siddhi/extension/timeseries/LinearRegressionStreamProcessor.java
>
> On Tue, Jun 7, 2016 at 10:42 PM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> Great work so far.
>>
>> Regarding the queries:
>>
>> streamingkm(0, 2,2,20,1,0.95 salary, rbi, walks, strikeouts, errors)
>>
>>
>> Can you give me the definitions of the first few entities in the order.
>> Also in previous supervised cases (linear regression), what is the response
>> variable, etc.
>> I'll go through the code and give you a feedback.
>>
>>  After this, we need to me this implementation into carbon-ml siddhi
>> extension. Please also do a similar implementation for logistic regression
>> as well because we need to have a streaming version for classification as
>> well.
>>
>> Best regards.
>>
>>
>>
>> On Tue, Jun 7, 2016 at 5:50 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshkya,
>>> I have changed the siddhi query for our StreamingKMeansClustering by
>>> adding Alpha into the picture which we can use to make data horizon (how
>>> quickly a most recent data point becomes a part of the model) and data
>>> obsolescence (how long does it take a past data point to become irrelevant
>>> to the model)in the streaming clustering algorithms.i have added new
>>> changes to repo [1] introducing StreamingKMeansClusteringModel and
>>> StreamingKMeansCLustering classes to project.new siddhi query is as follows.
>>>
>>> from Stream8Input#streaming:streamingkm(0, 2,2,20,1,0.95 salary, rbi,
>>> walks, strikeouts, errors)
>>>
>>> select *
>>> insert into regResults;
>>>
>>> regrads,
>>> Mahesh.
>>>
>>> [1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc
>>>
>>> On Mon, Jun 6, 2016 at 6:31 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
 Hi Maheshakya,
 As we have discussed the architecture of the project i have already
 developed a couple of essential components for our project. During last
 week i completed the writing cep siddhi extension for our streaming
 algorithms which are developed to learn incrementally with past
 experiences. I have written the siddhi extensions with StreamProcessor
 extension for StreamingLinearRegerssion and StreamingKMeansClustering with
 the relevant parameters to call it as siddhi query. On the other hand i did

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-08 Thread Mahesh Dananjaya
Hi Maheshakya,
you can find the details of the queries in this ReadMe [1]. i have add some
changes . so previous querirs may not valid.please use these new queries in
the README.
*1.Streaming Linear regression*
from LinRegInputStream#streaming:streaminglr((learnType),
(batchSize/timeFrame), (numIterations), (stepSize), (miniBatchFraction),
(ci), salary, rbi, walks, strikeouts, errors)
select *




*insert into regResults; from LinRegInputStream#streaming:streaminglr(0, 2,
100, 0.0001, 1, 0.95, salary, rbi, walks, strikeouts, errors)select
*insert into regResults*;

*2.Streaming KMeans Clustering*
from LinRegInputStream#streaming:streamingkm((learnType),
(batchSize/timeFrame), (numClusters), (numIterations),(alpha), (ci),
salary, rbi, walks, strikeouts, errors)
select *
insert into regResults;



*from
KMeansInputStream#streaming:streamingkm(0,3,0.95,2,10,1,salary,rbi,walks,strikeouts,errors)select
*insert into regResults*

 And i need a help in returning the outputData of my program back to cep.
therefore currenlt you may not find the stream output in event publish.but
you can see the output in the console. i want to understand the final stepd
of putting the output data back to output stream after the batch size is
completed and the algorithms is completed. you may find that following line
passes an exception. Thats have actually no clue of outputData format that
need to give for Output stream.

Object[] outputData = streamingLinearRegression.regress(eventData);


if (outputData == null) {
streamEventChunk.remove();
} else {
complexEventPopulater.populateComplexEvent(complexEvent, outputData);
}

my outputData Object[]array is in the format of
[mse,beta0,beta1,betap].But seems to be that cep does not understand
it. i do it by looking at the time series stream rpocessor extension at
[2].can you please help me with this.
regards,
Mahesh.

[1]
https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming
[2]
https://github.com/wso2/siddhi/blob/master/modules/siddhi-extensions/timeseries/src/main/java/org/wso2/siddhi/extension/timeseries/LinearRegressionStreamProcessor.java

On Tue, Jun 7, 2016 at 10:42 PM, Maheshakya Wijewardena  wrote:

> Hi Mahesh,
>
> Great work so far.
>
> Regarding the queries:
>
> streamingkm(0, 2,2,20,1,0.95 salary, rbi, walks, strikeouts, errors)
>
>
> Can you give me the definitions of the first few entities in the order.
> Also in previous supervised cases (linear regression), what is the response
> variable, etc.
> I'll go through the code and give you a feedback.
>
>  After this, we need to me this implementation into carbon-ml siddhi
> extension. Please also do a similar implementation for logistic regression
> as well because we need to have a streaming version for classification as
> well.
>
> Best regards.
>
>
>
> On Tue, Jun 7, 2016 at 5:50 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshkya,
>> I have changed the siddhi query for our StreamingKMeansClustering by
>> adding Alpha into the picture which we can use to make data horizon (how
>> quickly a most recent data point becomes a part of the model) and data
>> obsolescence (how long does it take a past data point to become irrelevant
>> to the model)in the streaming clustering algorithms.i have added new
>> changes to repo [1] introducing StreamingKMeansClusteringModel and
>> StreamingKMeansCLustering classes to project.new siddhi query is as follows.
>>
>> from Stream8Input#streaming:streamingkm(0, 2,2,20,1,0.95 salary, rbi,
>> walks, strikeouts, errors)
>>
>> select *
>> insert into regResults;
>>
>> regrads,
>> Mahesh.
>>
>> [1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc
>>
>> On Mon, Jun 6, 2016 at 6:31 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> As we have discussed the architecture of the project i have already
>>> developed a couple of essential components for our project. During last
>>> week i completed the writing cep siddhi extension for our streaming
>>> algorithms which are developed to learn incrementally with past
>>> experiences. I have written the siddhi extensions with StreamProcessor
>>> extension for StreamingLinearRegerssion and StreamingKMeansClustering with
>>> the relevant parameters to call it as siddhi query. On the other hand i did
>>> some research on developing Mini Batch KMeans clustering for our
>>> StreamingKMeansClustering. And also i added the moving window addition to
>>> usual batch processing. And currently i am working on the time based
>>> incremental  re-trainign method for siddhi streams. On the
>>> StreamingClustering side i have already part of th
>>> StreamingKMeansClustering with the mini batch KMeans clustering. All the
>>> work i did were pushed to my repo in github [1]. you can find the
>>> development on gsoc/ directory.
>>>  And also as the ml team and supun was asked, i have did some timing and
>>> performance analysis for 

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-07 Thread Maheshakya Wijewardena
Hi Mahesh,

Great work so far.

Regarding the queries:

streamingkm(0, 2,2,20,1,0.95 salary, rbi, walks, strikeouts, errors)


Can you give me the definitions of the first few entities in the order.
Also in previous supervised cases (linear regression), what is the response
variable, etc.
I'll go through the code and give you a feedback.

 After this, we need to me this implementation into carbon-ml siddhi
extension. Please also do a similar implementation for logistic regression
as well because we need to have a streaming version for classification as
well.

Best regards.



On Tue, Jun 7, 2016 at 5:50 PM, Mahesh Dananjaya 
wrote:

> Hi Maheshkya,
> I have changed the siddhi query for our StreamingKMeansClustering by
> adding Alpha into the picture which we can use to make data horizon (how
> quickly a most recent data point becomes a part of the model) and data
> obsolescence (how long does it take a past data point to become irrelevant
> to the model)in the streaming clustering algorithms.i have added new
> changes to repo [1] introducing StreamingKMeansClusteringModel and
> StreamingKMeansCLustering classes to project.new siddhi query is as follows.
>
> from Stream8Input#streaming:streamingkm(0, 2,2,20,1,0.95 salary, rbi,
> walks, strikeouts, errors)
>
> select *
> insert into regResults;
>
> regrads,
> Mahesh.
>
> [1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc
>
> On Mon, Jun 6, 2016 at 6:31 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> As we have discussed the architecture of the project i have already
>> developed a couple of essential components for our project. During last
>> week i completed the writing cep siddhi extension for our streaming
>> algorithms which are developed to learn incrementally with past
>> experiences. I have written the siddhi extensions with StreamProcessor
>> extension for StreamingLinearRegerssion and StreamingKMeansClustering with
>> the relevant parameters to call it as siddhi query. On the other hand i did
>> some research on developing Mini Batch KMeans clustering for our
>> StreamingKMeansClustering. And also i added the moving window addition to
>> usual batch processing. And currently i am working on the time based
>> incremental  re-trainign method for siddhi streams. On the
>> StreamingClustering side i have already part of th
>> StreamingKMeansClustering with the mini batch KMeans clustering. All the
>> work i did were pushed to my repo in github [1]. you can find the
>> development on gsoc/ directory.
>>  And also as the ml team and supun was asked, i have did some timing and
>> performance analysis for our SGD (Stochastic Gradient Descent) algorithms
>> for LinearRegression. Those results also add to my repo in [2]. Now i am
>> developing the rest for our purpose and trying to looked into other
>> researches on predictive analysis for online big data. Ans also doing some
>> work related to mini batch KMEans Clustering. And also i have been working
>> on the performance analysis, accuracy and basic comparison between mini
>> batch algorithms and moving window algorithms for streaming and periodic
>> re-training of ML model. thank you.
>> BR,
>> Mahesh.
>> [1] https://github.com/dananjayamahesh/GSOC2016
>> [2]
>> https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/output/lr_timing_1.jpg
>>
>>
>> On Sat, Jun 4, 2016 at 8:50 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshkya,
>>> If you want to run it please use following queries.
>>> 1. StreamingLInearRegression
>>>
>>> from Stream4InputStream#streaming:streaminglr(0, 2, 0.95, salary, rbi,
>>> walks, strikeouts, errors)
>>>
>>> select *
>>>
>>> insert into regResults;
>>>
>>> from Stream8Input#streaming:streamingkm(0, 2, 0.95,2,20, salary, rbi,
>>> walks, strikeouts, errors)
>>>
>>> select *
>>> insert into regResults;
>>>
>>> in both case the first parameter let you to decide which learning methos
>>> you want, moving window, batch processing or time based model learning.
>>> BR,
>>> Mahesh.
>>>
>>> On Sat, Jun 4, 2016 at 8:45 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
 Hi Maheshkaya,
 I have added the moving window method and update the previos
 StreamingLinearRegression [1] which only performed batch processing with
 streaming data. and also i added the StreamingKMeansClustering [1] for our
 purposes and debugged them.thank you.
 regards,
 Mahesh.
 [1]
 https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming/src/main/java/org/gsoc/siddhi/extension/streaming

 On Sat, Jun 4, 2016 at 5:58 PM, Supun Sethunga  wrote:

> Thanks Mahesh! The graphs look promising! :)
>
> So by looking at graph, LR with SGD can train  a model within 60 secs
> (6*10^10 nano sec), using about 900,000 data points . Means, this online
> training can handle 

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-07 Thread Mahesh Dananjaya
Hi Maheshkya,
I have changed the siddhi query for our StreamingKMeansClustering by adding
Alpha into the picture which we can use to make data horizon (how quickly a
most recent data point becomes a part of the model) and data obsolescence
(how long does it take a past data point to become irrelevant to the
model)in the streaming clustering algorithms.i have added new changes to
repo [1] introducing StreamingKMeansClusteringModel and
StreamingKMeansCLustering classes to project.new siddhi query is as follows.

from Stream8Input#streaming:streamingkm(0, 2,2,20,1,0.95 salary, rbi,
walks, strikeouts, errors)

select *
insert into regResults;

regrads,
Mahesh.

[1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc

On Mon, Jun 6, 2016 at 6:31 PM, Mahesh Dananjaya 
wrote:

> Hi Maheshakya,
> As we have discussed the architecture of the project i have already
> developed a couple of essential components for our project. During last
> week i completed the writing cep siddhi extension for our streaming
> algorithms which are developed to learn incrementally with past
> experiences. I have written the siddhi extensions with StreamProcessor
> extension for StreamingLinearRegerssion and StreamingKMeansClustering with
> the relevant parameters to call it as siddhi query. On the other hand i did
> some research on developing Mini Batch KMeans clustering for our
> StreamingKMeansClustering. And also i added the moving window addition to
> usual batch processing. And currently i am working on the time based
> incremental  re-trainign method for siddhi streams. On the
> StreamingClustering side i have already part of th
> StreamingKMeansClustering with the mini batch KMeans clustering. All the
> work i did were pushed to my repo in github [1]. you can find the
> development on gsoc/ directory.
>  And also as the ml team and supun was asked, i have did some timing and
> performance analysis for our SGD (Stochastic Gradient Descent) algorithms
> for LinearRegression. Those results also add to my repo in [2]. Now i am
> developing the rest for our purpose and trying to looked into other
> researches on predictive analysis for online big data. Ans also doing some
> work related to mini batch KMEans Clustering. And also i have been working
> on the performance analysis, accuracy and basic comparison between mini
> batch algorithms and moving window algorithms for streaming and periodic
> re-training of ML model. thank you.
> BR,
> Mahesh.
> [1] https://github.com/dananjayamahesh/GSOC2016
> [2]
> https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/output/lr_timing_1.jpg
>
>
> On Sat, Jun 4, 2016 at 8:50 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshkya,
>> If you want to run it please use following queries.
>> 1. StreamingLInearRegression
>>
>> from Stream4InputStream#streaming:streaminglr(0, 2, 0.95, salary, rbi,
>> walks, strikeouts, errors)
>>
>> select *
>>
>> insert into regResults;
>>
>> from Stream8Input#streaming:streamingkm(0, 2, 0.95,2,20, salary, rbi,
>> walks, strikeouts, errors)
>>
>> select *
>> insert into regResults;
>>
>> in both case the first parameter let you to decide which learning methos
>> you want, moving window, batch processing or time based model learning.
>> BR,
>> Mahesh.
>>
>> On Sat, Jun 4, 2016 at 8:45 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshkaya,
>>> I have added the moving window method and update the previos
>>> StreamingLinearRegression [1] which only performed batch processing with
>>> streaming data. and also i added the StreamingKMeansClustering [1] for our
>>> purposes and debugged them.thank you.
>>> regards,
>>> Mahesh.
>>> [1]
>>> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming/src/main/java/org/gsoc/siddhi/extension/streaming
>>>
>>> On Sat, Jun 4, 2016 at 5:58 PM, Supun Sethunga  wrote:
>>>
 Thanks Mahesh! The graphs look promising! :)

 So by looking at graph, LR with SGD can train  a model within 60 secs
 (6*10^10 nano sec), using about 900,000 data points . Means, this online
 training can handle events/data points coming at rate of 15,000 per second
 (or more) , if the batch size is set to 900,000 (or less) or window size is
 set to 60 secs (or less). This is great IMO!

 On Sat, Jun 4, 2016 at 10:51 AM, Mahesh Dananjaya <
 dananjayamah...@gmail.com> wrote:

> Hi Maheshakya,
> As you requested i can change other parameters as well such as feature
> size(p). Initially i did it with p=3;sure thing. Anyway you can see and 
> run
> the code if you want. source is at [1]. the test timing is called with
> random data as you requested if you set args[0] to 1. And you can find the
> extension and streaming algorithms in gsoc/ directiry[2]. thank you.
> BR,
> Mahesh.
> [1]
> 

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-06 Thread Mahesh Dananjaya
Hi Maheshakya,
As we have discussed the architecture of the project i have already
developed a couple of essential components for our project. During last
week i completed the writing cep siddhi extension for our streaming
algorithms which are developed to learn incrementally with past
experiences. I have written the siddhi extensions with StreamProcessor
extension for StreamingLinearRegerssion and StreamingKMeansClustering with
the relevant parameters to call it as siddhi query. On the other hand i did
some research on developing Mini Batch KMeans clustering for our
StreamingKMeansClustering. And also i added the moving window addition to
usual batch processing. And currently i am working on the time based
incremental  re-trainign method for siddhi streams. On the
StreamingClustering side i have already part of th
StreamingKMeansClustering with the mini batch KMeans clustering. All the
work i did were pushed to my repo in github [1]. you can find the
development on gsoc/ directory.
 And also as the ml team and supun was asked, i have did some timing and
performance analysis for our SGD (Stochastic Gradient Descent) algorithms
for LinearRegression. Those results also add to my repo in [2]. Now i am
developing the rest for our purpose and trying to looked into other
researches on predictive analysis for online big data. Ans also doing some
work related to mini batch KMEans Clustering. And also i have been working
on the performance analysis, accuracy and basic comparison between mini
batch algorithms and moving window algorithms for streaming and periodic
re-training of ML model. thank you.
BR,
Mahesh.
[1] https://github.com/dananjayamahesh/GSOC2016
[2]
https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/output/lr_timing_1.jpg


On Sat, Jun 4, 2016 at 8:50 PM, Mahesh Dananjaya 
wrote:

> Hi Maheshkya,
> If you want to run it please use following queries.
> 1. StreamingLInearRegression
>
> from Stream4InputStream#streaming:streaminglr(0, 2, 0.95, salary, rbi,
> walks, strikeouts, errors)
>
> select *
>
> insert into regResults;
>
> from Stream8Input#streaming:streamingkm(0, 2, 0.95,2,20, salary, rbi,
> walks, strikeouts, errors)
>
> select *
> insert into regResults;
>
> in both case the first parameter let you to decide which learning methos
> you want, moving window, batch processing or time based model learning.
> BR,
> Mahesh.
>
> On Sat, Jun 4, 2016 at 8:45 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshkaya,
>> I have added the moving window method and update the previos
>> StreamingLinearRegression [1] which only performed batch processing with
>> streaming data. and also i added the StreamingKMeansClustering [1] for our
>> purposes and debugged them.thank you.
>> regards,
>> Mahesh.
>> [1]
>> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming/src/main/java/org/gsoc/siddhi/extension/streaming
>>
>> On Sat, Jun 4, 2016 at 5:58 PM, Supun Sethunga  wrote:
>>
>>> Thanks Mahesh! The graphs look promising! :)
>>>
>>> So by looking at graph, LR with SGD can train  a model within 60 secs
>>> (6*10^10 nano sec), using about 900,000 data points . Means, this online
>>> training can handle events/data points coming at rate of 15,000 per second
>>> (or more) , if the batch size is set to 900,000 (or less) or window size is
>>> set to 60 secs (or less). This is great IMO!
>>>
>>> On Sat, Jun 4, 2016 at 10:51 AM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
 Hi Maheshakya,
 As you requested i can change other parameters as well such as feature
 size(p). Initially i did it with p=3;sure thing. Anyway you can see and run
 the code if you want. source is at [1]. the test timing is called with
 random data as you requested if you set args[0] to 1. And you can find the
 extension and streaming algorithms in gsoc/ directiry[2]. thank you.
 BR,
 Mahesh.
 [1]
 https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/src/main/java/org/sparkexample/StreamingLinearRegression.java
 [2] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc

 On Sat, Jun 4, 2016 at 10:39 AM, Mahesh Dananjaya <
 dananjayamah...@gmail.com> wrote:

> Hi supun,
> Though i pushed it yesterday, there was some problems with the
> network. now you can see them in the repo location [1].I added some Matlab
> plot you can see the patter there.you can use ml also. Ok sure thing. I 
> can
> prepare a report or else blog if you want. files are as follows. The y 
> axis
> is in ns and x axis is in batch size. And also i added two pplots as
> jpegs[2], so you can easily compare.
> lr_timing_1000.txt -> batch size incremented by 1000
> lr_timing_1.txt -> batch size incremented by 1
> lr_timing_power10.txt -> batch size incremented by power of 10
>

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-04 Thread Mahesh Dananjaya
Hi Maheshkya,
If you want to run it please use following queries.
1. StreamingLInearRegression

from Stream4InputStream#streaming:streaminglr(0, 2, 0.95, salary, rbi,
walks, strikeouts, errors)

select *

insert into regResults;

from Stream8Input#streaming:streamingkm(0, 2, 0.95,2,20, salary, rbi,
walks, strikeouts, errors)

select *
insert into regResults;

in both case the first parameter let you to decide which learning methos
you want, moving window, batch processing or time based model learning.
BR,
Mahesh.

On Sat, Jun 4, 2016 at 8:45 PM, Mahesh Dananjaya 
wrote:

> Hi Maheshkaya,
> I have added the moving window method and update the previos
> StreamingLinearRegression [1] which only performed batch processing with
> streaming data. and also i added the StreamingKMeansClustering [1] for our
> purposes and debugged them.thank you.
> regards,
> Mahesh.
> [1]
> https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming/src/main/java/org/gsoc/siddhi/extension/streaming
>
> On Sat, Jun 4, 2016 at 5:58 PM, Supun Sethunga  wrote:
>
>> Thanks Mahesh! The graphs look promising! :)
>>
>> So by looking at graph, LR with SGD can train  a model within 60 secs
>> (6*10^10 nano sec), using about 900,000 data points . Means, this online
>> training can handle events/data points coming at rate of 15,000 per second
>> (or more) , if the batch size is set to 900,000 (or less) or window size is
>> set to 60 secs (or less). This is great IMO!
>>
>> On Sat, Jun 4, 2016 at 10:51 AM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> As you requested i can change other parameters as well such as feature
>>> size(p). Initially i did it with p=3;sure thing. Anyway you can see and run
>>> the code if you want. source is at [1]. the test timing is called with
>>> random data as you requested if you set args[0] to 1. And you can find the
>>> extension and streaming algorithms in gsoc/ directiry[2]. thank you.
>>> BR,
>>> Mahesh.
>>> [1]
>>> https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/src/main/java/org/sparkexample/StreamingLinearRegression.java
>>> [2] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc
>>>
>>> On Sat, Jun 4, 2016 at 10:39 AM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
 Hi supun,
 Though i pushed it yesterday, there was some problems with the network.
 now you can see them in the repo location [1].I added some Matlab plot you
 can see the patter there.you can use ml also. Ok sure thing. I can prepare
 a report or else blog if you want. files are as follows. The y axis is in
 ns and x axis is in batch size. And also i added two pplots as jpegs[2], so
 you can easily compare.
 lr_timing_1000.txt -> batch size incremented by 1000
 lr_timing_1.txt -> batch size incremented by 1
 lr_timing_power10.txt -> batch size incremented by power of 10

 In here independent variable is only tha batch size.If you want i can
 send you making other parameters such as step size, number of iteration,
 feature vector size as independent variables. please let me know if you
 want further info. thank you.
 regards,
 Mahesh.


 [1
 ]https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output
 [2]
 https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/output/lr_timing_1.jpg

 On Sat, Jun 4, 2016 at 9:58 AM, Supun Sethunga  wrote:

> Hi Mahesh,
>
> I have added those timing reports to my repo [1].
>
> Whats the file name? :)
>
> Btw, can you compile simple doc (gdoc) with the above results, and
> bring everything to one place? That way it is easy to compare, and keep
> track.
>
> Thanks,
> Supun
>
> On Fri, Jun 3, 2016 at 7:23 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshkya,
>> I have added those timing reports to my repo [1].please have a look
>> at. three files are there. one is using incremet as 1000 for batch sizes
>> (lr_timing_1000). Otherone is using incremet by 1 (lr_timing_1)
>> upto 1 million in both scenarios.you can see the reports and figures in 
>> the
>> location [2] in the repo. i also added the streaminglinearregression
>> classes in the repo gsoc folder.thank you.
>> regards,
>> Mahesh.
>> [1]https://github.com/dananjayamahesh/GSOC2016
>> [2]
>> https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output
>>
>> On Mon, May 30, 2016 at 9:24 AM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Hi Mahesh,
>>>
>>> Thank you for the update. I will look into your implementation.
>>>
>>> And i will be able to send you the 

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-04 Thread Mahesh Dananjaya
Hi Maheshkaya,
I have added the moving window method and update the previos
StreamingLinearRegression [1] which only performed batch processing with
streaming data. and also i added the StreamingKMeansClustering [1] for our
purposes and debugged them.thank you.
regards,
Mahesh.
[1]
https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc/siddhi/extension/streaming/src/main/java/org/gsoc/siddhi/extension/streaming

On Sat, Jun 4, 2016 at 5:58 PM, Supun Sethunga  wrote:

> Thanks Mahesh! The graphs look promising! :)
>
> So by looking at graph, LR with SGD can train  a model within 60 secs
> (6*10^10 nano sec), using about 900,000 data points . Means, this online
> training can handle events/data points coming at rate of 15,000 per second
> (or more) , if the batch size is set to 900,000 (or less) or window size is
> set to 60 secs (or less). This is great IMO!
>
> On Sat, Jun 4, 2016 at 10:51 AM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> As you requested i can change other parameters as well such as feature
>> size(p). Initially i did it with p=3;sure thing. Anyway you can see and run
>> the code if you want. source is at [1]. the test timing is called with
>> random data as you requested if you set args[0] to 1. And you can find the
>> extension and streaming algorithms in gsoc/ directiry[2]. thank you.
>> BR,
>> Mahesh.
>> [1]
>> https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/src/main/java/org/sparkexample/StreamingLinearRegression.java
>> [2] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc
>>
>> On Sat, Jun 4, 2016 at 10:39 AM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi supun,
>>> Though i pushed it yesterday, there was some problems with the network.
>>> now you can see them in the repo location [1].I added some Matlab plot you
>>> can see the patter there.you can use ml also. Ok sure thing. I can prepare
>>> a report or else blog if you want. files are as follows. The y axis is in
>>> ns and x axis is in batch size. And also i added two pplots as jpegs[2], so
>>> you can easily compare.
>>> lr_timing_1000.txt -> batch size incremented by 1000
>>> lr_timing_1.txt -> batch size incremented by 1
>>> lr_timing_power10.txt -> batch size incremented by power of 10
>>>
>>> In here independent variable is only tha batch size.If you want i can
>>> send you making other parameters such as step size, number of iteration,
>>> feature vector size as independent variables. please let me know if you
>>> want further info. thank you.
>>> regards,
>>> Mahesh.
>>>
>>>
>>> [1
>>> ]https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output
>>> [2]
>>> https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/output/lr_timing_1.jpg
>>>
>>> On Sat, Jun 4, 2016 at 9:58 AM, Supun Sethunga  wrote:
>>>
 Hi Mahesh,

 I have added those timing reports to my repo [1].

 Whats the file name? :)

 Btw, can you compile simple doc (gdoc) with the above results, and
 bring everything to one place? That way it is easy to compare, and keep
 track.

 Thanks,
 Supun

 On Fri, Jun 3, 2016 at 7:23 PM, Mahesh Dananjaya <
 dananjayamah...@gmail.com> wrote:

> Hi Maheshkya,
> I have added those timing reports to my repo [1].please have a look
> at. three files are there. one is using incremet as 1000 for batch sizes
> (lr_timing_1000). Otherone is using incremet by 1 (lr_timing_1)
> upto 1 million in both scenarios.you can see the reports and figures in 
> the
> location [2] in the repo. i also added the streaminglinearregression
> classes in the repo gsoc folder.thank you.
> regards,
> Mahesh.
> [1]https://github.com/dananjayamahesh/GSOC2016
> [2]
> https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output
>
> On Mon, May 30, 2016 at 9:24 AM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> Thank you for the update. I will look into your implementation.
>>
>> And i will be able to send you the timing/performances analysis
>>> report tomorrow for the SGD functions
>>>
>>
>> Great. Sent those asap so that we can proceed.
>>
>> Best regards.
>>
>> On Sun, May 29, 2016 at 6:56 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>>
>>> Hi maheshakay,
>>> I have implemented the linear regression with cep siddhi event
>>> stream with  taking batch sizes as parameters from the cep. Now we can
>>> trying the moving window method to. Before that i think i should get 
>>> your
>>> opinion on data structures to save the streaming data.please check my 
>>> repo
>>> [1]  /gsoc/ folder there you can find all new things i add.. there in 

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-04 Thread Supun Sethunga
Thanks Mahesh! The graphs look promising! :)

So by looking at graph, LR with SGD can train  a model within 60 secs
(6*10^10 nano sec), using about 900,000 data points . Means, this online
training can handle events/data points coming at rate of 15,000 per second
(or more) , if the batch size is set to 900,000 (or less) or window size is
set to 60 secs (or less). This is great IMO!

On Sat, Jun 4, 2016 at 10:51 AM, Mahesh Dananjaya  wrote:

> Hi Maheshakya,
> As you requested i can change other parameters as well such as feature
> size(p). Initially i did it with p=3;sure thing. Anyway you can see and run
> the code if you want. source is at [1]. the test timing is called with
> random data as you requested if you set args[0] to 1. And you can find the
> extension and streaming algorithms in gsoc/ directiry[2]. thank you.
> BR,
> Mahesh.
> [1]
> https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/src/main/java/org/sparkexample/StreamingLinearRegression.java
> [2] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc
>
> On Sat, Jun 4, 2016 at 10:39 AM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi supun,
>> Though i pushed it yesterday, there was some problems with the network.
>> now you can see them in the repo location [1].I added some Matlab plot you
>> can see the patter there.you can use ml also. Ok sure thing. I can prepare
>> a report or else blog if you want. files are as follows. The y axis is in
>> ns and x axis is in batch size. And also i added two pplots as jpegs[2], so
>> you can easily compare.
>> lr_timing_1000.txt -> batch size incremented by 1000
>> lr_timing_1.txt -> batch size incremented by 1
>> lr_timing_power10.txt -> batch size incremented by power of 10
>>
>> In here independent variable is only tha batch size.If you want i can
>> send you making other parameters such as step size, number of iteration,
>> feature vector size as independent variables. please let me know if you
>> want further info. thank you.
>> regards,
>> Mahesh.
>>
>>
>> [1
>> ]https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output
>> [2]
>> https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/output/lr_timing_1.jpg
>>
>> On Sat, Jun 4, 2016 at 9:58 AM, Supun Sethunga  wrote:
>>
>>> Hi Mahesh,
>>>
>>> I have added those timing reports to my repo [1].
>>>
>>> Whats the file name? :)
>>>
>>> Btw, can you compile simple doc (gdoc) with the above results, and bring
>>> everything to one place? That way it is easy to compare, and keep track.
>>>
>>> Thanks,
>>> Supun
>>>
>>> On Fri, Jun 3, 2016 at 7:23 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
 Hi Maheshkya,
 I have added those timing reports to my repo [1].please have a look at.
 three files are there. one is using incremet as 1000 for batch sizes
 (lr_timing_1000). Otherone is using incremet by 1 (lr_timing_1)
 upto 1 million in both scenarios.you can see the reports and figures in the
 location [2] in the repo. i also added the streaminglinearregression
 classes in the repo gsoc folder.thank you.
 regards,
 Mahesh.
 [1]https://github.com/dananjayamahesh/GSOC2016
 [2]
 https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output

 On Mon, May 30, 2016 at 9:24 AM, Maheshakya Wijewardena <
 mahesha...@wso2.com> wrote:

> Hi Mahesh,
>
> Thank you for the update. I will look into your implementation.
>
> And i will be able to send you the timing/performances analysis report
>> tomorrow for the SGD functions
>>
>
> Great. Sent those asap so that we can proceed.
>
> Best regards.
>
> On Sun, May 29, 2016 at 6:56 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>>
>> Hi maheshakay,
>> I have implemented the linear regression with cep siddhi event stream
>> with  taking batch sizes as parameters from the cep. Now we can trying 
>> the
>> moving window method to. Before that i think i should get your opinion on
>> data structures to save the streaming data.please check my repo [1]  
>> /gsoc/
>> folder there you can find all new things i add.. there in the extension
>> folder you can find those extension. And i will be able to send you the
>> timing/performances analysis report tomorrow for the SGD functions. thank
>> you.
>> regards,
>> Mahesh.
>> [1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc
>>
>>
>> On Fri, May 27, 2016 at 12:56 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi maheshkaya,
>>> i have written some siddhi extension and trying to develop a one for
>>> my one. In time series example in the [1], can you please explain me the
>>> input format and query 

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-04 Thread Mahesh Dananjaya
Hi Maheshakya,
I have looked into the spark streaming fundamentals and  k mean clustering
to develop the streaming k mean clustering for stream data. those can be
found at [1] and [2].I will commit new changes to my repo today including
the basic implementation of streaming k mean clustering.thank you.
regards,
Mahesh.
[1] http://spark.apache.org/docs/latest/streaming-programming-guide.html
[2] http://spark.apache.org/docs/latest/mllib-clustering.html

On Sat, Jun 4, 2016 at 10:51 AM, Mahesh Dananjaya  wrote:

> Hi Maheshakya,
> As you requested i can change other parameters as well such as feature
> size(p). Initially i did it with p=3;sure thing. Anyway you can see and run
> the code if you want. source is at [1]. the test timing is called with
> random data as you requested if you set args[0] to 1. And you can find the
> extension and streaming algorithms in gsoc/ directiry[2]. thank you.
> BR,
> Mahesh.
> [1]
> https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/src/main/java/org/sparkexample/StreamingLinearRegression.java
> [2] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc
>
> On Sat, Jun 4, 2016 at 10:39 AM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi supun,
>> Though i pushed it yesterday, there was some problems with the network.
>> now you can see them in the repo location [1].I added some Matlab plot you
>> can see the patter there.you can use ml also. Ok sure thing. I can prepare
>> a report or else blog if you want. files are as follows. The y axis is in
>> ns and x axis is in batch size. And also i added two pplots as jpegs[2], so
>> you can easily compare.
>> lr_timing_1000.txt -> batch size incremented by 1000
>> lr_timing_1.txt -> batch size incremented by 1
>> lr_timing_power10.txt -> batch size incremented by power of 10
>>
>> In here independent variable is only tha batch size.If you want i can
>> send you making other parameters such as step size, number of iteration,
>> feature vector size as independent variables. please let me know if you
>> want further info. thank you.
>> regards,
>> Mahesh.
>>
>>
>> [1
>> ]https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output
>> [2]
>> https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/output/lr_timing_1.jpg
>>
>> On Sat, Jun 4, 2016 at 9:58 AM, Supun Sethunga  wrote:
>>
>>> Hi Mahesh,
>>>
>>> I have added those timing reports to my repo [1].
>>>
>>> Whats the file name? :)
>>>
>>> Btw, can you compile simple doc (gdoc) with the above results, and bring
>>> everything to one place? That way it is easy to compare, and keep track.
>>>
>>> Thanks,
>>> Supun
>>>
>>> On Fri, Jun 3, 2016 at 7:23 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
 Hi Maheshkya,
 I have added those timing reports to my repo [1].please have a look at.
 three files are there. one is using incremet as 1000 for batch sizes
 (lr_timing_1000). Otherone is using incremet by 1 (lr_timing_1)
 upto 1 million in both scenarios.you can see the reports and figures in the
 location [2] in the repo. i also added the streaminglinearregression
 classes in the repo gsoc folder.thank you.
 regards,
 Mahesh.
 [1]https://github.com/dananjayamahesh/GSOC2016
 [2]
 https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output

 On Mon, May 30, 2016 at 9:24 AM, Maheshakya Wijewardena <
 mahesha...@wso2.com> wrote:

> Hi Mahesh,
>
> Thank you for the update. I will look into your implementation.
>
> And i will be able to send you the timing/performances analysis report
>> tomorrow for the SGD functions
>>
>
> Great. Sent those asap so that we can proceed.
>
> Best regards.
>
> On Sun, May 29, 2016 at 6:56 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>>
>> Hi maheshakay,
>> I have implemented the linear regression with cep siddhi event stream
>> with  taking batch sizes as parameters from the cep. Now we can trying 
>> the
>> moving window method to. Before that i think i should get your opinion on
>> data structures to save the streaming data.please check my repo [1]  
>> /gsoc/
>> folder there you can find all new things i add.. there in the extension
>> folder you can find those extension. And i will be able to send you the
>> timing/performances analysis report tomorrow for the SGD functions. thank
>> you.
>> regards,
>> Mahesh.
>> [1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc
>>
>>
>> On Fri, May 27, 2016 at 12:56 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi maheshkaya,
>>> i have written some siddhi extension and trying to develop a one for
>>> my one. In time series example in 

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-03 Thread Mahesh Dananjaya
Hi Maheshakya,
As you requested i can change other parameters as well such as feature
size(p). Initially i did it with p=3;sure thing. Anyway you can see and run
the code if you want. source is at [1]. the test timing is called with
random data as you requested if you set args[0] to 1. And you can find the
extension and streaming algorithms in gsoc/ directiry[2]. thank you.
BR,
Mahesh.
[1]
https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/src/main/java/org/sparkexample/StreamingLinearRegression.java
[2] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc

On Sat, Jun 4, 2016 at 10:39 AM, Mahesh Dananjaya  wrote:

> Hi supun,
> Though i pushed it yesterday, there was some problems with the network.
> now you can see them in the repo location [1].I added some Matlab plot you
> can see the patter there.you can use ml also. Ok sure thing. I can prepare
> a report or else blog if you want. files are as follows. The y axis is in
> ns and x axis is in batch size. And also i added two pplots as jpegs[2], so
> you can easily compare.
> lr_timing_1000.txt -> batch size incremented by 1000
> lr_timing_1.txt -> batch size incremented by 1
> lr_timing_power10.txt -> batch size incremented by power of 10
>
> In here independent variable is only tha batch size.If you want i can send
> you making other parameters such as step size, number of iteration, feature
> vector size as independent variables. please let me know if you want
> further info. thank you.
> regards,
> Mahesh.
>
>
> [1
> ]https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output
> [2]
> https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/output/lr_timing_1.jpg
>
> On Sat, Jun 4, 2016 at 9:58 AM, Supun Sethunga  wrote:
>
>> Hi Mahesh,
>>
>> I have added those timing reports to my repo [1].
>>
>> Whats the file name? :)
>>
>> Btw, can you compile simple doc (gdoc) with the above results, and bring
>> everything to one place? That way it is easy to compare, and keep track.
>>
>> Thanks,
>> Supun
>>
>> On Fri, Jun 3, 2016 at 7:23 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshkya,
>>> I have added those timing reports to my repo [1].please have a look at.
>>> three files are there. one is using incremet as 1000 for batch sizes
>>> (lr_timing_1000). Otherone is using incremet by 1 (lr_timing_1)
>>> upto 1 million in both scenarios.you can see the reports and figures in the
>>> location [2] in the repo. i also added the streaminglinearregression
>>> classes in the repo gsoc folder.thank you.
>>> regards,
>>> Mahesh.
>>> [1]https://github.com/dananjayamahesh/GSOC2016
>>> [2]
>>> https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output
>>>
>>> On Mon, May 30, 2016 at 9:24 AM, Maheshakya Wijewardena <
>>> mahesha...@wso2.com> wrote:
>>>
 Hi Mahesh,

 Thank you for the update. I will look into your implementation.

 And i will be able to send you the timing/performances analysis report
> tomorrow for the SGD functions
>

 Great. Sent those asap so that we can proceed.

 Best regards.

 On Sun, May 29, 2016 at 6:56 PM, Mahesh Dananjaya <
 dananjayamah...@gmail.com> wrote:

>
> Hi maheshakay,
> I have implemented the linear regression with cep siddhi event stream
> with  taking batch sizes as parameters from the cep. Now we can trying the
> moving window method to. Before that i think i should get your opinion on
> data structures to save the streaming data.please check my repo [1]  
> /gsoc/
> folder there you can find all new things i add.. there in the extension
> folder you can find those extension. And i will be able to send you the
> timing/performances analysis report tomorrow for the SGD functions. thank
> you.
> regards,
> Mahesh.
> [1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc
>
>
> On Fri, May 27, 2016 at 12:56 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi maheshkaya,
>> i have written some siddhi extension and trying to develop a one for
>> my one. In time series example in the [1], can you please explain me the
>> input format and query lines in that example for my understanding.
>>
>> from baseballData#timeseries:regress(2, 1, 0.95, salary, rbi,
>> walks, strikeouts, errors)
>> select *
>> insert into regResults;
>>
>> i just want to knwo how i give a set of data into this extension and
>> what is baseballData. Is it input stream as usual.or any data file?how 
>> can
>> i find that data set to create dummy input stream like baseballData?
>>
>> thank you.
>> regards,
>> Mahesh.
>> [1]
>> 

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-03 Thread Mahesh Dananjaya
Hi supun,
Though i pushed it yesterday, there was some problems with the network. now
you can see them in the repo location [1].I added some Matlab plot you can
see the patter there.you can use ml also. Ok sure thing. I can prepare a
report or else blog if you want. files are as follows. The y axis is in ns
and x axis is in batch size. And also i added two pplots as jpegs[2], so
you can easily compare.
lr_timing_1000.txt -> batch size incremented by 1000
lr_timing_1.txt -> batch size incremented by 1
lr_timing_power10.txt -> batch size incremented by power of 10

In here independent variable is only tha batch size.If you want i can send
you making other parameters such as step size, number of iteration, feature
vector size as independent variables. please let me know if you want
further info. thank you.
regards,
Mahesh.


[1
]https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output
[2]
https://github.com/dananjayamahesh/GSOC2016/blob/master/spark-examples/first-example/output/lr_timing_1.jpg

On Sat, Jun 4, 2016 at 9:58 AM, Supun Sethunga  wrote:

> Hi Mahesh,
>
> I have added those timing reports to my repo [1].
>
> Whats the file name? :)
>
> Btw, can you compile simple doc (gdoc) with the above results, and bring
> everything to one place? That way it is easy to compare, and keep track.
>
> Thanks,
> Supun
>
> On Fri, Jun 3, 2016 at 7:23 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshkya,
>> I have added those timing reports to my repo [1].please have a look at.
>> three files are there. one is using incremet as 1000 for batch sizes
>> (lr_timing_1000). Otherone is using incremet by 1 (lr_timing_1)
>> upto 1 million in both scenarios.you can see the reports and figures in the
>> location [2] in the repo. i also added the streaminglinearregression
>> classes in the repo gsoc folder.thank you.
>> regards,
>> Mahesh.
>> [1]https://github.com/dananjayamahesh/GSOC2016
>> [2]
>> https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output
>>
>> On Mon, May 30, 2016 at 9:24 AM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Hi Mahesh,
>>>
>>> Thank you for the update. I will look into your implementation.
>>>
>>> And i will be able to send you the timing/performances analysis report
 tomorrow for the SGD functions

>>>
>>> Great. Sent those asap so that we can proceed.
>>>
>>> Best regards.
>>>
>>> On Sun, May 29, 2016 at 6:56 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>

 Hi maheshakay,
 I have implemented the linear regression with cep siddhi event stream
 with  taking batch sizes as parameters from the cep. Now we can trying the
 moving window method to. Before that i think i should get your opinion on
 data structures to save the streaming data.please check my repo [1]  /gsoc/
 folder there you can find all new things i add.. there in the extension
 folder you can find those extension. And i will be able to send you the
 timing/performances analysis report tomorrow for the SGD functions. thank
 you.
 regards,
 Mahesh.
 [1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc


 On Fri, May 27, 2016 at 12:56 PM, Mahesh Dananjaya <
 dananjayamah...@gmail.com> wrote:

> Hi maheshkaya,
> i have written some siddhi extension and trying to develop a one for
> my one. In time series example in the [1], can you please explain me the
> input format and query lines in that example for my understanding.
>
> from baseballData#timeseries:regress(2, 1, 0.95, salary, rbi,
> walks, strikeouts, errors)
> select *
> insert into regResults;
>
> i just want to knwo how i give a set of data into this extension and
> what is baseballData. Is it input stream as usual.or any data file?how can
> i find that data set to create dummy input stream like baseballData?
>
> thank you.
> regards,
> Mahesh.
> [1]
> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension
>
> On Thu, May 26, 2016 at 2:58 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> today i got the siddhi and debug the math extention. then did some
>> changes and check. Now i am trying to write same kind of extension in my
>> code base. so i add dependencies and it was built fine. Now i am trying 
>> to
>> debug my extension and i did the same thing as i did in previous case. 
>> Cep
>> is sending data, bu my extension is not firing in relevant break point.
>> 1. So how can i debug the siddhi extension in my new extension.(you
>> can see it in my example repoo)
>>
>> I think if i do it correctly we can built the extension for our
>> purpose. And i will send the relevant timing report of SGD algorithms 
>> very
>> 

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-03 Thread Supun Sethunga
Hi Mahesh,

I have added those timing reports to my repo [1].

Whats the file name? :)

Btw, can you compile simple doc (gdoc) with the above results, and bring
everything to one place? That way it is easy to compare, and keep track.

Thanks,
Supun

On Fri, Jun 3, 2016 at 7:23 PM, Mahesh Dananjaya 
wrote:

> Hi Maheshkya,
> I have added those timing reports to my repo [1].please have a look at.
> three files are there. one is using incremet as 1000 for batch sizes
> (lr_timing_1000). Otherone is using incremet by 1 (lr_timing_1)
> upto 1 million in both scenarios.you can see the reports and figures in the
> location [2] in the repo. i also added the streaminglinearregression
> classes in the repo gsoc folder.thank you.
> regards,
> Mahesh.
> [1]https://github.com/dananjayamahesh/GSOC2016
> [2]
> https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output
>
> On Mon, May 30, 2016 at 9:24 AM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> Thank you for the update. I will look into your implementation.
>>
>> And i will be able to send you the timing/performances analysis report
>>> tomorrow for the SGD functions
>>>
>>
>> Great. Sent those asap so that we can proceed.
>>
>> Best regards.
>>
>> On Sun, May 29, 2016 at 6:56 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>>
>>> Hi maheshakay,
>>> I have implemented the linear regression with cep siddhi event stream
>>> with  taking batch sizes as parameters from the cep. Now we can trying the
>>> moving window method to. Before that i think i should get your opinion on
>>> data structures to save the streaming data.please check my repo [1]  /gsoc/
>>> folder there you can find all new things i add.. there in the extension
>>> folder you can find those extension. And i will be able to send you the
>>> timing/performances analysis report tomorrow for the SGD functions. thank
>>> you.
>>> regards,
>>> Mahesh.
>>> [1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc
>>>
>>>
>>> On Fri, May 27, 2016 at 12:56 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
 Hi maheshkaya,
 i have written some siddhi extension and trying to develop a one for my
 one. In time series example in the [1], can you please explain me the input
 format and query lines in that example for my understanding.

 from baseballData#timeseries:regress(2, 1, 0.95, salary, rbi,
 walks, strikeouts, errors)
 select *
 insert into regResults;

 i just want to knwo how i give a set of data into this extension and
 what is baseballData. Is it input stream as usual.or any data file?how can
 i find that data set to create dummy input stream like baseballData?

 thank you.
 regards,
 Mahesh.
 [1]
 https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension

 On Thu, May 26, 2016 at 2:58 PM, Mahesh Dananjaya <
 dananjayamah...@gmail.com> wrote:

> Hi Maheshakya,
> today i got the siddhi and debug the math extention. then did some
> changes and check. Now i am trying to write same kind of extension in my
> code base. so i add dependencies and it was built fine. Now i am trying to
> debug my extension and i did the same thing as i did in previous case. Cep
> is sending data, bu my extension is not firing in relevant break point.
> 1. So how can i debug the siddhi extension in my new extension.(you
> can see it in my example repoo)
>
> I think if i do it correctly we can built the extension for our
> purpose. And i will send the relevant timing report of SGD algorithms very
> soon as supun was asking me. thank you.
> regards,
> Mahesh.
>
> On Tue, May 24, 2016 at 11:07 AM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Also note that there is a calculation interval in the siddhi time
>> series regression function[1]. You maybe able get some insight for this
>> from that as well.
>>
>> [1] https://docs.wso2.com/display/CEP400/Regression
>>
>> On Tue, May 24, 2016 at 11:03 AM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Hi Mahesh,
>>>
>>> As we discussed offline, we can use similar mechanism to train
>>> linear regression models, logistic regression models and k-means 
>>> clustering
>>> models.
>>>
>>> It is very interesting that i have found that somethings that can
 make use of our work. In the cep 4.0 documentation there is a Custom 
 Stream
 Processor Extention program [1]. There is a example of
 LinearRegressionStreamProcessor [1].

>>>
>>> As we have to train predictive models with Spark, you can write
>>> wrappers around regression/clustering models of Spark. Refer to Siddhi 
>>> time
>>> series regression source codes[1][2]. 

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-06-03 Thread Mahesh Dananjaya
Hi Maheshkya,
I have added those timing reports to my repo [1].please have a look at.
three files are there. one is using incremet as 1000 for batch sizes
(lr_timing_1000). Otherone is using incremet by 1 (lr_timing_1)
upto 1 million in both scenarios.you can see the reports and figures in the
location [2] in the repo. i also added the streaminglinearregression
classes in the repo gsoc folder.thank you.
regards,
Mahesh.
[1]https://github.com/dananjayamahesh/GSOC2016
[2]
https://github.com/dananjayamahesh/GSOC2016/tree/master/spark-examples/first-example/output

On Mon, May 30, 2016 at 9:24 AM, Maheshakya Wijewardena  wrote:

> Hi Mahesh,
>
> Thank you for the update. I will look into your implementation.
>
> And i will be able to send you the timing/performances analysis report
>> tomorrow for the SGD functions
>>
>
> Great. Sent those asap so that we can proceed.
>
> Best regards.
>
> On Sun, May 29, 2016 at 6:56 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>>
>> Hi maheshakay,
>> I have implemented the linear regression with cep siddhi event stream
>> with  taking batch sizes as parameters from the cep. Now we can trying the
>> moving window method to. Before that i think i should get your opinion on
>> data structures to save the streaming data.please check my repo [1]  /gsoc/
>> folder there you can find all new things i add.. there in the extension
>> folder you can find those extension. And i will be able to send you the
>> timing/performances analysis report tomorrow for the SGD functions. thank
>> you.
>> regards,
>> Mahesh.
>> [1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc
>>
>>
>> On Fri, May 27, 2016 at 12:56 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi maheshkaya,
>>> i have written some siddhi extension and trying to develop a one for my
>>> one. In time series example in the [1], can you please explain me the input
>>> format and query lines in that example for my understanding.
>>>
>>> from baseballData#timeseries:regress(2, 1, 0.95, salary, rbi,
>>> walks, strikeouts, errors)
>>> select *
>>> insert into regResults;
>>>
>>> i just want to knwo how i give a set of data into this extension and
>>> what is baseballData. Is it input stream as usual.or any data file?how can
>>> i find that data set to create dummy input stream like baseballData?
>>>
>>> thank you.
>>> regards,
>>> Mahesh.
>>> [1]
>>> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension
>>>
>>> On Thu, May 26, 2016 at 2:58 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
 Hi Maheshakya,
 today i got the siddhi and debug the math extention. then did some
 changes and check. Now i am trying to write same kind of extension in my
 code base. so i add dependencies and it was built fine. Now i am trying to
 debug my extension and i did the same thing as i did in previous case. Cep
 is sending data, bu my extension is not firing in relevant break point.
 1. So how can i debug the siddhi extension in my new extension.(you can
 see it in my example repoo)

 I think if i do it correctly we can built the extension for our
 purpose. And i will send the relevant timing report of SGD algorithms very
 soon as supun was asking me. thank you.
 regards,
 Mahesh.

 On Tue, May 24, 2016 at 11:07 AM, Maheshakya Wijewardena <
 mahesha...@wso2.com> wrote:

> Also note that there is a calculation interval in the siddhi time
> series regression function[1]. You maybe able get some insight for this
> from that as well.
>
> [1] https://docs.wso2.com/display/CEP400/Regression
>
> On Tue, May 24, 2016 at 11:03 AM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> As we discussed offline, we can use similar mechanism to train linear
>> regression models, logistic regression models and k-means clustering 
>> models.
>>
>> It is very interesting that i have found that somethings that can
>>> make use of our work. In the cep 4.0 documentation there is a Custom 
>>> Stream
>>> Processor Extention program [1]. There is a example of
>>> LinearRegressionStreamProcessor [1].
>>>
>>
>> As we have to train predictive models with Spark, you can write
>> wrappers around regression/clustering models of Spark. Refer to Siddhi 
>> time
>> series regression source codes[1][2]. You can write a streaming linear
>> regression class for ML in a similar fashion by wrapping Spark mllib
>> implementations. You can use the methods "addEvent", "removeEvent", etc.
>> (may have to be changed according to requirements) for the similar 
>> purpose.
>> You can introduce trainLinearRegression/LogisticRegression/Kmeans which
>> does a similar thing as in createLinearRegression in those time series
>> functions. In the 

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-29 Thread Maheshakya Wijewardena
Hi Mahesh,

Thank you for the update. I will look into your implementation.

And i will be able to send you the timing/performances analysis report
> tomorrow for the SGD functions
>

Great. Sent those asap so that we can proceed.

Best regards.

On Sun, May 29, 2016 at 6:56 PM, Mahesh Dananjaya  wrote:

>
> Hi maheshakay,
> I have implemented the linear regression with cep siddhi event stream
> with  taking batch sizes as parameters from the cep. Now we can trying the
> moving window method to. Before that i think i should get your opinion on
> data structures to save the streaming data.please check my repo [1]  /gsoc/
> folder there you can find all new things i add.. there in the extension
> folder you can find those extension. And i will be able to send you the
> timing/performances analysis report tomorrow for the SGD functions. thank
> you.
> regards,
> Mahesh.
> [1] https://github.com/dananjayamahesh/GSOC2016/tree/master/gsoc
>
>
> On Fri, May 27, 2016 at 12:56 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi maheshkaya,
>> i have written some siddhi extension and trying to develop a one for my
>> one. In time series example in the [1], can you please explain me the input
>> format and query lines in that example for my understanding.
>>
>> from baseballData#timeseries:regress(2, 1, 0.95, salary, rbi, walks,
>> strikeouts, errors)
>> select *
>> insert into regResults;
>>
>> i just want to knwo how i give a set of data into this extension and what
>> is baseballData. Is it input stream as usual.or any data file?how can i
>> find that data set to create dummy input stream like baseballData?
>>
>> thank you.
>> regards,
>> Mahesh.
>> [1]
>> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension
>>
>> On Thu, May 26, 2016 at 2:58 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> today i got the siddhi and debug the math extention. then did some
>>> changes and check. Now i am trying to write same kind of extension in my
>>> code base. so i add dependencies and it was built fine. Now i am trying to
>>> debug my extension and i did the same thing as i did in previous case. Cep
>>> is sending data, bu my extension is not firing in relevant break point.
>>> 1. So how can i debug the siddhi extension in my new extension.(you can
>>> see it in my example repoo)
>>>
>>> I think if i do it correctly we can built the extension for our purpose.
>>> And i will send the relevant timing report of SGD algorithms very soon as
>>> supun was asking me. thank you.
>>> regards,
>>> Mahesh.
>>>
>>> On Tue, May 24, 2016 at 11:07 AM, Maheshakya Wijewardena <
>>> mahesha...@wso2.com> wrote:
>>>
 Also note that there is a calculation interval in the siddhi time
 series regression function[1]. You maybe able get some insight for this
 from that as well.

 [1] https://docs.wso2.com/display/CEP400/Regression

 On Tue, May 24, 2016 at 11:03 AM, Maheshakya Wijewardena <
 mahesha...@wso2.com> wrote:

> Hi Mahesh,
>
> As we discussed offline, we can use similar mechanism to train linear
> regression models, logistic regression models and k-means clustering 
> models.
>
> It is very interesting that i have found that somethings that can make
>> use of our work. In the cep 4.0 documentation there is a Custom Stream
>> Processor Extention program [1]. There is a example of
>> LinearRegressionStreamProcessor [1].
>>
>
> As we have to train predictive models with Spark, you can write
> wrappers around regression/clustering models of Spark. Refer to Siddhi 
> time
> series regression source codes[1][2]. You can write a streaming linear
> regression class for ML in a similar fashion by wrapping Spark mllib
> implementations. You can use the methods "addEvent", "removeEvent", etc.
> (may have to be changed according to requirements) for the similar 
> purpose.
> You can introduce trainLinearRegression/LogisticRegression/Kmeans which
> does a similar thing as in createLinearRegression in those time series
> functions. In the processData method you can use Spark mllib classes to
> actually train models and return the model weights, evaluation metrics. 
> So,
> converting streams into RDDs and retrieving information from the trained
> models shall happen in this method.
>
> In the stream processor extension example, you can retrieve those
> values then use them to train new models with new batches. Weights/cluster
> centers maybe passed as initialization parameters for the wrappers.
>
> Please note that we have to figure out the best siddhi extension type
> for this process. In the siddhi query, we define batch size, type of
> algorithm and number of features (there can be more). After batch size
> number of events received, train a model and save 

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-27 Thread Mahesh Dananjaya
Hi maheshkaya,
i have written some siddhi extension and trying to develop a one for my
one. In time series example in the [1], can you please explain me the input
format and query lines in that example for my understanding.

from baseballData#timeseries:regress(2, 1, 0.95, salary, rbi, walks,
strikeouts, errors)
select *
insert into regResults;

i just want to knwo how i give a set of data into this extension and what
is baseballData. Is it input stream as usual.or any data file?how can i
find that data set to create dummy input stream like baseballData?

thank you.
regards,
Mahesh.
[1]
https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension

On Thu, May 26, 2016 at 2:58 PM, Mahesh Dananjaya  wrote:

> Hi Maheshakya,
> today i got the siddhi and debug the math extention. then did some changes
> and check. Now i am trying to write same kind of extension in my code base.
> so i add dependencies and it was built fine. Now i am trying to debug my
> extension and i did the same thing as i did in previous case. Cep is
> sending data, bu my extension is not firing in relevant break point.
> 1. So how can i debug the siddhi extension in my new extension.(you can
> see it in my example repoo)
>
> I think if i do it correctly we can built the extension for our purpose.
> And i will send the relevant timing report of SGD algorithms very soon as
> supun was asking me. thank you.
> regards,
> Mahesh.
>
> On Tue, May 24, 2016 at 11:07 AM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Also note that there is a calculation interval in the siddhi time series
>> regression function[1]. You maybe able get some insight for this from that
>> as well.
>>
>> [1] https://docs.wso2.com/display/CEP400/Regression
>>
>> On Tue, May 24, 2016 at 11:03 AM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Hi Mahesh,
>>>
>>> As we discussed offline, we can use similar mechanism to train linear
>>> regression models, logistic regression models and k-means clustering models.
>>>
>>> It is very interesting that i have found that somethings that can make
 use of our work. In the cep 4.0 documentation there is a Custom Stream
 Processor Extention program [1]. There is a example of
 LinearRegressionStreamProcessor [1].

>>>
>>> As we have to train predictive models with Spark, you can write wrappers
>>> around regression/clustering models of Spark. Refer to Siddhi time series
>>> regression source codes[1][2]. You can write a streaming linear regression
>>> class for ML in a similar fashion by wrapping Spark mllib implementations.
>>> You can use the methods "addEvent", "removeEvent", etc. (may have to be
>>> changed according to requirements) for the similar purpose. You can
>>> introduce trainLinearRegression/LogisticRegression/Kmeans which does a
>>> similar thing as in createLinearRegression in those time series functions.
>>> In the processData method you can use Spark mllib classes to actually train
>>> models and return the model weights, evaluation metrics. So, converting
>>> streams into RDDs and retrieving information from the trained models shall
>>> happen in this method.
>>>
>>> In the stream processor extension example, you can retrieve those values
>>> then use them to train new models with new batches. Weights/cluster centers
>>> maybe passed as initialization parameters for the wrappers.
>>>
>>> Please note that we have to figure out the best siddhi extension type
>>> for this process. In the siddhi query, we define batch size, type of
>>> algorithm and number of features (there can be more). After batch size
>>> number of events received, train a model and save parameters, return
>>> evaluation metric. With the next batch, retrain the model initialized with
>>> previously learned parameters.
>>>
>>> We also may need to test the same scenario with a moving window, but I
>>> suspect that that approach may become so slow as a model is trained each
>>> time an event is received. So, we may have to change the number of slots
>>> the moving window moves at a time (eg: not one by one, but ten by ten).
>>>
>>> Once this is resolved, majority of the research part will be finished
>>> and all we will be left to do is implementing wrappers around the 3
>>> learning algorithms we consider.
>>>
>>> Best regards.
>>>
>>> [1]
>>> https://github.com/wso2/siddhi/blob/master/modules/siddhi-extensions/timeseries/src/main/java/org/wso2/siddhi/extension/timeseries/linreg/RegressionCalculator.java
>>> [2]
>>> https://github.com/wso2/siddhi/blob/master/modules/siddhi-extensions/timeseries/src/main/java/org/wso2/siddhi/extension/timeseries/linreg/SimpleLinearRegressionCalculator.java
>>>
>>>
>>> On Sat, May 21, 2016 at 2:55 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
 Hi Maheshkya,
 shall we use [1] for our work? i am checking the possibility.
 BR,
 Mahesh.
 [1]
 

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-26 Thread Mahesh Dananjaya
Hi Maheshakya,
today i got the siddhi and debug the math extention. then did some changes
and check. Now i am trying to write same kind of extension in my code base.
so i add dependencies and it was built fine. Now i am trying to debug my
extension and i did the same thing as i did in previous case. Cep is
sending data, bu my extension is not firing in relevant break point.
1. So how can i debug the siddhi extension in my new extension.(you can see
it in my example repoo)

I think if i do it correctly we can built the extension for our purpose.
And i will send the relevant timing report of SGD algorithms very soon as
supun was asking me. thank you.
regards,
Mahesh.

On Tue, May 24, 2016 at 11:07 AM, Maheshakya Wijewardena <
mahesha...@wso2.com> wrote:

> Also note that there is a calculation interval in the siddhi time series
> regression function[1]. You maybe able get some insight for this from that
> as well.
>
> [1] https://docs.wso2.com/display/CEP400/Regression
>
> On Tue, May 24, 2016 at 11:03 AM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> As we discussed offline, we can use similar mechanism to train linear
>> regression models, logistic regression models and k-means clustering models.
>>
>> It is very interesting that i have found that somethings that can make
>>> use of our work. In the cep 4.0 documentation there is a Custom Stream
>>> Processor Extention program [1]. There is a example of
>>> LinearRegressionStreamProcessor [1].
>>>
>>
>> As we have to train predictive models with Spark, you can write wrappers
>> around regression/clustering models of Spark. Refer to Siddhi time series
>> regression source codes[1][2]. You can write a streaming linear regression
>> class for ML in a similar fashion by wrapping Spark mllib implementations.
>> You can use the methods "addEvent", "removeEvent", etc. (may have to be
>> changed according to requirements) for the similar purpose. You can
>> introduce trainLinearRegression/LogisticRegression/Kmeans which does a
>> similar thing as in createLinearRegression in those time series functions.
>> In the processData method you can use Spark mllib classes to actually train
>> models and return the model weights, evaluation metrics. So, converting
>> streams into RDDs and retrieving information from the trained models shall
>> happen in this method.
>>
>> In the stream processor extension example, you can retrieve those values
>> then use them to train new models with new batches. Weights/cluster centers
>> maybe passed as initialization parameters for the wrappers.
>>
>> Please note that we have to figure out the best siddhi extension type for
>> this process. In the siddhi query, we define batch size, type of algorithm
>> and number of features (there can be more). After batch size number of
>> events received, train a model and save parameters, return evaluation
>> metric. With the next batch, retrain the model initialized with previously
>> learned parameters.
>>
>> We also may need to test the same scenario with a moving window, but I
>> suspect that that approach may become so slow as a model is trained each
>> time an event is received. So, we may have to change the number of slots
>> the moving window moves at a time (eg: not one by one, but ten by ten).
>>
>> Once this is resolved, majority of the research part will be finished and
>> all we will be left to do is implementing wrappers around the 3 learning
>> algorithms we consider.
>>
>> Best regards.
>>
>> [1]
>> https://github.com/wso2/siddhi/blob/master/modules/siddhi-extensions/timeseries/src/main/java/org/wso2/siddhi/extension/timeseries/linreg/RegressionCalculator.java
>> [2]
>> https://github.com/wso2/siddhi/blob/master/modules/siddhi-extensions/timeseries/src/main/java/org/wso2/siddhi/extension/timeseries/linreg/SimpleLinearRegressionCalculator.java
>>
>>
>> On Sat, May 21, 2016 at 2:55 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshkya,
>>> shall we use [1] for our work? i am checking the possibility.
>>> BR,
>>> Mahesh.
>>> [1]
>>> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension
>>> [2]
>>> https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength
>>> [3]
>>> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function
>>>
>>> On Sat, May 21, 2016 at 2:44 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
 Hi Maheshakya,
 It is very interesting that i have found that somethings that can make
 use of our work. In the cep 4.0 documentation there is a Custom Stream
 Processor Extention program [1]. There is a example of
 LinearRegressionStreamProcessor [1] and also i saw
  private int batchSize = 10; i am going through this one.
 Please check whether we can use. WIll there be any compatibility or
 support issue?
 regards,
 Mahesh.


 [1]
 

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-23 Thread Maheshakya Wijewardena
Also note that there is a calculation interval in the siddhi time series
regression function[1]. You maybe able get some insight for this from that
as well.

[1] https://docs.wso2.com/display/CEP400/Regression

On Tue, May 24, 2016 at 11:03 AM, Maheshakya Wijewardena <
mahesha...@wso2.com> wrote:

> Hi Mahesh,
>
> As we discussed offline, we can use similar mechanism to train linear
> regression models, logistic regression models and k-means clustering models.
>
> It is very interesting that i have found that somethings that can make use
>> of our work. In the cep 4.0 documentation there is a Custom Stream
>> Processor Extention program [1]. There is a example of
>> LinearRegressionStreamProcessor [1].
>>
>
> As we have to train predictive models with Spark, you can write wrappers
> around regression/clustering models of Spark. Refer to Siddhi time series
> regression source codes[1][2]. You can write a streaming linear regression
> class for ML in a similar fashion by wrapping Spark mllib implementations.
> You can use the methods "addEvent", "removeEvent", etc. (may have to be
> changed according to requirements) for the similar purpose. You can
> introduce trainLinearRegression/LogisticRegression/Kmeans which does a
> similar thing as in createLinearRegression in those time series functions.
> In the processData method you can use Spark mllib classes to actually train
> models and return the model weights, evaluation metrics. So, converting
> streams into RDDs and retrieving information from the trained models shall
> happen in this method.
>
> In the stream processor extension example, you can retrieve those values
> then use them to train new models with new batches. Weights/cluster centers
> maybe passed as initialization parameters for the wrappers.
>
> Please note that we have to figure out the best siddhi extension type for
> this process. In the siddhi query, we define batch size, type of algorithm
> and number of features (there can be more). After batch size number of
> events received, train a model and save parameters, return evaluation
> metric. With the next batch, retrain the model initialized with previously
> learned parameters.
>
> We also may need to test the same scenario with a moving window, but I
> suspect that that approach may become so slow as a model is trained each
> time an event is received. So, we may have to change the number of slots
> the moving window moves at a time (eg: not one by one, but ten by ten).
>
> Once this is resolved, majority of the research part will be finished and
> all we will be left to do is implementing wrappers around the 3 learning
> algorithms we consider.
>
> Best regards.
>
> [1]
> https://github.com/wso2/siddhi/blob/master/modules/siddhi-extensions/timeseries/src/main/java/org/wso2/siddhi/extension/timeseries/linreg/RegressionCalculator.java
> [2]
> https://github.com/wso2/siddhi/blob/master/modules/siddhi-extensions/timeseries/src/main/java/org/wso2/siddhi/extension/timeseries/linreg/SimpleLinearRegressionCalculator.java
>
>
> On Sat, May 21, 2016 at 2:55 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshkya,
>> shall we use [1] for our work? i am checking the possibility.
>> BR,
>> Mahesh.
>> [1]
>> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension
>> [2]
>> https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength
>> [3]
>> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function
>>
>> On Sat, May 21, 2016 at 2:44 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> It is very interesting that i have found that somethings that can make
>>> use of our work. In the cep 4.0 documentation there is a Custom Stream
>>> Processor Extention program [1]. There is a example of
>>> LinearRegressionStreamProcessor [1] and also i saw
>>>  private int batchSize = 10; i am going through this one.
>>> Please check whether we can use. WIll there be any compatibility or
>>> support issue?
>>> regards,
>>> Mahesh.
>>>
>>>
>>> [1]
>>> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension
>>>
>>> On Sat, May 21, 2016 at 11:52 AM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
 Hi maheshakya,
 anyway how can test any siddhi extention after write it without
 integrating it to cep.can you please explain me the procedure. i am
 referring to [1] [2] [3] [4].  thank you.
 BR,
 Mahesh.

 [1] https://docs.wso2.com/display/CEP310/Writing+Extensions+to+Siddhi
 [2] https://docs.wso2.com/display/CEP310/Writing+a+Custom+Function
 [3] https://docs.wso2.com/display/CEP310/Writing+a+Custom+Window
 [4] https://docs.wso2.com/display/CEP400/Writing+Extensions+to+Siddhi

 On Thu, May 19, 2016 at 12:08 PM, Mahesh Dananjaya <
 dananjayamah...@gmail.com> wrote:

> Hi Maheshakya,
> thank you for the feedback. I have add data-sets into repo.

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-23 Thread Maheshakya Wijewardena
Hi Mahesh,

As we discussed offline, we can use similar mechanism to train linear
regression models, logistic regression models and k-means clustering models.

It is very interesting that i have found that somethings that can make use
> of our work. In the cep 4.0 documentation there is a Custom Stream
> Processor Extention program [1]. There is a example of
> LinearRegressionStreamProcessor [1].
>

As we have to train predictive models with Spark, you can write wrappers
around regression/clustering models of Spark. Refer to Siddhi time series
regression source codes[1][2]. You can write a streaming linear regression
class for ML in a similar fashion by wrapping Spark mllib implementations.
You can use the methods "addEvent", "removeEvent", etc. (may have to be
changed according to requirements) for the similar purpose. You can
introduce trainLinearRegression/LogisticRegression/Kmeans which does a
similar thing as in createLinearRegression in those time series functions.
In the processData method you can use Spark mllib classes to actually train
models and return the model weights, evaluation metrics. So, converting
streams into RDDs and retrieving information from the trained models shall
happen in this method.

In the stream processor extension example, you can retrieve those values
then use them to train new models with new batches. Weights/cluster centers
maybe passed as initialization parameters for the wrappers.

Please note that we have to figure out the best siddhi extension type for
this process. In the siddhi query, we define batch size, type of algorithm
and number of features (there can be more). After batch size number of
events received, train a model and save parameters, return evaluation
metric. With the next batch, retrain the model initialized with previously
learned parameters.

We also may need to test the same scenario with a moving window, but I
suspect that that approach may become so slow as a model is trained each
time an event is received. So, we may have to change the number of slots
the moving window moves at a time (eg: not one by one, but ten by ten).

Once this is resolved, majority of the research part will be finished and
all we will be left to do is implementing wrappers around the 3 learning
algorithms we consider.

Best regards.

[1]
https://github.com/wso2/siddhi/blob/master/modules/siddhi-extensions/timeseries/src/main/java/org/wso2/siddhi/extension/timeseries/linreg/RegressionCalculator.java
[2]
https://github.com/wso2/siddhi/blob/master/modules/siddhi-extensions/timeseries/src/main/java/org/wso2/siddhi/extension/timeseries/linreg/SimpleLinearRegressionCalculator.java


On Sat, May 21, 2016 at 2:55 PM, Mahesh Dananjaya  wrote:

> Hi Maheshkya,
> shall we use [1] for our work? i am checking the possibility.
> BR,
> Mahesh.
> [1]
> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension
> [2]
> https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength
> [3]
> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function
>
> On Sat, May 21, 2016 at 2:44 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> It is very interesting that i have found that somethings that can make
>> use of our work. In the cep 4.0 documentation there is a Custom Stream
>> Processor Extention program [1]. There is a example of
>> LinearRegressionStreamProcessor [1] and also i saw
>>  private int batchSize = 10; i am going through this one.
>> Please check whether we can use. WIll there be any compatibility or
>> support issue?
>> regards,
>> Mahesh.
>>
>>
>> [1]
>> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension
>>
>> On Sat, May 21, 2016 at 11:52 AM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi maheshakya,
>>> anyway how can test any siddhi extention after write it without
>>> integrating it to cep.can you please explain me the procedure. i am
>>> referring to [1] [2] [3] [4].  thank you.
>>> BR,
>>> Mahesh.
>>>
>>> [1] https://docs.wso2.com/display/CEP310/Writing+Extensions+to+Siddhi
>>> [2] https://docs.wso2.com/display/CEP310/Writing+a+Custom+Function
>>> [3] https://docs.wso2.com/display/CEP310/Writing+a+Custom+Window
>>> [4] https://docs.wso2.com/display/CEP400/Writing+Extensions+to+Siddhi
>>>
>>> On Thu, May 19, 2016 at 12:08 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
 Hi Maheshakya,
 thank you for the feedback. I have add data-sets into repo.
 data-sets/lr. I am all right with next week.Now i am writing some examples
 to collect samples and build mini batches and run the algorithms on those
 mini-batches. thank you. will add those into repo soon.I am still working
 on that siddhi extention.i will let you know the progress.
 BR,
 mahesh.

 On Thu, May 19, 2016 at 11:10 AM, Maheshakya Wijewardena <
 mahesha...@wso2.com> wrote:

> Hi 

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-21 Thread Mahesh Dananjaya
Hi Maheshkya,
shall we use [1] for our work? i am checking the possibility.
BR,
Mahesh.
[1]
https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension
[2]
https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength
[3]https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function

On Sat, May 21, 2016 at 2:44 PM, Mahesh Dananjaya  wrote:

> Hi Maheshakya,
> It is very interesting that i have found that somethings that can make use
> of our work. In the cep 4.0 documentation there is a Custom Stream
> Processor Extention program [1]. There is a example of
> LinearRegressionStreamProcessor [1] and also i saw
>  private int batchSize = 10; i am going through this one.
> Please check whether we can use. WIll there be any compatibility or
> support issue?
> regards,
> Mahesh.
>
>
> [1]
> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension
>
> On Sat, May 21, 2016 at 11:52 AM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi maheshakya,
>> anyway how can test any siddhi extention after write it without
>> integrating it to cep.can you please explain me the procedure. i am
>> referring to [1] [2] [3] [4].  thank you.
>> BR,
>> Mahesh.
>>
>> [1] https://docs.wso2.com/display/CEP310/Writing+Extensions+to+Siddhi
>> [2] https://docs.wso2.com/display/CEP310/Writing+a+Custom+Function
>> [3] https://docs.wso2.com/display/CEP310/Writing+a+Custom+Window
>> [4] https://docs.wso2.com/display/CEP400/Writing+Extensions+to+Siddhi
>>
>> On Thu, May 19, 2016 at 12:08 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> thank you for the feedback. I have add data-sets into repo.
>>> data-sets/lr. I am all right with next week.Now i am writing some examples
>>> to collect samples and build mini batches and run the algorithms on those
>>> mini-batches. thank you. will add those into repo soon.I am still working
>>> on that siddhi extention.i will let you know the progress.
>>> BR,
>>> mahesh.
>>>
>>> On Thu, May 19, 2016 at 11:10 AM, Maheshakya Wijewardena <
>>> mahesha...@wso2.com> wrote:
>>>
 Hi Mahesh,

 I've look into your code sample of streaming linear regression. Looks
 good to me, apart from few issues in coding practices which we can improve
 when you're doing the implementations in carbon-ml and during the code
 reviews. You are using a set of files as mini-batches of data, right? Can
 you also send us the datasets you've been using. I'd like to run this.

 does that cep problem is now all right that we were trying to fix. I am
> still using those pre-build versions. If so i can merge with the latest 
> one.


 I'll check this and let you know.

 Can we arrange a meeting (preferably in WSO2 offices) in next week with
 ML team members as well. Coding period begins on next Monday, so it's
 better to get overall feedback from others and discuss more about the
 project. Let me know convenient time slots for you. I'll arrange a meeting
 with ML team.

 Best regards.

 On Wed, May 18, 2016 at 9:53 AM, Mahesh Dananjaya <
 dananjayamah...@gmail.com> wrote:

> Hi Maheshakya,
> Ok. I will check it.you have sent me those relevant references and i
> am working on that thing.thank you. does that cep problem is now all right
> that we were trying to fix. I am still using those pre-build versions. If
> so i can merge with the latest one.thanks.
> BR,
> Mahesh.
>
> On Wed, May 18, 2016 at 9:44 AM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> You don't actually have to implement anything in spark streaming. Try
>> to understand how streaming data is handled in and the specifics of the
>> underlying algorithms in streaming.
>> What we want to do is having the similar algorithms that support CEP
>> event streams with siddhi.
>>
>> Best regards.
>>
>> On Wed, May 18, 2016 at 9:38 AM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> Did you check the repo. I will add recent works today.And also i was
>>> going through the Java docs related to spark streaming work. It is with
>>> that scala API. thank you.
>>> regards,
>>> Mahesh.
>>>
>>> On Tue, May 17, 2016 at 10:11 AM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
 Hi Maheshakya,
 I have gone through the Java Docs and run some of the Spark
 examples on spark shell which are paramount improtant for our work. 
 Then i
 have been writing my codes to check the Linear regression, K means for
 streaming. please check my git repo [1]. I think now i have to ask on 
 dev
 regarding the capturing event streams for our work. I will update the
 recent 

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-21 Thread Mahesh Dananjaya
Hi Maheshakya,
It is very interesting that i have found that somethings that can make use
of our work. In the cep 4.0 documentation there is a Custom Stream
Processor Extention program [1]. There is a example of
LinearRegressionStreamProcessor [1] and also i saw
 private int batchSize = 10; i am going through this one.
Please check whether we can use. WIll there be any compatibility or support
issue?
regards,
Mahesh.


[1]
https://docs.wso2.com/display/CEP400/Writing+a+Custom+Stream+Processor+Extension

On Sat, May 21, 2016 at 11:52 AM, Mahesh Dananjaya <
dananjayamah...@gmail.com> wrote:

> Hi maheshakya,
> anyway how can test any siddhi extention after write it without
> integrating it to cep.can you please explain me the procedure. i am
> referring to [1] [2] [3] [4].  thank you.
> BR,
> Mahesh.
>
> [1] https://docs.wso2.com/display/CEP310/Writing+Extensions+to+Siddhi
> [2] https://docs.wso2.com/display/CEP310/Writing+a+Custom+Function
> [3] https://docs.wso2.com/display/CEP310/Writing+a+Custom+Window
> [4] https://docs.wso2.com/display/CEP400/Writing+Extensions+to+Siddhi
>
> On Thu, May 19, 2016 at 12:08 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> thank you for the feedback. I have add data-sets into repo. data-sets/lr.
>> I am all right with next week.Now i am writing some examples to collect
>> samples and build mini batches and run the algorithms on those
>> mini-batches. thank you. will add those into repo soon.I am still working
>> on that siddhi extention.i will let you know the progress.
>> BR,
>> mahesh.
>>
>> On Thu, May 19, 2016 at 11:10 AM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Hi Mahesh,
>>>
>>> I've look into your code sample of streaming linear regression. Looks
>>> good to me, apart from few issues in coding practices which we can improve
>>> when you're doing the implementations in carbon-ml and during the code
>>> reviews. You are using a set of files as mini-batches of data, right? Can
>>> you also send us the datasets you've been using. I'd like to run this.
>>>
>>> does that cep problem is now all right that we were trying to fix. I am
 still using those pre-build versions. If so i can merge with the latest 
 one.
>>>
>>>
>>> I'll check this and let you know.
>>>
>>> Can we arrange a meeting (preferably in WSO2 offices) in next week with
>>> ML team members as well. Coding period begins on next Monday, so it's
>>> better to get overall feedback from others and discuss more about the
>>> project. Let me know convenient time slots for you. I'll arrange a meeting
>>> with ML team.
>>>
>>> Best regards.
>>>
>>> On Wed, May 18, 2016 at 9:53 AM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
 Hi Maheshakya,
 Ok. I will check it.you have sent me those relevant references and i am
 working on that thing.thank you. does that cep problem is now all right
 that we were trying to fix. I am still using those pre-build versions. If
 so i can merge with the latest one.thanks.
 BR,
 Mahesh.

 On Wed, May 18, 2016 at 9:44 AM, Maheshakya Wijewardena <
 mahesha...@wso2.com> wrote:

> Hi Mahesh,
>
> You don't actually have to implement anything in spark streaming. Try
> to understand how streaming data is handled in and the specifics of the
> underlying algorithms in streaming.
> What we want to do is having the similar algorithms that support CEP
> event streams with siddhi.
>
> Best regards.
>
> On Wed, May 18, 2016 at 9:38 AM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> Did you check the repo. I will add recent works today.And also i was
>> going through the Java docs related to spark streaming work. It is with
>> that scala API. thank you.
>> regards,
>> Mahesh.
>>
>> On Tue, May 17, 2016 at 10:11 AM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> I have gone through the Java Docs and run some of the Spark examples
>>> on spark shell which are paramount improtant for our work. Then i have 
>>> been
>>> writing my codes to check the Linear regression, K means for streaming.
>>> please check my git repo [1]. I think now i have to ask on dev regarding
>>> the capturing event streams for our work. I will update the recent 
>>> things
>>> on git. check the park-example directory for java. examples run on git
>>> shell is not included there. In my case i think i have to build mini
>>> batches from data streams that comes as individual samples. Now i am
>>> working on some coding to collect mini batches from data streams.thank 
>>> you.
>>> regards,
>>> Mahesh.
>>> [1]https://github.com/dananjayamahesh/GSOC2016
>>>
>>> On Tue, May 17, 2016 at 10:10 AM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
 Hi 

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-21 Thread Mahesh Dananjaya
Hi maheshakya,
anyway how can test any siddhi extention after write it without integrating
it to cep.can you please explain me the procedure. i am referring to [1]
[2] [3] [4].  thank you.
BR,
Mahesh.

[1] https://docs.wso2.com/display/CEP310/Writing+Extensions+to+Siddhi
[2] https://docs.wso2.com/display/CEP310/Writing+a+Custom+Function
[3] https://docs.wso2.com/display/CEP310/Writing+a+Custom+Window
[4] https://docs.wso2.com/display/CEP400/Writing+Extensions+to+Siddhi

On Thu, May 19, 2016 at 12:08 PM, Mahesh Dananjaya <
dananjayamah...@gmail.com> wrote:

> Hi Maheshakya,
> thank you for the feedback. I have add data-sets into repo. data-sets/lr.
> I am all right with next week.Now i am writing some examples to collect
> samples and build mini batches and run the algorithms on those
> mini-batches. thank you. will add those into repo soon.I am still working
> on that siddhi extention.i will let you know the progress.
> BR,
> mahesh.
>
> On Thu, May 19, 2016 at 11:10 AM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> I've look into your code sample of streaming linear regression. Looks
>> good to me, apart from few issues in coding practices which we can improve
>> when you're doing the implementations in carbon-ml and during the code
>> reviews. You are using a set of files as mini-batches of data, right? Can
>> you also send us the datasets you've been using. I'd like to run this.
>>
>> does that cep problem is now all right that we were trying to fix. I am
>>> still using those pre-build versions. If so i can merge with the latest one.
>>
>>
>> I'll check this and let you know.
>>
>> Can we arrange a meeting (preferably in WSO2 offices) in next week with
>> ML team members as well. Coding period begins on next Monday, so it's
>> better to get overall feedback from others and discuss more about the
>> project. Let me know convenient time slots for you. I'll arrange a meeting
>> with ML team.
>>
>> Best regards.
>>
>> On Wed, May 18, 2016 at 9:53 AM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> Ok. I will check it.you have sent me those relevant references and i am
>>> working on that thing.thank you. does that cep problem is now all right
>>> that we were trying to fix. I am still using those pre-build versions. If
>>> so i can merge with the latest one.thanks.
>>> BR,
>>> Mahesh.
>>>
>>> On Wed, May 18, 2016 at 9:44 AM, Maheshakya Wijewardena <
>>> mahesha...@wso2.com> wrote:
>>>
 Hi Mahesh,

 You don't actually have to implement anything in spark streaming. Try
 to understand how streaming data is handled in and the specifics of the
 underlying algorithms in streaming.
 What we want to do is having the similar algorithms that support CEP
 event streams with siddhi.

 Best regards.

 On Wed, May 18, 2016 at 9:38 AM, Mahesh Dananjaya <
 dananjayamah...@gmail.com> wrote:

> Hi Maheshakya,
> Did you check the repo. I will add recent works today.And also i was
> going through the Java docs related to spark streaming work. It is with
> that scala API. thank you.
> regards,
> Mahesh.
>
> On Tue, May 17, 2016 at 10:11 AM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> I have gone through the Java Docs and run some of the Spark examples
>> on spark shell which are paramount improtant for our work. Then i have 
>> been
>> writing my codes to check the Linear regression, K means for streaming.
>> please check my git repo [1]. I think now i have to ask on dev regarding
>> the capturing event streams for our work. I will update the recent things
>> on git. check the park-example directory for java. examples run on git
>> shell is not included there. In my case i think i have to build mini
>> batches from data streams that comes as individual samples. Now i am
>> working on some coding to collect mini batches from data streams.thank 
>> you.
>> regards,
>> Mahesh.
>> [1]https://github.com/dananjayamahesh/GSOC2016
>>
>> On Tue, May 17, 2016 at 10:10 AM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> I have gone through the Java Docs and run some of the Spark examples
>>> on spark shell which are paramount improtant for our work. Then i have 
>>> been
>>> writing my codes to check the Linear regression, K means for streaming.
>>> please check my git repo [1]. I think now i have to ask on dev regarding
>>> the capturing event streams for our work. I will update the recent 
>>> things
>>> on git. check the park-example directory for java. examples run on git
>>> shell is not included there. In my case i think i have to build mini
>>> batches from data streams that comes as individual samples. Now i am
>>> working on some coding to collect mini batches from data streams.thank 

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-19 Thread Mahesh Dananjaya
Hi Maheshakya,
thank you for the feedback. I have add data-sets into repo. data-sets/lr. I
am all right with next week.Now i am writing some examples to collect
samples and build mini batches and run the algorithms on those
mini-batches. thank you. will add those into repo soon.I am still working
on that siddhi extention.i will let you know the progress.
BR,
mahesh.

On Thu, May 19, 2016 at 11:10 AM, Maheshakya Wijewardena <
mahesha...@wso2.com> wrote:

> Hi Mahesh,
>
> I've look into your code sample of streaming linear regression. Looks good
> to me, apart from few issues in coding practices which we can improve when
> you're doing the implementations in carbon-ml and during the code reviews.
> You are using a set of files as mini-batches of data, right? Can you also
> send us the datasets you've been using. I'd like to run this.
>
> does that cep problem is now all right that we were trying to fix. I am
>> still using those pre-build versions. If so i can merge with the latest one.
>
>
> I'll check this and let you know.
>
> Can we arrange a meeting (preferably in WSO2 offices) in next week with ML
> team members as well. Coding period begins on next Monday, so it's better
> to get overall feedback from others and discuss more about the project. Let
> me know convenient time slots for you. I'll arrange a meeting with ML team.
>
> Best regards.
>
> On Wed, May 18, 2016 at 9:53 AM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> Ok. I will check it.you have sent me those relevant references and i am
>> working on that thing.thank you. does that cep problem is now all right
>> that we were trying to fix. I am still using those pre-build versions. If
>> so i can merge with the latest one.thanks.
>> BR,
>> Mahesh.
>>
>> On Wed, May 18, 2016 at 9:44 AM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Hi Mahesh,
>>>
>>> You don't actually have to implement anything in spark streaming. Try to
>>> understand how streaming data is handled in and the specifics of the
>>> underlying algorithms in streaming.
>>> What we want to do is having the similar algorithms that support CEP
>>> event streams with siddhi.
>>>
>>> Best regards.
>>>
>>> On Wed, May 18, 2016 at 9:38 AM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
 Hi Maheshakya,
 Did you check the repo. I will add recent works today.And also i was
 going through the Java docs related to spark streaming work. It is with
 that scala API. thank you.
 regards,
 Mahesh.

 On Tue, May 17, 2016 at 10:11 AM, Mahesh Dananjaya <
 dananjayamah...@gmail.com> wrote:

> Hi Maheshakya,
> I have gone through the Java Docs and run some of the Spark examples
> on spark shell which are paramount improtant for our work. Then i have 
> been
> writing my codes to check the Linear regression, K means for streaming.
> please check my git repo [1]. I think now i have to ask on dev regarding
> the capturing event streams for our work. I will update the recent things
> on git. check the park-example directory for java. examples run on git
> shell is not included there. In my case i think i have to build mini
> batches from data streams that comes as individual samples. Now i am
> working on some coding to collect mini batches from data streams.thank 
> you.
> regards,
> Mahesh.
> [1]https://github.com/dananjayamahesh/GSOC2016
>
> On Tue, May 17, 2016 at 10:10 AM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> I have gone through the Java Docs and run some of the Spark examples
>> on spark shell which are paramount improtant for our work. Then i have 
>> been
>> writing my codes to check the Linear regression, K means for streaming.
>> please check my git repo [1]. I think now i have to ask on dev regarding
>> the capturing event streams for our work. I will update the recent things
>> on git. check the park-example directory for java. examples run on git
>> shell is not included there. In my case i think i have to build mini
>> batches from data streams that comes as individual samples. Now i am
>> working on some coding to collect mini batches from data streams.thank 
>> you.
>> regards,
>> Mahesh.
>> [1]https://github.com/dananjayamahesh/GSOC2016
>>
>> On Mon, May 16, 2016 at 1:19 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> thank you. i will update the repo today.thank you.i changed the
>>> carbon ml siddhi extention and see how the changes are effecting. i will
>>> update the progress as soon as possible.thank you. i had some problem in
>>> spark mllib dependency. i was fixing that.
>>> regards,
>>> Mahesh.
>>> p.s: do i need to maintain a blog?
>>>
>>> On Mon, May 16, 2016 at 10:02 AM, Maheshakya Wijewardena <
>>> 

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-18 Thread Maheshakya Wijewardena
Hi Mahesh,

I've look into your code sample of streaming linear regression. Looks good
to me, apart from few issues in coding practices which we can improve when
you're doing the implementations in carbon-ml and during the code reviews.
You are using a set of files as mini-batches of data, right? Can you also
send us the datasets you've been using. I'd like to run this.

does that cep problem is now all right that we were trying to fix. I am
> still using those pre-build versions. If so i can merge with the latest one.


I'll check this and let you know.

Can we arrange a meeting (preferably in WSO2 offices) in next week with ML
team members as well. Coding period begins on next Monday, so it's better
to get overall feedback from others and discuss more about the project. Let
me know convenient time slots for you. I'll arrange a meeting with ML team.

Best regards.

On Wed, May 18, 2016 at 9:53 AM, Mahesh Dananjaya  wrote:

> Hi Maheshakya,
> Ok. I will check it.you have sent me those relevant references and i am
> working on that thing.thank you. does that cep problem is now all right
> that we were trying to fix. I am still using those pre-build versions. If
> so i can merge with the latest one.thanks.
> BR,
> Mahesh.
>
> On Wed, May 18, 2016 at 9:44 AM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> You don't actually have to implement anything in spark streaming. Try to
>> understand how streaming data is handled in and the specifics of the
>> underlying algorithms in streaming.
>> What we want to do is having the similar algorithms that support CEP
>> event streams with siddhi.
>>
>> Best regards.
>>
>> On Wed, May 18, 2016 at 9:38 AM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> Did you check the repo. I will add recent works today.And also i was
>>> going through the Java docs related to spark streaming work. It is with
>>> that scala API. thank you.
>>> regards,
>>> Mahesh.
>>>
>>> On Tue, May 17, 2016 at 10:11 AM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
 Hi Maheshakya,
 I have gone through the Java Docs and run some of the Spark examples on
 spark shell which are paramount improtant for our work. Then i have been
 writing my codes to check the Linear regression, K means for streaming.
 please check my git repo [1]. I think now i have to ask on dev regarding
 the capturing event streams for our work. I will update the recent things
 on git. check the park-example directory for java. examples run on git
 shell is not included there. In my case i think i have to build mini
 batches from data streams that comes as individual samples. Now i am
 working on some coding to collect mini batches from data streams.thank you.
 regards,
 Mahesh.
 [1]https://github.com/dananjayamahesh/GSOC2016

 On Tue, May 17, 2016 at 10:10 AM, Mahesh Dananjaya <
 dananjayamah...@gmail.com> wrote:

> Hi Maheshakya,
> I have gone through the Java Docs and run some of the Spark examples
> on spark shell which are paramount improtant for our work. Then i have 
> been
> writing my codes to check the Linear regression, K means for streaming.
> please check my git repo [1]. I think now i have to ask on dev regarding
> the capturing event streams for our work. I will update the recent things
> on git. check the park-example directory for java. examples run on git
> shell is not included there. In my case i think i have to build mini
> batches from data streams that comes as individual samples. Now i am
> working on some coding to collect mini batches from data streams.thank 
> you.
> regards,
> Mahesh.
> [1]https://github.com/dananjayamahesh/GSOC2016
>
> On Mon, May 16, 2016 at 1:19 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> thank you. i will update the repo today.thank you.i changed the
>> carbon ml siddhi extention and see how the changes are effecting. i will
>> update the progress as soon as possible.thank you. i had some problem in
>> spark mllib dependency. i was fixing that.
>> regards,
>> Mahesh.
>> p.s: do i need to maintain a blog?
>>
>> On Mon, May 16, 2016 at 10:02 AM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Hi Mahesh,
>>>
>>> Sorry for replying late.
>>>
>>> Thank you for the update. I believe you have done some
>>> implementations with with Spark MLLIb algorithms in streaming fashion 
>>> as we
>>> have discussed. If so, can you please share your code in a Github repo.
>>>
>>> Now i want to implements some machine learning algorithms with
 importing mllib and want to run within your code base

>>>
>>> For the moment you can try out editing the same class
>>> PredictStreamProcessor in the 

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-17 Thread Mahesh Dananjaya
Hi Maheshakya,
Ok. I will check it.you have sent me those relevant references and i am
working on that thing.thank you. does that cep problem is now all right
that we were trying to fix. I am still using those pre-build versions. If
so i can merge with the latest one.thanks.
BR,
Mahesh.

On Wed, May 18, 2016 at 9:44 AM, Maheshakya Wijewardena  wrote:

> Hi Mahesh,
>
> You don't actually have to implement anything in spark streaming. Try to
> understand how streaming data is handled in and the specifics of the
> underlying algorithms in streaming.
> What we want to do is having the similar algorithms that support CEP event
> streams with siddhi.
>
> Best regards.
>
> On Wed, May 18, 2016 at 9:38 AM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> Did you check the repo. I will add recent works today.And also i was
>> going through the Java docs related to spark streaming work. It is with
>> that scala API. thank you.
>> regards,
>> Mahesh.
>>
>> On Tue, May 17, 2016 at 10:11 AM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> I have gone through the Java Docs and run some of the Spark examples on
>>> spark shell which are paramount improtant for our work. Then i have been
>>> writing my codes to check the Linear regression, K means for streaming.
>>> please check my git repo [1]. I think now i have to ask on dev regarding
>>> the capturing event streams for our work. I will update the recent things
>>> on git. check the park-example directory for java. examples run on git
>>> shell is not included there. In my case i think i have to build mini
>>> batches from data streams that comes as individual samples. Now i am
>>> working on some coding to collect mini batches from data streams.thank you.
>>> regards,
>>> Mahesh.
>>> [1]https://github.com/dananjayamahesh/GSOC2016
>>>
>>> On Tue, May 17, 2016 at 10:10 AM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
 Hi Maheshakya,
 I have gone through the Java Docs and run some of the Spark examples on
 spark shell which are paramount improtant for our work. Then i have been
 writing my codes to check the Linear regression, K means for streaming.
 please check my git repo [1]. I think now i have to ask on dev regarding
 the capturing event streams for our work. I will update the recent things
 on git. check the park-example directory for java. examples run on git
 shell is not included there. In my case i think i have to build mini
 batches from data streams that comes as individual samples. Now i am
 working on some coding to collect mini batches from data streams.thank you.
 regards,
 Mahesh.
 [1]https://github.com/dananjayamahesh/GSOC2016

 On Mon, May 16, 2016 at 1:19 PM, Mahesh Dananjaya <
 dananjayamah...@gmail.com> wrote:

> Hi Maheshakya,
> thank you. i will update the repo today.thank you.i changed the carbon
> ml siddhi extention and see how the changes are effecting. i will update
> the progress as soon as possible.thank you. i had some problem in spark
> mllib dependency. i was fixing that.
> regards,
> Mahesh.
> p.s: do i need to maintain a blog?
>
> On Mon, May 16, 2016 at 10:02 AM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> Sorry for replying late.
>>
>> Thank you for the update. I believe you have done some
>> implementations with with Spark MLLIb algorithms in streaming fashion as 
>> we
>> have discussed. If so, can you please share your code in a Github repo.
>>
>> Now i want to implements some machine learning algorithms with
>>> importing mllib and want to run within your code base
>>>
>>
>> For the moment you can try out editing the same class
>> PredictStreamProcessor in the siddhi extension in carbon-ml. Later we 
>> will
>> add this separately. You should be able to add org.apache.spark.mllib.
>> classes to there.
>>
>> And i want to see how event streams are coming from cep. As i think
>>> it is not in a RDD format since it is arriving as the individual 
>>> samples. I
>>> will send a email to dev asking about how to get the streams.
>>
>>
>> Please pay attention to length[1] and lengthbatch[1] inbuilt windows
>> in siddhi. What you need to write are functions similar to a custom
>> aggregate function[2].
>> When you send the email to dev list, explain your requirement. You
>> need to get a set of event with from a stream with a specified window 
>> size
>> (number of events). Then build a model within that function. You also 
>> need
>> to retain the data (learned weights, cluster centers, etc.) from the
>> previous window to use in the current window. Ask what can be the most
>> suitable option for this among the set of siddhi extensions given.
>>

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-17 Thread Maheshakya Wijewardena
Hi Mahesh,

You don't actually have to implement anything in spark streaming. Try to
understand how streaming data is handled in and the specifics of the
underlying algorithms in streaming.
What we want to do is having the similar algorithms that support CEP event
streams with siddhi.

Best regards.

On Wed, May 18, 2016 at 9:38 AM, Mahesh Dananjaya  wrote:

> Hi Maheshakya,
> Did you check the repo. I will add recent works today.And also i was going
> through the Java docs related to spark streaming work. It is with that
> scala API. thank you.
> regards,
> Mahesh.
>
> On Tue, May 17, 2016 at 10:11 AM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> I have gone through the Java Docs and run some of the Spark examples on
>> spark shell which are paramount improtant for our work. Then i have been
>> writing my codes to check the Linear regression, K means for streaming.
>> please check my git repo [1]. I think now i have to ask on dev regarding
>> the capturing event streams for our work. I will update the recent things
>> on git. check the park-example directory for java. examples run on git
>> shell is not included there. In my case i think i have to build mini
>> batches from data streams that comes as individual samples. Now i am
>> working on some coding to collect mini batches from data streams.thank you.
>> regards,
>> Mahesh.
>> [1]https://github.com/dananjayamahesh/GSOC2016
>>
>> On Tue, May 17, 2016 at 10:10 AM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> I have gone through the Java Docs and run some of the Spark examples on
>>> spark shell which are paramount improtant for our work. Then i have been
>>> writing my codes to check the Linear regression, K means for streaming.
>>> please check my git repo [1]. I think now i have to ask on dev regarding
>>> the capturing event streams for our work. I will update the recent things
>>> on git. check the park-example directory for java. examples run on git
>>> shell is not included there. In my case i think i have to build mini
>>> batches from data streams that comes as individual samples. Now i am
>>> working on some coding to collect mini batches from data streams.thank you.
>>> regards,
>>> Mahesh.
>>> [1]https://github.com/dananjayamahesh/GSOC2016
>>>
>>> On Mon, May 16, 2016 at 1:19 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
 Hi Maheshakya,
 thank you. i will update the repo today.thank you.i changed the carbon
 ml siddhi extention and see how the changes are effecting. i will update
 the progress as soon as possible.thank you. i had some problem in spark
 mllib dependency. i was fixing that.
 regards,
 Mahesh.
 p.s: do i need to maintain a blog?

 On Mon, May 16, 2016 at 10:02 AM, Maheshakya Wijewardena <
 mahesha...@wso2.com> wrote:

> Hi Mahesh,
>
> Sorry for replying late.
>
> Thank you for the update. I believe you have done some implementations
> with with Spark MLLIb algorithms in streaming fashion as we have 
> discussed.
> If so, can you please share your code in a Github repo.
>
> Now i want to implements some machine learning algorithms with
>> importing mllib and want to run within your code base
>>
>
> For the moment you can try out editing the same class
> PredictStreamProcessor in the siddhi extension in carbon-ml. Later we will
> add this separately. You should be able to add org.apache.spark.mllib.
> classes to there.
>
> And i want to see how event streams are coming from cep. As i think it
>> is not in a RDD format since it is arriving as the individual samples. I
>> will send a email to dev asking about how to get the streams.
>
>
> Please pay attention to length[1] and lengthbatch[1] inbuilt windows
> in siddhi. What you need to write are functions similar to a custom
> aggregate function[2].
> When you send the email to dev list, explain your requirement. You
> need to get a set of event with from a stream with a specified window size
> (number of events). Then build a model within that function. You also need
> to retain the data (learned weights, cluster centers, etc.) from the
> previous window to use in the current window. Ask what can be the most
> suitable option for this among the set of siddhi extensions given.
>
> Best regards.
>
> [1]
> https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength
> [2]
> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function
>
> On Wed, May 11, 2016 at 1:43 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>>
>> -- Forwarded message --
>> From: Mahesh Dananjaya 
>> Date: Wed, May 11, 2016 at 1:43 PM
>> Subject: Re: [Dev] GSOC2016: 

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-17 Thread Maheshakya Wijewardena
Hi Mahesh,

I'll review your code sample and give you our feedback asap.
In the meantime, please go through the documentation for writing siddhi
extensions and get some idea. It's better if you can try writing some
simple siddhi extensions your self and test them to get a good
understanding.

Best regards.

On Tue, May 17, 2016 at 10:11 AM, Mahesh Dananjaya <
dananjayamah...@gmail.com> wrote:

> Hi Maheshakya,
> I have gone through the Java Docs and run some of the Spark examples on
> spark shell which are paramount improtant for our work. Then i have been
> writing my codes to check the Linear regression, K means for streaming.
> please check my git repo [1]. I think now i have to ask on dev regarding
> the capturing event streams for our work. I will update the recent things
> on git. check the park-example directory for java. examples run on git
> shell is not included there. In my case i think i have to build mini
> batches from data streams that comes as individual samples. Now i am
> working on some coding to collect mini batches from data streams.thank you.
> regards,
> Mahesh.
> [1]https://github.com/dananjayamahesh/GSOC2016
>
> On Tue, May 17, 2016 at 10:10 AM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> I have gone through the Java Docs and run some of the Spark examples on
>> spark shell which are paramount improtant for our work. Then i have been
>> writing my codes to check the Linear regression, K means for streaming.
>> please check my git repo [1]. I think now i have to ask on dev regarding
>> the capturing event streams for our work. I will update the recent things
>> on git. check the park-example directory for java. examples run on git
>> shell is not included there. In my case i think i have to build mini
>> batches from data streams that comes as individual samples. Now i am
>> working on some coding to collect mini batches from data streams.thank you.
>> regards,
>> Mahesh.
>> [1]https://github.com/dananjayamahesh/GSOC2016
>>
>> On Mon, May 16, 2016 at 1:19 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> thank you. i will update the repo today.thank you.i changed the carbon
>>> ml siddhi extention and see how the changes are effecting. i will update
>>> the progress as soon as possible.thank you. i had some problem in spark
>>> mllib dependency. i was fixing that.
>>> regards,
>>> Mahesh.
>>> p.s: do i need to maintain a blog?
>>>
>>> On Mon, May 16, 2016 at 10:02 AM, Maheshakya Wijewardena <
>>> mahesha...@wso2.com> wrote:
>>>
 Hi Mahesh,

 Sorry for replying late.

 Thank you for the update. I believe you have done some implementations
 with with Spark MLLIb algorithms in streaming fashion as we have discussed.
 If so, can you please share your code in a Github repo.

 Now i want to implements some machine learning algorithms with
> importing mllib and want to run within your code base
>

 For the moment you can try out editing the same class
 PredictStreamProcessor in the siddhi extension in carbon-ml. Later we will
 add this separately. You should be able to add org.apache.spark.mllib.
 classes to there.

 And i want to see how event streams are coming from cep. As i think it
> is not in a RDD format since it is arriving as the individual samples. I
> will send a email to dev asking about how to get the streams.


 Please pay attention to length[1] and lengthbatch[1] inbuilt windows in
 siddhi. What you need to write are functions similar to a custom aggregate
 function[2].
 When you send the email to dev list, explain your requirement. You need
 to get a set of event with from a stream with a specified window size
 (number of events). Then build a model within that function. You also need
 to retain the data (learned weights, cluster centers, etc.) from the
 previous window to use in the current window. Ask what can be the most
 suitable option for this among the set of siddhi extensions given.

 Best regards.

 [1]
 https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength
 [2]
 https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function

 On Wed, May 11, 2016 at 1:43 PM, Mahesh Dananjaya <
 dananjayamah...@gmail.com> wrote:

>
> -- Forwarded message --
> From: Mahesh Dananjaya 
> Date: Wed, May 11, 2016 at 1:43 PM
> Subject: Re: [Dev] GSOC2016: [ML][CEP] Predictive analytic with online
> data for WSO2 Machine Learner
> To: Maheshakya Wijewardena 
>
>
> Hi Maheshakya,
> sorry for not updating. I did what you wanted me to do. I checked the
> code base and train functions. I went through those java docs. I went
> through the carbon-ml current implementation of LG and K-Mean. And i had
> 

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-17 Thread Mahesh Dananjaya
Hi Maheshakya,
Did you check the repo. I will add recent works today.And also i was going
through the Java docs related to spark streaming work. It is with that
scala API. thank you.
regards,
Mahesh.

On Tue, May 17, 2016 at 10:11 AM, Mahesh Dananjaya <
dananjayamah...@gmail.com> wrote:

> Hi Maheshakya,
> I have gone through the Java Docs and run some of the Spark examples on
> spark shell which are paramount improtant for our work. Then i have been
> writing my codes to check the Linear regression, K means for streaming.
> please check my git repo [1]. I think now i have to ask on dev regarding
> the capturing event streams for our work. I will update the recent things
> on git. check the park-example directory for java. examples run on git
> shell is not included there. In my case i think i have to build mini
> batches from data streams that comes as individual samples. Now i am
> working on some coding to collect mini batches from data streams.thank you.
> regards,
> Mahesh.
> [1]https://github.com/dananjayamahesh/GSOC2016
>
> On Tue, May 17, 2016 at 10:10 AM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> I have gone through the Java Docs and run some of the Spark examples on
>> spark shell which are paramount improtant for our work. Then i have been
>> writing my codes to check the Linear regression, K means for streaming.
>> please check my git repo [1]. I think now i have to ask on dev regarding
>> the capturing event streams for our work. I will update the recent things
>> on git. check the park-example directory for java. examples run on git
>> shell is not included there. In my case i think i have to build mini
>> batches from data streams that comes as individual samples. Now i am
>> working on some coding to collect mini batches from data streams.thank you.
>> regards,
>> Mahesh.
>> [1]https://github.com/dananjayamahesh/GSOC2016
>>
>> On Mon, May 16, 2016 at 1:19 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> thank you. i will update the repo today.thank you.i changed the carbon
>>> ml siddhi extention and see how the changes are effecting. i will update
>>> the progress as soon as possible.thank you. i had some problem in spark
>>> mllib dependency. i was fixing that.
>>> regards,
>>> Mahesh.
>>> p.s: do i need to maintain a blog?
>>>
>>> On Mon, May 16, 2016 at 10:02 AM, Maheshakya Wijewardena <
>>> mahesha...@wso2.com> wrote:
>>>
 Hi Mahesh,

 Sorry for replying late.

 Thank you for the update. I believe you have done some implementations
 with with Spark MLLIb algorithms in streaming fashion as we have discussed.
 If so, can you please share your code in a Github repo.

 Now i want to implements some machine learning algorithms with
> importing mllib and want to run within your code base
>

 For the moment you can try out editing the same class
 PredictStreamProcessor in the siddhi extension in carbon-ml. Later we will
 add this separately. You should be able to add org.apache.spark.mllib.
 classes to there.

 And i want to see how event streams are coming from cep. As i think it
> is not in a RDD format since it is arriving as the individual samples. I
> will send a email to dev asking about how to get the streams.


 Please pay attention to length[1] and lengthbatch[1] inbuilt windows in
 siddhi. What you need to write are functions similar to a custom aggregate
 function[2].
 When you send the email to dev list, explain your requirement. You need
 to get a set of event with from a stream with a specified window size
 (number of events). Then build a model within that function. You also need
 to retain the data (learned weights, cluster centers, etc.) from the
 previous window to use in the current window. Ask what can be the most
 suitable option for this among the set of siddhi extensions given.

 Best regards.

 [1]
 https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength
 [2]
 https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function

 On Wed, May 11, 2016 at 1:43 PM, Mahesh Dananjaya <
 dananjayamah...@gmail.com> wrote:

>
> -- Forwarded message --
> From: Mahesh Dananjaya 
> Date: Wed, May 11, 2016 at 1:43 PM
> Subject: Re: [Dev] GSOC2016: [ML][CEP] Predictive analytic with online
> data for WSO2 Machine Learner
> To: Maheshakya Wijewardena 
>
>
> Hi Maheshakya,
> sorry for not updating. I did what you wanted me to do. I checked the
> code base and train functions. I went through those java docs. I went
> through the carbon-ml current implementation of LG and K-Mean. And i had
> Apache Spark and i tried with several examples. Now i want to implements
> some machine learning algorithms 

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-16 Thread Mahesh Dananjaya
Hi Maheshakya,
I have gone through the Java Docs and run some of the Spark examples on
spark shell which are paramount improtant for our work. Then i have been
writing my codes to check the Linear regression, K means for streaming.
please check my git repo [1]. I think now i have to ask on dev regarding
the capturing event streams for our work. I will update the recent things
on git. check the park-example directory for java. examples run on git
shell is not included there. In my case i think i have to build mini
batches from data streams that comes as individual samples. Now i am
working on some coding to collect mini batches from data streams.thank you.
regards,
Mahesh.
[1]https://github.com/dananjayamahesh/GSOC2016

On Tue, May 17, 2016 at 10:10 AM, Mahesh Dananjaya <
dananjayamah...@gmail.com> wrote:

> Hi Maheshakya,
> I have gone through the Java Docs and run some of the Spark examples on
> spark shell which are paramount improtant for our work. Then i have been
> writing my codes to check the Linear regression, K means for streaming.
> please check my git repo [1]. I think now i have to ask on dev regarding
> the capturing event streams for our work. I will update the recent things
> on git. check the park-example directory for java. examples run on git
> shell is not included there. In my case i think i have to build mini
> batches from data streams that comes as individual samples. Now i am
> working on some coding to collect mini batches from data streams.thank you.
> regards,
> Mahesh.
> [1]https://github.com/dananjayamahesh/GSOC2016
>
> On Mon, May 16, 2016 at 1:19 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> thank you. i will update the repo today.thank you.i changed the carbon ml
>> siddhi extention and see how the changes are effecting. i will update the
>> progress as soon as possible.thank you. i had some problem in spark mllib
>> dependency. i was fixing that.
>> regards,
>> Mahesh.
>> p.s: do i need to maintain a blog?
>>
>> On Mon, May 16, 2016 at 10:02 AM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Hi Mahesh,
>>>
>>> Sorry for replying late.
>>>
>>> Thank you for the update. I believe you have done some implementations
>>> with with Spark MLLIb algorithms in streaming fashion as we have discussed.
>>> If so, can you please share your code in a Github repo.
>>>
>>> Now i want to implements some machine learning algorithms with importing
 mllib and want to run within your code base

>>>
>>> For the moment you can try out editing the same class
>>> PredictStreamProcessor in the siddhi extension in carbon-ml. Later we will
>>> add this separately. You should be able to add org.apache.spark.mllib.
>>> classes to there.
>>>
>>> And i want to see how event streams are coming from cep. As i think it
 is not in a RDD format since it is arriving as the individual samples. I
 will send a email to dev asking about how to get the streams.
>>>
>>>
>>> Please pay attention to length[1] and lengthbatch[1] inbuilt windows in
>>> siddhi. What you need to write are functions similar to a custom aggregate
>>> function[2].
>>> When you send the email to dev list, explain your requirement. You need
>>> to get a set of event with from a stream with a specified window size
>>> (number of events). Then build a model within that function. You also need
>>> to retain the data (learned weights, cluster centers, etc.) from the
>>> previous window to use in the current window. Ask what can be the most
>>> suitable option for this among the set of siddhi extensions given.
>>>
>>> Best regards.
>>>
>>> [1]
>>> https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength
>>> [2]
>>> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function
>>>
>>> On Wed, May 11, 2016 at 1:43 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>

 -- Forwarded message --
 From: Mahesh Dananjaya 
 Date: Wed, May 11, 2016 at 1:43 PM
 Subject: Re: [Dev] GSOC2016: [ML][CEP] Predictive analytic with online
 data for WSO2 Machine Learner
 To: Maheshakya Wijewardena 


 Hi Maheshakya,
 sorry for not updating. I did what you wanted me to do. I checked the
 code base and train functions. I went through those java docs. I went
 through the carbon-ml current implementation of LG and K-Mean. And i had
 Apache Spark and i tried with several examples. Now i want to implements
 some machine learning algorithms with importing mllib and want to run
 within your code base. Can you help me with that.
 And i want to see how event streams are coming from cep. As i think it
 is not in a RDD format since it is arriving as the individual samples. I
 will send a email to dev asking about how to get the streams. I debugged
 many of those functions in the code base. So need further instructions to
 

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-16 Thread Mahesh Dananjaya
Hi Maheshakya,
thank you. i will update the repo today.thank you.i changed the carbon ml
siddhi extention and see how the changes are effecting. i will update the
progress as soon as possible.thank you. i had some problem in spark mllib
dependency. i was fixing that.
regards,
Mahesh.
p.s: do i need to maintain a blog?

On Mon, May 16, 2016 at 10:02 AM, Maheshakya Wijewardena <
mahesha...@wso2.com> wrote:

> Hi Mahesh,
>
> Sorry for replying late.
>
> Thank you for the update. I believe you have done some implementations
> with with Spark MLLIb algorithms in streaming fashion as we have discussed.
> If so, can you please share your code in a Github repo.
>
> Now i want to implements some machine learning algorithms with importing
>> mllib and want to run within your code base
>>
>
> For the moment you can try out editing the same class
> PredictStreamProcessor in the siddhi extension in carbon-ml. Later we will
> add this separately. You should be able to add org.apache.spark.mllib.
> classes to there.
>
> And i want to see how event streams are coming from cep. As i think it is
>> not in a RDD format since it is arriving as the individual samples. I will
>> send a email to dev asking about how to get the streams.
>
>
> Please pay attention to length[1] and lengthbatch[1] inbuilt windows in
> siddhi. What you need to write are functions similar to a custom aggregate
> function[2].
> When you send the email to dev list, explain your requirement. You need to
> get a set of event with from a stream with a specified window size (number
> of events). Then build a model within that function. You also need to
> retain the data (learned weights, cluster centers, etc.) from the previous
> window to use in the current window. Ask what can be the most suitable
> option for this among the set of siddhi extensions given.
>
> Best regards.
>
> [1]
> https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength
> [2]
> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function
>
> On Wed, May 11, 2016 at 1:43 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>>
>> -- Forwarded message --
>> From: Mahesh Dananjaya 
>> Date: Wed, May 11, 2016 at 1:43 PM
>> Subject: Re: [Dev] GSOC2016: [ML][CEP] Predictive analytic with online
>> data for WSO2 Machine Learner
>> To: Maheshakya Wijewardena 
>>
>>
>> Hi Maheshakya,
>> sorry for not updating. I did what you wanted me to do. I checked the
>> code base and train functions. I went through those java docs. I went
>> through the carbon-ml current implementation of LG and K-Mean. And i had
>> Apache Spark and i tried with several examples. Now i want to implements
>> some machine learning algorithms with importing mllib and want to run
>> within your code base. Can you help me with that.
>> And i want to see how event streams are coming from cep. As i think it is
>> not in a RDD format since it is arriving as the individual samples. I will
>> send a email to dev asking about how to get the streams. I debugged many of
>> those functions in the code base. So need further instructions to
>> proceed.thank you.
>> regards,
>> Mahesh.
>>
>> On Wed, May 11, 2016 at 10:32 AM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Hi Mahesh,
>>>
>>> Any update on your progress?
>>>
>>> Best regards.
>>>
>>> On Wed, May 4, 2016 at 8:35 PM, Maheshakya Wijewardena <
>>> mahesha...@wso2.com> wrote:
>>>
 Hi Mahesh,

 is that "Put break points in train methods in Linear Regression class"
> means the spark/algorithms/ LinearRegrassion.java class in the
> org.wso2.carbon.ml.core? is that the correct file?


 Yes, this is the correct place.

 You can refer to spark programming guide[1][2] as well as our ML code
 base when you try those algorithms out. Please try to do rough
 implementations of the streaming versions of linear regression, logistic
 regression and k-means clustering as we have discussed in the proposal in
 plain Java. It's better if you can create a git repo and share your code
 once you have made some progress.

 Were you able debug and understand the flow of the ML siddhi extension?
 I hope you haven't encountered more errors after switching the released
 version of CEP.

 Is this Friday okay for you? Afternoon at 2:00 pm?

 Best regards.


 Best regards.

 [1] http://spark.apache.org/docs/latest/programming-guide.html
 [2] http://spark.apache.org/docs/latest/mllib-guide.html

 On Wed, May 4, 2016 at 1:07 PM, Mahesh Dananjaya <
 dananjayamah...@gmail.com> wrote:

> Hi Maheshakya,
> I have been looking into some algorithms related to stochastic
> gradient descent based algorithms.anything i should focus please let me
> know.Ans also i will be available for calling this week and next 
> week.thank
> you.
> BR,

Re: [Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-15 Thread Maheshakya Wijewardena
Hi Mahesh,

Sorry for replying late.

Thank you for the update. I believe you have done some implementations with
with Spark MLLIb algorithms in streaming fashion as we have discussed. If
so, can you please share your code in a Github repo.

Now i want to implements some machine learning algorithms with importing
> mllib and want to run within your code base
>

For the moment you can try out editing the same class
PredictStreamProcessor in the siddhi extension in carbon-ml. Later we will
add this separately. You should be able to add org.apache.spark.mllib.
classes to there.

And i want to see how event streams are coming from cep. As i think it is
> not in a RDD format since it is arriving as the individual samples. I will
> send a email to dev asking about how to get the streams.


Please pay attention to length[1] and lengthbatch[1] inbuilt windows in
siddhi. What you need to write are functions similar to a custom aggregate
function[2].
When you send the email to dev list, explain your requirement. You need to
get a set of event with from a stream with a specified window size (number
of events). Then build a model within that function. You also need to
retain the data (learned weights, cluster centers, etc.) from the previous
window to use in the current window. Ask what can be the most suitable
option for this among the set of siddhi extensions given.

Best regards.

[1]
https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength
[2] https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function

On Wed, May 11, 2016 at 1:43 PM, Mahesh Dananjaya  wrote:

>
> -- Forwarded message --
> From: Mahesh Dananjaya 
> Date: Wed, May 11, 2016 at 1:43 PM
> Subject: Re: [Dev] GSOC2016: [ML][CEP] Predictive analytic with online
> data for WSO2 Machine Learner
> To: Maheshakya Wijewardena 
>
>
> Hi Maheshakya,
> sorry for not updating. I did what you wanted me to do. I checked the code
> base and train functions. I went through those java docs. I went through
> the carbon-ml current implementation of LG and K-Mean. And i had Apache
> Spark and i tried with several examples. Now i want to implements some
> machine learning algorithms with importing mllib and want to run within
> your code base. Can you help me with that.
> And i want to see how event streams are coming from cep. As i think it is
> not in a RDD format since it is arriving as the individual samples. I will
> send a email to dev asking about how to get the streams. I debugged many of
> those functions in the code base. So need further instructions to
> proceed.thank you.
> regards,
> Mahesh.
>
> On Wed, May 11, 2016 at 10:32 AM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> Any update on your progress?
>>
>> Best regards.
>>
>> On Wed, May 4, 2016 at 8:35 PM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Hi Mahesh,
>>>
>>> is that "Put break points in train methods in Linear Regression class"
 means the spark/algorithms/ LinearRegrassion.java class in the
 org.wso2.carbon.ml.core? is that the correct file?
>>>
>>>
>>> Yes, this is the correct place.
>>>
>>> You can refer to spark programming guide[1][2] as well as our ML code
>>> base when you try those algorithms out. Please try to do rough
>>> implementations of the streaming versions of linear regression, logistic
>>> regression and k-means clustering as we have discussed in the proposal in
>>> plain Java. It's better if you can create a git repo and share your code
>>> once you have made some progress.
>>>
>>> Were you able debug and understand the flow of the ML siddhi extension?
>>> I hope you haven't encountered more errors after switching the released
>>> version of CEP.
>>>
>>> Is this Friday okay for you? Afternoon at 2:00 pm?
>>>
>>> Best regards.
>>>
>>>
>>> Best regards.
>>>
>>> [1] http://spark.apache.org/docs/latest/programming-guide.html
>>> [2] http://spark.apache.org/docs/latest/mllib-guide.html
>>>
>>> On Wed, May 4, 2016 at 1:07 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
 Hi Maheshakya,
 I have been looking into some algorithms related to stochastic gradient
 descent based algorithms.anything i should focus please let me know.Ans
 also i will be available for calling this week and next week.thank you.
 BR,
 Mahesh.

 On Tue, May 3, 2016 at 5:05 PM, Mahesh Dananjaya <
 dananjayamah...@gmail.com> wrote:

> Hi Maheshakya,
> thank you.that's good. i have been trying to fix that for couple of
> days. please inform me when it will be fixed.now i have been testing the 
> ML
> algorithms and trying to identify the flow and the hierarchy. is that "Put
> break points in train methods in Linear Regression class" means the
> spark/algorithms/ LinearRegrassion.java class in the
> org.wso2.carbon.ml.core? is that the correct 

[Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-11 Thread Mahesh Dananjaya
-- Forwarded message --
From: Mahesh Dananjaya 
Date: Wed, May 11, 2016 at 1:43 PM
Subject: Re: [Dev] GSOC2016: [ML][CEP] Predictive analytic with online data
for WSO2 Machine Learner
To: Maheshakya Wijewardena 


Hi Maheshakya,
sorry for not updating. I did what you wanted me to do. I checked the code
base and train functions. I went through those java docs. I went through
the carbon-ml current implementation of LG and K-Mean. And i had Apache
Spark and i tried with several examples. Now i want to implements some
machine learning algorithms with importing mllib and want to run within
your code base. Can you help me with that.
And i want to see how event streams are coming from cep. As i think it is
not in a RDD format since it is arriving as the individual samples. I will
send a email to dev asking about how to get the streams. I debugged many of
those functions in the code base. So need further instructions to
proceed.thank you.
regards,
Mahesh.

On Wed, May 11, 2016 at 10:32 AM, Maheshakya Wijewardena <
mahesha...@wso2.com> wrote:

> Hi Mahesh,
>
> Any update on your progress?
>
> Best regards.
>
> On Wed, May 4, 2016 at 8:35 PM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>> is that "Put break points in train methods in Linear Regression class"
>>> means the spark/algorithms/ LinearRegrassion.java class in the
>>> org.wso2.carbon.ml.core? is that the correct file?
>>
>>
>> Yes, this is the correct place.
>>
>> You can refer to spark programming guide[1][2] as well as our ML code
>> base when you try those algorithms out. Please try to do rough
>> implementations of the streaming versions of linear regression, logistic
>> regression and k-means clustering as we have discussed in the proposal in
>> plain Java. It's better if you can create a git repo and share your code
>> once you have made some progress.
>>
>> Were you able debug and understand the flow of the ML siddhi extension? I
>> hope you haven't encountered more errors after switching the released
>> version of CEP.
>>
>> Is this Friday okay for you? Afternoon at 2:00 pm?
>>
>> Best regards.
>>
>>
>> Best regards.
>>
>> [1] http://spark.apache.org/docs/latest/programming-guide.html
>> [2] http://spark.apache.org/docs/latest/mllib-guide.html
>>
>> On Wed, May 4, 2016 at 1:07 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> I have been looking into some algorithms related to stochastic gradient
>>> descent based algorithms.anything i should focus please let me know.Ans
>>> also i will be available for calling this week and next week.thank you.
>>> BR,
>>> Mahesh.
>>>
>>> On Tue, May 3, 2016 at 5:05 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
 Hi Maheshakya,
 thank you.that's good. i have been trying to fix that for couple of
 days. please inform me when it will be fixed.now i have been testing the ML
 algorithms and trying to identify the flow and the hierarchy. is that "Put
 break points in train methods in Linear Regression class" means the
 spark/algorithms/ LinearRegrassion.java class in the
 org.wso2.carbon.ml.core? is that the correct file?
 And also i am planning to write some programs to use apache spark mllib
 algorithms. and i refer to [1] and some wso2 documentations to get some
 idea about ML structure.thank you.

 BR,
 Mahesh.

 [1]nirmalfdo.blogspot.com

 On Tue, May 3, 2016 at 4:36 PM, Maheshakya Wijewardena <
 mahesha...@wso2.com> wrote:

> Hi Mahesh,
>
> I have checked. It seems the issue you have encountered is cause only
> in the current development branch of the product-cep. It doesn't identify
> the ML siddhi extension as an extension. ML siddhi extension works fine in
> the latest release of CEP (4.1.0) [1].
> Until we figure out the reason and come up with a solution, can you
> use the latest CEP release for your work. It's fine to use that since you
> haven't started actual development yet.
>
> Best regards.
>
> [1] http://wso2.com/products/complex-event-processor/
>
> On Tue, May 3, 2016 at 3:19 PM, Maheshakya Wijewardena <
> mahesha...@wso2.com> wrote:
>
>> Hi Mahesh,
>>
>>
>>> Is is vital to use those local repo in my upcoming implementation?
>>
>>
>> Yes. The remote p2-repo contains the p2-repos of released versions.
>> What you have to develop on is the current master of the carbon-ml and
>> product-ml. You can try out with the modification I have suggested. In 
>> the
>> meantime, I'll verify whether the current repos are working as expected.
>>
>> And also i am trying to debug the carbon-ml org.wso2.carbon.ml.core
>>> by putting some break point in the spark/algorithms/Linear Regression
>>
>>
>> It's great that you have started looking at the implementation 

[Dev] Fwd: GSOC2016: [ML][CEP] Predictive analytic with online data for WSO2 Machine Learner

2016-05-02 Thread Mahesh Dananjaya
Hi maheshakya,
I have installed them correctly.now I am trying to debug the siddhi
extention with the cep as the [1] describes. But when i created an input
stream and a predictionStream (output stream). when i was trying to create
new execution plan with above streams i got error when i clicked "Validate
Query Expression".Error was,
Error:
No extension exist for StreamFunctionExtension{namespace='ml'} in execution
plan "ExecutionPlan"

and my expression is like a

/* Enter a unique ExecutionPlan */
@Plan:name('ExecutionPlan')

/* Enter a unique description for ExecutionPlan */
-- @Plan:description('ExecutionPlan')

/* define streams/tables and write queries here ... */

@Import('InputStream:1.0.0')
define stream InputStream (NumPregnancies double, TSFT double, DPF double,
BMI double, DBP double, PG2 double, Age double, SI2 double);

@Export('PredictionStream:1.0.0')
define stream PredictionSTream (NumPregnancies double, TSFT double, DPF
double, BMI double, DBP double, PG2 double, Age double, SI2 double, Class
double);

from
InputStream#ml:predict('file:///home/mahesh/GSOC/WSO2/data-set/pima-indian-diabetes.data','double')
select *
insert into PredictionStream


i used file instead of registry. And i referred to the [2] and there they
mention that solution for fixing CEP is running on distributed mode with
apache Storm cluster.

1. Is that CEP i built is originally run as distributed mode?
2. Is this cuased by an not having sudo privilleges in current user when
installing ML features onto CEP?
3.Is this the correct way to give file to CEP.

[1]
https://docs.wso2.com/display/ML110/WSO2+CEP+Extension+for+ML+Predictions#WSO2CEPExtensionforMLPredictions-Siddhisyntaxfortheextension

[2]https://wso2.org/jira/browse/CEP-1400

BR,
Mahesh.


On Mon, May 2, 2016 at 12:35 PM, Maheshakya Wijewardena  wrote:

> Hi Mahesh,
>
> If you have built product-ml, you can find the P2-repo at
> product-ml/modules/p2-profile/target/p2-repo
> Add this folder as a local repository.
> After that, you should be able to see the ML features.
>
> Best regards.
>
> On Mon, May 2, 2016 at 12:24 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> Since i already have carbon-ml built in my pc can i use my local
>> repository to install those features in to CEP.is that correct.thank you.
>> regards,
>> Mahesh.
>>
>> On Mon, May 2, 2016 at 12:20 PM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>
>>> Hi Maheshakya,
>>> Can you please tell me how to find the most recent p2 repository URL to
>>> add machine learner Core, Machine learner commons, Machine learner database
>>> service and ML Siddhi extension to add as features in CEP as describes in
>>> the [1]. When i use
>>> http://product-dist.wso2.com/p2/carbon/releases/4.2.0/ URL those
>>> features are not visible in the CEP.Is that not he most recent one.
>>> BR,
>>> Mahesh.
>>>
>>> [1]
>>> https://docs.wso2.com/display/ML110/WSO2+CEP+Extension+for+ML+Predictions#WSO2CEPExtensionforMLPredictions-Siddhisyntaxfortheextension
>>>
>>> On Mon, May 2, 2016 at 11:28 AM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>
 Hi Maheshakya,
 sorry for the incomplete message.I have set up the dev environment and
 now i am trying to remotely debug. The following steps were done.
 1. build product-cep, carbon-ml and product-ml by source.
 2. go through their code bases and trying to understand the way and the
 flow you developed.
 3. i have set up break point in org.wso2.carbon.ml.siddhi.extension in
 carbon-ml
 4. start the ./wso2server.sh debug 5005 in the SNAPSHOT directory of
 product-ml
 5. trying to trigger the break points with the [1] reference.break
 points are placed in the PredictStreamProcessor.java file within the
 extention.

 This is the way i followed. I was trying to remotely debug the ML core
 by putting break-points in ml core.(org.wso2.carbon.ml.core) in spark java
 files. Is this the right way to do those things.

 [1]
 https://docs.wso2.com/display/ML110/WSO2+CEP+Extension+for+ML+Predictions#WSO2CEPExtensionforMLPredictions-Siddhisyntaxfortheextension

 On Mon, May 2, 2016 at 11:19 AM, Mahesh Dananjaya <
 dananjayamah...@gmail.com> wrote:

> Hi maheshakya,
> I have set up the dev environment and now i am trying to remotely
> debug. The following steps were done.
> 1. build product-cep, carbon-ml and product-ml by source.
> 2. go through their code bases and trying to understand the way and
> the flow you developed.
> 3. i have set up break point in
>
>
> On Thu, Apr 28, 2016 at 7:05 PM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>
>> Hi Maheshakya,
>> ok.i got it.thank you.
>> regards,
>> Mahesh.
>>
>> On Thu, Apr 28, 2016 at 6:56 PM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Hi Mahesh,
>>>
>>>