Hi Mahesh,

You don't actually have to implement anything in spark streaming. Try to
understand how streaming data is handled in and the specifics of the
underlying algorithms in streaming.
What we want to do is having the similar algorithms that support CEP event
streams with siddhi.

Best regards.

On Wed, May 18, 2016 at 9:38 AM, Mahesh Dananjaya <dananjayamah...@gmail.com
> wrote:

> Hi Maheshakya,
> Did you check the repo. I will add recent works today.And also i was going
> through the Java docs related to spark streaming work. It is with that
> scala API. thank you.
> regards,
> Mahesh.
> On Tue, May 17, 2016 at 10:11 AM, Mahesh Dananjaya <
> dananjayamah...@gmail.com> wrote:
>> Hi Maheshakya,
>> I have gone through the Java Docs and run some of the Spark examples on
>> spark shell which are paramount improtant for our work. Then i have been
>> writing my codes to check the Linear regression, K means for streaming.
>> please check my git repo [1]. I think now i have to ask on dev regarding
>> the capturing event streams for our work. I will update the recent things
>> on git. check the park-example directory for java. examples run on git
>> shell is not included there. In my case i think i have to build mini
>> batches from data streams that comes as individual samples. Now i am
>> working on some coding to collect mini batches from data streams.thank you.
>> regards,
>> Mahesh.
>> [1]https://github.com/dananjayamahesh/GSOC2016
>> On Tue, May 17, 2016 at 10:10 AM, Mahesh Dananjaya <
>> dananjayamah...@gmail.com> wrote:
>>> Hi Maheshakya,
>>> I have gone through the Java Docs and run some of the Spark examples on
>>> spark shell which are paramount improtant for our work. Then i have been
>>> writing my codes to check the Linear regression, K means for streaming.
>>> please check my git repo [1]. I think now i have to ask on dev regarding
>>> the capturing event streams for our work. I will update the recent things
>>> on git. check the park-example directory for java. examples run on git
>>> shell is not included there. In my case i think i have to build mini
>>> batches from data streams that comes as individual samples. Now i am
>>> working on some coding to collect mini batches from data streams.thank you.
>>> regards,
>>> Mahesh.
>>> [1]https://github.com/dananjayamahesh/GSOC2016
>>> On Mon, May 16, 2016 at 1:19 PM, Mahesh Dananjaya <
>>> dananjayamah...@gmail.com> wrote:
>>>> Hi Maheshakya,
>>>> thank you. i will update the repo today.thank you.i changed the carbon
>>>> ml siddhi extention and see how the changes are effecting. i will update
>>>> the progress as soon as possible.thank you. i had some problem in spark
>>>> mllib dependency. i was fixing that.
>>>> regards,
>>>> Mahesh.
>>>> p.s: do i need to maintain a blog?
>>>> On Mon, May 16, 2016 at 10:02 AM, Maheshakya Wijewardena <
>>>> mahesha...@wso2.com> wrote:
>>>>> Hi Mahesh,
>>>>> Sorry for replying late.
>>>>> Thank you for the update. I believe you have done some implementations
>>>>> with with Spark MLLIb algorithms in streaming fashion as we have 
>>>>> discussed.
>>>>> If so, can you please share your code in a Github repo.
>>>>> Now i want to implements some machine learning algorithms with
>>>>>> importing mllib and want to run within your code base
>>>>> For the moment you can try out editing the same class
>>>>> PredictStreamProcessor in the siddhi extension in carbon-ml. Later we will
>>>>> add this separately. You should be able to add org.apache.spark.mllib.
>>>>> classes to there.
>>>>> And i want to see how event streams are coming from cep. As i think it
>>>>>> is not in a RDD format since it is arriving as the individual samples. I
>>>>>> will send a email to dev asking about how to get the streams.
>>>>> Please pay attention to length[1] and lengthbatch[1] inbuilt windows
>>>>> in siddhi. What you need to write are functions similar to a custom
>>>>> aggregate function[2].
>>>>> When you send the email to dev list, explain your requirement. You
>>>>> need to get a set of event with from a stream with a specified window size
>>>>> (number of events). Then build a model within that function. You also need
>>>>> to retain the data (learned weights, cluster centers, etc.) from the
>>>>> previous window to use in the current window. Ask what can be the most
>>>>> suitable option for this among the set of siddhi extensions given.
>>>>> Best regards.
>>>>> [1]
>>>>> https://docs.wso2.com/display/CEP400/Inbuilt+Windows#InbuiltWindows-lengthlength
>>>>> [2]
>>>>> https://docs.wso2.com/display/CEP400/Writing+a+Custom+Aggregate+Function
>>>>> On Wed, May 11, 2016 at 1:43 PM, Mahesh Dananjaya <
>>>>> dananjayamah...@gmail.com> wrote:
>>>>>> ---------- Forwarded message ----------
>>>>>> From: Mahesh Dananjaya <dananjayamah...@gmail.com>
>>>>>> Date: Wed, May 11, 2016 at 1:43 PM
>>>>>> Subject: Re: [Dev] GSOC2016: [ML][CEP] Predictive analytic with
>>>>>> online data for WSO2 Machine Learner
>>>>>> To: Maheshakya Wijewardena <mahesha...@wso2.com>
>>>>>> Hi Maheshakya,
>>>>>> sorry for not updating. I did what you wanted me to do. I checked the
>>>>>> code base and train functions. I went through those java docs. I went
>>>>>> through the carbon-ml current implementation of LG and K-Mean. And i had
>>>>>> Apache Spark and i tried with several examples. Now i want to implements
>>>>>> some machine learning algorithms with importing mllib and want to run
>>>>>> within your code base. Can you help me with that.
>>>>>> And i want to see how event streams are coming from cep. As i think
>>>>>> it is not in a RDD format since it is arriving as the individual 
>>>>>> samples. I
>>>>>> will send a email to dev asking about how to get the streams. I debugged
>>>>>> many of those functions in the code base. So need further instructions to
>>>>>> proceed.thank you.
>>>>>> regards,
>>>>>> Mahesh.
>>>>>> On Wed, May 11, 2016 at 10:32 AM, Maheshakya Wijewardena <
>>>>>> mahesha...@wso2.com> wrote:
>>>>>>> Hi Mahesh,
>>>>>>> Any update on your progress?
>>>>>>> Best regards.
>>>>>>> On Wed, May 4, 2016 at 8:35 PM, Maheshakya Wijewardena <
>>>>>>> mahesha...@wso2.com> wrote:
>>>>>>>> Hi Mahesh,
>>>>>>>> is that "Put break points in train methods in Linear Regression
>>>>>>>>> class" means the spark/algorithms/ LinearRegrassion.java class in the
>>>>>>>>> org.wso2.carbon.ml.core? is that the correct file?
>>>>>>>> Yes, this is the correct place.
>>>>>>>> You can refer to spark programming guide[1][2] as well as our ML
>>>>>>>> code base when you try those algorithms out. Please try to do rough
>>>>>>>> implementations of the streaming versions of linear regression, 
>>>>>>>> logistic
>>>>>>>> regression and k-means clustering as we have discussed in the proposal 
>>>>>>>> in
>>>>>>>> plain Java. It's better if you can create a git repo and share your 
>>>>>>>> code
>>>>>>>> once you have made some progress.
>>>>>>>> Were you able debug and understand the flow of the ML siddhi
>>>>>>>> extension? I hope you haven't encountered more errors after switching 
>>>>>>>> the
>>>>>>>> released version of CEP.
>>>>>>>> Is this Friday okay for you? Afternoon at 2:00 pm?
>>>>>>>> Best regards.
>>>>>>>> Best regards.
>>>>>>>> [1] http://spark.apache.org/docs/latest/programming-guide.html
>>>>>>>> [2] http://spark.apache.org/docs/latest/mllib-guide.html
>>>>>>>> On Wed, May 4, 2016 at 1:07 PM, Mahesh Dananjaya <
>>>>>>>> dananjayamah...@gmail.com> wrote:
>>>>>>>>> Hi Maheshakya,
>>>>>>>>> I have been looking into some algorithms related to stochastic
>>>>>>>>> gradient descent based algorithms.anything i should focus please let 
>>>>>>>>> me
>>>>>>>>> know.Ans also i will be available for calling this week and next 
>>>>>>>>> week.thank
>>>>>>>>> you.
>>>>>>>>> BR,
>>>>>>>>> Mahesh.
>>>>>>>>> On Tue, May 3, 2016 at 5:05 PM, Mahesh Dananjaya <
>>>>>>>>> dananjayamah...@gmail.com> wrote:
>>>>>>>>>> Hi Maheshakya,
>>>>>>>>>> thank you.that's good. i have been trying to fix that for couple
>>>>>>>>>> of days. please inform me when it will be fixed.now i have been 
>>>>>>>>>> testing the
>>>>>>>>>> ML algorithms and trying to identify the flow and the hierarchy. is 
>>>>>>>>>> that
>>>>>>>>>> "Put break points in train methods in Linear Regression class" means 
>>>>>>>>>> the
>>>>>>>>>> spark/algorithms/ LinearRegrassion.java class in the
>>>>>>>>>> org.wso2.carbon.ml.core? is that the correct file?
>>>>>>>>>> And also i am planning to write some programs to use apache spark
>>>>>>>>>> mllib algorithms. and i refer to [1] and some wso2 documentations to 
>>>>>>>>>> get
>>>>>>>>>> some idea about ML structure.thank you.
>>>>>>>>>> BR,
>>>>>>>>>> Mahesh.
>>>>>>>>>> [1]nirmalfdo.blogspot.com
>>>>>>>>>> On Tue, May 3, 2016 at 4:36 PM, Maheshakya Wijewardena <
>>>>>>>>>> mahesha...@wso2.com> wrote:
>>>>>>>>>>> Hi Mahesh,
>>>>>>>>>>> I have checked. It seems the issue you have encountered is cause
>>>>>>>>>>> only in the current development branch of the product-cep. It 
>>>>>>>>>>> doesn't
>>>>>>>>>>> identify the ML siddhi extension as an extension. ML siddhi 
>>>>>>>>>>> extension works
>>>>>>>>>>> fine in the latest release of CEP (4.1.0) [1].
>>>>>>>>>>> Until we figure out the reason and come up with a solution, can
>>>>>>>>>>> you use the latest CEP release for your work. It's fine to use that 
>>>>>>>>>>> since
>>>>>>>>>>> you haven't started actual development yet.
>>>>>>>>>>> Best regards.
>>>>>>>>>>> [1] http://wso2.com/products/complex-event-processor/
>>>>>>>>>>> On Tue, May 3, 2016 at 3:19 PM, Maheshakya Wijewardena <
>>>>>>>>>>> mahesha...@wso2.com> wrote:
>>>>>>>>>>>> Hi Mahesh,
>>>>>>>>>>>>> Is is vital to use those local repo in my upcoming
>>>>>>>>>>>>> implementation?
>>>>>>>>>>>> Yes. The remote p2-repo contains the p2-repos of released
>>>>>>>>>>>> versions. What you have to develop on is the current master of the
>>>>>>>>>>>> carbon-ml and product-ml. You can try out with the modification I 
>>>>>>>>>>>> have
>>>>>>>>>>>> suggested. In the meantime, I'll verify whether the current repos 
>>>>>>>>>>>> are
>>>>>>>>>>>> working as expected.
>>>>>>>>>>>> And also i am trying to debug the carbon-ml
>>>>>>>>>>>>> org.wso2.carbon.ml.core by putting some break point in the
>>>>>>>>>>>>> spark/algorithms/Linear Regression
>>>>>>>>>>>> It's great that you have started looking at the implementation
>>>>>>>>>>>> of linear regression as well. Put break points in train methods in
>>>>>>>>>>>> LinearRegression class. This is being used when you run linear 
>>>>>>>>>>>> regression
>>>>>>>>>>>> from UI.
>>>>>>>>>>>> I can see some comments left behind for streaming algo as
>>>>>>>>>>>>> well.thank you
>>>>>>>>>>>> You may be referring to the linear regression with SGD model.
>>>>>>>>>>>> Here, there's no retraining with streaming data involved. The SGD 
>>>>>>>>>>>> with
>>>>>>>>>>>> minibatches is used to train the model with the data set only once.
>>>>>>>>>>>> What you have to do is create a similar mechanism to involve
>>>>>>>>>>>> streaming data and retrain models. We will get to that part once 
>>>>>>>>>>>> you get
>>>>>>>>>>>> comfortable with siddhi extensions.
>>>>>>>>>>>>  BTW, is it possible for you to join a call on this Friday or
>>>>>>>>>>>> in the next week. We'll try to resolve your current issues and 
>>>>>>>>>>>> discuss
>>>>>>>>>>>> further on project.
>>>>>>>>>>>> Best regards.
>>>>>>>>>>>> On Tue, May 3, 2016 at 1:03 PM, Mahesh Dananjaya <
>>>>>>>>>>>> dananjayamah...@gmail.com> wrote:
>>>>>>>>>>>>> Hi maheshakya,
>>>>>>>>>>>>> Is it ok to go with p2 repo at
>>>>>>>>>>>>> http://product-dist.wso2.com/p2/carbon/releases/wilkes/features/
>>>>>>>>>>>>> rather than the P2-repo at 
>>>>>>>>>>>>> product-ml/modules/p2-profile/target/p2-repo in
>>>>>>>>>>>>> local repo.What is the impact?.Is is vital to use those local 
>>>>>>>>>>>>> repo in my
>>>>>>>>>>>>> upcoming implementation?so i was trying to give remote p2 repo to 
>>>>>>>>>>>>> cep built
>>>>>>>>>>>>> by source and trying to debug the cep ml extension and got the 
>>>>>>>>>>>>> same error
>>>>>>>>>>>>> as yesterday. But pre-built product is working fine.  Therefore 
>>>>>>>>>>>>> now i am
>>>>>>>>>>>>> trying as you described in the last email.
>>>>>>>>>>>>> And also i am trying to debug the carbon-ml
>>>>>>>>>>>>> org.wso2.carbon.ml.core by putting some break point in the
>>>>>>>>>>>>> spark/algorithms/Linear Regression. I am trying to trigger it with
>>>>>>>>>>>>> product-ml project with data set. Does that Linear Regression 
>>>>>>>>>>>>> also in UI is
>>>>>>>>>>>>> consuming that spark algorithms or is it in another place? I can 
>>>>>>>>>>>>> see some
>>>>>>>>>>>>> comments left behind for streaming algo as well.thank you.
>>>>>>>>>>>>> BR,
>>>>>>>>>>>>> Mahesh.
>>>>>>>>>>>>> On Tue, May 3, 2016 at 9:35 AM, Maheshakya Wijewardena <
>>>>>>>>>>>>> mahesha...@wso2.com> wrote:
>>>>>>>>>>>>>> Hi Mahesh,
>>>>>>>>>>>>>> The earlier error you have mentioned may occur due to the
>>>>>>>>>>>>>> incompatible Siddhi versions in ML p2-repo and CEP, when you add 
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> p2-repo from prodcut-ml you built.
>>>>>>>>>>>>>> Current siddhi version in product-cep is 3.0.6-SNAPSHOT[1],
>>>>>>>>>>>>>> but in ML it's 3.0.2.
>>>>>>>>>>>>>> Can you try changing the siddhi.version in carbon-ml/pom.xml
>>>>>>>>>>>>>> to 3.0.6-SNAPSHOT, build carbon-ml, then build product-ml again. 
>>>>>>>>>>>>>> After
>>>>>>>>>>>>>> this, add p2-repo as a local repository again to fresh CEP pack 
>>>>>>>>>>>>>> and try it
>>>>>>>>>>>>>> out.
>>>>>>>>>>>>>> Best regards.
>>>>>>>>>>>>>> On Mon, May 2, 2016 at 7:02 PM, Mahesh Dananjaya <
>>>>>>>>>>>>>> dananjayamah...@gmail.com> wrote:
>>>>>>>>>>>>>>> Hi Maheshakya,
>>>>>>>>>>>>>>> now i remotely debug the CEP extension for ML Prediction.
>>>>>>>>>>>>>>> What i did was, i have all the pre-build version of CEP and ML. 
>>>>>>>>>>>>>>> Therefore i
>>>>>>>>>>>>>>> used that built CEP and did the same thing that i was doing 
>>>>>>>>>>>>>>> with the source
>>>>>>>>>>>>>>> code. I think the only change i did was install those packages 
>>>>>>>>>>>>>>> from remote
>>>>>>>>>>>>>>> p2 repo. This was work fine and i debugged the carbon-ml
>>>>>>>>>>>>>>> org.wso2.carbon.ml.siddhi.extension as described in the [1]. So 
>>>>>>>>>>>>>>> now i have
>>>>>>>>>>>>>>> to try same thing with the build by source content.thank you.
>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>> https://docs.wso2.com/display/ML110/WSO2+CEP+Extension+for+ML+Predictions#WSO2CEPExtensionforMLPredictions-Siddhisyntaxfortheextension
>>>>>>>>>>>>>>> BR,
>>>>>>>>>>>>>>> Mahesh
>>>>>>>>>>>>>>> On Mon, Apr 25, 2016 at 5:49 PM, Maheshakya Wijewardena <
>>>>>>>>>>>>>>> mahesha...@wso2.com> wrote:
>>>>>>>>>>>>>>>> Hi Mahesh,
>>>>>>>>>>>>>>>> Congratulations and welcome to GSoC 2016. You did a great
>>>>>>>>>>>>>>>> job in preparing the proposal. Now it's time to dig deep and 
>>>>>>>>>>>>>>>> get started
>>>>>>>>>>>>>>>> with the project.
>>>>>>>>>>>>>>>> First of all you need to familiarize with the code base. We
>>>>>>>>>>>>>>>> have agreed to implement this with CEP event streams. We 
>>>>>>>>>>>>>>>> already have a CEP
>>>>>>>>>>>>>>>> extension for predictions [1][2]. Go through this 
>>>>>>>>>>>>>>>> implementation and
>>>>>>>>>>>>>>>> familiarize your self with that. You need to understand how:
>>>>>>>>>>>>>>>>    1. Even streams are consumed
>>>>>>>>>>>>>>>>    2. predictions are made from individual event
>>>>>>>>>>>>>>>>    3. Results are sent back
>>>>>>>>>>>>>>>> Get WSO2 ML and CEP sources (You may use latest released
>>>>>>>>>>>>>>>> version of CEP) and build the products. Get both carbon-ml[3] 
>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>> product-ml[4] masters and create new branches for your work 
>>>>>>>>>>>>>>>> from masters.
>>>>>>>>>>>>>>>> After you build the products, you may need to do remote
>>>>>>>>>>>>>>>> debugging[5] to understand the flow. So please follow an 
>>>>>>>>>>>>>>>> example of real
>>>>>>>>>>>>>>>> time prediction with ML with debugging and get some idea. The 
>>>>>>>>>>>>>>>> component you
>>>>>>>>>>>>>>>> need to debug is org.wso2.carbon.ml.siddhi.extension.
>>>>>>>>>>>>>>>> Next tasks would be implementing online learning algorithms
>>>>>>>>>>>>>>>> in plain java with spark ml lib and integrating those to ML. 
>>>>>>>>>>>>>>>> We also need
>>>>>>>>>>>>>>>> to come up with a proper and detailed architecture to employ 
>>>>>>>>>>>>>>>> those
>>>>>>>>>>>>>>>> algorithms in ML. Getting familiar with the aforementioned 
>>>>>>>>>>>>>>>> sections would
>>>>>>>>>>>>>>>> give you some insight on how this should be implemented.
>>>>>>>>>>>>>>>> So please try to get a quick grasp then you can start the
>>>>>>>>>>>>>>>> implementation. Let us know if you have any questions or you 
>>>>>>>>>>>>>>>> get stuck
>>>>>>>>>>>>>>>> somewhere.
>>>>>>>>>>>>>>>> Also, please always add WSO2 developer's list as well when
>>>>>>>>>>>>>>>> you communicate with us regarding the project so that you can 
>>>>>>>>>>>>>>>> get opinions
>>>>>>>>>>>>>>>> and feedback from others as well.
>>>>>>>>>>>>>>>> Best regards.
>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>> https://docs.wso2.com/display/ML110/WSO2+CEP+Extension+for+ML+Predictions#WSO2CEPExtensionforMLPredictions-Siddhisyntaxfortheextension
>>>>>>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>> https://github.com/wso2/carbon-ml/tree/master/components/extensions/org.wso2.carbon.ml.siddhi.extension
>>>>>>>>>>>>>>>> [3] https://github.com/wso2/carbon-ml
>>>>>>>>>>>>>>>> [4] https://github.com/wso2/product-ml
>>>>>>>>>>>>>>>> [5] https://dzone.com/articles/how-debug-wso2-carbon-kernel
>>>>>>>>>>>>>>>> On Mon, Apr 25, 2016 at 3:33 PM, Mahesh Dananjaya <
>>>>>>>>>>>>>>>> dananjayamah...@gmail.com> wrote:
>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>> thank you for accepting my GSOC 2016 proposal and i am
>>>>>>>>>>>>>>>>> looking forward for the further instruction and project 
>>>>>>>>>>>>>>>>> continuation. thank
>>>>>>>>>>>>>>>>> you very much.
>>>>>>>>>>>>>>>>> regards,
>>>>>>>>>>>>>>>>> Mahesh.
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>> Pruthuvi Maheshakya Wijewardena
>>>>>>>>>>>>>>>> mahesha...@wso2.com
>>>>>>>>>>>>>>>> +94711228855
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> Pruthuvi Maheshakya Wijewardena
>>>>>>>>>>>>>> mahesha...@wso2.com
>>>>>>>>>>>>>> +94711228855
>>>>>>>>>>>> --
>>>>>>>>>>>> Pruthuvi Maheshakya Wijewardena
>>>>>>>>>>>> mahesha...@wso2.com
>>>>>>>>>>>> +94711228855
>>>>>>>>>>> --
>>>>>>>>>>> Pruthuvi Maheshakya Wijewardena
>>>>>>>>>>> mahesha...@wso2.com
>>>>>>>>>>> +94711228855
>>>>>>>> --
>>>>>>>> Pruthuvi Maheshakya Wijewardena
>>>>>>>> mahesha...@wso2.com
>>>>>>>> +94711228855
>>>>>>> --
>>>>>>> Pruthuvi Maheshakya Wijewardena
>>>>>>> mahesha...@wso2.com
>>>>>>> +94711228855
>>>>>> _______________________________________________
>>>>>> Dev mailing list
>>>>>> Dev@wso2.org
>>>>>> http://wso2.org/cgi-bin/mailman/listinfo/dev
>>>>> --
>>>>> Pruthuvi Maheshakya Wijewardena
>>>>> mahesha...@wso2.com
>>>>> +94711228855

Pruthuvi Maheshakya Wijewardena
Dev mailing list

Reply via email to