Hi Danula, How is it coming along?
On Tue, Aug 11, 2015 at 1:51 AM, Danula Eranjith <[email protected]> wrote: > Hi Supun, > > Following points were discussed in the meeting > > *Integration to ML* > > We decided to add the wrangler interface as the first step considering the > current ML implementation. > > So the steps from a users perspective would be as follows > > - A sample from the dataset will be sent to wrangler interface. > - User can apply desired operations in the wrangler interface > - User can return to ML by clicking an button in the interface. > - Viewing the script will be optional for the user. > - When returned to ML, spark transformations are automatically generated > and applied to the dataset. > > *Spark Transformations* > > I have implemented all the wrangler transformations by extending a single > abstract class. These operations are invoked by parsing the javascript code > generated by wrangler. However since ML spark transformations are applied > all together at the end of the process, I have to persist all the > parameters and keep operations as a list which can be invoked later. > > Nirmal pointed out that this could be achieved by using chain of > responsibility design pattern. I am currently changing the implementation > accordingly. > > I will get back to you and Nirmal when automation process is completed to > start the integration. > > Regards, > Danula > > On Mon, Aug 10, 2015 at 9:29 PM, Supun Sethunga <[email protected]> wrote: > >> Any update? >> >> On Fri, Aug 7, 2015 at 10:13 AM, Supun Sethunga <[email protected]> wrote: >> >>> Hi Danula, >>> >>> Sorry I couldn't join the meeting. Can you please share the >>> meeting/review notes? Also the progress on the suggestions and what is left >>> to be done in overall? >>> >>> Thanks, >>> Supun >>> >>> On Wed, Aug 5, 2015 at 3:47 AM, Nirmal Fernando <[email protected]> wrote: >>> >>>> Hi Danula, >>>> >>>> It should be a JavaRDD<String[]>, where each row represents the feature >>>> vector as a string[]. >>>> >>>> On Tue, Aug 4, 2015 at 11:51 AM, Danula Eranjith <[email protected]> >>>> wrote: >>>> >>>>> In other words, >>>>> What would be the preferred output type for a dataset which is >>>>> pre-processed by wrangler? >>>>> As I have observed different algorithms use different JavaRDD types as >>>>> input ( JavaRDD<String>, JavaRDD<Vector> etc ) >>>>> >>>>> On Tue, Aug 4, 2015 at 11:48 AM, Nirmal Fernando <[email protected]> >>>>> wrote: >>>>> >>>>>> Hi Danula, >>>>>> >>>>>> On Tue, Aug 4, 2015 at 11:47 AM, Danula Eranjith <[email protected] >>>>>> > wrote: >>>>>> >>>>>>> Hi Nirmal, >>>>>>> >>>>>>> In ML, what is the preferred way of keeping data in a single row of >>>>>>> JavaRDD? >>>>>>> >>>>>> >>>>>> I didn't quite get your question. Can you elaborate please? >>>>>> >>>>>> >>>>>>> >>>>>>> As I have figured it depends on the algorithm being used. >>>>>>> >>>>>>> Danula >>>>>>> >>>>>>> On Thu, Jul 30, 2015 at 9:14 AM, Nirmal Fernando <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Thanks Danula, I'll send an invite. >>>>>>>> >>>>>>>> On Wed, Jul 29, 2015 at 10:24 PM, Danula Eranjith < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> Hi Nirmal, >>>>>>>>> >>>>>>>>> I am available after 1.30pm on Tuesday, Wednesday and Thursday. >>>>>>>>> >>>>>>>>> Danula >>>>>>>>> >>>>>>>>> On Wed, Jul 29, 2015 at 12:10 PM, Nirmal Fernando <[email protected] >>>>>>>>> > wrote: >>>>>>>>> >>>>>>>>>> Hi Danula, >>>>>>>>>> >>>>>>>>>> Can we arrange a demo/review somewhere next week? Please let me >>>>>>>>>> know few time slots. >>>>>>>>>> >>>>>>>>>> On Thu, Jul 23, 2015 at 11:47 AM, Nirmal Fernando < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> Thanks Danula. >>>>>>>>>>> >>>>>>>>>>> On Thu, Jul 23, 2015 at 11:41 AM, Danula Eranjith < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>>> You can find the source at [1] >>>>>>>>>>>> <https://github.com/danula/wso2-ml-wrangler-integration>. I >>>>>>>>>>>> have to do some refactoring when integrating to ML. >>>>>>>>>>>> >>>>>>>>>>>> [1] - https://github.com/danula/wso2-ml-wrangler-integration >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Jul 23, 2015 at 11:31 AM, Nirmal Fernando < >>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Thanks Danula. Please share the current code, if possible. >>>>>>>>>>>>> >>>>>>>>>>>>> On Thu, Jul 23, 2015 at 8:41 AM, Danula Eranjith < >>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I have succeeded in parsing the operations from wrangler >>>>>>>>>>>>>> javascript code to spark transformations I have written. Working >>>>>>>>>>>>>> on >>>>>>>>>>>>>> automating the process. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Last couple of steps would be changing the wrangler interface >>>>>>>>>>>>>> and integrating it into ML Wizard. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks >>>>>>>>>>>>>> Danula >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Wed, Jul 22, 2015 at 9:31 AM, Nirmal Fernando < >>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Danula, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Could you please summarize the current status of the project >>>>>>>>>>>>>>> and also the things left to do? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Sun, Jul 19, 2015 at 11:39 PM, Danula Eranjith < >>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thank you. >>>>>>>>>>>>>>>> Will use them. I already have some other kaggle datasets as >>>>>>>>>>>>>>>> well. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 1. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Sun, Jul 19, 2015 at 11:30 PM, Danula Eranjith < >>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hi Nirmal, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Would it be possible to get some sample data sets which >>>>>>>>>>>>>>>>>> are more likely to be pre-processed using wrangler. I am >>>>>>>>>>>>>>>>>> currently testing >>>>>>>>>>>>>>>>>> my implementations against small and more general data sets. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I have checked datasets available at [1] >>>>>>>>>>>>>>>>>> <https://github.com/wso2/product-ml/tree/master/modules/samples> >>>>>>>>>>>>>>>>>> as >>>>>>>>>>>>>>>>>> well. But there is nothing much to be processed as they are >>>>>>>>>>>>>>>>>> ready to be fed >>>>>>>>>>>>>>>>>> to ML. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> [1] - >>>>>>>>>>>>>>>>>> https://github.com/wso2/product-ml/tree/master/modules/samples >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> Danula >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Thu, Jul 16, 2015 at 10:15 PM, Nirmal Fernando < >>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks Danula. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Thu, Jul 16, 2015 at 10:07 PM, Danula Eranjith < >>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Sorry for not keeping you in the loop. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> After considering and experimenting with several >>>>>>>>>>>>>>>>>>>> options. I am using the javascript code generated by >>>>>>>>>>>>>>>>>>>> wrangler to implement >>>>>>>>>>>>>>>>>>>> them using spark. I have used regular expressions to >>>>>>>>>>>>>>>>>>>> extract the >>>>>>>>>>>>>>>>>>>> operations, parameters and values and mapped them to spark >>>>>>>>>>>>>>>>>>>> transformations >>>>>>>>>>>>>>>>>>>> I previously developed. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> The code generated by wrangler for certain functions >>>>>>>>>>>>>>>>>>>> have nested operations. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> (1) >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> /* Fill split3 with values from above */ >>>>>>>>>>>>>>>>>>>> w.add(dw.fill().column(["split3"]) >>>>>>>>>>>>>>>>>>>> .table(0) >>>>>>>>>>>>>>>>>>>> .status("active") >>>>>>>>>>>>>>>>>>>> .drop(false) >>>>>>>>>>>>>>>>>>>> .direction("down") >>>>>>>>>>>>>>>>>>>> .method("copy") >>>>>>>>>>>>>>>>>>>> .row(undefined) >>>>>>>>>>>>>>>>>>>> ) >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> (2) >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> /* Delete rows where split1 is null */ >>>>>>>>>>>>>>>>>>>> w.add(dw.filter().column([]) >>>>>>>>>>>>>>>>>>>> .table(0) >>>>>>>>>>>>>>>>>>>> .status("active") >>>>>>>>>>>>>>>>>>>> .drop(false) >>>>>>>>>>>>>>>>>>>> .row(dw.row().column([]) >>>>>>>>>>>>>>>>>>>> .table(0) >>>>>>>>>>>>>>>>>>>> .status("active") >>>>>>>>>>>>>>>>>>>> .drop(false) >>>>>>>>>>>>>>>>>>>> .conditions([dw.is_null().column([]) >>>>>>>>>>>>>>>>>>>> .table(0) >>>>>>>>>>>>>>>>>>>> .status("active") >>>>>>>>>>>>>>>>>>>> .drop(false) >>>>>>>>>>>>>>>>>>>> .lcol("split1") >>>>>>>>>>>>>>>>>>>> .value(undefined) >>>>>>>>>>>>>>>>>>>> .op_str("is null") >>>>>>>>>>>>>>>>>>>> ]) >>>>>>>>>>>>>>>>>>>> ) >>>>>>>>>>>>>>>>>>>> ) >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I have succeeded in parsing the operations similar to >>>>>>>>>>>>>>>>>>>> (1) above and currently working on extending it to work on >>>>>>>>>>>>>>>>>>>> operations >>>>>>>>>>>>>>>>>>>> similar to (2). >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Next step would be automating the process of spark >>>>>>>>>>>>>>>>>>>> transformation generation. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>> Danula >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Wed, Jul 15, 2015 at 7:32 PM, Nirmal Fernando < >>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Hi Danula, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Please send an update at least every week. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On Wed, Jul 15, 2015 at 5:51 PM, Supun Sethunga < >>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Hi Danula, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Any update on the progress? Were you managed to >>>>>>>>>>>>>>>>>>>>>> integrate the transformations with the wrangler? >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On Thu, Jul 2, 2015 at 11:38 AM, Danula Eranjith < >>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Update on the current progress of the project and >>>>>>>>>>>>>>>>>>>>>>> future activities as we discussed at the recent meeting. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> *Current Progress* >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> I have completed the phase of creating spark >>>>>>>>>>>>>>>>>>>>>>> transformations relevant to operations available in >>>>>>>>>>>>>>>>>>>>>>> wrangler. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Operations implemented >>>>>>>>>>>>>>>>>>>>>>> - Fill >>>>>>>>>>>>>>>>>>>>>>> - Split >>>>>>>>>>>>>>>>>>>>>>> - Drop >>>>>>>>>>>>>>>>>>>>>>> - Delete >>>>>>>>>>>>>>>>>>>>>>> - Extract >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> *Future activities* >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> - Modify the wrangler interface to suit the current >>>>>>>>>>>>>>>>>>>>>>> implementation >>>>>>>>>>>>>>>>>>>>>>> - Automate the process of generating Spark >>>>>>>>>>>>>>>>>>>>>>> transformations >>>>>>>>>>>>>>>>>>>>>>> - Integrating wrangler to the ML workflow >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>> Danula >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On Sun, Jun 28, 2015 at 9:31 AM, Danula Eranjith < >>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> No, We haven't done a review yet. >>>>>>>>>>>>>>>>>>>>>>>> It would be great if we could have one so that I >>>>>>>>>>>>>>>>>>>>>>>> can discuss with you all and clarify the next steps of >>>>>>>>>>>>>>>>>>>>>>>> the implementation >>>>>>>>>>>>>>>>>>>>>>>> as you mentioned. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Thanks >>>>>>>>>>>>>>>>>>>>>>>> Danula >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On Sun, Jun 28, 2015 at 9:25 AM, Supun Sethunga < >>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Hi Danula, >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Did we have a review for the work done so far? If >>>>>>>>>>>>>>>>>>>>>>>>> not, shall we have a one? We can clear out any doubts >>>>>>>>>>>>>>>>>>>>>>>>> and issues as well.. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>> Supun >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Jun 24, 2015 at 6:42 AM, Nirmal Fernando < >>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Hi Danula, >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for the update, keep them coming. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> On a JavaRDD you can perform a collect() to get a >>>>>>>>>>>>>>>>>>>>>>>>>> list, AFAIR. Yes, this is costly, since it would >>>>>>>>>>>>>>>>>>>>>>>>>> load whole dataset into >>>>>>>>>>>>>>>>>>>>>>>>>> memory. So, is this an operation which involves >>>>>>>>>>>>>>>>>>>>>>>>>> multiple rows? >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, Jun 23, 2015 at 2:15 PM, Danula Eranjith >>>>>>>>>>>>>>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Supun, >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> I modified the "Fill" operation to add what you >>>>>>>>>>>>>>>>>>>>>>>>>>> mentioned. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> I used a workaround to to implement certain >>>>>>>>>>>>>>>>>>>>>>>>>>> parts of the operations such as filling with values >>>>>>>>>>>>>>>>>>>>>>>>>>> from rows above and >>>>>>>>>>>>>>>>>>>>>>>>>>> below. >>>>>>>>>>>>>>>>>>>>>>>>>>> I created a List Implementation using toArray() >>>>>>>>>>>>>>>>>>>>>>>>>>> method in JavaRDD and then converted it back to a >>>>>>>>>>>>>>>>>>>>>>>>>>> JavaRDD after the >>>>>>>>>>>>>>>>>>>>>>>>>>> operation. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> This will be inefficient (in terms of both >>>>>>>>>>>>>>>>>>>>>>>>>>> memory and time) when working with very large data >>>>>>>>>>>>>>>>>>>>>>>>>>> sets. But I think its >>>>>>>>>>>>>>>>>>>>>>>>>>> important to have these features included. >>>>>>>>>>>>>>>>>>>>>>>>>>> Otherwise a user would be left >>>>>>>>>>>>>>>>>>>>>>>>>>> with very limited set of operations. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Please let me know if you have a different >>>>>>>>>>>>>>>>>>>>>>>>>>> opinion on this. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>> Danula >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, Jun 16, 2015 at 9:44 AM, Supun Sethunga >>>>>>>>>>>>>>>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Somehow there are issues in implementing >>>>>>>>>>>>>>>>>>>>>>>>>>>>> certain wrangler functions due to limitations in >>>>>>>>>>>>>>>>>>>>>>>>>>>>> JavaRDD used in spark >>>>>>>>>>>>>>>>>>>>>>>>>>>>> e.g. - >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Fill operation - when filling with values from >>>>>>>>>>>>>>>>>>>>>>>>>>>>> rows above and below >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Fold operation >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Agree, since rows will get executed randomly >>>>>>>>>>>>>>>>>>>>>>>>>>>> with spark, inter-row operations are not very >>>>>>>>>>>>>>>>>>>>>>>>>>>> meaningful. >>>>>>>>>>>>>>>>>>>>>>>>>>>> But you can slightly modify the implementation >>>>>>>>>>>>>>>>>>>>>>>>>>>> of the "Fill" operation, such as, to fill values >>>>>>>>>>>>>>>>>>>>>>>>>>>> based on an >>>>>>>>>>>>>>>>>>>>>>>>>>>> expression/static-value/mean etc. (not depending >>>>>>>>>>>>>>>>>>>>>>>>>>>> on other rows).. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>>> Supun >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, Jun 16, 2015 at 9:27 AM, Supun Sethunga >>>>>>>>>>>>>>>>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Danula, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Sorry for the late reply. Have you got the >>>>>>>>>>>>>>>>>>>>>>>>>>>>> details you were looking for? >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> It would be great if I could get to know which >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrangler operations are important for a user of >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the ML >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Other than the ones you have mentioned in the >>>>>>>>>>>>>>>>>>>>>>>>>>>>> proposal, think its better to have "Translate" >>>>>>>>>>>>>>>>>>>>>>>>>>>>> operation as well (to create a new column based >>>>>>>>>>>>>>>>>>>>>>>>>>>>> on an existing column). >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Supun >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Jun 4, 2015 at 10:11 PM, Danula >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Eranjith <[email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I am currently working on generating spark >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> transformations related to the operations >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> available in the data wrangler. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Data wrangler provides sufficient parameters >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to re-create these at spark.I have successfully >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> implemented delete and >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> split operations of wrangler in spark. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Once this phase is completed, I can either >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> directly generate these scripts at wrangler or >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> use the javascript output >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and convert it to spark depending on the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> implementation. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Somehow there are issues in implementing >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> certain wrangler functions due to limitations in >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> JavaRDD used in spark >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> e.g. - >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Fill operation - when filling with values >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> from rows above and below >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Fold operation >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> It would be great if I could get to know >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> which wrangler operations are important for a >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> user of the ML >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Danula >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Jun 3, 2015 at 8:30 AM, Nirmal >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Fernando <[email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Danula, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Please send an update of your work thus far. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Sun, May 10, 2015 at 2:30 PM, Nirmal >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Fernando <[email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Danula, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Welcome to GSoC 15' ! Can you do some >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> research on directly generating spark >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> transformations using Wrangler and >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> come up with a summary ? >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, May 8, 2015 at 11:03 AM, Danula >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Eranjith <[email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thank you for selecting my proposal [1] >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> <https://docs.google.com/document/d/18NFa23CrhXqnHrkl_AuRz3sQ3Axg7SEmiA7l66Hl9_0/edit?usp=sharing> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for GSoC 2015. I am really looking forward to >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> work with you all and >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> contribute to WSO2. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I have already completed my primary >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> research on wrangler and would like to meet >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> you to get feedback on the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> proposed architecture. I am planning to start >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> working on the project before >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 25th of May. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thank you, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Danula >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [1] - >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://docs.google.com/document/d/18NFa23CrhXqnHrkl_AuRz3sQ3Axg7SEmiA7l66Hl9_0/edit?usp=sharing >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks & regards, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Nirmal >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Associate Technical Lead - Data >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Technologies Team, WSO2 Inc. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Mobile: +94715779733 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks & regards, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Nirmal >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Associate Technical Lead - Data Technologies >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Team, WSO2 Inc. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Mobile: +94715779733 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>>>> *Supun Sethunga* >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Software Engineer >>>>>>>>>>>>>>>>>>>>>>>>>>>>> WSO2, Inc. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> http://wso2.com/ >>>>>>>>>>>>>>>>>>>>>>>>>>>>> lean | enterprise | middleware >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Mobile : +94 716546324 >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>>>> *Supun Sethunga* >>>>>>>>>>>>>>>>>>>>>>>>>>>> Software Engineer >>>>>>>>>>>>>>>>>>>>>>>>>>>> WSO2, Inc. >>>>>>>>>>>>>>>>>>>>>>>>>>>> http://wso2.com/ >>>>>>>>>>>>>>>>>>>>>>>>>>>> lean | enterprise | middleware >>>>>>>>>>>>>>>>>>>>>>>>>>>> Mobile : +94 716546324 >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Thanks & regards, >>>>>>>>>>>>>>>>>>>>>>>>>> Nirmal >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Associate Technical Lead - Data Technologies >>>>>>>>>>>>>>>>>>>>>>>>>> Team, WSO2 Inc. >>>>>>>>>>>>>>>>>>>>>>>>>> Mobile: +94715779733 >>>>>>>>>>>>>>>>>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>>>>> *Supun Sethunga* >>>>>>>>>>>>>>>>>>>>>>>>> Software Engineer >>>>>>>>>>>>>>>>>>>>>>>>> WSO2, Inc. >>>>>>>>>>>>>>>>>>>>>>>>> http://wso2.com/ >>>>>>>>>>>>>>>>>>>>>>>>> lean | enterprise | middleware >>>>>>>>>>>>>>>>>>>>>>>>> Mobile : +94 716546324 >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>>> *Supun Sethunga* >>>>>>>>>>>>>>>>>>>>>> Software Engineer >>>>>>>>>>>>>>>>>>>>>> WSO2, Inc. >>>>>>>>>>>>>>>>>>>>>> http://wso2.com/ >>>>>>>>>>>>>>>>>>>>>> lean | enterprise | middleware >>>>>>>>>>>>>>>>>>>>>> Mobile : +94 716546324 >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks & regards, >>>>>>>>>>>>>>>>>>>>> Nirmal >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Associate Technical Lead - Data Technologies Team, >>>>>>>>>>>>>>>>>>>>> WSO2 Inc. >>>>>>>>>>>>>>>>>>>>> Mobile: +94715779733 >>>>>>>>>>>>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks & regards, >>>>>>>>>>>>>>>>>>> Nirmal >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 >>>>>>>>>>>>>>>>>>> Inc. >>>>>>>>>>>>>>>>>>> Mobile: +94715779733 >>>>>>>>>>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks & regards, >>>>>>>>>>>>>>>>> Nirmal >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 >>>>>>>>>>>>>>>>> Inc. >>>>>>>>>>>>>>>>> Mobile: +94715779733 >>>>>>>>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks & regards, >>>>>>>>>>>>>>> Nirmal >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >>>>>>>>>>>>>>> Mobile: +94715779733 >>>>>>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks & regards, >>>>>>>>>>>>> Nirmal >>>>>>>>>>>>> >>>>>>>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >>>>>>>>>>>>> Mobile: +94715779733 >>>>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> >>>>>>>>>>> Thanks & regards, >>>>>>>>>>> Nirmal >>>>>>>>>>> >>>>>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >>>>>>>>>>> Mobile: +94715779733 >>>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> >>>>>>>>>> Thanks & regards, >>>>>>>>>> Nirmal >>>>>>>>>> >>>>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >>>>>>>>>> Mobile: +94715779733 >>>>>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> Thanks & regards, >>>>>>>> Nirmal >>>>>>>> >>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >>>>>>>> Mobile: +94715779733 >>>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> Thanks & regards, >>>>>> Nirmal >>>>>> >>>>>> Team Lead - WSO2 Machine Learner >>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >>>>>> Mobile: +94715779733 >>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>> >>>>>> >>>>>> >>>>> >>>> >>>> >>>> -- >>>> >>>> Thanks & regards, >>>> Nirmal >>>> >>>> Team Lead - WSO2 Machine Learner >>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >>>> Mobile: +94715779733 >>>> Blog: http://nirmalfdo.blogspot.com/ >>>> >>>> >>>> >>> >>> >>> -- >>> *Supun Sethunga* >>> Software Engineer >>> WSO2, Inc. >>> http://wso2.com/ >>> lean | enterprise | middleware >>> Mobile : +94 716546324 >>> >> >> >> >> -- >> *Supun Sethunga* >> Software Engineer >> WSO2, Inc. >> http://wso2.com/ >> lean | enterprise | middleware >> Mobile : +94 716546324 >> > > -- Thanks & regards, Nirmal Team Lead - WSO2 Machine Learner Associate Technical Lead - Data Technologies Team, WSO2 Inc. Mobile: +94715779733 Blog: http://nirmalfdo.blogspot.com/
_______________________________________________ Dev mailing list [email protected] http://wso2.org/cgi-bin/mailman/listinfo/dev
