Hi all, No, We haven't done a review yet. It would be great if we could have one so that I can discuss with you all and clarify the next steps of the implementation as you mentioned.
Thanks Danula On Sun, Jun 28, 2015 at 9:25 AM, Supun Sethunga <[email protected]> wrote: > Hi Danula, > > Did we have a review for the work done so far? If not, shall we have a > one? We can clear out any doubts and issues as well.. > > Thanks, > Supun > > On Wed, Jun 24, 2015 at 6:42 AM, Nirmal Fernando <[email protected]> wrote: > >> Hi Danula, >> >> Thanks for the update, keep them coming. >> >> On a JavaRDD you can perform a collect() to get a list, AFAIR. Yes, this >> is costly, since it would load whole dataset into memory. So, is this an >> operation which involves multiple rows? >> >> On Tue, Jun 23, 2015 at 2:15 PM, Danula Eranjith <[email protected]> >> wrote: >> >>> Hi Supun, >>> >>> I modified the "Fill" operation to add what you mentioned. >>> >>> I used a workaround to to implement certain parts of the operations such >>> as filling with values from rows above and below. >>> I created a List Implementation using toArray() method in JavaRDD and >>> then converted it back to a JavaRDD after the operation. >>> >>> This will be inefficient (in terms of both memory and time) when working >>> with very large data sets. But I think its important to have these features >>> included. Otherwise a user would be left with very limited set of >>> operations. >>> >>> Please let me know if you have a different opinion on this. >>> >>> Thanks, >>> Danula >>> >>> On Tue, Jun 16, 2015 at 9:44 AM, Supun Sethunga <[email protected]> wrote: >>> >>>> Somehow there are issues in implementing certain wrangler functions due >>>>> to limitations in JavaRDD used in spark >>>>> e.g. - >>>>> Fill operation - when filling with values from rows above and below >>>>> Fold operation >>>> >>>> >>>> Agree, since rows will get executed randomly with spark, inter-row >>>> operations are not very meaningful. >>>> But you can slightly modify the implementation of the "Fill" operation, >>>> such as, to fill values based on an expression/static-value/mean etc. (not >>>> depending on other rows).. >>>> >>>> Thanks, >>>> Supun >>>> >>>> On Tue, Jun 16, 2015 at 9:27 AM, Supun Sethunga <[email protected]> >>>> wrote: >>>> >>>>> Hi Danula, >>>>> >>>>> Sorry for the late reply. Have you got the details you were looking >>>>> for? >>>>> >>>>> It would be great if I could get to know which wrangler operations are >>>>>> important for a user of the ML >>>>> >>>>> >>>>> Other than the ones you have mentioned in the proposal, think its >>>>> better to have "Translate" operation as well (to create a new column >>>>> based on an existing column). >>>>> >>>>> Thanks, >>>>> Supun >>>>> >>>>> >>>>> >>>>> On Thu, Jun 4, 2015 at 10:11 PM, Danula Eranjith <[email protected]> >>>>> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> I am currently working on generating spark transformations related to >>>>>> the operations available in the data wrangler. >>>>>> >>>>>> Data wrangler provides sufficient parameters to re-create these at >>>>>> spark.I have successfully implemented delete and split operations of >>>>>> wrangler in spark. >>>>>> >>>>>> Once this phase is completed, I can either directly generate these >>>>>> scripts at wrangler or use the javascript output and convert it to spark >>>>>> depending on the implementation. >>>>>> >>>>>> Somehow there are issues in implementing certain wrangler functions >>>>>> due to limitations in JavaRDD used in spark >>>>>> >>>>>> e.g. - >>>>>> Fill operation - when filling with values from rows above and below >>>>>> Fold operation >>>>>> >>>>>> It would be great if I could get to know which wrangler operations >>>>>> are important for a user of the ML >>>>>> >>>>>> Thanks, >>>>>> Danula >>>>>> >>>>>> On Wed, Jun 3, 2015 at 8:30 AM, Nirmal Fernando <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hi Danula, >>>>>>> >>>>>>> Please send an update of your work thus far. >>>>>>> >>>>>>> On Sun, May 10, 2015 at 2:30 PM, Nirmal Fernando <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi Danula, >>>>>>>> >>>>>>>> Welcome to GSoC 15' ! Can you do some research on directly >>>>>>>> generating spark transformations using Wrangler and come up with a >>>>>>>> summary ? >>>>>>>> >>>>>>>> On Fri, May 8, 2015 at 11:03 AM, Danula Eranjith < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> Thank you for selecting my proposal [1] >>>>>>>>> <https://docs.google.com/document/d/18NFa23CrhXqnHrkl_AuRz3sQ3Axg7SEmiA7l66Hl9_0/edit?usp=sharing> >>>>>>>>> for GSoC 2015. I am really looking forward to work with you all and >>>>>>>>> contribute to WSO2. >>>>>>>>> >>>>>>>>> I have already completed my primary research on wrangler and would >>>>>>>>> like to meet you to get feedback on the proposed architecture. I am >>>>>>>>> planning to start working on the project before 25th of May. >>>>>>>>> >>>>>>>>> Thank you, >>>>>>>>> Danula >>>>>>>>> >>>>>>>>> [1] - >>>>>>>>> https://docs.google.com/document/d/18NFa23CrhXqnHrkl_AuRz3sQ3Axg7SEmiA7l66Hl9_0/edit?usp=sharing >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> Thanks & regards, >>>>>>>> Nirmal >>>>>>>> >>>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >>>>>>>> Mobile: +94715779733 >>>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> Thanks & regards, >>>>>>> Nirmal >>>>>>> >>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >>>>>>> Mobile: +94715779733 >>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> *Supun Sethunga* >>>>> Software Engineer >>>>> WSO2, Inc. >>>>> http://wso2.com/ >>>>> lean | enterprise | middleware >>>>> Mobile : +94 716546324 >>>>> >>>> >>>> >>>> >>>> -- >>>> *Supun Sethunga* >>>> Software Engineer >>>> WSO2, Inc. >>>> http://wso2.com/ >>>> lean | enterprise | middleware >>>> Mobile : +94 716546324 >>>> >>> >>> >> >> >> -- >> >> Thanks & regards, >> Nirmal >> >> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >> Mobile: +94715779733 >> Blog: http://nirmalfdo.blogspot.com/ >> >> >> > > > -- > *Supun Sethunga* > Software Engineer > WSO2, Inc. > http://wso2.com/ > lean | enterprise | middleware > Mobile : +94 716546324 >
_______________________________________________ Dev mailing list [email protected] http://wso2.org/cgi-bin/mailman/listinfo/dev
