Re: [Dev] [ML] Wrangler Integration

2015-09-11 Thread Danula Eranjith
Hi Nirmal,

Please find the document at [1] and I have already created the PR.

[1] -
https://docs.google.com/document/d/172MavBl2TuBNHVoyEPuRIPwmSW5lBljHFSNxrq4gclQ/edit?usp=sharing

Danula

On Sat, Sep 5, 2015 at 12:25 PM, Danula Eranjith <hmdanu...@gmail.com>
wrote:

> Sure. Ill send the document and the PR
>
> Thanks,
> Danula
>
> On Fri, Sep 4, 2015 at 9:01 AM, Nirmal Fernando <nir...@wso2.com> wrote:
>
>> Also please send a PR to our repos.
>>
>> On Fri, Sep 4, 2015 at 8:59 AM, Nirmal Fernando <nir...@wso2.com> wrote:
>>
>>> Thanks Danula.
>>>
>>> ML team like to do the integration of this, since there're few things we
>>> need to clear up in ML side.
>>>
>>> Can you please come up with a clearly explained document on the project
>>> work carried out during the summer?
>>>
>>> On Wed, Sep 2, 2015 at 12:41 AM, Danula Eranjith <hmdanu...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I have made the changes you suggested at [1] and created a API to
>>>> access the wrangler service at [2].
>>>>
>>>> Also added a new step in the wizard at [3] with the wrangler interface.
>>>> Please have a look.
>>>>
>>>> How can we create a sample of the dataset to be passed to wrangler? I
>>>> couldn't find any current implementation capable of this.
>>>>
>>>> Another concern is if we are adding the feature selection step after
>>>> the cleaning step, we need to reflect the changes done in step one at step
>>>> two. But since we do not apply transformations to RDD initially, we need to
>>>> come up an alternative approach.
>>>>
>>>> [1] -
>>>> https://github.com/danula/carbon-ml/tree/master/components/ml/org.wso2.carbon.ml.wrangler/src/main/java/org/wso2/carbon/ml/wrangler
>>>>
>>>> [2] -
>>>> https://github.com/danula/carbon-ml/blob/master/components/ml/org.wso2.carbon.ml.rest.api/src/main/java/org/wso2/carbon/ml/rest/api/WranglerApiV10.java
>>>> <https://github.com/danula/carbon-ml/blob/master/components/ml/org.wso2.carbon.ml.rest.api/src/main/java/org/wso2/carbon/ml/rest/api/WranglerApiV10.java>
>>>>
>>>> [3] -
>>>> https://github.com/danula/carbon-ml/blob/master/apps/ml/site/clean/clean.jag
>>>>
>>>> Thanks,
>>>> Danula
>>>>
>>>>
>>>> On Thu, Aug 27, 2015 at 9:41 AM, Danula Eranjith <hmdanu...@gmail.com>
>>>> wrote:
>>>>
>>>>> Basically script exported from Wrangler tool has list of operations.
>>>>> Wrangler class parse that script and create WranglerOperation object
>>>>> for each operation with its parameters.
>>>>> Then when WranglerOperation.executeOperation() is invoked, it creates
>>>>> the respective SparkOperation object and then applies operations to the
>>>>> JavaRDD
>>>>>
>>>>> On Thu, Aug 27, 2015 at 9:35 AM, Nirmal Fernando <nir...@wso2.com>
>>>>> wrote:
>>>>>
>>>>>> What does WranglerOperation class do?
>>>>>>
>>>>>> On Thu, Aug 27, 2015 at 9:24 AM, Danula Eranjith <hmdanu...@gmail.com
>>>>>> > wrote:
>>>>>>
>>>>>>> Currently Wrangler Operation is the class that holds details related
>>>>>>> to wrangler and SparkOperation contains the relevant Spark 
>>>>>>> transformation.
>>>>>>>
>>>>>>> If we are changing SparkOperation as WranglerOperation, we need to
>>>>>>> rename the current WranglerOperation into something else.
>>>>>>>
>>>>>>> On Thu, Aug 27, 2015 at 9:18 AM, Nirmal Fernando <nir...@wso2.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> /s/SparkOpration/SparkOperation
>>>>>>>>
>>>>>>>> May be as Supun said, I too think we should call them as
>>>>>>>> 'WranglerOperation'.
>>>>>>>>
>>>>>>>> On Thu, Aug 27, 2015 at 7:02 AM, Nirmal Fernando <nir...@wso2.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Also, avoid static methods in transformations.
>>>>>>>>>
>>>>>>>>> On Thu, Aug 27, 2015 at 2:48 AM, Supun Sethunga <s

Re: [Dev] [ML] Wrangler Integration

2015-09-05 Thread Danula Eranjith
Sure. Ill send the document and the PR

Thanks,
Danula

On Fri, Sep 4, 2015 at 9:01 AM, Nirmal Fernando <nir...@wso2.com> wrote:

> Also please send a PR to our repos.
>
> On Fri, Sep 4, 2015 at 8:59 AM, Nirmal Fernando <nir...@wso2.com> wrote:
>
>> Thanks Danula.
>>
>> ML team like to do the integration of this, since there're few things we
>> need to clear up in ML side.
>>
>> Can you please come up with a clearly explained document on the project
>> work carried out during the summer?
>>
>> On Wed, Sep 2, 2015 at 12:41 AM, Danula Eranjith <hmdanu...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> I have made the changes you suggested at [1] and created a API to access
>>> the wrangler service at [2].
>>>
>>> Also added a new step in the wizard at [3] with the wrangler interface.
>>> Please have a look.
>>>
>>> How can we create a sample of the dataset to be passed to wrangler? I
>>> couldn't find any current implementation capable of this.
>>>
>>> Another concern is if we are adding the feature selection step after the
>>> cleaning step, we need to reflect the changes done in step one at step two.
>>> But since we do not apply transformations to RDD initially, we need to come
>>> up an alternative approach.
>>>
>>> [1] -
>>> https://github.com/danula/carbon-ml/tree/master/components/ml/org.wso2.carbon.ml.wrangler/src/main/java/org/wso2/carbon/ml/wrangler
>>>
>>> [2] -
>>> https://github.com/danula/carbon-ml/blob/master/components/ml/org.wso2.carbon.ml.rest.api/src/main/java/org/wso2/carbon/ml/rest/api/WranglerApiV10.java
>>> <https://github.com/danula/carbon-ml/blob/master/components/ml/org.wso2.carbon.ml.rest.api/src/main/java/org/wso2/carbon/ml/rest/api/WranglerApiV10.java>
>>>
>>> [3] -
>>> https://github.com/danula/carbon-ml/blob/master/apps/ml/site/clean/clean.jag
>>>
>>> Thanks,
>>> Danula
>>>
>>>
>>> On Thu, Aug 27, 2015 at 9:41 AM, Danula Eranjith <hmdanu...@gmail.com>
>>> wrote:
>>>
>>>> Basically script exported from Wrangler tool has list of operations.
>>>> Wrangler class parse that script and create WranglerOperation object
>>>> for each operation with its parameters.
>>>> Then when WranglerOperation.executeOperation() is invoked, it creates
>>>> the respective SparkOperation object and then applies operations to the
>>>> JavaRDD
>>>>
>>>> On Thu, Aug 27, 2015 at 9:35 AM, Nirmal Fernando <nir...@wso2.com>
>>>> wrote:
>>>>
>>>>> What does WranglerOperation class do?
>>>>>
>>>>> On Thu, Aug 27, 2015 at 9:24 AM, Danula Eranjith <hmdanu...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Currently Wrangler Operation is the class that holds details related
>>>>>> to wrangler and SparkOperation contains the relevant Spark 
>>>>>> transformation.
>>>>>>
>>>>>> If we are changing SparkOperation as WranglerOperation, we need to
>>>>>> rename the current WranglerOperation into something else.
>>>>>>
>>>>>> On Thu, Aug 27, 2015 at 9:18 AM, Nirmal Fernando <nir...@wso2.com>
>>>>>> wrote:
>>>>>>
>>>>>>> /s/SparkOpration/SparkOperation
>>>>>>>
>>>>>>> May be as Supun said, I too think we should call them as
>>>>>>> 'WranglerOperation'.
>>>>>>>
>>>>>>> On Thu, Aug 27, 2015 at 7:02 AM, Nirmal Fernando <nir...@wso2.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Also, avoid static methods in transformations.
>>>>>>>>
>>>>>>>> On Thu, Aug 27, 2015 at 2:48 AM, Supun Sethunga <sup...@wso2.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi Danula,
>>>>>>>>>
>>>>>>>>> Few comments:
>>>>>>>>>
>>>>>>>>>- You might have to register the component in the OSGI
>>>>>>>>>environment, to be able to call the services from a another 
>>>>>>>>> component.
>>>>>>>>>Refer [1] on how to do this.
>>>>>>>>>- Better to introduce an 

Re: [Dev] [ML] Wrangler Integration

2015-09-01 Thread Danula Eranjith
Hi,

I have made the changes you suggested at [1] and created a API to access
the wrangler service at [2].

Also added a new step in the wizard at [3] with the wrangler interface.
Please have a look.

How can we create a sample of the dataset to be passed to wrangler? I
couldn't find any current implementation capable of this.

Another concern is if we are adding the feature selection step after the
cleaning step, we need to reflect the changes done in step one at step two.
But since we do not apply transformations to RDD initially, we need to come
up an alternative approach.

[1] -
https://github.com/danula/carbon-ml/tree/master/components/ml/org.wso2.carbon.ml.wrangler/src/main/java/org/wso2/carbon/ml/wrangler

[2] -
https://github.com/danula/carbon-ml/blob/master/components/ml/org.wso2.carbon.ml.rest.api/src/main/java/org/wso2/carbon/ml/rest/api/WranglerApiV10.java
<https://github.com/danula/carbon-ml/blob/master/components/ml/org.wso2.carbon.ml.rest.api/src/main/java/org/wso2/carbon/ml/rest/api/WranglerApiV10.java>

[3] -
https://github.com/danula/carbon-ml/blob/master/apps/ml/site/clean/clean.jag

Thanks,
Danula


On Thu, Aug 27, 2015 at 9:41 AM, Danula Eranjith <hmdanu...@gmail.com>
wrote:

> Basically script exported from Wrangler tool has list of operations.
> Wrangler class parse that script and create WranglerOperation object for
> each operation with its parameters.
> Then when WranglerOperation.executeOperation() is invoked, it creates the
> respective SparkOperation object and then applies operations to the JavaRDD
>
> On Thu, Aug 27, 2015 at 9:35 AM, Nirmal Fernando <nir...@wso2.com> wrote:
>
>> What does WranglerOperation class do?
>>
>> On Thu, Aug 27, 2015 at 9:24 AM, Danula Eranjith <hmdanu...@gmail.com>
>> wrote:
>>
>>> Currently Wrangler Operation is the class that holds details related to
>>> wrangler and SparkOperation contains the relevant Spark transformation.
>>>
>>> If we are changing SparkOperation as WranglerOperation, we need to
>>> rename the current WranglerOperation into something else.
>>>
>>> On Thu, Aug 27, 2015 at 9:18 AM, Nirmal Fernando <nir...@wso2.com>
>>> wrote:
>>>
>>>> /s/SparkOpration/SparkOperation
>>>>
>>>> May be as Supun said, I too think we should call them as
>>>> 'WranglerOperation'.
>>>>
>>>> On Thu, Aug 27, 2015 at 7:02 AM, Nirmal Fernando <nir...@wso2.com>
>>>> wrote:
>>>>
>>>>> Also, avoid static methods in transformations.
>>>>>
>>>>> On Thu, Aug 27, 2015 at 2:48 AM, Supun Sethunga <sup...@wso2.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Danula,
>>>>>>
>>>>>> Few comments:
>>>>>>
>>>>>>- You might have to register the component in the OSGI
>>>>>>environment, to be able to call the services from a another component.
>>>>>>Refer [1] on how to do this.
>>>>>>- Better to introduce an interface for WranglerOperation class.
>>>>>>- Add class level/ method level comments.
>>>>>>- Use logger in-place of System.out.println
>>>>>>
>>>>>> [1]
>>>>>> https://github.com/danula/carbon-ml/blob/master/components/ml/org.wso2.carbon.ml.database/src/main/java/org/wso2/carbon/ml/database/internal/ds/MLDatabaseServiceDS.java
>>>>>>
>>>>>> Thanks,
>>>>>> Supun
>>>>>>
>>>>>> On Wed, Aug 26, 2015 at 1:32 PM, Danula Eranjith <hmdanu...@gmail.com
>>>>>> > wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I added the component at [1]
>>>>>>> <https://github.com/danula/carbon-ml/tree/master/components/ml/org.wso2.carbon.ml.wrangler>
>>>>>>> Please have a look.
>>>>>>>
>>>>>>> [1] -
>>>>>>> https://github.com/danula/carbon-ml/tree/master/components/ml/org.wso2.carbon.ml.wrangler
>>>>>>>
>>>>>>> Danula
>>>>>>>
>>>>>>> On Tue, Aug 25, 2015 at 8:35 PM, Danula Eranjith <
>>>>>>> hmdanu...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Thanks Supun
>>>>>>>>
>>>>>>>> On Tue, Aug 25, 2015 at 7:25 PM, Supun Sethunga <sup...@wso2.com>
>>>>>>>> wrote:
>>>>

Re: [Dev] [ML] Wrangler Integration

2015-08-26 Thread Danula Eranjith
Hi,

I added the component at [1]
https://github.com/danula/carbon-ml/tree/master/components/ml/org.wso2.carbon.ml.wrangler
Please have a look.

[1] -
https://github.com/danula/carbon-ml/tree/master/components/ml/org.wso2.carbon.ml.wrangler

Danula

On Tue, Aug 25, 2015 at 8:35 PM, Danula Eranjith hmdanu...@gmail.com
wrote:

 Thanks Supun

 On Tue, Aug 25, 2015 at 7:25 PM, Supun Sethunga sup...@wso2.com wrote:

 You can integrate it to [1], by adding a new component
 org.wso2.carbon.ml.wrangler. Each component is a carbon component.

 Please follow the naming conventions used in the other components, for
 package names and etc..

 [1] https://github.com/wso2/carbon-ml/tree/master/components/ml

 Thanks,
 Supun

 On Tue, Aug 25, 2015 at 7:33 AM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 Hi all,

 Can you suggest where I should be ideally integrating these files[1]
 https://github.com/danula/wso2-ml-wrangler-integration/tree/master/src
 in ML.

 [1] -
 https://github.com/danula/wso2-ml-wrangler-integration/tree/master/src

 Thanks,
 Danula




 --
 *Supun Sethunga*
 Software Engineer
 WSO2, Inc.
 http://wso2.com/
 lean | enterprise | middleware
 Mobile : +94 716546324



___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [ML] Wrangler Integration

2015-08-26 Thread Danula Eranjith
Basically script exported from Wrangler tool has list of operations.
Wrangler class parse that script and create WranglerOperation object for
each operation with its parameters.
Then when WranglerOperation.executeOperation() is invoked, it creates the
respective SparkOperation object and then applies operations to the JavaRDD

On Thu, Aug 27, 2015 at 9:35 AM, Nirmal Fernando nir...@wso2.com wrote:

 What does WranglerOperation class do?

 On Thu, Aug 27, 2015 at 9:24 AM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 Currently Wrangler Operation is the class that holds details related to
 wrangler and SparkOperation contains the relevant Spark transformation.

 If we are changing SparkOperation as WranglerOperation, we need to rename
 the current WranglerOperation into something else.

 On Thu, Aug 27, 2015 at 9:18 AM, Nirmal Fernando nir...@wso2.com wrote:

 /s/SparkOpration/SparkOperation

 May be as Supun said, I too think we should call them as
 'WranglerOperation'.

 On Thu, Aug 27, 2015 at 7:02 AM, Nirmal Fernando nir...@wso2.com
 wrote:

 Also, avoid static methods in transformations.

 On Thu, Aug 27, 2015 at 2:48 AM, Supun Sethunga sup...@wso2.com
 wrote:

 Hi Danula,

 Few comments:

- You might have to register the component in the OSGI
environment, to be able to call the services from a another component.
Refer [1] on how to do this.
- Better to introduce an interface for WranglerOperation class.
- Add class level/ method level comments.
- Use logger in-place of System.out.println

 [1]
 https://github.com/danula/carbon-ml/blob/master/components/ml/org.wso2.carbon.ml.database/src/main/java/org/wso2/carbon/ml/database/internal/ds/MLDatabaseServiceDS.java

 Thanks,
 Supun

 On Wed, Aug 26, 2015 at 1:32 PM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 Hi,

 I added the component at [1]
 https://github.com/danula/carbon-ml/tree/master/components/ml/org.wso2.carbon.ml.wrangler
 Please have a look.

 [1] -
 https://github.com/danula/carbon-ml/tree/master/components/ml/org.wso2.carbon.ml.wrangler

 Danula

 On Tue, Aug 25, 2015 at 8:35 PM, Danula Eranjith hmdanu...@gmail.com
  wrote:

 Thanks Supun

 On Tue, Aug 25, 2015 at 7:25 PM, Supun Sethunga sup...@wso2.com
 wrote:

 You can integrate it to [1], by adding a new component
 org.wso2.carbon.ml.wrangler. Each component is a carbon component.

 Please follow the naming conventions used in the other components,
 for package names and etc..

 [1] https://github.com/wso2/carbon-ml/tree/master/components/ml

 Thanks,
 Supun

 On Tue, Aug 25, 2015 at 7:33 AM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi all,

 Can you suggest where I should be ideally integrating these files
 [1]
 https://github.com/danula/wso2-ml-wrangler-integration/tree/master/src
 in ML.

 [1] -
 https://github.com/danula/wso2-ml-wrangler-integration/tree/master/src

 Thanks,
 Danula




 --
 *Supun Sethunga*
 Software Engineer
 WSO2, Inc.
 http://wso2.com/
 lean | enterprise | middleware
 Mobile : +94 716546324






 --
 *Supun Sethunga*
 Software Engineer
 WSO2, Inc.
 http://wso2.com/
 lean | enterprise | middleware
 Mobile : +94 716546324




 --

 Thanks  regards,
 Nirmal

 Team Lead - WSO2 Machine Learner
 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/





 --

 Thanks  regards,
 Nirmal

 Team Lead - WSO2 Machine Learner
 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/






 --

 Thanks  regards,
 Nirmal

 Team Lead - WSO2 Machine Learner
 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/



___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [ML] Wrangler Integration

2015-08-26 Thread Danula Eranjith
Currently Wrangler Operation is the class that holds details related to
wrangler and SparkOperation contains the relevant Spark transformation.

If we are changing SparkOperation as WranglerOperation, we need to rename
the current WranglerOperation into something else.

On Thu, Aug 27, 2015 at 9:18 AM, Nirmal Fernando nir...@wso2.com wrote:

 /s/SparkOpration/SparkOperation

 May be as Supun said, I too think we should call them as
 'WranglerOperation'.

 On Thu, Aug 27, 2015 at 7:02 AM, Nirmal Fernando nir...@wso2.com wrote:

 Also, avoid static methods in transformations.

 On Thu, Aug 27, 2015 at 2:48 AM, Supun Sethunga sup...@wso2.com wrote:

 Hi Danula,

 Few comments:

- You might have to register the component in the OSGI environment,
to be able to call the services from a another component.  Refer [1] on 
 how
to do this.
- Better to introduce an interface for WranglerOperation class.
- Add class level/ method level comments.
- Use logger in-place of System.out.println

 [1]
 https://github.com/danula/carbon-ml/blob/master/components/ml/org.wso2.carbon.ml.database/src/main/java/org/wso2/carbon/ml/database/internal/ds/MLDatabaseServiceDS.java

 Thanks,
 Supun

 On Wed, Aug 26, 2015 at 1:32 PM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 Hi,

 I added the component at [1]
 https://github.com/danula/carbon-ml/tree/master/components/ml/org.wso2.carbon.ml.wrangler
 Please have a look.

 [1] -
 https://github.com/danula/carbon-ml/tree/master/components/ml/org.wso2.carbon.ml.wrangler

 Danula

 On Tue, Aug 25, 2015 at 8:35 PM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 Thanks Supun

 On Tue, Aug 25, 2015 at 7:25 PM, Supun Sethunga sup...@wso2.com
 wrote:

 You can integrate it to [1], by adding a new component
 org.wso2.carbon.ml.wrangler. Each component is a carbon component.

 Please follow the naming conventions used in the other components,
 for package names and etc..

 [1] https://github.com/wso2/carbon-ml/tree/master/components/ml

 Thanks,
 Supun

 On Tue, Aug 25, 2015 at 7:33 AM, Danula Eranjith hmdanu...@gmail.com
  wrote:

 Hi all,

 Can you suggest where I should be ideally integrating these files[1]
 https://github.com/danula/wso2-ml-wrangler-integration/tree/master/src
 in ML.

 [1] -
 https://github.com/danula/wso2-ml-wrangler-integration/tree/master/src

 Thanks,
 Danula




 --
 *Supun Sethunga*
 Software Engineer
 WSO2, Inc.
 http://wso2.com/
 lean | enterprise | middleware
 Mobile : +94 716546324






 --
 *Supun Sethunga*
 Software Engineer
 WSO2, Inc.
 http://wso2.com/
 lean | enterprise | middleware
 Mobile : +94 716546324




 --

 Thanks  regards,
 Nirmal

 Team Lead - WSO2 Machine Learner
 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/





 --

 Thanks  regards,
 Nirmal

 Team Lead - WSO2 Machine Learner
 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/



___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [ML] Wrangler Integration

2015-08-25 Thread Danula Eranjith
Thanks Supun

On Tue, Aug 25, 2015 at 7:25 PM, Supun Sethunga sup...@wso2.com wrote:

 You can integrate it to [1], by adding a new component
 org.wso2.carbon.ml.wrangler. Each component is a carbon component.

 Please follow the naming conventions used in the other components, for
 package names and etc..

 [1] https://github.com/wso2/carbon-ml/tree/master/components/ml

 Thanks,
 Supun

 On Tue, Aug 25, 2015 at 7:33 AM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 Hi all,

 Can you suggest where I should be ideally integrating these files[1]
 https://github.com/danula/wso2-ml-wrangler-integration/tree/master/src
 in ML.

 [1] -
 https://github.com/danula/wso2-ml-wrangler-integration/tree/master/src

 Thanks,
 Danula




 --
 *Supun Sethunga*
 Software Engineer
 WSO2, Inc.
 http://wso2.com/
 lean | enterprise | middleware
 Mobile : +94 716546324

___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [ML] jQuery conflict in Wrangler Integration

2015-08-25 Thread Danula Eranjith
Cannot use jquery-1.4.2 alone as bootstrap in ML require 1.9 or higher.
No help from noConflict() as well.
It seems the only option is to load both versions and then resolve the
issues with wrangler.

On Tue, Aug 25, 2015 at 10:36 AM, CD Athuraliya chathur...@wso2.com wrote:



 On Mon, Aug 24, 2015 at 6:49 PM, Nirmal Fernando nir...@wso2.com wrote:

 @Danula please try the suggested approaches.

 On Mon, Aug 24, 2015 at 5:35 PM, Supun Sethunga sup...@wso2.com wrote:

 Can't we use only the jquery-1.4.2 for that particular page  (wrangler
 UI page) as a temporary workaround? jquery-1.4.2 should support
 backward compatibility right?


 We might need to check for compatibility to make sure other components
 work properly.


 Thanks,
 Supun

 On Mon, Aug 24, 2015 at 7:37 AM, Tanya Madurapperuma ta...@wso2.com
 wrote:

 Ideally $.noConflict(); should resolve the issue. [1]

 [1] https://api.jquery.com/jquery.noconflict/

 On Mon, Aug 24, 2015 at 2:43 PM, Nirmal Fernando nir...@wso2.com
 wrote:

 Manu / Tanya,

 Any thoughts?

 On Sun, Aug 23, 2015 at 7:19 PM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 Hi,

 Tried using jQuery Migrate, but it doesn't solve the issue.

 Regards,
 Danula

 On Sun, Aug 23, 2015 at 1:01 PM, CD Athuraliya chathur...@wso2.com
 wrote:

 Hi Danula,

 Can you try using jQuery Migrate [1] plugin? Please make sure you
 place them in correct order.

 [1] http://blog.jquery.com/2013/05/08/jquery-migrate-1-2-1-released/

 Regards,
 CD

 On Sun, Aug 23, 2015 at 11:59 AM, Nirmal Fernando nir...@wso2.com
 wrote:

 @CD any thoughts?

 On Sun, Aug 23, 2015 at 10:34 AM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi,

 I am trying to integrate a data cleaning tool into ML and I have
 two jQuery versions as ML uses 1.11 and wrangler uses 1.4

 jquery-1.4.2.min.js
 jquery-1.11.1.min.js

 I have tried using .noConflict() method but could not resolve the
 issue.
 Any suggestions on how to implement this?

 Regards,
 Danula







 --

 Thanks  regards,
 Nirmal

 Team Lead - WSO2 Machine Learner
 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/





 --
 *CD Athuraliya*
 Software Engineer
 WSO2, Inc.
 lean . enterprise . middleware
 Mobile: +94 716288847 94716288847
 LinkedIn http://lk.linkedin.com/in/cdathuraliya | Twitter
 https://twitter.com/cdathuraliya | Blog
 http://cdathuraliya.tumblr.com/





 --

 Thanks  regards,
 Nirmal

 Team Lead - WSO2 Machine Learner
 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/





 --
 Tanya Madurapperuma

 Senior Software Engineer,
 WSO2 Inc. : wso2.com
 Mobile : +94718184439
 Blog : http://tanyamadurapperuma.blogspot.com




 --
 *Supun Sethunga*
 Software Engineer
 WSO2, Inc.
 http://wso2.com/
 lean | enterprise | middleware
 Mobile : +94 716546324




 --

 Thanks  regards,
 Nirmal

 Team Lead - WSO2 Machine Learner
 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/





 --
 *CD Athuraliya*
 Software Engineer
 WSO2, Inc.
 lean . enterprise . middleware
 Mobile: +94 716288847 94716288847
 LinkedIn http://lk.linkedin.com/in/cdathuraliya | Twitter
 https://twitter.com/cdathuraliya | Blog
 http://cdathuraliya.tumblr.com/

___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


[Dev] [ML] Wrangler Integration

2015-08-25 Thread Danula Eranjith
Hi all,

Can you suggest where I should be ideally integrating these files[1]
https://github.com/danula/wso2-ml-wrangler-integration/tree/master/src in
ML.

[1] - https://github.com/danula/wso2-ml-wrangler-integration/tree/master/src

Thanks,
Danula
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [ML] jQuery conflict in Wrangler Integration

2015-08-23 Thread Danula Eranjith
Hi,

Tried using jQuery Migrate, but it doesn't solve the issue.

Regards,
Danula

On Sun, Aug 23, 2015 at 1:01 PM, CD Athuraliya chathur...@wso2.com wrote:

 Hi Danula,

 Can you try using jQuery Migrate [1] plugin? Please make sure you place
 them in correct order.

 [1] http://blog.jquery.com/2013/05/08/jquery-migrate-1-2-1-released/

 Regards,
 CD

 On Sun, Aug 23, 2015 at 11:59 AM, Nirmal Fernando nir...@wso2.com wrote:

 @CD any thoughts?

 On Sun, Aug 23, 2015 at 10:34 AM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 Hi,

 I am trying to integrate a data cleaning tool into ML and I have two
 jQuery versions as ML uses 1.11 and wrangler uses 1.4

 jquery-1.4.2.min.js
 jquery-1.11.1.min.js

 I have tried using .noConflict() method but could not resolve the issue.
 Any suggestions on how to implement this?

 Regards,
 Danula







 --

 Thanks  regards,
 Nirmal

 Team Lead - WSO2 Machine Learner
 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/





 --
 *CD Athuraliya*
 Software Engineer
 WSO2, Inc.
 lean . enterprise . middleware
 Mobile: +94 716288847 94716288847
 LinkedIn http://lk.linkedin.com/in/cdathuraliya | Twitter
 https://twitter.com/cdathuraliya | Blog
 http://cdathuraliya.tumblr.com/

___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


[Dev] [ML] jQuery conflict in Wrangler Integration

2015-08-22 Thread Danula Eranjith
Hi,

I am trying to integrate a data cleaning tool into ML and I have two jQuery
versions as ML uses 1.11 and wrangler uses 1.4

jquery-1.4.2.min.js
jquery-1.11.1.min.js

I have tried using .noConflict() method but could not resolve the issue.
Any suggestions on how to implement this?

Regards,
Danula
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [GSoC-2015] Data Wrangler extension for WSO2 Machine Learner

2015-08-14 Thread Danula Eranjith
Hi Nirmal,

I have changed the structure so that operations could be recorded at one
point using the javascript and then executed later by executing a method in
Wrangler class.

public void test(JavaRDDString[] data,String scriptPath)
public JavaRDDString[] executeOperations(JavaSparkContext jsc,JavaRDD
String[] data)

Please check above mentioned functions in [1] and let me know if that is
fine.

I have some issues with saving the javascript into a file. Will send you
the details if I cannot figure it out.

[1] -
https://github.com/danula/wso2-ml-wrangler-integration/blob/master/src/Wrangler/Wrangler.java

Danula

On Fri, Aug 14, 2015 at 9:52 AM, Nirmal Fernando nir...@wso2.com wrote:

 Hi Danula,

 How is it coming along?

 On Tue, Aug 11, 2015 at 1:51 AM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 Hi Supun,

 Following points were discussed in the meeting

 *Integration to ML*

 We decided to add the wrangler interface as the first step considering
 the current ML implementation.

 So the steps from a users perspective would be as follows

 - A sample from the dataset will be sent to wrangler interface.
 - User can apply desired operations in the wrangler interface
 - User can return to ML by clicking an button in the interface.
 - Viewing the script will be optional for the user.
 - When returned to ML, spark transformations are automatically generated
 and applied to the dataset.

 *Spark Transformations*

 I have implemented all the wrangler transformations by extending a single
 abstract class. These operations are invoked by parsing the javascript code
 generated by wrangler. However since ML spark transformations are applied
 all together at the end of the process, I have to persist all the
 parameters and keep operations as a list which can be invoked later.

 Nirmal pointed out that this could be achieved by using chain of
 responsibility design pattern. I am currently changing the implementation
 accordingly.

 I will get back to you and Nirmal when automation process is completed to
 start the integration.

 Regards,
 Danula

 On Mon, Aug 10, 2015 at 9:29 PM, Supun Sethunga sup...@wso2.com wrote:

 Any update?

 On Fri, Aug 7, 2015 at 10:13 AM, Supun Sethunga sup...@wso2.com wrote:

 Hi Danula,

 Sorry I couldn't join the meeting. Can you please share the
 meeting/review notes? Also the progress on the suggestions and what is left
 to be done in overall?

 Thanks,
 Supun

 On Wed, Aug 5, 2015 at 3:47 AM, Nirmal Fernando nir...@wso2.com
 wrote:

 Hi Danula,

 It should be a JavaRDDString[], where each row represents the
 feature vector as a string[].

 On Tue, Aug 4, 2015 at 11:51 AM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 In other words,
 What would be the preferred output type for a dataset which is
 pre-processed by wrangler?
 As I have observed different algorithms use different JavaRDD types
 as input ( JavaRDDString, JavaRDDVector etc )

 On Tue, Aug 4, 2015 at 11:48 AM, Nirmal Fernando nir...@wso2.com
 wrote:

 Hi Danula,

 On Tue, Aug 4, 2015 at 11:47 AM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi Nirmal,

 In ML, what is the preferred way of keeping data in a single row of
 JavaRDD?


 I didn't quite get your question. Can you elaborate please?



 As I have figured it depends on the algorithm being used.

 Danula

 On Thu, Jul 30, 2015 at 9:14 AM, Nirmal Fernando nir...@wso2.com
 wrote:

 Thanks Danula, I'll send an invite.

 On Wed, Jul 29, 2015 at 10:24 PM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi Nirmal,

 I am available after 1.30pm on Tuesday, Wednesday and Thursday.

 Danula

 On Wed, Jul 29, 2015 at 12:10 PM, Nirmal Fernando 
 nir...@wso2.com wrote:

 Hi Danula,

 Can we arrange a demo/review somewhere next week? Please let me
 know few time slots.

 On Thu, Jul 23, 2015 at 11:47 AM, Nirmal Fernando 
 nir...@wso2.com wrote:

 Thanks Danula.

 On Thu, Jul 23, 2015 at 11:41 AM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 You can find the source at [1]
 https://github.com/danula/wso2-ml-wrangler-integration. I
 have to do some refactoring when integrating to ML.

 [1] - https://github.com/danula/wso2-ml-wrangler-integration

 On Thu, Jul 23, 2015 at 11:31 AM, Nirmal Fernando 
 nir...@wso2.com wrote:

 Thanks Danula. Please share the current code, if possible.

 On Thu, Jul 23, 2015 at 8:41 AM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi all,

 I have succeeded in parsing the operations from wrangler
 javascript code to spark transformations I have written. 
 Working on
 automating the process.

 Last couple of steps would be changing the wrangler
 interface and integrating it into ML Wizard.

 Thanks
 Danula

 On Wed, Jul 22, 2015 at 9:31 AM, Nirmal Fernando 
 nir...@wso2.com wrote:

 Hi Danula,

 Could you please summarize the current status of the
 project and also the things left to do?

 On Sun, Jul 19, 2015 at 11:39 PM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Thank you.
 Will use them. I already have some other kaggle datasets

Re: [Dev] [GSoC-2015] Data Wrangler extension for WSO2 Machine Learner

2015-08-10 Thread Danula Eranjith
Hi Supun,

Following points were discussed in the meeting

*Integration to ML*

We decided to add the wrangler interface as the first step considering the
current ML implementation.

So the steps from a users perspective would be as follows

- A sample from the dataset will be sent to wrangler interface.
- User can apply desired operations in the wrangler interface
- User can return to ML by clicking an button in the interface.
- Viewing the script will be optional for the user.
- When returned to ML, spark transformations are automatically generated
and applied to the dataset.

*Spark Transformations*

I have implemented all the wrangler transformations by extending a single
abstract class. These operations are invoked by parsing the javascript code
generated by wrangler. However since ML spark transformations are applied
all together at the end of the process, I have to persist all the
parameters and keep operations as a list which can be invoked later.

Nirmal pointed out that this could be achieved by using chain of
responsibility design pattern. I am currently changing the implementation
accordingly.

I will get back to you and Nirmal when automation process is completed to
start the integration.

Regards,
Danula

On Mon, Aug 10, 2015 at 9:29 PM, Supun Sethunga sup...@wso2.com wrote:

 Any update?

 On Fri, Aug 7, 2015 at 10:13 AM, Supun Sethunga sup...@wso2.com wrote:

 Hi Danula,

 Sorry I couldn't join the meeting. Can you please share the
 meeting/review notes? Also the progress on the suggestions and what is left
 to be done in overall?

 Thanks,
 Supun

 On Wed, Aug 5, 2015 at 3:47 AM, Nirmal Fernando nir...@wso2.com wrote:

 Hi Danula,

 It should be a JavaRDDString[], where each row represents the feature
 vector as a string[].

 On Tue, Aug 4, 2015 at 11:51 AM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 In other words,
 What would be the preferred output type for a dataset which is
 pre-processed by wrangler?
 As I have observed different algorithms use different JavaRDD types as
 input ( JavaRDDString, JavaRDDVector etc )

 On Tue, Aug 4, 2015 at 11:48 AM, Nirmal Fernando nir...@wso2.com
 wrote:

 Hi Danula,

 On Tue, Aug 4, 2015 at 11:47 AM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 Hi Nirmal,

 In ML, what is the preferred way of keeping data in a single row of
 JavaRDD?


 I didn't quite get your question. Can you elaborate please?



 As I have figured it depends on the algorithm being used.

 Danula

 On Thu, Jul 30, 2015 at 9:14 AM, Nirmal Fernando nir...@wso2.com
 wrote:

 Thanks Danula, I'll send an invite.

 On Wed, Jul 29, 2015 at 10:24 PM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi Nirmal,

 I am available after 1.30pm on Tuesday, Wednesday and Thursday.

 Danula

 On Wed, Jul 29, 2015 at 12:10 PM, Nirmal Fernando nir...@wso2.com
 wrote:

 Hi Danula,

 Can we arrange a demo/review somewhere next week? Please let me
 know few time slots.

 On Thu, Jul 23, 2015 at 11:47 AM, Nirmal Fernando nir...@wso2.com
  wrote:

 Thanks Danula.

 On Thu, Jul 23, 2015 at 11:41 AM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 You can find the source at [1]
 https://github.com/danula/wso2-ml-wrangler-integration. I
 have to do some refactoring when integrating to ML.

 [1] - https://github.com/danula/wso2-ml-wrangler-integration

 On Thu, Jul 23, 2015 at 11:31 AM, Nirmal Fernando 
 nir...@wso2.com wrote:

 Thanks Danula. Please share the current code, if possible.

 On Thu, Jul 23, 2015 at 8:41 AM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi all,

 I have succeeded in parsing the operations from wrangler
 javascript code to spark transformations I have written. Working 
 on
 automating the process.

 Last couple of steps would be changing the wrangler interface
 and integrating it into ML Wizard.

 Thanks
 Danula

 On Wed, Jul 22, 2015 at 9:31 AM, Nirmal Fernando 
 nir...@wso2.com wrote:

 Hi Danula,

 Could you please summarize the current status of the project
 and also the things left to do?

 On Sun, Jul 19, 2015 at 11:39 PM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Thank you.
 Will use them. I already have some other kaggle datasets as
 well.


1.


 On Sun, Jul 19, 2015 at 11:30 PM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi Nirmal,

 Would it be possible to get some sample data sets which
 are more likely to be pre-processed using wrangler. I am 
 currently testing
 my implementations against small and more general data sets.

 I have checked datasets available at [1]
 https://github.com/wso2/product-ml/tree/master/modules/samples
  as
 well. But there is nothing much to be processed as they are 
 ready to be fed
 to ML.

 [1] -
 https://github.com/wso2/product-ml/tree/master/modules/samples

 Thanks,
 Danula

 On Thu, Jul 16, 2015 at 10:15 PM, Nirmal Fernando 
 nir...@wso2.com wrote:

 Thanks Danula.

 On Thu, Jul 16, 2015 at 10:07 PM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi all,

 Sorry for not keeping you in the loop.

 After considering

Re: [Dev] [GSoC-2015] Data Wrangler extension for WSO2 Machine Learner

2015-08-04 Thread Danula Eranjith
Hi Nirmal,

In ML, what is the preferred way of keeping data in a single row of JavaRDD?

As I have figured it depends on the algorithm being used.

Danula

On Thu, Jul 30, 2015 at 9:14 AM, Nirmal Fernando nir...@wso2.com wrote:

 Thanks Danula, I'll send an invite.

 On Wed, Jul 29, 2015 at 10:24 PM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 Hi Nirmal,

 I am available after 1.30pm on Tuesday, Wednesday and Thursday.

 Danula

 On Wed, Jul 29, 2015 at 12:10 PM, Nirmal Fernando nir...@wso2.com
 wrote:

 Hi Danula,

 Can we arrange a demo/review somewhere next week? Please let me know few
 time slots.

 On Thu, Jul 23, 2015 at 11:47 AM, Nirmal Fernando nir...@wso2.com
 wrote:

 Thanks Danula.

 On Thu, Jul 23, 2015 at 11:41 AM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 You can find the source at [1]
 https://github.com/danula/wso2-ml-wrangler-integration. I have to
 do some refactoring when integrating to ML.

 [1] - https://github.com/danula/wso2-ml-wrangler-integration

 On Thu, Jul 23, 2015 at 11:31 AM, Nirmal Fernando nir...@wso2.com
 wrote:

 Thanks Danula. Please share the current code, if possible.

 On Thu, Jul 23, 2015 at 8:41 AM, Danula Eranjith hmdanu...@gmail.com
  wrote:

 Hi all,

 I have succeeded in parsing the operations from wrangler javascript
 code to spark transformations I have written. Working on automating the
 process.

 Last couple of steps would be changing the wrangler interface and
 integrating it into ML Wizard.

 Thanks
 Danula

 On Wed, Jul 22, 2015 at 9:31 AM, Nirmal Fernando nir...@wso2.com
 wrote:

 Hi Danula,

 Could you please summarize the current status of the project and
 also the things left to do?

 On Sun, Jul 19, 2015 at 11:39 PM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Thank you.
 Will use them. I already have some other kaggle datasets as well.


1.


 On Sun, Jul 19, 2015 at 11:30 PM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi Nirmal,

 Would it be possible to get some sample data sets which are more
 likely to be pre-processed using wrangler. I am currently testing my
 implementations against small and more general data sets.

 I have checked datasets available at [1]
 https://github.com/wso2/product-ml/tree/master/modules/samples as
 well. But there is nothing much to be processed as they are ready 
 to be fed
 to ML.

 [1] -
 https://github.com/wso2/product-ml/tree/master/modules/samples

 Thanks,
 Danula

 On Thu, Jul 16, 2015 at 10:15 PM, Nirmal Fernando 
 nir...@wso2.com wrote:

 Thanks Danula.

 On Thu, Jul 16, 2015 at 10:07 PM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi all,

 Sorry for not keeping you in the loop.

 After considering and experimenting with several options. I am
 using the javascript code generated by wrangler to implement them 
 using
 spark. I have used regular expressions to extract the operations,
 parameters and values and mapped them to spark transformations I 
 previously
 developed.

 The code generated by wrangler for certain functions have
 nested operations.

 (1)

 /* Fill split3  with values from above */
 w.add(dw.fill().column([split3])
 .table(0)
 .status(active)
 .drop(false)
 .direction(down)
 .method(copy)
 .row(undefined)
 )

 (2)

 /* Delete  rows where split1 is null */
 w.add(dw.filter().column([])
 .table(0)
 .status(active)
 .drop(false)
 .row(dw.row().column([])
 .table(0)
 .status(active)
 .drop(false)
 .conditions([dw.is_null().column([])
 .table(0)
 .status(active)
 .drop(false)
 .lcol(split1)
 .value(undefined)
 .op_str(is null)
 ])
 )
 )

 I have succeeded in parsing the operations similar to (1)
 above and currently working on extending it to work on operations 
 similar
 to (2).

 Next step would be automating the process of spark
 transformation generation.

 Thanks,
 Danula

 On Wed, Jul 15, 2015 at 7:32 PM, Nirmal Fernando 
 nir...@wso2.com wrote:

 Hi Danula,

 Please send an update at least every week.

 On Wed, Jul 15, 2015 at 5:51 PM, Supun Sethunga 
 sup...@wso2.com wrote:

 Hi Danula,

 Any update on the progress? Were you managed to integrate
 the transformations with the wrangler?

 Thanks,

 On Thu, Jul 2, 2015 at 11:38 AM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi all,

 Update on the current progress of the project and future
 activities as we discussed at the recent meeting.

 *Current Progress*

 I have completed the phase of creating spark
 transformations relevant to operations available in wrangler.

 Operations implemented
 - Fill
 - Split
 - Drop
 - Delete
 - Extract

 *Future activities*

 - Modify the wrangler interface to suit the current
 implementation
 - Automate the process of generating Spark transformations
 - Integrating wrangler to the ML workflow

 Thanks,
 Danula

 On Sun, Jun 28, 2015 at 9:31 AM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi all,

 No, We haven't done a review yet.
 It would be great if we could have one so that I can
 discuss with you all and clarify the next steps of the 
 implementation as
 you mentioned

Re: [Dev] [GSoC-2015] Data Wrangler extension for WSO2 Machine Learner

2015-08-04 Thread Danula Eranjith
In other words,
What would be the preferred output type for a dataset which is
pre-processed by wrangler?
As I have observed different algorithms use different JavaRDD types as
input ( JavaRDDString, JavaRDDVector etc )

On Tue, Aug 4, 2015 at 11:48 AM, Nirmal Fernando nir...@wso2.com wrote:

 Hi Danula,

 On Tue, Aug 4, 2015 at 11:47 AM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 Hi Nirmal,

 In ML, what is the preferred way of keeping data in a single row of
 JavaRDD?


 I didn't quite get your question. Can you elaborate please?



 As I have figured it depends on the algorithm being used.

 Danula

 On Thu, Jul 30, 2015 at 9:14 AM, Nirmal Fernando nir...@wso2.com wrote:

 Thanks Danula, I'll send an invite.

 On Wed, Jul 29, 2015 at 10:24 PM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 Hi Nirmal,

 I am available after 1.30pm on Tuesday, Wednesday and Thursday.

 Danula

 On Wed, Jul 29, 2015 at 12:10 PM, Nirmal Fernando nir...@wso2.com
 wrote:

 Hi Danula,

 Can we arrange a demo/review somewhere next week? Please let me know
 few time slots.

 On Thu, Jul 23, 2015 at 11:47 AM, Nirmal Fernando nir...@wso2.com
 wrote:

 Thanks Danula.

 On Thu, Jul 23, 2015 at 11:41 AM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 You can find the source at [1]
 https://github.com/danula/wso2-ml-wrangler-integration. I have to
 do some refactoring when integrating to ML.

 [1] - https://github.com/danula/wso2-ml-wrangler-integration

 On Thu, Jul 23, 2015 at 11:31 AM, Nirmal Fernando nir...@wso2.com
 wrote:

 Thanks Danula. Please share the current code, if possible.

 On Thu, Jul 23, 2015 at 8:41 AM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi all,

 I have succeeded in parsing the operations from wrangler
 javascript code to spark transformations I have written. Working on
 automating the process.

 Last couple of steps would be changing the wrangler interface and
 integrating it into ML Wizard.

 Thanks
 Danula

 On Wed, Jul 22, 2015 at 9:31 AM, Nirmal Fernando nir...@wso2.com
 wrote:

 Hi Danula,

 Could you please summarize the current status of the project and
 also the things left to do?

 On Sun, Jul 19, 2015 at 11:39 PM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Thank you.
 Will use them. I already have some other kaggle datasets as
 well.


1.


 On Sun, Jul 19, 2015 at 11:30 PM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi Nirmal,

 Would it be possible to get some sample data sets which are
 more likely to be pre-processed using wrangler. I am currently 
 testing my
 implementations against small and more general data sets.

 I have checked datasets available at [1]
 https://github.com/wso2/product-ml/tree/master/modules/samples 
 as
 well. But there is nothing much to be processed as they are ready 
 to be fed
 to ML.

 [1] -
 https://github.com/wso2/product-ml/tree/master/modules/samples

 Thanks,
 Danula

 On Thu, Jul 16, 2015 at 10:15 PM, Nirmal Fernando 
 nir...@wso2.com wrote:

 Thanks Danula.

 On Thu, Jul 16, 2015 at 10:07 PM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi all,

 Sorry for not keeping you in the loop.

 After considering and experimenting with several options. I
 am using the javascript code generated by wrangler to implement 
 them using
 spark. I have used regular expressions to extract the 
 operations,
 parameters and values and mapped them to spark transformations 
 I previously
 developed.

 The code generated by wrangler for certain functions have
 nested operations.

 (1)

 /* Fill split3  with values from above */
 w.add(dw.fill().column([split3])
 .table(0)
 .status(active)
 .drop(false)
 .direction(down)
 .method(copy)
 .row(undefined)
 )

 (2)

 /* Delete  rows where split1 is null */
 w.add(dw.filter().column([])
 .table(0)
 .status(active)
 .drop(false)
 .row(dw.row().column([])
 .table(0)
 .status(active)
 .drop(false)
 .conditions([dw.is_null().column([])
 .table(0)
 .status(active)
 .drop(false)
 .lcol(split1)
 .value(undefined)
 .op_str(is null)
 ])
 )
 )

 I have succeeded in parsing the operations similar to (1)
 above and currently working on extending it to work on 
 operations similar
 to (2).

 Next step would be automating the process of spark
 transformation generation.

 Thanks,
 Danula

 On Wed, Jul 15, 2015 at 7:32 PM, Nirmal Fernando 
 nir...@wso2.com wrote:

 Hi Danula,

 Please send an update at least every week.

 On Wed, Jul 15, 2015 at 5:51 PM, Supun Sethunga 
 sup...@wso2.com wrote:

 Hi Danula,

 Any update on the progress? Were you managed to integrate
 the transformations with the wrangler?

 Thanks,

 On Thu, Jul 2, 2015 at 11:38 AM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi all,

 Update on the current progress of the project and future
 activities as we discussed at the recent meeting.

 *Current Progress*

 I have completed the phase of creating spark
 transformations relevant to operations available in wrangler.

 Operations implemented
 - Fill
 - Split
 - Drop
 - Delete
 - Extract

 *Future

Re: [Dev] [GSoC-2015] Data Wrangler extension for WSO2 Machine Learner

2015-07-29 Thread Danula Eranjith
Hi Nirmal,

I am available after 1.30pm on Tuesday, Wednesday and Thursday.

Danula

On Wed, Jul 29, 2015 at 12:10 PM, Nirmal Fernando nir...@wso2.com wrote:

 Hi Danula,

 Can we arrange a demo/review somewhere next week? Please let me know few
 time slots.

 On Thu, Jul 23, 2015 at 11:47 AM, Nirmal Fernando nir...@wso2.com wrote:

 Thanks Danula.

 On Thu, Jul 23, 2015 at 11:41 AM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 You can find the source at [1]
 https://github.com/danula/wso2-ml-wrangler-integration. I have to do
 some refactoring when integrating to ML.

 [1] - https://github.com/danula/wso2-ml-wrangler-integration

 On Thu, Jul 23, 2015 at 11:31 AM, Nirmal Fernando nir...@wso2.com
 wrote:

 Thanks Danula. Please share the current code, if possible.

 On Thu, Jul 23, 2015 at 8:41 AM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 Hi all,

 I have succeeded in parsing the operations from wrangler javascript
 code to spark transformations I have written. Working on automating the
 process.

 Last couple of steps would be changing the wrangler interface and
 integrating it into ML Wizard.

 Thanks
 Danula

 On Wed, Jul 22, 2015 at 9:31 AM, Nirmal Fernando nir...@wso2.com
 wrote:

 Hi Danula,

 Could you please summarize the current status of the project and also
 the things left to do?

 On Sun, Jul 19, 2015 at 11:39 PM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Thank you.
 Will use them. I already have some other kaggle datasets as well.


1.


 On Sun, Jul 19, 2015 at 11:30 PM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi Nirmal,

 Would it be possible to get some sample data sets which are more
 likely to be pre-processed using wrangler. I am currently testing my
 implementations against small and more general data sets.

 I have checked datasets available at [1]
 https://github.com/wso2/product-ml/tree/master/modules/samples as
 well. But there is nothing much to be processed as they are ready to 
 be fed
 to ML.

 [1] -
 https://github.com/wso2/product-ml/tree/master/modules/samples

 Thanks,
 Danula

 On Thu, Jul 16, 2015 at 10:15 PM, Nirmal Fernando nir...@wso2.com
  wrote:

 Thanks Danula.

 On Thu, Jul 16, 2015 at 10:07 PM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi all,

 Sorry for not keeping you in the loop.

 After considering and experimenting with several options. I am
 using the javascript code generated by wrangler to implement them 
 using
 spark. I have used regular expressions to extract the operations,
 parameters and values and mapped them to spark transformations I 
 previously
 developed.

 The code generated by wrangler for certain functions have nested
 operations.

 (1)

 /* Fill split3  with values from above */
 w.add(dw.fill().column([split3])
 .table(0)
 .status(active)
 .drop(false)
 .direction(down)
 .method(copy)
 .row(undefined)
 )

 (2)

 /* Delete  rows where split1 is null */
 w.add(dw.filter().column([])
 .table(0)
 .status(active)
 .drop(false)
 .row(dw.row().column([])
 .table(0)
 .status(active)
 .drop(false)
 .conditions([dw.is_null().column([])
 .table(0)
 .status(active)
 .drop(false)
 .lcol(split1)
 .value(undefined)
 .op_str(is null)
 ])
 )
 )

 I have succeeded in parsing the operations similar to (1) above
 and currently working on extending it to work on operations similar 
 to (2).

 Next step would be automating the process of spark
 transformation generation.

 Thanks,
 Danula

 On Wed, Jul 15, 2015 at 7:32 PM, Nirmal Fernando 
 nir...@wso2.com wrote:

 Hi Danula,

 Please send an update at least every week.

 On Wed, Jul 15, 2015 at 5:51 PM, Supun Sethunga 
 sup...@wso2.com wrote:

 Hi Danula,

 Any update on the progress? Were you managed to integrate the
 transformations with the wrangler?

 Thanks,

 On Thu, Jul 2, 2015 at 11:38 AM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi all,

 Update on the current progress of the project and future
 activities as we discussed at the recent meeting.

 *Current Progress*

 I have completed the phase of creating spark transformations
 relevant to operations available in wrangler.

 Operations implemented
 - Fill
 - Split
 - Drop
 - Delete
 - Extract

 *Future activities*

 - Modify the wrangler interface to suit the current
 implementation
 - Automate the process of generating Spark transformations
 - Integrating wrangler to the ML workflow

 Thanks,
 Danula

 On Sun, Jun 28, 2015 at 9:31 AM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi all,

 No, We haven't done a review yet.
 It would be great if we could have one so that I can discuss
 with you all and clarify the next steps of the implementation 
 as you
 mentioned.

 Thanks
 Danula

 On Sun, Jun 28, 2015 at 9:25 AM, Supun Sethunga 
 sup...@wso2.com wrote:

 Hi Danula,

 Did we have a review for the work done so far? If not,
 shall we have a one? We can clear out any doubts and issues as 
 well..

 Thanks,
 Supun

 On Wed, Jun 24, 2015 at 6:42 AM, Nirmal Fernando 
 nir...@wso2.com wrote:

 Hi Danula,

 Thanks

Re: [Dev] [GSoC-2015] Data Wrangler extension for WSO2 Machine Learner

2015-07-19 Thread Danula Eranjith
Hi Nirmal,

Would it be possible to get some sample data sets which are more likely to
be pre-processed using wrangler. I am currently testing my implementations
against small and more general data sets.

I have checked datasets available at [1]
https://github.com/wso2/product-ml/tree/master/modules/samples as well.
But there is nothing much to be processed as they are ready to be fed to ML.

[1] - https://github.com/wso2/product-ml/tree/master/modules/samples

Thanks,
Danula

On Thu, Jul 16, 2015 at 10:15 PM, Nirmal Fernando nir...@wso2.com wrote:

 Thanks Danula.

 On Thu, Jul 16, 2015 at 10:07 PM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 Hi all,

 Sorry for not keeping you in the loop.

 After considering and experimenting with several options. I am using the
 javascript code generated by wrangler to implement them using spark. I have
 used regular expressions to extract the operations, parameters and values
 and mapped them to spark transformations I previously developed.

 The code generated by wrangler for certain functions have nested
 operations.

 (1)

 /* Fill split3  with values from above */
 w.add(dw.fill().column([split3])
 .table(0)
 .status(active)
 .drop(false)
 .direction(down)
 .method(copy)
 .row(undefined)
 )

 (2)

 /* Delete  rows where split1 is null */
 w.add(dw.filter().column([])
 .table(0)
 .status(active)
 .drop(false)
 .row(dw.row().column([])
 .table(0)
 .status(active)
 .drop(false)
 .conditions([dw.is_null().column([])
 .table(0)
 .status(active)
 .drop(false)
 .lcol(split1)
 .value(undefined)
 .op_str(is null)
 ])
 )
 )

 I have succeeded in parsing the operations similar to (1) above and
 currently working on extending it to work on operations similar to (2).

 Next step would be automating the process of spark transformation
 generation.

 Thanks,
 Danula

 On Wed, Jul 15, 2015 at 7:32 PM, Nirmal Fernando nir...@wso2.com wrote:

 Hi Danula,

 Please send an update at least every week.

 On Wed, Jul 15, 2015 at 5:51 PM, Supun Sethunga sup...@wso2.com wrote:

 Hi Danula,

 Any update on the progress? Were you managed to integrate the
 transformations with the wrangler?

 Thanks,

 On Thu, Jul 2, 2015 at 11:38 AM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 Hi all,

 Update on the current progress of the project and future activities as
 we discussed at the recent meeting.

 *Current Progress*

 I have completed the phase of creating spark transformations relevant
 to operations available in wrangler.

 Operations implemented
 - Fill
 - Split
 - Drop
 - Delete
 - Extract

 *Future activities*

 - Modify the wrangler interface to suit the current implementation
 - Automate the process of generating Spark transformations
 - Integrating wrangler to the ML workflow

 Thanks,
 Danula

 On Sun, Jun 28, 2015 at 9:31 AM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 Hi all,

 No, We haven't done a review yet.
 It would be great if we could have one so that I can discuss with you
 all and clarify the next steps of the implementation as you mentioned.

 Thanks
 Danula

 On Sun, Jun 28, 2015 at 9:25 AM, Supun Sethunga sup...@wso2.com
 wrote:

 Hi Danula,

 Did we have a review for the work done so far? If not, shall we have
 a one? We can clear out any doubts and issues as well..

 Thanks,
 Supun

 On Wed, Jun 24, 2015 at 6:42 AM, Nirmal Fernando nir...@wso2.com
 wrote:

 Hi Danula,

 Thanks for the update, keep them coming.

 On a JavaRDD you can perform a collect() to get a list, AFAIR. Yes,
 this is costly, since it would load whole dataset into memory. So, is 
 this
 an operation which involves multiple rows?

 On Tue, Jun 23, 2015 at 2:15 PM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi Supun,

 I modified the Fill operation to add what you mentioned.

 I used a workaround to to implement certain parts of the
 operations such as filling with values from rows above and below.
 I created a List Implementation using toArray() method in JavaRDD
 and then converted it back to a JavaRDD after the operation.

 This will be inefficient (in terms of both memory and time) when
 working with very large data sets. But I think its important to have 
 these
 features included. Otherwise a user would be left with very limited 
 set of
 operations.

 Please let me know if you have a different opinion on this.

 Thanks,
 Danula

 On Tue, Jun 16, 2015 at 9:44 AM, Supun Sethunga sup...@wso2.com
 wrote:

 Somehow there are issues in implementing certain wrangler
 functions due to limitations in JavaRDD used in spark
 e.g. -
 Fill operation - when filling with values from rows above and
 below
 Fold operation


 Agree, since rows will get executed randomly with spark,
 inter-row operations are not very meaningful.
 But you can slightly modify the implementation of the Fill
 operation, such as, to fill values based on an 
 expression/static-value/mean
 etc. (not depending on other rows)..

 Thanks,
 Supun

 On Tue, Jun 16, 2015 at 9:27 AM, Supun Sethunga sup...@wso2.com
 wrote:

 Hi

Re: [Dev] [GSoC-2015] Data Wrangler extension for WSO2 Machine Learner

2015-07-16 Thread Danula Eranjith
Hi all,

Sorry for not keeping you in the loop.

After considering and experimenting with several options. I am using the
javascript code generated by wrangler to implement them using spark. I have
used regular expressions to extract the operations, parameters and values
and mapped them to spark transformations I previously developed.

The code generated by wrangler for certain functions have nested operations.

(1)

/* Fill split3  with values from above */
w.add(dw.fill().column([split3])
.table(0)
.status(active)
.drop(false)
.direction(down)
.method(copy)
.row(undefined)
)

(2)

/* Delete  rows where split1 is null */
w.add(dw.filter().column([])
.table(0)
.status(active)
.drop(false)
.row(dw.row().column([])
.table(0)
.status(active)
.drop(false)
.conditions([dw.is_null().column([])
.table(0)
.status(active)
.drop(false)
.lcol(split1)
.value(undefined)
.op_str(is null)
])
)
)

I have succeeded in parsing the operations similar to (1) above and
currently working on extending it to work on operations similar to (2).

Next step would be automating the process of spark transformation
generation.

Thanks,
Danula

On Wed, Jul 15, 2015 at 7:32 PM, Nirmal Fernando nir...@wso2.com wrote:

 Hi Danula,

 Please send an update at least every week.

 On Wed, Jul 15, 2015 at 5:51 PM, Supun Sethunga sup...@wso2.com wrote:

 Hi Danula,

 Any update on the progress? Were you managed to integrate the
 transformations with the wrangler?

 Thanks,

 On Thu, Jul 2, 2015 at 11:38 AM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 Hi all,

 Update on the current progress of the project and future activities as
 we discussed at the recent meeting.

 *Current Progress*

 I have completed the phase of creating spark transformations relevant to
 operations available in wrangler.

 Operations implemented
 - Fill
 - Split
 - Drop
 - Delete
 - Extract

 *Future activities*

 - Modify the wrangler interface to suit the current implementation
 - Automate the process of generating Spark transformations
 - Integrating wrangler to the ML workflow

 Thanks,
 Danula

 On Sun, Jun 28, 2015 at 9:31 AM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 Hi all,

 No, We haven't done a review yet.
 It would be great if we could have one so that I can discuss with you
 all and clarify the next steps of the implementation as you mentioned.

 Thanks
 Danula

 On Sun, Jun 28, 2015 at 9:25 AM, Supun Sethunga sup...@wso2.com
 wrote:

 Hi Danula,

 Did we have a review for the work done so far? If not, shall we have a
 one? We can clear out any doubts and issues as well..

 Thanks,
 Supun

 On Wed, Jun 24, 2015 at 6:42 AM, Nirmal Fernando nir...@wso2.com
 wrote:

 Hi Danula,

 Thanks for the update, keep them coming.

 On a JavaRDD you can perform a collect() to get a list, AFAIR. Yes,
 this is costly, since it would load whole dataset into memory. So, is 
 this
 an operation which involves multiple rows?

 On Tue, Jun 23, 2015 at 2:15 PM, Danula Eranjith hmdanu...@gmail.com
  wrote:

 Hi Supun,

 I modified the Fill operation to add what you mentioned.

 I used a workaround to to implement certain parts of the operations
 such as filling with values from rows above and below.
 I created a List Implementation using toArray() method in JavaRDD
 and then converted it back to a JavaRDD after the operation.

 This will be inefficient (in terms of both memory and time) when
 working with very large data sets. But I think its important to have 
 these
 features included. Otherwise a user would be left with very limited set 
 of
 operations.

 Please let me know if you have a different opinion on this.

 Thanks,
 Danula

 On Tue, Jun 16, 2015 at 9:44 AM, Supun Sethunga sup...@wso2.com
 wrote:

 Somehow there are issues in implementing certain wrangler functions
 due to limitations in JavaRDD used in spark
 e.g. -
 Fill operation - when filling with values from rows above and below
 Fold operation


 Agree, since rows will get executed randomly with spark, inter-row
 operations are not very meaningful.
 But you can slightly modify the implementation of the Fill
 operation, such as, to fill values based on an 
 expression/static-value/mean
 etc. (not depending on other rows)..

 Thanks,
 Supun

 On Tue, Jun 16, 2015 at 9:27 AM, Supun Sethunga sup...@wso2.com
 wrote:

 Hi Danula,

 Sorry for the late reply. Have you got the details you were
 looking for?

 It would be great if I could get to know which wrangler operations
 are important for a user of the ML


 Other than the ones you have mentioned in the proposal, think its
 better to have Translate operation as well (to create a new
 column based on an existing column).

 Thanks,
 Supun



 On Thu, Jun 4, 2015 at 10:11 PM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi all,

 I am currently working on generating spark transformations
 related to the operations available in the data wrangler.

 Data wrangler provides sufficient parameters to re-create these
 at spark.I have successfully implemented delete

Re: [Dev] [GSoC-2015] Data Wrangler extension for WSO2 Machine Learner

2015-07-02 Thread Danula Eranjith
Hi all,

Update on the current progress of the project and future activities as we
discussed at the recent meeting.

*Current Progress*

I have completed the phase of creating spark transformations relevant to
operations available in wrangler.

Operations implemented
- Fill
- Split
- Drop
- Delete
- Extract

*Future activities*

- Modify the wrangler interface to suit the current implementation
- Automate the process of generating Spark transformations
- Integrating wrangler to the ML workflow

Thanks,
Danula

On Sun, Jun 28, 2015 at 9:31 AM, Danula Eranjith hmdanu...@gmail.com
wrote:

 Hi all,

 No, We haven't done a review yet.
 It would be great if we could have one so that I can discuss with you all
 and clarify the next steps of the implementation as you mentioned.

 Thanks
 Danula

 On Sun, Jun 28, 2015 at 9:25 AM, Supun Sethunga sup...@wso2.com wrote:

 Hi Danula,

 Did we have a review for the work done so far? If not, shall we have a
 one? We can clear out any doubts and issues as well..

 Thanks,
 Supun

 On Wed, Jun 24, 2015 at 6:42 AM, Nirmal Fernando nir...@wso2.com wrote:

 Hi Danula,

 Thanks for the update, keep them coming.

 On a JavaRDD you can perform a collect() to get a list, AFAIR. Yes, this
 is costly, since it would load whole dataset into memory. So, is this an
 operation which involves multiple rows?

 On Tue, Jun 23, 2015 at 2:15 PM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 Hi Supun,

 I modified the Fill operation to add what you mentioned.

 I used a workaround to to implement certain parts of the operations
 such as filling with values from rows above and below.
 I created a List Implementation using toArray() method in JavaRDD and
 then converted it back to a JavaRDD after the operation.

 This will be inefficient (in terms of both memory and time) when
 working with very large data sets. But I think its important to have these
 features included. Otherwise a user would be left with very limited set of
 operations.

 Please let me know if you have a different opinion on this.

 Thanks,
 Danula

 On Tue, Jun 16, 2015 at 9:44 AM, Supun Sethunga sup...@wso2.com
 wrote:

 Somehow there are issues in implementing certain wrangler functions
 due to limitations in JavaRDD used in spark
 e.g. -
 Fill operation - when filling with values from rows above and below
 Fold operation


 Agree, since rows will get executed randomly with spark, inter-row
 operations are not very meaningful.
 But you can slightly modify the implementation of the Fill
 operation, such as, to fill values based on an 
 expression/static-value/mean
 etc. (not depending on other rows)..

 Thanks,
 Supun

 On Tue, Jun 16, 2015 at 9:27 AM, Supun Sethunga sup...@wso2.com
 wrote:

 Hi Danula,

 Sorry for the late reply. Have you got the details you were looking
 for?

 It would be great if I could get to know which wrangler operations
 are important for a user of the ML


 Other than the ones you have mentioned in the proposal, think its
 better to have Translate operation as well (to create a new column
 based on an existing column).

 Thanks,
 Supun



 On Thu, Jun 4, 2015 at 10:11 PM, Danula Eranjith hmdanu...@gmail.com
  wrote:

 Hi all,

 I am currently working on generating spark transformations related
 to the operations available in the data wrangler.

 Data wrangler provides sufficient parameters to re-create these at
 spark.I have successfully implemented delete and split operations of
 wrangler in spark.

 Once this phase is completed, I can either directly generate these
 scripts at wrangler or use the javascript output and convert it to spark
 depending on the implementation.

 Somehow there are issues in implementing certain wrangler functions
 due to limitations in JavaRDD used in spark

 e.g. -
 Fill operation - when filling with values from rows above and below
 Fold operation

 It would be great if I could get to know which wrangler operations
 are important for a user of the ML

 Thanks,
 Danula

 On Wed, Jun 3, 2015 at 8:30 AM, Nirmal Fernando nir...@wso2.com
 wrote:

 Hi Danula,

 Please send an update of your work thus far.

 On Sun, May 10, 2015 at 2:30 PM, Nirmal Fernando nir...@wso2.com
 wrote:

 Hi Danula,

 Welcome to GSoC 15' ! Can you do some research on directly
 generating spark transformations using Wrangler and come up with a 
 summary ?

 On Fri, May 8, 2015 at 11:03 AM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi all,

 Thank you for selecting my proposal [1]
 https://docs.google.com/document/d/18NFa23CrhXqnHrkl_AuRz3sQ3Axg7SEmiA7l66Hl9_0/edit?usp=sharing
 for GSoC 2015. I am really looking forward to work with you all and
 contribute to WSO2.

 I have already completed my primary research on wrangler and
 would like to meet you to get feedback on the proposed architecture. 
 I am
 planning to start working on the project before 25th of May.

 Thank you,
 Danula

 [1] -
 https://docs.google.com/document/d/18NFa23CrhXqnHrkl_AuRz3sQ3Axg7SEmiA7l66Hl9_0/edit?usp=sharing

Re: [Dev] [GSoC-2015] Data Wrangler extension for WSO2 Machine Learner

2015-06-27 Thread Danula Eranjith
correction

Actually I am free only from 12.15 pm to 1.15 pm. But can make myself
available from 10.15 am to 12.15 pm if the previous time is not feasible.

On Sun, Jun 28, 2015 at 11:28 AM, Danula Eranjith hmdanu...@gmail.com
wrote:

 Actually I am free only from 12.15 am to 1.15 am. But can make myself
 available from 10.15 am to 12.15 am if the previous time is not feasible.

 On Sun, Jun 28, 2015 at 11:21 AM, Nirmal Fernando nir...@wso2.com wrote:

 Let us know feasible time slots for tomorrow please.

 On Sun, Jun 28, 2015 at 11:20 AM, Nirmal Fernando nir...@wso2.com
 wrote:

 Cool, thanks. Will send an invite.

 On Sun, Jun 28, 2015 at 11:18 AM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 Okay Sure.
 We can have a hangout

 On Sun, Jun 28, 2015 at 11:15 AM, Nirmal Fernando nir...@wso2.com
 wrote:

 It'll be good if we can have it before mid evaluations. If you can't
 make it to Trace, we can have a hangout?

 On Sun, Jun 28, 2015 at 11:11 AM, Danula Eranjith hmdanu...@gmail.com
  wrote:

 It would be difficult for me to make it tomorrow.
 How about Thursday (02/07) at Trace? anytime after 11.30 am would be
 great.

 On Sun, Jun 28, 2015 at 10:09 AM, Nirmal Fernando nir...@wso2.com
 wrote:

 +1 shall we have it tomorrow at Trace?

 On Sun, Jun 28, 2015 at 9:45 AM, Supun Sethunga sup...@wso2.com
 wrote:

 Can you arrange a time around this week? Please check with Nirmal
 too.

 On Sun, Jun 28, 2015 at 9:31 AM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi all,

 No, We haven't done a review yet.
 It would be great if we could have one so that I can discuss with
 you all and clarify the next steps of the implementation as you 
 mentioned.

 Thanks
 Danula

 On Sun, Jun 28, 2015 at 9:25 AM, Supun Sethunga sup...@wso2.com
 wrote:

 Hi Danula,

 Did we have a review for the work done so far? If not, shall we
 have a one? We can clear out any doubts and issues as well..

 Thanks,
 Supun

 On Wed, Jun 24, 2015 at 6:42 AM, Nirmal Fernando nir...@wso2.com
  wrote:

 Hi Danula,

 Thanks for the update, keep them coming.

 On a JavaRDD you can perform a collect() to get a list, AFAIR.
 Yes, this is costly, since it would load whole dataset into memory. 
 So, is
 this an operation which involves multiple rows?

 On Tue, Jun 23, 2015 at 2:15 PM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi Supun,

 I modified the Fill operation to add what you mentioned.

 I used a workaround to to implement certain parts of the
 operations such as filling with values from rows above and below.
 I created a List Implementation using toArray() method
 in JavaRDD and then converted it back to a JavaRDD after the 
 operation.

 This will be inefficient (in terms of both memory and time)
 when working with very large data sets. But I think its important 
 to have
 these features included. Otherwise a user would be left with very 
 limited
 set of operations.

 Please let me know if you have a different opinion on this.

 Thanks,
 Danula

 On Tue, Jun 16, 2015 at 9:44 AM, Supun Sethunga 
 sup...@wso2.com wrote:

 Somehow there are issues in implementing certain wrangler
 functions due to limitations in JavaRDD used in spark
 e.g. -
 Fill operation - when filling with values from rows above and
 below
 Fold operation


 Agree, since rows will get executed randomly with spark,
 inter-row operations are not very meaningful.
 But you can slightly modify the implementation of the Fill
 operation, such as, to fill values based on an 
 expression/static-value/mean
 etc. (not depending on other rows)..

 Thanks,
 Supun

 On Tue, Jun 16, 2015 at 9:27 AM, Supun Sethunga 
 sup...@wso2.com wrote:

 Hi Danula,

 Sorry for the late reply. Have you got the details you were
 looking for?

 It would be great if I could get to know which wrangler
 operations are important for a user of the ML


 Other than the ones you have mentioned in the proposal, think
 its better to have Translate operation as well (to create
 a new column based on an existing column).

 Thanks,
 Supun



 On Thu, Jun 4, 2015 at 10:11 PM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi all,

 I am currently working on generating spark transformations
 related to the operations available in the data wrangler.

 Data wrangler provides sufficient parameters to re-create
 these at spark.I have successfully implemented delete and split 
 operations
 of wrangler in spark.

 Once this phase is completed, I can either directly generate
 these scripts at wrangler or use the javascript output and 
 convert it to
 spark depending on the implementation.

 Somehow there are issues in implementing certain wrangler
 functions due to limitations in JavaRDD used in spark

 e.g. -
 Fill operation - when filling with values from rows above
 and below
 Fold operation

 It would be great if I could get to know which wrangler
 operations are important for a user of the ML

 Thanks,
 Danula

 On Wed, Jun 3, 2015 at 8:30 AM, Nirmal Fernando 
 nir...@wso2.com wrote:

 Hi Danula,

 Please send

Re: [Dev] [GSoC-2015] Data Wrangler extension for WSO2 Machine Learner

2015-06-27 Thread Danula Eranjith
Hi all,

No, We haven't done a review yet.
It would be great if we could have one so that I can discuss with you all
and clarify the next steps of the implementation as you mentioned.

Thanks
Danula

On Sun, Jun 28, 2015 at 9:25 AM, Supun Sethunga sup...@wso2.com wrote:

 Hi Danula,

 Did we have a review for the work done so far? If not, shall we have a
 one? We can clear out any doubts and issues as well..

 Thanks,
 Supun

 On Wed, Jun 24, 2015 at 6:42 AM, Nirmal Fernando nir...@wso2.com wrote:

 Hi Danula,

 Thanks for the update, keep them coming.

 On a JavaRDD you can perform a collect() to get a list, AFAIR. Yes, this
 is costly, since it would load whole dataset into memory. So, is this an
 operation which involves multiple rows?

 On Tue, Jun 23, 2015 at 2:15 PM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 Hi Supun,

 I modified the Fill operation to add what you mentioned.

 I used a workaround to to implement certain parts of the operations such
 as filling with values from rows above and below.
 I created a List Implementation using toArray() method in JavaRDD and
 then converted it back to a JavaRDD after the operation.

 This will be inefficient (in terms of both memory and time) when working
 with very large data sets. But I think its important to have these features
 included. Otherwise a user would be left with very limited set of
 operations.

 Please let me know if you have a different opinion on this.

 Thanks,
 Danula

 On Tue, Jun 16, 2015 at 9:44 AM, Supun Sethunga sup...@wso2.com wrote:

 Somehow there are issues in implementing certain wrangler functions due
 to limitations in JavaRDD used in spark
 e.g. -
 Fill operation - when filling with values from rows above and below
 Fold operation


 Agree, since rows will get executed randomly with spark, inter-row
 operations are not very meaningful.
 But you can slightly modify the implementation of the Fill operation,
 such as, to fill values based on an expression/static-value/mean etc. (not
 depending on other rows)..

 Thanks,
 Supun

 On Tue, Jun 16, 2015 at 9:27 AM, Supun Sethunga sup...@wso2.com
 wrote:

 Hi Danula,

 Sorry for the late reply. Have you got the details you were looking
 for?

 It would be great if I could get to know which wrangler operations are
 important for a user of the ML


 Other than the ones you have mentioned in the proposal, think its
 better to have Translate operation as well (to create a new column
 based on an existing column).

 Thanks,
 Supun



 On Thu, Jun 4, 2015 at 10:11 PM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 Hi all,

 I am currently working on generating spark transformations related to
 the operations available in the data wrangler.

 Data wrangler provides sufficient parameters to re-create these at
 spark.I have successfully implemented delete and split operations of
 wrangler in spark.

 Once this phase is completed, I can either directly generate these
 scripts at wrangler or use the javascript output and convert it to spark
 depending on the implementation.

 Somehow there are issues in implementing certain wrangler functions
 due to limitations in JavaRDD used in spark

 e.g. -
 Fill operation - when filling with values from rows above and below
 Fold operation

 It would be great if I could get to know which wrangler operations
 are important for a user of the ML

 Thanks,
 Danula

 On Wed, Jun 3, 2015 at 8:30 AM, Nirmal Fernando nir...@wso2.com
 wrote:

 Hi Danula,

 Please send an update of your work thus far.

 On Sun, May 10, 2015 at 2:30 PM, Nirmal Fernando nir...@wso2.com
 wrote:

 Hi Danula,

 Welcome to GSoC 15' ! Can you do some research on directly
 generating spark transformations using Wrangler and come up with a 
 summary ?

 On Fri, May 8, 2015 at 11:03 AM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi all,

 Thank you for selecting my proposal [1]
 https://docs.google.com/document/d/18NFa23CrhXqnHrkl_AuRz3sQ3Axg7SEmiA7l66Hl9_0/edit?usp=sharing
 for GSoC 2015. I am really looking forward to work with you all and
 contribute to WSO2.

 I have already completed my primary research on wrangler and would
 like to meet you to get feedback on the proposed architecture. I am
 planning to start working on the project before 25th of May.

 Thank you,
 Danula

 [1] -
 https://docs.google.com/document/d/18NFa23CrhXqnHrkl_AuRz3sQ3Axg7SEmiA7l66Hl9_0/edit?usp=sharing




 --

 Thanks  regards,
 Nirmal

 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/





 --

 Thanks  regards,
 Nirmal

 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/






 --
 *Supun Sethunga*
 Software Engineer
 WSO2, Inc.
 http://wso2.com/
 lean | enterprise | middleware
 Mobile : +94 716546324




 --
 *Supun Sethunga*
 Software Engineer
 WSO2, Inc.
 http://wso2.com/
 lean | enterprise | middleware
 Mobile : +94 716546324





 --

 Thanks  regards

Re: [Dev] [GSoC-2015] Data Wrangler extension for WSO2 Machine Learner

2015-06-27 Thread Danula Eranjith
Okay Sure.
We can have a hangout

On Sun, Jun 28, 2015 at 11:15 AM, Nirmal Fernando nir...@wso2.com wrote:

 It'll be good if we can have it before mid evaluations. If you can't make
 it to Trace, we can have a hangout?

 On Sun, Jun 28, 2015 at 11:11 AM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 It would be difficult for me to make it tomorrow.
 How about Thursday (02/07) at Trace? anytime after 11.30 am would be
 great.

 On Sun, Jun 28, 2015 at 10:09 AM, Nirmal Fernando nir...@wso2.com
 wrote:

 +1 shall we have it tomorrow at Trace?

 On Sun, Jun 28, 2015 at 9:45 AM, Supun Sethunga sup...@wso2.com wrote:

 Can you arrange a time around this week? Please check with Nirmal too.

 On Sun, Jun 28, 2015 at 9:31 AM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 Hi all,

 No, We haven't done a review yet.
 It would be great if we could have one so that I can discuss with you
 all and clarify the next steps of the implementation as you mentioned.

 Thanks
 Danula

 On Sun, Jun 28, 2015 at 9:25 AM, Supun Sethunga sup...@wso2.com
 wrote:

 Hi Danula,

 Did we have a review for the work done so far? If not, shall we have
 a one? We can clear out any doubts and issues as well..

 Thanks,
 Supun

 On Wed, Jun 24, 2015 at 6:42 AM, Nirmal Fernando nir...@wso2.com
 wrote:

 Hi Danula,

 Thanks for the update, keep them coming.

 On a JavaRDD you can perform a collect() to get a list, AFAIR. Yes,
 this is costly, since it would load whole dataset into memory. So, is 
 this
 an operation which involves multiple rows?

 On Tue, Jun 23, 2015 at 2:15 PM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi Supun,

 I modified the Fill operation to add what you mentioned.

 I used a workaround to to implement certain parts of the operations
 such as filling with values from rows above and below.
 I created a List Implementation using toArray() method in JavaRDD
 and then converted it back to a JavaRDD after the operation.

 This will be inefficient (in terms of both memory and time) when
 working with very large data sets. But I think its important to have 
 these
 features included. Otherwise a user would be left with very limited 
 set of
 operations.

 Please let me know if you have a different opinion on this.

 Thanks,
 Danula

 On Tue, Jun 16, 2015 at 9:44 AM, Supun Sethunga sup...@wso2.com
 wrote:

 Somehow there are issues in implementing certain wrangler
 functions due to limitations in JavaRDD used in spark
 e.g. -
 Fill operation - when filling with values from rows above and
 below
 Fold operation


 Agree, since rows will get executed randomly with spark, inter-row
 operations are not very meaningful.
 But you can slightly modify the implementation of the Fill
 operation, such as, to fill values based on an 
 expression/static-value/mean
 etc. (not depending on other rows)..

 Thanks,
 Supun

 On Tue, Jun 16, 2015 at 9:27 AM, Supun Sethunga sup...@wso2.com
 wrote:

 Hi Danula,

 Sorry for the late reply. Have you got the details you were
 looking for?

 It would be great if I could get to know which wrangler
 operations are important for a user of the ML


 Other than the ones you have mentioned in the proposal, think its
 better to have Translate operation as well (to create a new
 column based on an existing column).

 Thanks,
 Supun



 On Thu, Jun 4, 2015 at 10:11 PM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi all,

 I am currently working on generating spark transformations
 related to the operations available in the data wrangler.

 Data wrangler provides sufficient parameters to re-create these
 at spark.I have successfully implemented delete and split 
 operations of
 wrangler in spark.

 Once this phase is completed, I can either directly generate
 these scripts at wrangler or use the javascript output and convert 
 it to
 spark depending on the implementation.

 Somehow there are issues in implementing certain wrangler
 functions due to limitations in JavaRDD used in spark

 e.g. -
 Fill operation - when filling with values from rows above and
 below
 Fold operation

 It would be great if I could get to know which wrangler
 operations are important for a user of the ML

 Thanks,
 Danula

 On Wed, Jun 3, 2015 at 8:30 AM, Nirmal Fernando nir...@wso2.com
  wrote:

 Hi Danula,

 Please send an update of your work thus far.

 On Sun, May 10, 2015 at 2:30 PM, Nirmal Fernando 
 nir...@wso2.com wrote:

 Hi Danula,

 Welcome to GSoC 15' ! Can you do some research on directly
 generating spark transformations using Wrangler and come up with 
 a summary ?

 On Fri, May 8, 2015 at 11:03 AM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi all,

 Thank you for selecting my proposal [1]
 https://docs.google.com/document/d/18NFa23CrhXqnHrkl_AuRz3sQ3Axg7SEmiA7l66Hl9_0/edit?usp=sharing
 for GSoC 2015. I am really looking forward to work with you all 
 and
 contribute to WSO2.

 I have already completed my primary research on wrangler and
 would like to meet you to get feedback on the proposed 
 architecture

Re: [Dev] [GSoC-2015] Data Wrangler extension for WSO2 Machine Learner

2015-06-27 Thread Danula Eranjith
It would be difficult for me to make it tomorrow.
How about Thursday (02/07) at Trace? anytime after 11.30 am would be great.

On Sun, Jun 28, 2015 at 10:09 AM, Nirmal Fernando nir...@wso2.com wrote:

 +1 shall we have it tomorrow at Trace?

 On Sun, Jun 28, 2015 at 9:45 AM, Supun Sethunga sup...@wso2.com wrote:

 Can you arrange a time around this week? Please check with Nirmal too.

 On Sun, Jun 28, 2015 at 9:31 AM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 Hi all,

 No, We haven't done a review yet.
 It would be great if we could have one so that I can discuss with you
 all and clarify the next steps of the implementation as you mentioned.

 Thanks
 Danula

 On Sun, Jun 28, 2015 at 9:25 AM, Supun Sethunga sup...@wso2.com wrote:

 Hi Danula,

 Did we have a review for the work done so far? If not, shall we have a
 one? We can clear out any doubts and issues as well..

 Thanks,
 Supun

 On Wed, Jun 24, 2015 at 6:42 AM, Nirmal Fernando nir...@wso2.com
 wrote:

 Hi Danula,

 Thanks for the update, keep them coming.

 On a JavaRDD you can perform a collect() to get a list, AFAIR. Yes,
 this is costly, since it would load whole dataset into memory. So, is this
 an operation which involves multiple rows?

 On Tue, Jun 23, 2015 at 2:15 PM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 Hi Supun,

 I modified the Fill operation to add what you mentioned.

 I used a workaround to to implement certain parts of the operations
 such as filling with values from rows above and below.
 I created a List Implementation using toArray() method in JavaRDD and
 then converted it back to a JavaRDD after the operation.

 This will be inefficient (in terms of both memory and time) when
 working with very large data sets. But I think its important to have 
 these
 features included. Otherwise a user would be left with very limited set 
 of
 operations.

 Please let me know if you have a different opinion on this.

 Thanks,
 Danula

 On Tue, Jun 16, 2015 at 9:44 AM, Supun Sethunga sup...@wso2.com
 wrote:

 Somehow there are issues in implementing certain wrangler functions
 due to limitations in JavaRDD used in spark
 e.g. -
 Fill operation - when filling with values from rows above and below
 Fold operation


 Agree, since rows will get executed randomly with spark, inter-row
 operations are not very meaningful.
 But you can slightly modify the implementation of the Fill
 operation, such as, to fill values based on an 
 expression/static-value/mean
 etc. (not depending on other rows)..

 Thanks,
 Supun

 On Tue, Jun 16, 2015 at 9:27 AM, Supun Sethunga sup...@wso2.com
 wrote:

 Hi Danula,

 Sorry for the late reply. Have you got the details you were looking
 for?

 It would be great if I could get to know which wrangler operations
 are important for a user of the ML


 Other than the ones you have mentioned in the proposal, think its
 better to have Translate operation as well (to create a new
 column based on an existing column).

 Thanks,
 Supun



 On Thu, Jun 4, 2015 at 10:11 PM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi all,

 I am currently working on generating spark transformations related
 to the operations available in the data wrangler.

 Data wrangler provides sufficient parameters to re-create these at
 spark.I have successfully implemented delete and split operations of
 wrangler in spark.

 Once this phase is completed, I can either directly generate these
 scripts at wrangler or use the javascript output and convert it to 
 spark
 depending on the implementation.

 Somehow there are issues in implementing certain wrangler
 functions due to limitations in JavaRDD used in spark

 e.g. -
 Fill operation - when filling with values from rows above and below
 Fold operation

 It would be great if I could get to know which wrangler operations
 are important for a user of the ML

 Thanks,
 Danula

 On Wed, Jun 3, 2015 at 8:30 AM, Nirmal Fernando nir...@wso2.com
 wrote:

 Hi Danula,

 Please send an update of your work thus far.

 On Sun, May 10, 2015 at 2:30 PM, Nirmal Fernando nir...@wso2.com
  wrote:

 Hi Danula,

 Welcome to GSoC 15' ! Can you do some research on directly
 generating spark transformations using Wrangler and come up with a 
 summary ?

 On Fri, May 8, 2015 at 11:03 AM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi all,

 Thank you for selecting my proposal [1]
 https://docs.google.com/document/d/18NFa23CrhXqnHrkl_AuRz3sQ3Axg7SEmiA7l66Hl9_0/edit?usp=sharing
 for GSoC 2015. I am really looking forward to work with you all and
 contribute to WSO2.

 I have already completed my primary research on wrangler and
 would like to meet you to get feedback on the proposed 
 architecture. I am
 planning to start working on the project before 25th of May.

 Thank you,
 Danula

 [1] -
 https://docs.google.com/document/d/18NFa23CrhXqnHrkl_AuRz3sQ3Axg7SEmiA7l66Hl9_0/edit?usp=sharing




 --

 Thanks  regards,
 Nirmal

 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733

Re: [Dev] [GSoC-2015] Data Wrangler extension for WSO2 Machine Learner

2015-06-27 Thread Danula Eranjith
Actually I am free only from 12.15 am to 1.15 am. But can make myself
available from 10.15 am to 12.15 am if the previous time is not feasible.

On Sun, Jun 28, 2015 at 11:21 AM, Nirmal Fernando nir...@wso2.com wrote:

 Let us know feasible time slots for tomorrow please.

 On Sun, Jun 28, 2015 at 11:20 AM, Nirmal Fernando nir...@wso2.com wrote:

 Cool, thanks. Will send an invite.

 On Sun, Jun 28, 2015 at 11:18 AM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 Okay Sure.
 We can have a hangout

 On Sun, Jun 28, 2015 at 11:15 AM, Nirmal Fernando nir...@wso2.com
 wrote:

 It'll be good if we can have it before mid evaluations. If you can't
 make it to Trace, we can have a hangout?

 On Sun, Jun 28, 2015 at 11:11 AM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 It would be difficult for me to make it tomorrow.
 How about Thursday (02/07) at Trace? anytime after 11.30 am would be
 great.

 On Sun, Jun 28, 2015 at 10:09 AM, Nirmal Fernando nir...@wso2.com
 wrote:

 +1 shall we have it tomorrow at Trace?

 On Sun, Jun 28, 2015 at 9:45 AM, Supun Sethunga sup...@wso2.com
 wrote:

 Can you arrange a time around this week? Please check with Nirmal
 too.

 On Sun, Jun 28, 2015 at 9:31 AM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi all,

 No, We haven't done a review yet.
 It would be great if we could have one so that I can discuss with
 you all and clarify the next steps of the implementation as you 
 mentioned.

 Thanks
 Danula

 On Sun, Jun 28, 2015 at 9:25 AM, Supun Sethunga sup...@wso2.com
 wrote:

 Hi Danula,

 Did we have a review for the work done so far? If not, shall we
 have a one? We can clear out any doubts and issues as well..

 Thanks,
 Supun

 On Wed, Jun 24, 2015 at 6:42 AM, Nirmal Fernando nir...@wso2.com
 wrote:

 Hi Danula,

 Thanks for the update, keep them coming.

 On a JavaRDD you can perform a collect() to get a list, AFAIR.
 Yes, this is costly, since it would load whole dataset into memory. 
 So, is
 this an operation which involves multiple rows?

 On Tue, Jun 23, 2015 at 2:15 PM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi Supun,

 I modified the Fill operation to add what you mentioned.

 I used a workaround to to implement certain parts of the
 operations such as filling with values from rows above and below.
 I created a List Implementation using toArray() method
 in JavaRDD and then converted it back to a JavaRDD after the 
 operation.

 This will be inefficient (in terms of both memory and time) when
 working with very large data sets. But I think its important to 
 have these
 features included. Otherwise a user would be left with very limited 
 set of
 operations.

 Please let me know if you have a different opinion on this.

 Thanks,
 Danula

 On Tue, Jun 16, 2015 at 9:44 AM, Supun Sethunga sup...@wso2.com
  wrote:

 Somehow there are issues in implementing certain wrangler
 functions due to limitations in JavaRDD used in spark
 e.g. -
 Fill operation - when filling with values from rows above and
 below
 Fold operation


 Agree, since rows will get executed randomly with spark,
 inter-row operations are not very meaningful.
 But you can slightly modify the implementation of the Fill
 operation, such as, to fill values based on an 
 expression/static-value/mean
 etc. (not depending on other rows)..

 Thanks,
 Supun

 On Tue, Jun 16, 2015 at 9:27 AM, Supun Sethunga 
 sup...@wso2.com wrote:

 Hi Danula,

 Sorry for the late reply. Have you got the details you were
 looking for?

 It would be great if I could get to know which wrangler
 operations are important for a user of the ML


 Other than the ones you have mentioned in the proposal, think
 its better to have Translate operation as well (to create a
 new column based on an existing column).

 Thanks,
 Supun



 On Thu, Jun 4, 2015 at 10:11 PM, Danula Eranjith 
 hmdanu...@gmail.com wrote:

 Hi all,

 I am currently working on generating spark transformations
 related to the operations available in the data wrangler.

 Data wrangler provides sufficient parameters to re-create
 these at spark.I have successfully implemented delete and split 
 operations
 of wrangler in spark.

 Once this phase is completed, I can either directly generate
 these scripts at wrangler or use the javascript output and 
 convert it to
 spark depending on the implementation.

 Somehow there are issues in implementing certain wrangler
 functions due to limitations in JavaRDD used in spark

 e.g. -
 Fill operation - when filling with values from rows above and
 below
 Fold operation

 It would be great if I could get to know which wrangler
 operations are important for a user of the ML

 Thanks,
 Danula

 On Wed, Jun 3, 2015 at 8:30 AM, Nirmal Fernando 
 nir...@wso2.com wrote:

 Hi Danula,

 Please send an update of your work thus far.

 On Sun, May 10, 2015 at 2:30 PM, Nirmal Fernando 
 nir...@wso2.com wrote:

 Hi Danula,

 Welcome to GSoC 15' ! Can you do some research on directly
 generating spark transformations using Wrangler and come

Re: [Dev] [GSoC-2015] Data Wrangler extension for WSO2 Machine Learner

2015-06-23 Thread Danula Eranjith
Hi Supun,

I modified the Fill operation to add what you mentioned.

I used a workaround to to implement certain parts of the operations such
as filling with values from rows above and below.
I created a List Implementation using toArray() method in JavaRDD and then
converted it back to a JavaRDD after the operation.

This will be inefficient (in terms of both memory and time) when working
with very large data sets. But I think its important to have these features
included. Otherwise a user would be left with very limited set of
operations.

Please let me know if you have a different opinion on this.

Thanks,
Danula

On Tue, Jun 16, 2015 at 9:44 AM, Supun Sethunga sup...@wso2.com wrote:

 Somehow there are issues in implementing certain wrangler functions due to
 limitations in JavaRDD used in spark
 e.g. -
 Fill operation - when filling with values from rows above and below
 Fold operation


 Agree, since rows will get executed randomly with spark, inter-row
 operations are not very meaningful.
 But you can slightly modify the implementation of the Fill operation,
 such as, to fill values based on an expression/static-value/mean etc. (not
 depending on other rows)..

 Thanks,
 Supun

 On Tue, Jun 16, 2015 at 9:27 AM, Supun Sethunga sup...@wso2.com wrote:

 Hi Danula,

 Sorry for the late reply. Have you got the details you were looking for?

 It would be great if I could get to know which wrangler operations are
 important for a user of the ML


 Other than the ones you have mentioned in the proposal, think its better
 to have Translate operation as well (to create a new column based on
 an existing column).

 Thanks,
 Supun



 On Thu, Jun 4, 2015 at 10:11 PM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 Hi all,

 I am currently working on generating spark transformations related to
 the operations available in the data wrangler.

 Data wrangler provides sufficient parameters to re-create these at
 spark.I have successfully implemented delete and split operations of
 wrangler in spark.

 Once this phase is completed, I can either directly generate these
 scripts at wrangler or use the javascript output and convert it to spark
 depending on the implementation.

 Somehow there are issues in implementing certain wrangler functions due
 to limitations in JavaRDD used in spark

 e.g. -
 Fill operation - when filling with values from rows above and below
 Fold operation

 It would be great if I could get to know which wrangler operations are
 important for a user of the ML

 Thanks,
 Danula

 On Wed, Jun 3, 2015 at 8:30 AM, Nirmal Fernando nir...@wso2.com wrote:

 Hi Danula,

 Please send an update of your work thus far.

 On Sun, May 10, 2015 at 2:30 PM, Nirmal Fernando nir...@wso2.com
 wrote:

 Hi Danula,

 Welcome to GSoC 15' ! Can you do some research on directly generating
 spark transformations using Wrangler and come up with a summary ?

 On Fri, May 8, 2015 at 11:03 AM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 Hi all,

 Thank you for selecting my proposal [1]
 https://docs.google.com/document/d/18NFa23CrhXqnHrkl_AuRz3sQ3Axg7SEmiA7l66Hl9_0/edit?usp=sharing
 for GSoC 2015. I am really looking forward to work with you all and
 contribute to WSO2.

 I have already completed my primary research on wrangler and would
 like to meet you to get feedback on the proposed architecture. I am
 planning to start working on the project before 25th of May.

 Thank you,
 Danula

 [1] -
 https://docs.google.com/document/d/18NFa23CrhXqnHrkl_AuRz3sQ3Axg7SEmiA7l66Hl9_0/edit?usp=sharing




 --

 Thanks  regards,
 Nirmal

 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/





 --

 Thanks  regards,
 Nirmal

 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/






 --
 *Supun Sethunga*
 Software Engineer
 WSO2, Inc.
 http://wso2.com/
 lean | enterprise | middleware
 Mobile : +94 716546324




 --
 *Supun Sethunga*
 Software Engineer
 WSO2, Inc.
 http://wso2.com/
 lean | enterprise | middleware
 Mobile : +94 716546324

___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [GSoC-2015] Data Wrangler extension for WSO2 Machine Learner

2015-06-23 Thread Danula Eranjith
Hi all,

I have completed implementing the wrangler operations as spark
transformations.

I am currently working on linking these operations with wrangler.

Thanks,
Danula

On Wed, Jun 17, 2015 at 10:25 AM, Nirmal Fernando nir...@wso2.com wrote:

 Danula,

 Can you please send an update on the status of the project?

 On Tue, Jun 16, 2015 at 9:44 AM, Supun Sethunga sup...@wso2.com wrote:

 Somehow there are issues in implementing certain wrangler functions due
 to limitations in JavaRDD used in spark
 e.g. -
 Fill operation - when filling with values from rows above and below
 Fold operation


 Agree, since rows will get executed randomly with spark, inter-row
 operations are not very meaningful.
 But you can slightly modify the implementation of the Fill operation,
 such as, to fill values based on an expression/static-value/mean etc. (not
 depending on other rows)..

 Thanks,
 Supun

 On Tue, Jun 16, 2015 at 9:27 AM, Supun Sethunga sup...@wso2.com wrote:

 Hi Danula,

 Sorry for the late reply. Have you got the details you were looking for?

 It would be great if I could get to know which wrangler operations are
 important for a user of the ML


 Other than the ones you have mentioned in the proposal, think its better
 to have Translate operation as well (to create a new column based on
 an existing column).

 Thanks,
 Supun



 On Thu, Jun 4, 2015 at 10:11 PM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 Hi all,

 I am currently working on generating spark transformations related to
 the operations available in the data wrangler.

 Data wrangler provides sufficient parameters to re-create these at
 spark.I have successfully implemented delete and split operations of
 wrangler in spark.

 Once this phase is completed, I can either directly generate these
 scripts at wrangler or use the javascript output and convert it to spark
 depending on the implementation.

 Somehow there are issues in implementing certain wrangler functions due
 to limitations in JavaRDD used in spark

 e.g. -
 Fill operation - when filling with values from rows above and below
 Fold operation

 It would be great if I could get to know which wrangler operations are
 important for a user of the ML

 Thanks,
 Danula

 On Wed, Jun 3, 2015 at 8:30 AM, Nirmal Fernando nir...@wso2.com
 wrote:

 Hi Danula,

 Please send an update of your work thus far.

 On Sun, May 10, 2015 at 2:30 PM, Nirmal Fernando nir...@wso2.com
 wrote:

 Hi Danula,

 Welcome to GSoC 15' ! Can you do some research on directly generating
 spark transformations using Wrangler and come up with a summary ?

 On Fri, May 8, 2015 at 11:03 AM, Danula Eranjith hmdanu...@gmail.com
  wrote:

 Hi all,

 Thank you for selecting my proposal [1]
 https://docs.google.com/document/d/18NFa23CrhXqnHrkl_AuRz3sQ3Axg7SEmiA7l66Hl9_0/edit?usp=sharing
 for GSoC 2015. I am really looking forward to work with you all and
 contribute to WSO2.

 I have already completed my primary research on wrangler and would
 like to meet you to get feedback on the proposed architecture. I am
 planning to start working on the project before 25th of May.

 Thank you,
 Danula

 [1] -
 https://docs.google.com/document/d/18NFa23CrhXqnHrkl_AuRz3sQ3Axg7SEmiA7l66Hl9_0/edit?usp=sharing




 --

 Thanks  regards,
 Nirmal

 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/





 --

 Thanks  regards,
 Nirmal

 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/






 --
 *Supun Sethunga*
 Software Engineer
 WSO2, Inc.
 http://wso2.com/
 lean | enterprise | middleware
 Mobile : +94 716546324




 --
 *Supun Sethunga*
 Software Engineer
 WSO2, Inc.
 http://wso2.com/
 lean | enterprise | middleware
 Mobile : +94 716546324




 --

 Thanks  regards,
 Nirmal

 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/



___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] [GSoC-2015] Data Wrangler extension for WSO2 Machine Learner

2015-06-04 Thread Danula Eranjith
Hi all,

I am currently working on generating spark transformations related to the
operations available in the data wrangler.

Data wrangler provides sufficient parameters to re-create these at spark.I
have successfully implemented delete and split operations of wrangler in
spark.

Once this phase is completed, I can either directly generate these scripts
at wrangler or use the javascript output and convert it to spark depending
on the implementation.

Somehow there are issues in implementing certain wrangler functions due to
limitations in JavaRDD used in spark

e.g. -
Fill operation - when filling with values from rows above and below
Fold operation

It would be great if I could get to know which wrangler operations are
important for a user of the ML

Thanks,
Danula

On Wed, Jun 3, 2015 at 8:30 AM, Nirmal Fernando nir...@wso2.com wrote:

 Hi Danula,

 Please send an update of your work thus far.

 On Sun, May 10, 2015 at 2:30 PM, Nirmal Fernando nir...@wso2.com wrote:

 Hi Danula,

 Welcome to GSoC 15' ! Can you do some research on directly generating
 spark transformations using Wrangler and come up with a summary ?

 On Fri, May 8, 2015 at 11:03 AM, Danula Eranjith hmdanu...@gmail.com
 wrote:

 Hi all,

 Thank you for selecting my proposal [1]
 https://docs.google.com/document/d/18NFa23CrhXqnHrkl_AuRz3sQ3Axg7SEmiA7l66Hl9_0/edit?usp=sharing
 for GSoC 2015. I am really looking forward to work with you all and
 contribute to WSO2.

 I have already completed my primary research on wrangler and would like
 to meet you to get feedback on the proposed architecture. I am planning to
 start working on the project before 25th of May.

 Thank you,
 Danula

 [1] -
 https://docs.google.com/document/d/18NFa23CrhXqnHrkl_AuRz3sQ3Axg7SEmiA7l66Hl9_0/edit?usp=sharing




 --

 Thanks  regards,
 Nirmal

 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/





 --

 Thanks  regards,
 Nirmal

 Associate Technical Lead - Data Technologies Team, WSO2 Inc.
 Mobile: +94715779733
 Blog: http://nirmalfdo.blogspot.com/



___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


[Dev] [GSoC-2015] Data Wrangler extension for WSO2 Machine Learner

2015-05-07 Thread Danula Eranjith
Hi all,

Thank you for selecting my proposal [1]
https://docs.google.com/document/d/18NFa23CrhXqnHrkl_AuRz3sQ3Axg7SEmiA7l66Hl9_0/edit?usp=sharing
for GSoC 2015. I am really looking forward to work with you all and
contribute to WSO2.

I have already completed my primary research on wrangler and would like to
meet you to get feedback on the proposed architecture. I am planning to
start working on the project before 25th of May.

Thank you,
Danula

[1] -
https://docs.google.com/document/d/18NFa23CrhXqnHrkl_AuRz3sQ3Axg7SEmiA7l66Hl9_0/edit?usp=sharing
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


[Dev] [GSoC-2015] Proposal 17 : Data Wrangler extension for WSO2 Machine Learner

2015-03-02 Thread Danula Eranjith
Hi,

I am Danula Eranjith, an undergraduate from Department of Computer Science
and Engineering, University of Moratuwa.

I am interested in proposal [1]
https://docs.wso2.com/display/GSoC/Project+Proposals+for+2015#ProjectProposalsfor2015-Proposal17:DataWranglerextensionforWSO2MachineLearner
and looking forward to contribute. I am already familiar with WSO2 ML while
working on my training project at WSO2.

Please let me know if we could have a discussion on $subject.

[1] -
https://docs.wso2.com/display/GSoC/Project+Proposals+for+2015#ProjectProposalsfor2015-Proposal17:DataWranglerextensionforWSO2MachineLearner

Thanks,
-- 
*Danula Eranjith*
Software Engineering Intern
WSO2, Inc.
lean.enterprise.middleware

Mobile: +94719425232
LinkedIn http://lk.linkedin.com/in/danula | Twitter
https://twitter.com/danulaera
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


[Dev] Custom OutputAttributeAggregator for CEP

2015-02-15 Thread Danula Eranjith
Hi,

I am trying to implement a Custom OutputAttributeAggregator for CEP. I have
already created the extension and added the jar files to
CEP_HOME/repository/components/lib and updated the siddhi.extension file
as well according to [1]

Somehow I keep on getting the following error when creating the execution
plan

*Exception: Invalid query specified, No extension exist for
OutputAttributeExtension{extensionName='bm', functionName='getProduct',
rename='product'} *

I have specified the namespace and function name accordingly with siddhi
annotation.

Any idea on this?

[1] -
https://docs.wso2.com/display/CEP300/Writing+a+Custom+OutputAttributeAggregator

Thanks,
-- 
*Danula Eranjith*
Software Engineering Intern
WSO2, Inc.
lean.enterprise.middleware

Mobile: +94719425232
LinkedIn http://lk.linkedin.com/in/danula | Twitter
https://twitter.com/danulaera
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev


Re: [Dev] Custom OutputAttributeAggregator for CEP

2015-02-15 Thread Danula Eranjith
Hi,
Thanks a lot, solved the issue.

@Mohan
Yes it was the issue with having the same package name for two extension
jars.

@Farasath
Yes I have added that

Thanks,

On Mon, Feb 16, 2015 at 11:06 AM, Mohanadarshan Vivekanandalingam 
mo...@wso2.com wrote:

 Follow below steps to identify the issue.

 1) Are there many extension jars available? Whether they have same package
 name ? If yes, you cannot have like that in OSGI environment then some jars
 which have the same package name will not be picked by the OSGI class
 loader.. You need to have unique package name..

 2) Check how many OSGI bundles are exists in dropins folder.. If you are
 put a jar in to lib directory then it will be converted in to OSGI bundle
 and copies to dropins folder. Then if you delete a jar in lib then you also
 need to delete it from dropins as well..

 Thanks,
 Mohan


 On Mon, Feb 16, 2015 at 10:57 AM, Danula Eranjith danu...@wso2.com
 wrote:

 Hi,

 I am trying to implement a Custom OutputAttributeAggregator for CEP. I
 have already created the extension and added the jar files to
 CEP_HOME/repository/components/lib and updated the siddhi.extension file
 as well according to [1]

 Somehow I keep on getting the following error when creating the execution
 plan

 *Exception: Invalid query specified, No extension exist for
 OutputAttributeExtension{extensionName='bm', functionName='getProduct',
 rename='product'} *

 I have specified the namespace and function name accordingly with siddhi
 annotation.

 Any idea on this?

 [1] -
 https://docs.wso2.com/display/CEP300/Writing+a+Custom+OutputAttributeAggregator

 Thanks,
 --
 *Danula Eranjith*
 Software Engineering Intern
 WSO2, Inc.
 lean.enterprise.middleware

 Mobile: +94719425232
 LinkedIn http://lk.linkedin.com/in/danula | Twitter
 https://twitter.com/danulaera

 ___
 Dev mailing list
 Dev@wso2.org
 http://wso2.org/cgi-bin/mailman/listinfo/dev




 --
 *V. Mohanadarshan*
 *Software Engineer,*
 *Data Technologies Team,*
 *WSO2, Inc. http://wso2.com http://wso2.com *
 *lean.enterprise.middleware.*

 email: mo...@wso2.com
 phone:(+94) 771117673




-- 
*Danula Eranjith*
Software Engineering Intern
WSO2, Inc.
lean.enterprise.middleware

Mobile: +94719425232
LinkedIn http://lk.linkedin.com/in/danula | Twitter
https://twitter.com/danulaera
___
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev