Re: [DISCUSS] R-Interface to SystemML

2017-09-22 Thread Deron Eriksson
Hi Brendan,

Thank you for the detailed description. At a high level that sounds
feasible. Also, offering to help maintain the R codebase is extremely
helpful. Please let us know if you have any questions so that we can assist
you and Alok in your efforts, since as I said I think an R interface to
SystemML makes a lot of sense.

Deron


On Thu, Sep 21, 2017 at 4:36 PM, Brendan Dwyer <brendan.dw...@ibm.com>
wrote:

> Sorry for not responding sooner. I had some issues with my email client.
>
>
>
> I will do my best to address as many as the points that have been raised
> as I can. Hopefully Alok will be able to jump in as well once he resolves
> his email issues.
>
>
>
> - I would be happy to help maintain R4ML in SystemML and I’m sure Alok
> would too.
>
> - R4ML does allow arbitrary DML script to be executed via the
> `sysml.execute()` function.
>
> - I think we would like to merge the entire R4ML github repository into
> SystemML. We could do this the same way SparkR was merged into Spark (
> https://github.com/apache/spark/tree/master/R)
>
> - Currently the code is not ready to be merged into SystemML because we
> are still on the old ML context. We have a PR in the works that will update
> to the newest ML context. Once that happens we won’t need to duplicate the
> DML scripts.
>
> - Documentation is generated automatically with the R package “roxygen”.
> We would need to discuss how to incorporate this into the SystemML
> documentation. Perhaps we could look to Spark/SparkR for ideas.
>
> - Test are done using  the R testthat package. I can work with Alan to get
> that integrated into the systemml Jenkins  server
>
> Sent from IBM Verse
>
>
>Matthias Boehm --- Re: [DISCUSS] R-Interface to SystemML ---
> From:"Matthias Boehm" <mboe...@googlemail.com>To:
> dev@systemml.apache.org, deron@apache.orgDate:Thu, Sep 21, 2017 4:13
> PMSubject:Re: [DISCUSS] R-Interface to SystemML
>
> I pretty much agree with Niketan and Deron. In general, it would be
> usefulto provide an R API as well. However, I'm a bit concerned for two
> reasons:* Looking over the github repo, apparently R4ML is not under
> activedevelopment/maintenance anymore (last commit Jul 20). So who would
> bewilling to maintain and extend it?* Providing wrappers for our algorithm
> scripts would be just a startbecause it hides our core value proposition of
> custom large-scale ML.Hence, we would also need an MLContext equivalent
> that allows to executearbitrary DML scripts or R functions. Is there
> already a tentative designof such an API and if not, who would like to take
> it over?Regards,MatthiasOn Thu, Sep 21, 2017 at 3:43 PM, Deron Eriksson <
> deroneriks...@gmail.com>wrote:> I agree with Niketan. An R interface
> definitely makes sense for SystemML.> DML itself is based on R, so it's
> surprising we have Java/Scala/Python> interfaces to SystemML but we don't
> have an R interface.>> Perhaps R4ML committers could supply a little more
> info? For instance:> 1) Would they like to merge R4ML code into the main
> SystemML project> itself? (Currently we have no modules.)> 2) What would
> they like to merge?> 3) If so, how do they propose to do so?> 4) Who will
> do the majority of the work to add R4ML code to SystemML? Or> who would
> like to volunteer to do this?> 5) Who will maintain the contributed code?
> Or who would like to volunteer> to do this?> 6) Documentation is needed
> (fit in SystemML documentation framework).> 7) Testing is needed (fit into
> SystemML testing framework).> 8) How is this packaged?>> From a technology
> standpoint, I think an R interface totally makes sense.> As for a minor
> criticism (which I apply to other parts of SystemML too), I> see script
> wrappers at https://urldefense.proofpoint.com/v2/url?u=https-3A__github.
> com_SparkTC_r4ml_tree_master_R4ML_R=DwIBaQ=jf_iaSHvJObTbx-siA1ZOg=
> oU0Hd6PZBeEjeAVlZmb0utBefJN1XdJBEF8eiZhdECk=kVcfoxaRYrbaD_
> gb_hA_jn4bjiKe_gcUpc6mF1xbEd4=_qGcWSDggH-K3E_
> wTPXBBjOClp2Jub4KtvBgfeW1kbU= .> This tightly binds the existing DML
> scripts to R, which means DML> input/output modifications could potentially
> require modifications to R> code.>> Deron>>>> On Thu, Sep 21, 2017 at 11:00
> AM, Niketan Pansare <npan...@us.ibm.com>> wrote:>> > Janardhan: I believe
> this is the R4ML repo: https://urldefense.proofpoint.
> com/v2/url?u=https-3A__github.com_SparkTC_=DwIBaQ=jf_
> iaSHvJObTbx-siA1ZOg=oU0Hd6PZBeEjeAVlZmb0utBefJN1Xd
> JBEF8eiZhdECk=kVcfoxaRYrbaD_gb_hA_jn4bjiKe_gcUpc6mF1xbEd4&
> s=vj-Ogu1F2fnli1JwDjU1-S-Pauc7SSmSNG0g5sjgwUk= > > r4ml . Arvind:
> please correct me if I am wrong.> >> > O

Re: [DISCUSS] R-Interface to SystemML

2017-09-21 Thread Matthias Boehm
I pretty much agree with Niketan and Deron. In general, it would be useful
to provide an R API as well. However, I'm a bit concerned for two reasons:

* Looking over the github repo, apparently R4ML is not under active
development/maintenance anymore (last commit Jul 20). So who would be
willing to maintain and extend it?

* Providing wrappers for our algorithm scripts would be just a start
because it hides our core value proposition of custom large-scale ML.
Hence, we would also need an MLContext equivalent that allows to execute
arbitrary DML scripts or R functions. Is there already a tentative design
of such an API and if not, who would like to take it over?

Regards,
Matthias


On Thu, Sep 21, 2017 at 3:43 PM, Deron Eriksson <deroneriks...@gmail.com>
wrote:

> I agree with Niketan. An R interface definitely makes sense for SystemML.
> DML itself is based on R, so it's surprising we have Java/Scala/Python
> interfaces to SystemML but we don't have an R interface.
>
> Perhaps R4ML committers could supply a little more info? For instance:
> 1) Would they like to merge R4ML code into the main SystemML project
> itself? (Currently we have no modules.)
> 2) What would they like to merge?
> 3) If so, how do they propose to do so?
> 4) Who will do the majority of the work to add R4ML code to SystemML? Or
> who would like to volunteer to do this?
> 5) Who will maintain the contributed code? Or who would like to volunteer
> to do this?
> 6) Documentation is needed (fit in SystemML documentation framework).
> 7) Testing is needed (fit into SystemML testing framework).
> 8) How is this packaged?
>
> From a technology standpoint, I think an R interface totally makes sense.
> As for a minor criticism (which I apply to other parts of SystemML too), I
> see script wrappers at https://github.com/SparkTC/r4ml/tree/master/R4ML/R.
> This tightly binds the existing DML scripts to R, which means DML
> input/output modifications could potentially require modifications to R
> code.
>
> Deron
>
>
>
> On Thu, Sep 21, 2017 at 11:00 AM, Niketan Pansare <npan...@us.ibm.com>
> wrote:
>
> > Janardhan: I believe this is the R4ML repo: https://github.com/SparkTC/
> > r4ml . Arvind: please correct me if I am wrong.
> >
> > Overall, having a R interface for SystemML is an awesome idea. Since I am
> > not an R4ML expert, may be R4ML committers can comment on how they
> envision
> > "two code streams to work together".
> >
> > Also, comparing the features of R4ML with that of our Python APIs will be
> > useful as it might make a stronger case for R4ML.
> >
> > As an FYI, here are different ways Python users can use SystemML:
> > - Using MLContext to invoke DML script (http://apache.github.io/
> > systemml/beginners-guide-python#invoking-dmlpydml-
> scripts-using-mlcontext
> > and http://apache.github.io/systemml/spark-mlcontext-
> > programming-guide.html)
> > - Python algorithms wrappers (http://apache.github.io/
> > systemml/beginners-guide-python#invoke-systemmls-algorithms)
> > - (not important for R4ML discussion): Python DSL (
> > http://apache.github.io/systemml/beginners-guide-
> python#matrix-operations)
> >
> > Thanks,
> >
> > Niketan Pansare
> > IBM Almaden Research Center
> > E-mail: npansar At us.ibm.com
> > http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar
> >
> > [image: Inactive hide details for Janardhan ---09/21/2017 04:44:02
> AM---Hi
> > Arvind, This is a great idea. One question: the R4ML generat]Janardhan
> > ---09/21/2017 04:44:02 AM---Hi Arvind, This is a great idea. One
> question:
> > the R4ML generates any plan like the SystemML with `D
> >
> > From: Janardhan <j...@protonmail.com>
> > To: Arvind Surve <ac...@yahoo.com.INVALID>, "dev@systemml.apache.org" <
> > dev@systemml.apache.org>
> > Date: 09/21/2017 04:44 AM
> > Subject: Re: [DISCUSS] R-Interface to SystemML
> > --
> >
> >
> >
> > Hi Arvind,
> >
> > This is a great idea. One question: the R4ML generates any plan like the
> > SystemML with `DML` Or with providing some interface we leverage this
> > feature ?. Community effort in the sense of collaborative algorithm
> > implementation.(?)
> >
> > Is this the Spark-R repo ( https://urldefense.proofpoint.
> > com/v2/url?u=https-3A__github.com_rstudio_sparklyr=DwIGaQ&
> > c=jf_iaSHvJObTbx-siA1ZOg=HzVC6v79boGYQrpc383_Kao_
> > 6a6SaOkZrfiSrYZVby0=uxG7P-4VuICwg6yatnAEX5JBdZ-
> PSwyvQzq5gwX1GL0=6VRs_
> > J7zXj9jZEavEP8iNvVfISAjDJeM8wFL2sBnb0g=  ) ?
> >
> > Thanks,
> > Jana

Re: [DISCUSS] R-Interface to SystemML

2017-09-21 Thread Deron Eriksson
I agree with Niketan. An R interface definitely makes sense for SystemML.
DML itself is based on R, so it's surprising we have Java/Scala/Python
interfaces to SystemML but we don't have an R interface.

Perhaps R4ML committers could supply a little more info? For instance:
1) Would they like to merge R4ML code into the main SystemML project
itself? (Currently we have no modules.)
2) What would they like to merge?
3) If so, how do they propose to do so?
4) Who will do the majority of the work to add R4ML code to SystemML? Or
who would like to volunteer to do this?
5) Who will maintain the contributed code? Or who would like to volunteer
to do this?
6) Documentation is needed (fit in SystemML documentation framework).
7) Testing is needed (fit into SystemML testing framework).
8) How is this packaged?

>From a technology standpoint, I think an R interface totally makes sense.
As for a minor criticism (which I apply to other parts of SystemML too), I
see script wrappers at https://github.com/SparkTC/r4ml/tree/master/R4ML/R.
This tightly binds the existing DML scripts to R, which means DML
input/output modifications could potentially require modifications to R
code.

Deron



On Thu, Sep 21, 2017 at 11:00 AM, Niketan Pansare <npan...@us.ibm.com>
wrote:

> Janardhan: I believe this is the R4ML repo: https://github.com/SparkTC/
> r4ml . Arvind: please correct me if I am wrong.
>
> Overall, having a R interface for SystemML is an awesome idea. Since I am
> not an R4ML expert, may be R4ML committers can comment on how they envision
> "two code streams to work together".
>
> Also, comparing the features of R4ML with that of our Python APIs will be
> useful as it might make a stronger case for R4ML.
>
> As an FYI, here are different ways Python users can use SystemML:
> - Using MLContext to invoke DML script (http://apache.github.io/
> systemml/beginners-guide-python#invoking-dmlpydml-scripts-using-mlcontext
> and http://apache.github.io/systemml/spark-mlcontext-
> programming-guide.html)
> - Python algorithms wrappers (http://apache.github.io/
> systemml/beginners-guide-python#invoke-systemmls-algorithms)
> - (not important for R4ML discussion): Python DSL (
> http://apache.github.io/systemml/beginners-guide-python#matrix-operations)
>
> Thanks,
>
> Niketan Pansare
> IBM Almaden Research Center
> E-mail: npansar At us.ibm.com
> http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar
>
> [image: Inactive hide details for Janardhan ---09/21/2017 04:44:02 AM---Hi
> Arvind, This is a great idea. One question: the R4ML generat]Janardhan
> ---09/21/2017 04:44:02 AM---Hi Arvind, This is a great idea. One question:
> the R4ML generates any plan like the SystemML with `D
>
> From: Janardhan <j...@protonmail.com>
> To: Arvind Surve <ac...@yahoo.com.INVALID>, "dev@systemml.apache.org" <
> dev@systemml.apache.org>
> Date: 09/21/2017 04:44 AM
> Subject: Re: [DISCUSS] R-Interface to SystemML
> --
>
>
>
> Hi Arvind,
>
> This is a great idea. One question: the R4ML generates any plan like the
> SystemML with `DML` Or with providing some interface we leverage this
> feature ?. Community effort in the sense of collaborative algorithm
> implementation.(?)
>
> Is this the Spark-R repo ( https://urldefense.proofpoint.
> com/v2/url?u=https-3A__github.com_rstudio_sparklyr=DwIGaQ&
> c=jf_iaSHvJObTbx-siA1ZOg=HzVC6v79boGYQrpc383_Kao_
> 6a6SaOkZrfiSrYZVby0=uxG7P-4VuICwg6yatnAEX5JBdZ-PSwyvQzq5gwX1GL0=6VRs_
> J7zXj9jZEavEP8iNvVfISAjDJeM8wFL2sBnb0g=  ) ?
>
> Thanks,
> Janardhan
>
> Sent with [ProtonMail](https://urldefense.proofpoint.com/v2/
> url?u=https-3A__protonmail.com=DwIGaQ=jf_iaSHvJObTbx-
> siA1ZOg=HzVC6v79boGYQrpc383_Kao_6a6SaOkZrfiSrYZVby0=
> uxG7P-4VuICwg6yatnAEX5JBdZ-PSwyvQzq5gwX1GL0=khkGV3oXz1W5m_
> ueQRuKWlNMVOXXCVhV_ytNCINjJWY= ) Secure Email.
>
> >  Original Message 
> > Subject: [DISCUSS] R-Interface to SystemML
> > Local Time: September 20, 2017 12:50 PM
> > UTC Time: September 20, 2017 4:50 PM
> > From: ac...@yahoo.com.INVALID
> > To: dev@systemml.apache.org <dev@systemml.apache.org>
> >
> > Hi,
> > R4ML is an open source project which provides a R interface to
> SystemML.Its a bridge between SystemML and Spark-R.
> > Lets discuss here if and how we can get two code streams work together
> to benefit development/community effort.
> >
> > Arvind Surve | Spark Technology Center  | https://urldefense.proofpoint.
> com/v2/url?u=http-3A__www.spark.tc_=DwIGaQ=jf_iaSHvJObTbx-siA1ZOg=
> HzVC6v79boGYQrpc383_Kao_6a6SaOkZrfiSrYZVby0=uxG7P-4VuICwg6yatnAEX5JBdZ-
> PSwyvQzq5gwX1GL0=Yj8qfo7sjGBxX547UMGfLHRZFUxCAjZDTHWe8B7jLxI=
>
>
>