As I understand it, the way it is now is the following: { PyDML, DML } ——> ANTLR AST (org.apache.sysml.parser.dml, org.apache.sysml.parser.pydml) ——> Legacy AST (DMLProgram, Expression, ForStatement…) ——> HOPS ——> LOPS ——> Runtime
Niketan’s embedded Python DSL ——> PyDML Felix’s embedded Scala DSL ——> DML @Niketan, when you say “IR should be at abstraction to allow Python/R DSL to be a thin layer”, do you mean something different than is already implemented? > On Sep 28, 2016, at 12:37 PM, Niketan Pansare <npan...@us.ibm.com> wrote: > > Hi Fred, > > I would consider DMLProgram as an internal AST, which could be created by IR > (or IR could just create DML). According to me, IR should be at abstraction > to allow Python/R DSL to be a thin layer. This would maximize code reuse and > minimize bugs between DSLs. Something that Felix suggested (i.e. Matrix > class) would work best. > > Thanks, > > Niketan Pansare > IBM Almaden Research Center > E-mail: npansar At us.ibm.com > http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar > <http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar> > > Frederick R Reiss---09/28/2016 12:02:01 PM---Maybe I'm missing a subtle point > here, but why not refactor the existing class org.apache.sysml.pars > > From: Frederick R Reiss/Almaden/IBM@IBMUS > To: dev@systemml.incubator.apache.org > Date: 09/28/2016 12:02 PM > Subject: Re: Proof of Concept: Embedded Scala DSL > > > > > Maybe I'm missing a subtle point here, but why not refactor the existing > class org.apache.sysml.parser.DMLProgram into our common internal > representation across DSLs? This class is already sufficiently expressive to > represent any DML or PyDML program. > > Fred > > Niketan Pansare---09/28/2016 11:20:11 AM---Thanks Felix for the response. +1 > > From: Niketan Pansare/Almaden/IBM@IBMUS > To: dev@systemml.incubator.apache.org > Date: 09/28/2016 11:20 AM > Subject: Re: Proof of Concept: Embedded Scala DSL > > > > Thanks Felix for the response. > > +1 > >> For the future design I will probably make the Matrix and Vector classes > abstract which allows for different concrete implementations. We could > then have one that is backed directly by SystemML and works similar to > the Python DSL in that it just uses mock operators and builds the DML > string that is then executed using SystemML. That way the deep embedding > would reuse the shallow embedding and we could offer the user to either > use the lazy MatrixType on the Repl or write code inside the macro. > > Also, I agree that we can postpone the IR and integration of different DSLs > until the work on parallelize is completed. > > Thanks, > > Niketan Pansare > IBM Almaden Research Center > E-mail: npansar At us.ibm.com > http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar > <http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar> > > fschueler---09/28/2016 10:54:37 AM---Hi Niketan, thanks for your suggestions! > I thought about it a bit and here are my > > From: fschue...@posteo.de > To: dev@systemml.incubator.apache.org > Date: 09/28/2016 10:54 AM > Subject: Re: Proof of Concept: Embedded Scala DSL > > > > Hi Niketan, > > thanks for your suggestions! I thought about it a bit and here are my > ideas on it: > > The IR you are describing is basically already my user facing API. I am > not sure how much sense it makes to have an IR that looks exactly like > the API but with control structures renamed. A common IR for all DSLs > definitely makes sense in general but I am not sure if it should be part > of one particular DSL. For maintainability it might be better to have > that IR somewhere on the SystemML side. > > Apart from that and to what Matthias suggested, I thought about how to > make the DSL more suitable for using on the Repl and I think we can find > a good compromise. Currently my API is backed by breeze for rapid > prototyping where breeze just forces evaluation of every statement. For > the future design I will probably make the Matrix and Vector classes > abstract which allows for different concrete implementations. We could > then have one that is backed directly by SystemML and works similar to > the Python DSL in that it just uses mock operators and builds the DML > string that is then executed using SystemML. That way the deep embedding > would reuse the shallow embedding and we could offer the user to either > use the lazy MatrixType on the Repl or write code inside the macro. > > I haven't started playing around with this idea but let me know what you > think of it. The lazy, shallow DSL would basically do what you would > want from a seperate IR, but i don't know if you want to call that from > the python DSL. > > Felix > > Am 24.09.2016 19:39 schrieb Niketan Pansare: > > Hi Felix, > > > > Thanks for the summary. The document is extremely useful. I > > particularly like the idea of parallelizing the code with 'breeze' > > library. I would like to pitch in few ideas which would enable your > > code to be reused by other DSLs: > > 1. Scala DSL/parallelize macro remains the same as described in your > > documentation, but instead of generating DML directly, we call an > > intermediate representation (IR). This IR then generates DML (instead > > of generating DML directly by parallelize). This IR will be then > > reused by Python DSL and R DSL. > > 2. As an example, IR could be a lazy Matrix class (which would be part > > of SystemML). It could have awkward syntax/mechanism for pushing down > > control structures for example: beginWhile and endWhile. Since IR will > > not be exposed to the end-user, it should be fine. > > > > Example: > > https://github.com/apache/incubator-systemml/blob/master/src/main/python/systemml/defmatrix.py#L537 > > > > <https://github.com/apache/incubator-systemml/blob/master/src/main/python/systemml/defmatrix.py#L537> > > [1] will call IR's add() method. At the end of parallelize or when the > > user wants result (i.e. eval() ), IR could generate DML code and > > execute it. > > > > Again, this is just a proposal and am fine dropping the idea of > > integrating different DSL if it makes the implementation of Scala DSL > > complicated. Also, please feel free to correct me if I am missing > > anything. > > > > Thanks, > > > > Niketan Pansare > > IBM Almaden Research Center > > E-mail: npansar At us.ibm.com > > http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar > > <http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar> > > [2] > > > > Matthias Boehm---09/24/2016 01:11:36 AM---thanks for sharing the > > summary - this is very nice. While looking over the example, I had the > > follow > > > > From: Matthias Boehm/Almaden/IBM@IBMUS > > To: dev@systemml.incubator.apache.org > > Date: 09/24/2016 01:11 AM > > Subject: Re: Proof of Concept: Embedded Scala DSL > > > > ------------------------- > > > > thanks for sharing the summary - this is very nice. While looking over > > the example, I had the following questions: > > > > 1) Output handling: It would be great to see an example how the > > results of Algorithm.execute() are consumed. Do you intend to hand out > > our binary matrix representation or MLContext's Matrix from which the > > user then requests specific output formats? Also if there are multiple > > Algorithm instances, how is the MLContext (with its internal state of > > lazily evaluated intermediates) reused? > > > > 2) Scala-breeze prototyping: How do you intend to support operations > > that are not supported in breeze? Examples are removeEmpty, table, > > aggregate, rowIndexMax, quantile/centralmoment, cummin/cummax, and DNN > > operations? > > > > 3) Frame data type and operations: Do you also intend to add a frame > > type and its operations? I think for this initial prototype it is not > > necessarily required but please make the scope explicit. > > > > Regards, > > Matthias > > > > fschueler---09/23/2016 04:36:14 PM---As discussed in the related Jira > > (SYSTEMML-451) I have started to implement a prototype/proof of co > > > > From: fschue...@posteo.de > > To: dev@systemml.incubator.apache.org > > Date: 09/23/2016 04:36 PM > > Subject: Proof of Concept: Embedded Scala DSL > > > > ------------------------- > > > > As discussed in the related Jira (SYSTEMML-451) I have started to > > implement a prototype/proof of concept for an embedded DSL in Scala. > > > > I have summarized the current approach in a short document that you > > can > > find on github together with the code: > > https://github.com/fschueler/emma/blob/sysml-dsl/emma-sysml-dsl/README.md > > <https://github.com/fschueler/emma/blob/sysml-dsl/emma-sysml-dsl/README.md> > > [3] > > Please note that current development happens in the Emma project but > > will move to an independent module in the SystemML project once the > > necessary additions to Emma are merged. By having the DSL in a > > separate > > module, we can include Scala and Emma dependencies only for the users > > that actually want to use the Scala DSL. > > > > The current code serves as a proof of concept to discuss further > > development with the SystemML community. I especially welcome input > > from > > SystemML Scala users on the usability of the API design. > > Next steps will include the translation from Scala code to DML with > > support of all features currently supported in DML, including control > > flow structures. > > Also, a coherent way of executing the generated scripts from Scala and > > > > the interaction with outside data formats (such as Spark Dataframes) > > will be integrated. > > > > I am happy to answer your questions and discuss the described approach > > > > here! > > > > Felix > > > > > > > > Links: > > ------ > > [1] > > https://github.com/apache/incubator-systemml/blob/master/src/main/python/systemml/defmatrix.py#L537 > > > > <https://github.com/apache/incubator-systemml/blob/master/src/main/python/systemml/defmatrix.py#L537> > > [2] > > http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar > > <http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar> > > [3] > > https://github.com/fschueler/emma/blob/sysml-dsl/emma-sysml-dsl/README.md > > <https://github.com/fschueler/emma/blob/sysml-dsl/emma-sysml-dsl/README.md> > > > > > > > >