I was thinking of rtl as a target (after optimizations) rather than anything else. Also, my experience with APL compilers involved creating routines for use within the APL environment. I know it's popular to build compilers for the OS command line, but what I'd like to do is focus attention on the small fraction of interpreted code which eats the bulk of the time.
That said, targeting SAC sounds like it might be far less work than targeting RTL. But I am somewhat worried about portability and deployment of SAC (RTL is a bit bulky but it has a compatible copyright and that's a good thing, and it allows support for many kinds of computing devices). Generally speaking, copyright issues get mis-managed. To really take advantage of computing resources you need to scale, and you need to tie your resources back to something else that people value (and there's a lot of such things). There's other ways of making money, of course, but that's a topic for another discussion. Thanks, -- Raul On Sun, Mar 2, 2014 at 5:43 PM, Robert Bernecky <[email protected]>wrote: > and more > > > -------- Original Message -------- > Subject: Re: [Jchat] J subset compilation > Date: Fri, 28 Feb 2014 15:08:03 -0500 > From: Robert Bernecky <[email protected]> > To: [email protected] > > > > Again, rtl is far too low level to be a good basis for writing > array-based optimizations, IMO. > > Compilation to SAC would be fairly straightforward, and would > offer a number of other benefits. E.g.: > > - SAC has array-based semantics. > > - You can define all the scalar functions in terms of the SAC > data-parallel "with loop". > > - SAC preserves the data-parallel and array natures of APL/J > code, and optimizes those structures. > > - SAC does the grunt work of emitting serial and parallel > back-end C code for single-thread, multi-thread, and > CUDA run-time environments. > > - SAC handles all of the memory management for you, just > like APL/J. It also does some very nice work on optimizing > that, as well as optimizing reference counts on same. > > These are all wheels that you need not be wasting your > time reinventing. Especially in rtl. > > I suggest that a more productive approach would be this: > > 1. Write a J parser in J. This will end up being quicker than repeatedly > hand-carving various toy J examples into your IL for trying out. > > 2. Take the parser output and run it through a Static Single Assignment > function. If you do this, MANY things become simpler. If you do not > do this, you will spend a lot of time doing analyses of the sort > that were being done by computer scientists in the previous > millennium. That will make you crazy. > > 3. Take the SSA output and run it through a data-flow analyzer (DFA) > that establishes, for every verb/derived verb/defined verb, the > rank and type of their arguments and results. Shapes, values, > and array predicates are, at this stage, frills. If you do not > know the rank and type of everything, then you still have > an interpreter, and most optimizers won't be able to > optimize anything. > > 4. Take the DFA output and feed it into a code generator. If you > do it right, this is little more than a fancy macro-expander, > at least for starters. Here is a trivial example of the code > template for the APEX SAC back end for first-axis (J-like) > reverse: > > %Fragment rot1 x** x bidc bidc . > inline $ZTYPE[+] $FNAME($YTYPE[+] y) > { /* First axis reverse on anything */ > frameshape = take([1],shape(y)); > cell = genarray(drop([1],shape(y)), $OTFILL); > z = with { > ( .<= iv<= .) > : y[(frameshape-[1])-iv]; > } : genarray(frameshape, cell); > return(z); > } > > The %Fragment identifies a new code fragment in the file of > monadic structural verbs. rot1 is the verb name. Remaining > stuff on that line defines the argument and result types, > and specialization cases for this fragment. This allows > the compiler to statically generate, for example, > linear-time upgrade code when the argument is a permutation > vector, or subset thereof. > > $YTYPE and $ZTYPE are right argument and result > macro parameters that become the SAC type and rank designators, e.g., > "int[.,.]" for arbitrary-shape rank-2 arrays. > > $FNAME becomes the SAC function name; $OTFILL is the > fill element value, required so that SAC knows the result > cell type and shape. > > Once you have all this in place, then adding array-based > optimizations at the APL/J level can be performed by > adding new abstract syntax tree traversals after DFA. > This can be fun, particularly because the SAC array optimizations, > as lovely as they are, still don't do everything you need. > > Bob > > On 14-02-28 01:38 PM, Raul Miller wrote: > >> I had been thinking gcc rtl - I've built gcc a number of times and while >> it >> can be complicated, most of the complications are irrelevant during the >> early development of a project. In particular, there are a number of >> subtleties which matter for cross compilation - where the compiler cannot >> know what it is compiling for just by looking at itself - which should >> not >> matter for the first few drafts of a compiler. Those abstractions can >> come >> later. >> >> Still, that might mean that llvm is better for us than rtl, simply for >> its >> focus on native compilation (as opposed to cross compilation). >> >> Of course a third path involves something like CUDA (which, in its >> current >> state of development can only be supported through cross compilation, and >> which also needs a certain amount of native support, to get anything >> useful >> done). >> >> I imagine that an LLVM subset would focus on rank zero operations with a >> small amount of rank 1 and rank _ operations. CUDA would probably focus >> on >> rank 1 and rank 2 (I'm not certain whether CUDA could efficiently >> implement >> dyadic transpose, or boxed arrays - there's better things to start on). >> >> Actually... probably the best place to start with a language subset would >> be: >> >> First draft: >> rank 0 only >> + - * dyads, - monad, only one data type (floating point, probably) >> >> Second draft: >> rank 1 arrays and indexing >> >> Third draft: >> whatever you need to implement ;: (mostly this means arbitrarily ranked >> arrays - basically a pair of rank 1 arrays and integer division and >> modulo). >> >> Fourth draft: >> First pass at J parsing (but no names yet, no adverbs yet, no >> conjunctions >> yet) >> http://www.jsoftware.com/help/dictionary/dicte.htm >> >> Mostly these passes would be practice. But once you had the parser you >> could start using the tests which come packaged in the J sources. You >> would >> not so much use those in order (and some would never be done >> successfully - >> for a J subset you'd identify the fraction of the tests that your subset >> could handle) but you would use them to help you find things which can be >> done (relatively) easily which fit what you already have. >> >> Basically to manage a large body of code, you wind up using errors of >> some >> sort (compilation errors, testing errors, or whatever else) to help you >> focus on your next steps. >> >> Thanks, >> >> > > -- > Robert Bernecky > Snake Island Research Inc > 18 Fifth Street > Ward's Island > Toronto, Ontario M5J 2B9 > > [email protected] > tel: +1 416 203 0854 > > > > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
