[Chicken-users] macro systems and chicken (long)

Alex Shinn Fri, 04 Apr 2008 04:59:24 -0700

There seems to be a lot of confusion in the Chicken
community, and the Lisp community in general, about the
different macro systems, so I thought provide some
background information and discussion of the eggs available
in Chicken and their uses.


--- Background ---

There are two completely orthogonal aspects of macro systems
- whether they are hygienic or unhygienic, and whether they
are low-level or high-level.

Low-level means direct manipulation of sexps to produce
sexps - you're generating code expressions by hand.
High-level means you use some higher abstraction like
templating - the underlying processing may or may not make
use of sexps at all.  Low-level of course offers the most
control.  High-level has nice benefits such as providing a
location in source code for line-number debug info, and
easier analysis by other tools like analysers and editors.

Neither of these have anything to do with hygiene.  Hygiene
is a relatively newer concept, so all the old macro systems
were either unhygienic + low-level or unhygienic +
high-level.  defmacro is the former - it's low-level
manipulation of sexps.  The C preprocessor can be thought of
as the weakest, most poorly designed instance of unhygienic
high-level macros.  It's a templating system without any
kind of destructuring, conditionals or polymorphism.  Other
alternatives like the m4 macro preprocessor and pretty much
every assembly preprocessor are more powerful instances of
high-level macro systems.

Anyway, in the Lisp community we had defmacro, and it was
good.  You had to be careful to use gensyms, and never to
shadow or redefine any core procedures anywhere in your
program, but if you stuck to those rules there weren't many
problems.

Then Scheme came along, and nicely unified the CL namespace
mess into a single consistent namespace.  The problem was
this made conflicts much more likely.  It became much more
important to be able to automatically avoid problems,
without burdening the programmer with mentally keeping track
of everything in all lexical scopes.  Thus hygiene was born.
A good description of why hygiene is necessary can be found
at http://community.schemewiki.org/?hygiene-versus-gensym.

A very brief time-line:

  1986: Kohlbecker - introduced the idea of hygiene, low-level,
          used an O(n^2) coloring algorithm
  1987: Kohlbecker - introduced declare-syntax, high-level,
          the precursor to syntax-rules
  1988: Bawden & Rees - "Syntactic closures," low-level,
          faster than Kohlbecker's algorithm
  1991: Clinger & Rees - Explicit renaming, low-level, based
          on syntactic-closures but also supports syntax-rules
  1992: Dybvig - Syntax-case, primary motivation to remove the
          distinction between low-level and high-level

You can find the papers for these at library.readscheme.org.

--- Using The Low-level Systems ---

I'm assuming everyone is familiar with syntax-rules, if not
there are good tutorials available elsewhere.  I'm also
going to skip Kohlbecker's original system since it isn't
used anywhere.

The syntactic closures idea is very simple.  Instead of the
macro just being passed the expression to transform, it's
passed the expression plus environment information.  You can
think of it like

  (define-syntax foo
    (lambda (form usage-environment macro-environment)
      ...))

which is indeed how it's implemented, but you never use that
directly, you use one of the transformer abstractions.  The
most basic is sc-macro-transformer ("sc" is for syntactic
closures).  A good discussion can be found at

  http://community.schemewiki.org/?syntactic-closures

or in the MIT Scheme reference manual

  
http://www.gnu.org/software/mit-scheme/documentation/mit-scheme-ref/SC-Transformer-Definition.html

but basically the idea is you write macros like

  (define-syntax foo
    (sc-macro-transformer
      (lambda (form usage-environment)
        ...)))

You can then manipulate FORM as a normal sexp just like in
defmacro.  The resulting sexp is then interpreted in the
macro's syntactic environment.  To make parts of FORM refer
to their bindings in the calling environment, you need to
wrap them in syntactic-closures with the USAGE-ENVIRONMENT
parameter.  As an example,

  (define-syntax swap! 
    (sc-macro-transformer 
     (lambda (form env) 
       (let ((a (make-syntactic-closure env '() (cadr form))) 
             (b (make-syntactic-closure env '() (caddr form)))) 
         `(let ((value ,a)) 
            (set! ,a ,b) 
            (set! ,b value)))))) 

FORM is the full form (swap! var1 var2), so we're binding A
to var1 and B to var2, in the context of the usage
environment.  The other identifiers in the returned sexp
(LET, VALUE and SET!) all refer to the original macro
environment, so even if they had been locally shadowed in
the usage environment, this will still work.

The second argument to make-syntactic-closure (just '()
above) is used when you want to deliberately break hygiene.
See the other links for details.

The next transformer is rsc-macro-transformer, which is
essentially the reverse - the env parameter is the macro
environment, and bare identifiers in the result are
implicitly handled in the usage environment.

  (define-syntax swap! 
    (rsc-macro-transformer 
     (lambda (form env) 
       (let ((a (cadr form))
             (b (caddr form))
             (value (make-syntactic-closure env '() 'value))
             (let-r (make-syntactic-closure env '() 'let))
             (set!-r (make-syntactic-closure env '() 'set!)))
         `(,let-r ((,value ,a)) 
            (,set!-r ,a ,b) 
            (,set!-r ,b ,value)))))) 

Here A and B are just passed as-is, and the normal Scheme
constructs (LET and SET!) need to explicitly refer to the
macro environment.  It looks a little more busy - since most
of what you write in a macro expansion will be new code,
rather than rearranging the old cold.

However, if you look at that, the reason we make VALUE a
syntactic-closure is so that it won't conflict with any
instances of VALUE in A or B.  An alternate way of achieving
the same result would be to use gensym.  Now, if you're
programming by the old defmacro conventions of never
redefining or shadowing core forms and functions like LET
and SET!, they would have the same meaning in both
environments.  So, a safe-only-by-convention way of writing
this is:

  (define-syntax swap! 
    (rsc-macro-transformer 
     (lambda (form env) 
       (let ((a (cadr form))
             (b (caddr form))
             (value (gensym)))
         `(let ((,value ,a)) 
            (set! ,a ,b) 
            (set! ,b ,value)))))) 

But that's exactly the way you write this in defmacro!

People who argue against hygiene saying "you can have
defmacro when you take it from my cold, dead hands" are
simply unaware that you can do *exactly* the same style of
programming with hygiene.  The only extra code above is the
rsc-macro-transformer line.

  ============================================================
  =  IF YOU FIND HYGIENE CONFUSING, WRITE EVERYTHING WITH    =
  =  RSC-MACRO-TRANSFORMER AS THOUGH IT WERE DEFMACRO.  YOU  =
  =  CAN ADD IN HYGIENE SEEMLESSLY IF AND WHEN ANY PROBLEMS  =
  =  ARISE.                                                  =
  ============================================================

The next transformer is er-macro-transformer, where "er"
stands for "Explicit Renaming."

  (define-syntax swap! 
    (er-macro-transformer
      (lambda (form rename compare) 
        (let ((a (cadr form))
              (b (caddr form)))
          `(,(rename 'let) ((,(rename 'value) ,a))
             (,(rename 'set!) ,a ,b)
             (,(rename 'set!) ,b ,(rename 'value)))))))

The result is handled just like in rsc-macro-transformer -
raw identifiers are handled in the usage environment.
Instead of an reference to the macro environment, we're
given a RENAME procedure which explicitly makes a syntactic
closure for the macro environment.  RENAME is referentially
transparent, so even though it's called twice on VALUE above
the results are the same, where sameness is by comparison
with the COMPARE procedure.  I.e.

   (compare (rename 'foo) (rename 'foo)) => #t

Though you can of course rename everything you need once
outside to preserve readability of the expression.

--- The Hybrid System ---

The syntax-case macro system is sort of a hybrid, combining
high-level and low-level features.  Our example would
become:

  (define-syntax swap!
    (lambda (stx)
      (syntax-case stx ()
        ((swap! a b)
         (syntax
           (let ((value a))
             (set! a b)
             (set! b value)))))))

The SYNTAX-CASE form destructures the STX input just like
SYNTAX-RULES does.  However, the body isn't a template, but
rather is evaluated normally.  SYNTAX is another special
form that works to instantiate a template.  Because this use
of SYNTAX occurs lexically inside the (swap! a b) pattern,
the instances of A and B in the syntax template hygienically
refer to those parameters of the macro.  If you moved the
SYNTAX to a helper function it would break.  So you can
think of SYNTAX-CASE as unhygienically inserting some
lexical environment information that SYNTAX refers to.

The nice thing is that this example actually does more than
any of our previous SWAP! definitions in that it checks
syntax and will signal a syntax error if not given two
arguments.  On the other hand, for this example SYNTAX-RULES
beats everybody:

  (define-syntax swap!
    (syntax-rules ()
      ((swap! a b)
       (let ((value a))
         (set! a b)
         (set! b value)))))

The advantage of SYNTAX-CASE over SYNTAX-RULES is that you
don't have to just use SYNTAX, you can perform some
arbitrary computation on sexps and then convert it to
syntax.  The basic pattern here would be:

  (define-syntax swap!
    (lambda (stx)
      (syntax-case stx ()
        ((swap! a b)
         (let ((a (syntax-object->datum (syntax a)))
               (b (syntax-object->datum (syntax b))))
           (datum->syntax-object
             (syntax swap!)
             `(let ((value ,a))
                (set! ,a ,b)
                (set! ,b value))))))))

That is, you destructure with SYNTAX-CASE, access the
destructured info with SYNTAX, convert these to sexps with
syntax-object->datum, perform arbitrary defmacro-style
computations, and then convert it back to syntax with
datum->syntax-object.

Got it?

There are also utilities to streamline this somewhat like
quasisyntax (which even gets its own new read syntax) and
with-syntax, and a whole huge library of stuff.  It's a very
large and baroque system.  For pure template-style syntax,
SYNTAX-RULES wins, and for pure low-level handling the other
systems win because they don't get in your way as much.
SYNTAX-CASE has a niche in medium-level complexity macros
that benefit from destructuring plus a small amount of
computation.  On the other hand, if you don't tightly bind
one specific destructuring idiom to your macro system, you
can take your pick of any external matching or
syntax-verifying libraries you want (e.g. use explicit
renaming macros with the MATCH macro for destructuring).

Oh, there's another really serious problem here.  Remember
how all the syntactic closures based transformers were
wrapped in a macro like sc-macro-transformer or
er-macro-transformer?  Well, the syntax case macro system
doesn't have that:

  (define-syntax foo
    (lambda (stx)
      ...))

i.e. a macro is _always_ _explicitly_ a procedure of one
argument, which is a syntax object (and contains all the
syntax case semantics thereof).  So if you implemented your
system such that macros take three arguments (the form and
the two environments), you're screwed.  You have to resort
to very ugly hacks to get these to work together.

So to summarize SYNTAX-CASE does let you write both high and
low level macros and preserve hygiene, and has some nice
ideas, but I really dislike it and discourage it's use for
the following reasons:

  1) very, very large and baroque API and reader extensions

  2) forces a single destructuring idiom tightly integrated
     with the macro system, when this should be a purely
     orthogonal concept

  3) makes it very difficult to play along with alternate
     macro systems

  4) implicit unhygienic interaction between SYNTAX-CASE and
     SYNTAX, and in general confusing semantics

  5) identifier syntax (another huge, ugly can of worms I
     won't even get into here)

--- Macros in Chicken ---

OK, so now what macro systems are available in Chicken and
how do we use them?

Core chicken by itself has define-macro, which is
unhygienic.  All of the alternative systems hook into
Chicken by registering themselves as the macro expander,
thus effectively throwing away any existing macros.  They
then reload their own versions of the standard Chicken
macros (use, cond-expand, when, unless, define-macro, etc.).
Thereafter (yes, load order matters here) any new macros are
defined in terms of the new macro system.  The hygienic
systems mostly do provide a define-macro definition, but it
should be avoided, as it becomes even more fragile than
usual when combined with hygienic macros (interleaving them
basically just doesn't work).

The alternative hyienic macro systems are:

                       syntax-rules?  low-level?  compiled-macros?
alexpander                   O            X              X
syntax-case                  O            O              X
simple-macros                O            O              X
syntactic-closures           O            O              X
riaxpander                   O            O              O

alexpander is a simple, lightweight implementation of
syntax-rules only, with a few extensions, written by Al*
Petrofsky.  It's the only option here that doesn't have any
low-level macros.

syntax-case is as bashed thoroughly above.  If you use it I
reserve the right to taunt you endlessly :P

simple-macros is a more recent system by Andre van Tonder
which contains a full implementation of syntax-case.  So
it's an even bigger system, and I believe more semantically
complex, but I'm not too familiar with it.

The syntactic-closures egg is the original implementation by
Bawden, modified heavily by Chris Hanson, and is the macro
system currently used in MIT Scheme.  It provides all three
of the transformers (sc-, rsc- and er-) described above, as
well as SYNTAX-RULES.  It's the most light-weight of the
low-level hygienic macro systems.

The riaxpander egg is a recent, clean implementation of
syntactic-closures by Taylor Campbell, and is fully
compatible with the syntactic-closures egg.  I also recently
added support for compiled macros so it loads an order of
magnitude faster than any of the above systems.

Which should you use?  If you're defining your own local
macros for compile-time it's not really a big deal - use
whatever is most convenient.  If you're _exporting_ macros,
then this becomes an important decision.  Exporting
unhygienic macros is a bad idea.  If at all possible,
exported macros should be written with SYNTAX-RULES because
that's universally supported by the alternatives.

If you really need to use low-level macros, then you have to
choose between the syntax-case API and the
syntactic-closures API.  Obviously I prefer the latter :)
You *don't* need to specify an implementation in either case
- the user can choose whichever s/he prefers.  For example,
if our made a swap egg that exported our swap macro as an
explicit renaming macro, then someone who wanted to use it
would write

  (use syntactic-closures swap)

or

  (use riaxpander swap)

If you want to be very friendly, you can actually support
all of the systems with judicious use of COND-EXPAND.  If
you look at the source to the matchable egg, it's 95%
SYNTAX-RULES with a couple of COND-EXPANDed definitions for
either syntax-case, syntactic-closures, or pure syntax-rules
(the alexpander case).  If you look at the test egg, it just
provides a few macros which are fully COND-EXPANDed to
support any system - including Chicken's core define-macro.

-- 
Alex


_______________________________________________
Chicken-users mailing list
[email protected]
http://lists.nongnu.org/mailman/listinfo/chicken-users

[Chicken-users] macro systems and chicken (long)

Reply via email to