Re: [racket-users] syntax/parse is not hygienic

2018-03-05 Thread Yucheng Zhang
On Monday, March 5, 2018 at 6:36:09 PM UTC, Alexis King wrote:
> I will say this, however: while I have developed over the years a 
reasonably
> strong intuition for how Racket macros operate, when I was learning the 
macro
> system for the first time, I did not find some parts of the hygiene 
algorithm
> terribly intuitive.

I am new to both Racket and Scheme in general. When I learned about Scheme's
hygienic macros, my (incorrect) understanding has always been close to André
van Tonder’s system. I have been programming with this mental modal for 
months,
and only upon reading this discussion did I realize both R6RS and Racket 
macro
systems disagree with my mental modal.

I find the existing macro system easier to understand and use, but I am
concerned that it may introduce unintentional symbol collisions. From a
programmer's perspective, care must be taken when using phase-1 functions, 
and
gensym or others must be used to avoid unintended symbol collision. The
situation seems to be similar to that of using unhygienic macros. The 
problem
seems to be a fundamental one, and syntax/parse is only one instance arising
from this problem.

I recently started switching my projects to Racket because I find it 
powerful
and elegant. I sincerely hope Racket to move in a good direction -- whatever
that means.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] syntax/parse is not hygienic

2018-03-05 Thread Alexis King
For those interested, it turns out you can get a loose approximation of
the van Tonder system in Racket in just a few dozen lines of code.
Namely, you can write a helper that undoes the macro-introduction scope
added by the Racket macro system:

(begin-for-syntax
  (define ((make-unscoped-transformer proc) stx)
(syntax-local-introduce (proc (syntax-local-introduce stx)

Then you can write some functions and forms for keeping track of which
scopes to add when users write quote-syntax:

(begin-for-syntax
  (define current-syntax-introducer (make-parameter #f))
  (define (current-syntax-introduce stx)
((or (current-syntax-introducer)
 (make-syntax-introducer))
 stx))

  (define (call-with-shared-syntax-introducer proc)
(if (current-syntax-introducer)
(proc)
(parameterize ([current-syntax-introducer
(make-syntax-introducer)])
  (proc
  (define (call-with-masked-syntax-introducer proc)
(parameterize ([current-syntax-introducer #f])
  (proc)))

  (define-simple-macro (with-shared-syntax-introducer
 body:expr ...+)
(call-with-shared-syntax-introducer (λ () body ...)))
  (define-simple-macro (with-masked-syntax-introducer
 body:expr ...+)
(call-with-masked-syntax-introducer (λ () body ...

You can define an introducing variant of quote-syntax in terms of
Racket’s quote-syntax:

(begin-for-syntax
  (define-simple-macro (quote-syntax form)
(current-syntax-introduce (quote-syntax/no-introduce form

And finally, you can implement syntax and quasisyntax in terms of these
other forms and functions. That part is the most amount of work, so I
haven’t implemented full versions of either, but I implemented
simplified versions that don’t handle ellipses and generate less optimal
code. The only interesting thing in their implementations is the
placement of with-shared-syntax-introducer and
with-masked-syntax-introducer. Both expand into uses of
with-shared-syntax-introducer, which is wrapped around the entire
expansion, and unsyntax must wrap its expression in
with-masked-syntax-introducer in its expansion. This produces a system
that seems to have the properties of van Tonder’s system in simple
situations.

Experimentation leads to some interesting behavior. For example, the
following macro is completely uninteresting in a system that uses scoped
expansion, but it’s quite interesting in one that uses scoped quotation:

(define x 'module)

(define-syntax mac
  (make-unscoped-transformer
   (syntax-parser
 [(_)
  #`(let ([x 'local])
  (list x #,#'x))])))

(mac)

Under scoped-expansion, the program produces the boring result '(local
local), but under scoped-quotation, it produces the much more
interesting result '(local module)! Maybe some people would find this
confusing, but I think it’s a little neat.

If anyone is interested in playing with my hacky, incomplete, and
probably buggy embedding of this system in Racket, I’ve posted it here:

  https://gist.github.com/lexi-lambda/a32aab1bb3eccd416764ef90cbd55b67

As a testament to the power of Racket’s macro system and its
macro-writing facilities, the whole thing is only 80 lines of code.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] syntax/parse is not hygienic

2018-03-05 Thread Alexis King
Thank you to both of you for your detailed responses! I think this is
all fascinating.

> On Mar 5, 2018, at 05:18, Ryan Culpepper  wrote:
> 
> 1. Yes. To me, at least :) That aspect of hygiene is triggered by a
> macro expansion step, and the macro expansion step also defines the
> boundary of its effect. In other words, the expansion of a macro
> introduces a scope, but syntax classes do not. Compare with the
> following examples:

Yes, this is what I meant when I wrote that “Syntax classes behave like
phase 1 functions, not macros.” It is worth pointing out, however, that
not all things that behave like macros are strictly within the realm of
the macroexpander — things like match expanders and syntax/parse pattern
expanders manually emulate hygienic expansion despite not ever actually
yielding to “the” macroexpander.

> 2. I think the main technical challenge is finding all of the syntax
> objects to flip the scope on, given that syntax classes allow
> attributes to have arbitrary values (like opaque structures and
> closures), not just syntax objects. We have a similar problem with
> syntax properties, which are invisible to the hygiene algorithm.
> 
> It might be easier in a macro system like Andre van Tonder's system,
> as Matthew and Sam mentioned.

The parallel with syntax properties is a good one, and it’s something
that came to mind to me, too. It seems hard to solve in general
automatically, but some cooperation from the user might be enough (via
some generic protocol like structure type properties). That’s a bit
ugly, though.

> 3. Maybe. Half-baked musings follow:

If these are your half-baked musings, I would like to see what your
fully baked ones look like. :)

Your points are good ones, and I agree that I have virtually no
intuition for which places the boundaries make sense. I will say this,
however: while I have developed over the years a reasonably strong
intuition for how Racket macros operate, when I was learning the macro
system for the first time, I did not find some parts of the hygiene
algorithm terribly intuitive. The fact that quoted syntax could be in
wildly different lexical contexts but capture and bind the same
identifiers because they were in the same dynamic context seemed
antithetical to hygiene to me (which I heard described as “respecting
the lexical structure of the program as-written”).

I find André van Tonder’s system compelling, but I also agree I can’t
really evaluate it without trying to write some programs with it. Maybe,
with Racket 7, it’d be easier to implement such a macro system in a fork
of Racket for experimentation... but the result might be too
incompatible with existing code to serve any purpose. It would be an
interesting experiment.

> On Mar 5, 2018, at 06:45, Matthew Flatt  wrote:
> 
> Adding to Ryan's answer, I note that Andre van Tonder's SRFI-72 system
> has `quasisyntax` is a primitive. That is,
> 
>  #`(x #,y x)
> 
> is not like
> 
> (datum->syntax #'here (list #'x y #'x))
> 
> because the scope introduced by a `quasisyntax` spans the whole
> `quasisyntax` form and causes nested `syntax` forms to not introduce
> a fresh scope.

Yes, I noticed that, too. My assumption was that a form like
with-fresh-renaming-scope would be necessary, but it would be a little
bit different from the version originally described. Rather than require
it in order to produce distinct scopes from distinct uses of quotation,
keep the behavior of the final version of SRFI 72, but allow
with-fresh-renaming-scope (or, most likely, something similar but with
a more appropriate name) to *weaken*, not strengthen, the hygiene rules.

To illustrate, this would lead to the following behavior:

(bound-identifier=? #'x #'x) ; ==> #f

(with-fresh-renaming-scope
  (bound-identifier=? #'x #'x)) ; ==> #t

If uses of with-fresh-renaming-scope are nested, the outermost use
“wins”, yielding the following behavior:

(with-fresh-renaming-scope
  (bound-identifier=? (with-fresh-renaming-scope #'x)
  (with-fresh-renaming-scope #'x)))
; ==> #t

This would allow forms like quasisyntax to introduce
with-fresh-renaming-scope in their expansions to avoid distinct uses of
syntax from generating distinct identifiers while simultaneously
allowing new abstractions to be defined in terms of quasisyntax just as
quasisyntax is defined in terms of syntax.

Some parts of this are still a little unsatisfying, however. The draft
of SRFI 72 you link defines with-fresh-renaming-scope as applying
lexically, not dynamically, but it isn’t immediately obvious to me which
is the correct behavior in this case. Furthermore, if it applies
lexically, what does it mean when with-fresh-renaming-scope is
introduced by a macro? If it applies to all identifiers inside its
expansion, that feels unhygienic enough to cause trouble... so my own
intuition here is nonexistent.

As for looking at the discussion, it doesn’t appear to be particularly
illustrative. As far as the archiv

Re: [racket-users] syntax/parse is not hygienic

2018-03-05 Thread Matthew Flatt
At Sun, 4 Mar 2018 20:01:56 -0800, Alexis King wrote:
> While it’s a bit of a tangent, I’d be quite interested to finding more
> information on this alternate model of hygiene from anyone familiar with
> the tradeoffs (the SRFI that describes it does not include much in the
> way of comparisons). Are there strong reasons to prefer Racket’s model
> aside from backwards compatibility and mild convenience when
> procedurally assembling pieces of syntax?

Adding to Ryan's answer, I note that Andre van Tonder's SRFI-72 system
has `quasisyntax` is a primitive. That is,

  #`(x #,y x)

is not like

 (datum->syntax #'here (list #'x y #'x))

because the scope introduced by a `quasisyntax` spans the whole
`quasisyntax` form and causes nested `syntax` forms to not introduce a
fresh scope.

Is `quasisyntax` special enough to be built in? What about other
syntactic forms that would be naturally implemented with multiple
`syntax` (or `quote-syntax`) forms?


Looking back, I see that SRFI-72 at one point included a
`with-fresh-renaming-scope` operation that very close to (or maybe
exactly) what I had in mind by "applying a fresh scope to a textual
region of syntax literals":

  https://srfi.schemers.org/srfi-72/srfi-72-1.3.html

To me, that seems like a way to pull the specialness out of `syntax`
and `quasisyntax`, although it also seems inconvenient to have to write
`with-fresh-renaming-scope` explicitly (which that connects to Ryan's
comments on `syntax-protect`).

But I haven't looked in detail, and I may be mixing things up. You may
find something in the SRFI-72 discussion on why that direction was
abandoned.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] syntax/parse is not hygienic

2018-03-05 Thread Ryan Culpepper

On 03/04/2018 09:40 PM, Alexis King wrote:

[... context ...]

Still, with all this context out of the way, my questions are
comparatively short:

   1. Is this lack of hygiene well-known? I did not find anything in
  Ryan’s dissertation that explicitly dealt with the question, but I
  did not look very hard, and even if it isn’t explicitly mentioned
  there, I imagine people have thought about it before.

   2. Are there some fundamental, theoretical obstacles to making a
  syntax class-like thing hygienic that I have not foreseen? Or would
  it really be as simple as performing the usual scope-flipping that
  macroexpansion already performs?

   3. If it is possible, is the unhygienic nature of syntax classes
  desirable frequently enough that it outweighs the benefits of
  respecting hygiene? That seems unlikely to me, but maybe I have not
  fully considered the problem. The semantics of syntax classes
  cannot be changed now, of course, for backwards compatibility
  reasons, but were that not a problem, would it make sense to make
  them hygienic?  If not, why not?


1. Yes. To me, at least :) That aspect of hygiene is triggered by a 
macro expansion step, and the macro expansion step also defines the 
boundary of its effect. In other words, the expansion of a macro 
introduces a scope, but syntax classes do not. Compare with the 
following examples:


   (define x 'old)

   (begin-for-syntax
 (define (get-def)
   #'(define x 'new))
 (define (get-use)
   #'x))
   (define-syntax (m1 stx)
 #`(begin #,(get-def) #,(get-use)))
   (m1) ;; => 'new

   (define-syntax (m2 stx)
 (let ([expr #'x])
   #`(let ([x 'new]) #,expr)))
   (m2) ;; => 'new

Contrast with systems like MetaML, which strictly enforce lexical 
scoping but don't actually give you macros.


2. I think the main technical challenge is finding all of the syntax 
objects to flip the scope on, given that syntax classes allow attributes 
to have arbitrary values (like opaque structures and closures), not just 
syntax objects. We have a similar problem with syntax properties, which 
are invisible to the hygiene algorithm.


It might be easier in a macro system like Andre van Tonder's system, as 
Matthew and Sam mentioned.


3. Maybe. Half-baked musings follow:

There are two good ideas in opposition here. One is hygiene. The other 
is the availability of pure abstractions.


As a new grad student, I spent some time playing with plain R5RS-style 
syntax-rules macros. My first presentation in grad school was on an 
EOPL-style interpreter I wrote using nothing but syntax-rules macros. It 
did integer arithmetic (using a unary encoding of integers as 
parentheses), it did closures, it did continuations... all at compile 
time. But I discovered that there were things I couldn't express the way 
I wanted because the only available abstraction mechanism (the macro 
definition) was tangled up with the hygiene effect. I don't remember the 
exact situations, but they had the following form: I needed 
macro-generated-macro X to produce exactly identifier Y, but it could 
only produce Y with a mark. Of course, most of these "macros" were 
really implementation fragments of something else; I was forced to break 
them out into separate pieces because of the limitations of the language 
I chose to work in.


Hygienic macros are impure abstractions. (Other examples of impure 
abstractions: Javascript functions, if you do reflection on the current 
activation record, and Prolog predicates, which delimit the effect of a 
cut.) They still win, because despite being impure at the syntax object 
level, they approximate this nice lexical scoping property one level up, 
the level of interpreting the syntax objects as expressions, 
definitions, etc. (I say "approximate" because of the examples I gave in 
part 1.) But I think the win depends on the placement of the hygiene 
boundaries. My interpreter experience makes me think that too many 
boundaries within the implementation of a macro can be bad.


So another framing of the question is where should the boundaries go?[*] 
The one that corresponds to a macro expansion step is nice because macro 
expansion occurs at expression (or more precisely, "form") positions, 
and that connects the hygiene boundary with the interpretation level 
where lexical scoping is defined. Are there other "meaningful" places to 
put hygiene boundaries? Do syntax classes necessarily correspond with 
meaningful boundaries? Or are meaningful boundaries not actually that 
important?


([*] Matthew and I had a similar problem regarding syntax certificates, 
which evolved into the current dye pack and tainting system. We couldn't 
find a good way to identify anchors in the source code that represented 
discrete macro implementations that could be automatically protected, so 
we left it to the programmer to call syntax-protect explicitly. (Or use 
syntax-rules.) The problem is th

Re: [racket-users] syntax/parse is not hygienic

2018-03-04 Thread Alexis King
Actually, what I wrote was wrong. The key piece of information I
overlooked was the following rule:

> A binding for an identifier can only capture a reference to another
> if both were present in the source or introduced during a single
> evaluation of a syntax or quasisyntax form, with the understanding
> that the evaluation of any nested, unquoted syntax or quasisyntax
> forms counts as part of the evaluation of an enclosing quasisyntax.

The key phrase is “single evaluation”, so quote-syntax becomes
generative: multiple evaluations of the same quote-syntax form use
distinct scopes.

This is interesting to me. It’s stricter than Racket’s model for
hygiene, since Racket makes it legal to do things like this:

(with-syntax ([def #'(define x 42)])
  #'(begin def x))

...which produces a piece of syntax that will evaluate to 42, unlike in
van Tonder’s model, in which it would produce an unbound identifier
error. Of course, this problem is not difficult to solve; it just
requires lifting #'x into a separate binding:

(with-syntax* ([id #'x]
   [def #'(define id 42)])
  #'(begin def id))

This model... makes sense to me. I like it. It seems, on the surface,
more intuitive than Racket’s model of introducing fresh scopes in the
expander itself. That said, it’s still quite different from Racket’s
model, so some of what I said in my last message still applies, I think.
I also wouldn’t be surprised if there were some infelicities in the
alternative approach I’m not immediately seeing (corner cases,
perhaps?).

While it’s a bit of a tangent, I’d be quite interested to finding more
information on this alternate model of hygiene from anyone familiar with
the tradeoffs (the SRFI that describes it does not include much in the
way of comparisons). Are there strong reasons to prefer Racket’s model
aside from backwards compatibility and mild convenience when
procedurally assembling pieces of syntax?

> On Mar 4, 2018, at 19:28, Alexis King  wrote:
> 
> Sam suggested I take a look at van Tonder’s work as well on Slack, and
> it’s interesting, though it isn’t what I originally had in mind. I
> think it would solve the first example of mine, but it would not solve
> the second. In the second example, all uses of tmp come from the same
> quote-syntax form, merely multiplied via ellipsis. My first mental
> model was to treat syntax classes under ellipses like distinct macro
> invocations, which would require a dynamic, not lexical, treatment of
> scope to be consistent with Racket’s model of hygiene.
> 
> If I’m understanding correctly, attaching fresh scopes at quotation
> rather than expansion treats the source text of the program as the
> ground truth for all scoping information — if two identifiers come
> from the same location in the user’s source code, they can bind each
> other.  This seems like a good model for most things, but it seems
> radically different from Racket’s model when internal definitions are
> involved, since such an interpretation would imply that this program
> should produce a duplicate definition error:
> 
>(define-syntax-rule (def-x)
>  (define x 42))
> 
>(def-x)
>(def-x)
> 
> That seems to me like an enormous break from Racket’s model of
> hygiene, but it doesn’t seem wrong, just different. I could picture a
> different programming language with a different macroexpander using
> such a model successfully. Still, unless I’m misunderstanding the
> implications here, it seems like attaching the scopes at expansion
> (even if “expansion” is really “parsing with syntax classes”) rather
> than quoting would be more consistent with the rest of Racket?

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] syntax/parse is not hygienic

2018-03-04 Thread Alexis King
> On Mar 4, 2018, at 15:11, Matthew Flatt  wrote:
> 
> I think scope-flipping would work, but FWIW, I thought you were going
> a different direction here. The scope-flipping approach is a way to
> infer an intended scope dynamically. It sounds to me like you want
> something more static --- a way of applying a fresh scope to a textual
> region of syntax literals.
> 
> An extreme end of that approach would be applying a fresh scope on the
> evaluation of each `quote-syntax` form (essentially as van Tonder
> explored), but it's possible that larger regions would work better.

Sam suggested I take a look at van Tonder’s work as well on Slack, and
it’s interesting, though it isn’t what I originally had in mind. I think
it would solve the first example of mine, but it would not solve the
second. In the second example, all uses of tmp come from the same
quote-syntax form, merely multiplied via ellipsis. My first mental model
was to treat syntax classes under ellipses like distinct macro
invocations, which would require a dynamic, not lexical, treatment of
scope to be consistent with Racket’s model of hygiene.

If I’m understanding correctly, attaching fresh scopes at quotation
rather than expansion treats the source text of the program as the
ground truth for all scoping information — if two identifiers come from
the same location in the user’s source code, they can bind each other.
This seems like a good model for most things, but it seems radically
different from Racket’s model when internal definitions are involved,
since such an interpretation would imply that this program should
produce a duplicate definition error:

(define-syntax-rule (def-x)
  (define x 42))

(def-x)
(def-x)

That seems to me like an enormous break from Racket’s model of hygiene,
but it doesn’t seem wrong, just different. I could picture a different
programming language with a different macroexpander using such a model
successfully. Still, unless I’m misunderstanding the implications here,
it seems like attaching the scopes at expansion (even if “expansion” is
really “parsing with syntax classes”) rather than quoting would be more
consistent with the rest of Racket?

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] syntax/parse is not hygienic

2018-03-04 Thread Matthew Flatt
At Sun, 4 Mar 2018 12:40:43 -0800, Alexis King wrote:
>   2. Are there some fundamental, theoretical obstacles to making a
>  syntax class-like thing hygienic that I have not foreseen? Or would
>  it really be as simple as performing the usual scope-flipping that
>  macroexpansion already performs?

I think scope-flipping would work, but FWIW, I thought you were going a
different direction here. The scope-flipping approach is a way to infer
an intended scope dynamically. It sounds to me like you want something
more static --- a way of applying a fresh scope to a textual region of
syntax literals.

An extreme end of that approach would be applying a fresh scope on the
evaluation of each `quote-syntax` form (essentially as van Tonder
explored), but it's possible that larger regions would work better.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [racket-users] syntax/parse is not hygienic

2018-03-04 Thread Matthias Felleisen


> On Mar 4, 2018, at 3:40 PM, Alexis King  wrote:
> 
> Apologies in advance for both the inflammatory subject and yet another
> overly long email to this list.


I wouldn’t call this inflammatory. It might be considered a bug report. 
Thanks for the thorough analysis. 

Would you be in a position to add syntax classes to Michael Adam’s 
model of hygiene (I know that it doesn’t support define, but I think one 
can think of define-syntax-class as (let-for-syntax ((a (syntax-class …)). 
One could also add it to Matthew’s model and see what is doable there. 
If the two agree that syntax-generating attributes can be made hygienic, 
good. If not, one might wish to consider telling people (forcing?) to produce 
values that are then fed into auxiliary macros. 

I am looking forward to Ryan’s response — Matthias



> 
> I think anyone who knows me knows that I love syntax/parse — I think
> it’s far and away one of Racket’s most wonderful features — but I’ve
> long suspected it does not respect hygiene. Consider:
> 
>#lang racket
>(require syntax/parse/define)
> 
>(define x #f)
> 
>(begin-for-syntax
>  (define-syntax-class a
>[pattern _ #:attr def #'(define x #t)])
>  (define-syntax-class b
>[pattern _ #:attr use #'x]))
> 
>(define-simple-macro (m a:a b:b)
>  (begin a.def b.use))
> 
>(m 0 0) ; => #t
> 
> This program produces #t from the reference to x on line 10. Considering
> the natural lexical scope of the program, as it appears to a human
> reader, there is no local definition of x in scope where #'x is written
> on line 10, so it logically ought to refer to the top-level definition
> of x on line 4, which would make the program produce #f. However, it
> does not. Instead, it produces #t because it actually refers to the
> definition of x written on line 8, which is assembled alongside the use
> on line 13.
> 
> While this behavior makes sense from the perspective of someone familiar
> with the semantics of procedural macros, if taken from the point of view
> of pattern-based systems, it seems to violate one of the essential
> properties of a hygienic macro system. Namely, the macro system should
> respect program scope. The above program does not.
> 
> For those unfamiliar with the details, this behavior occurs because
> syntax class uses are not treated like macro transformations. When a
> macro is expanded, a fresh scope is attached to its expansion, but when
> a syntax class is used, its syntax objects have no additional
> introduction scope. One could argue this behavior is useful — sometimes
> it is helpful to be able to assemble larger pieces of syntax from the
> outputs of different syntax classes without needing to pass shared
> identifiers as input to the classes — but it also causes problems. I
> think I first ran into it when I was using a syntax class to generate a
> series of definitions:
> 
>#lang racket
>(require syntax/parse/define)
> 
>(begin-for-syntax
>  (define-syntax-class def-and-use
>[pattern val:expr
> #:attr x #'(begin
>  (define tmp (+ val 1))
>  (displayln tmp))]))
> 
>(define-simple-macro (m a:def-and-use ...)
>  (begin a.x ...))
> 
>(m 1 2 3)
> 
> I would expect this program to print "2\n3\n4\n", but instead, it fails
> to compile with an error:
> 
>module: identifier already defined
>  in: tmp
> 
> The multiple definitions of tmp are assembled alongside each other, and
> since they all have the same scopes, they collide. A solution is to use
> generate-temporary, but that is a little ugly. A solution that uses a
> helper macro in place of the syntax class has no such problem:
> 
>#lang racket
>(require syntax/parse/define)
> 
>(define-simple-macro (def-and-use val:expr)
>  (begin (define tmp (+ val 1))
> (displayln tmp)))
> 
>(define-simple-macro (m a:expr ...)
>  (begin (def-and-use a) ...))
> 
>(m 1 2 3)
> 
> There are arguments to be made that the existing behavior is not
> unreasonable. Syntax classes behave like phase 1 functions, not macros.
> If one desires macro-like behavior, it’s often possible to use a helper
> macro instead of a syntax class. This is not always true, however;
> sometimes syntax classes are used to generate syntax that will be
> inserted into places where the macroexpander will not run (such as
> binding positions), but one still needs to use generate-temporaries to
> avoid duplicate bindings.
> 
> There are some minor questions as to what the semantics of “hygienic”
> syntax classes would be, since they accept arbitrary values as inputs
> (in the case of parameterized syntax classes), not exclusively syntax
> objects. They also have multiple outputs, some of which may not be
> syntax-valued, so it’s not immediately obvious to me if performing the
> same scope flipping that works for macros would produce the appropriate
> result for syntax classes

[racket-users] syntax/parse is not hygienic

2018-03-04 Thread Alexis King
Apologies in advance for both the inflammatory subject and yet another
overly long email to this list.

I think anyone who knows me knows that I love syntax/parse — I think
it’s far and away one of Racket’s most wonderful features — but I’ve
long suspected it does not respect hygiene. Consider:

#lang racket
(require syntax/parse/define)

(define x #f)

(begin-for-syntax
  (define-syntax-class a
[pattern _ #:attr def #'(define x #t)])
  (define-syntax-class b
[pattern _ #:attr use #'x]))

(define-simple-macro (m a:a b:b)
  (begin a.def b.use))

(m 0 0) ; => #t

This program produces #t from the reference to x on line 10. Considering
the natural lexical scope of the program, as it appears to a human
reader, there is no local definition of x in scope where #'x is written
on line 10, so it logically ought to refer to the top-level definition
of x on line 4, which would make the program produce #f. However, it
does not. Instead, it produces #t because it actually refers to the
definition of x written on line 8, which is assembled alongside the use
on line 13.

While this behavior makes sense from the perspective of someone familiar
with the semantics of procedural macros, if taken from the point of view
of pattern-based systems, it seems to violate one of the essential
properties of a hygienic macro system. Namely, the macro system should
respect program scope. The above program does not.

For those unfamiliar with the details, this behavior occurs because
syntax class uses are not treated like macro transformations. When a
macro is expanded, a fresh scope is attached to its expansion, but when
a syntax class is used, its syntax objects have no additional
introduction scope. One could argue this behavior is useful — sometimes
it is helpful to be able to assemble larger pieces of syntax from the
outputs of different syntax classes without needing to pass shared
identifiers as input to the classes — but it also causes problems. I
think I first ran into it when I was using a syntax class to generate a
series of definitions:

#lang racket
(require syntax/parse/define)

(begin-for-syntax
  (define-syntax-class def-and-use
[pattern val:expr
 #:attr x #'(begin
  (define tmp (+ val 1))
  (displayln tmp))]))

(define-simple-macro (m a:def-and-use ...)
  (begin a.x ...))

(m 1 2 3)

I would expect this program to print "2\n3\n4\n", but instead, it fails
to compile with an error:

module: identifier already defined
  in: tmp

The multiple definitions of tmp are assembled alongside each other, and
since they all have the same scopes, they collide. A solution is to use
generate-temporary, but that is a little ugly. A solution that uses a
helper macro in place of the syntax class has no such problem:

#lang racket
(require syntax/parse/define)

(define-simple-macro (def-and-use val:expr)
  (begin (define tmp (+ val 1))
 (displayln tmp)))

(define-simple-macro (m a:expr ...)
  (begin (def-and-use a) ...))

(m 1 2 3)

There are arguments to be made that the existing behavior is not
unreasonable. Syntax classes behave like phase 1 functions, not macros.
If one desires macro-like behavior, it’s often possible to use a helper
macro instead of a syntax class. This is not always true, however;
sometimes syntax classes are used to generate syntax that will be
inserted into places where the macroexpander will not run (such as
binding positions), but one still needs to use generate-temporaries to
avoid duplicate bindings.

There are some minor questions as to what the semantics of “hygienic”
syntax classes would be, since they accept arbitrary values as inputs
(in the case of parameterized syntax classes), not exclusively syntax
objects. They also have multiple outputs, some of which may not be
syntax-valued, so it’s not immediately obvious to me if performing the
same scope flipping that works for macros would produce the appropriate
result for syntax classes.

Still, with all this context out of the way, my questions are
comparatively short:

  1. Is this lack of hygiene well-known? I did not find anything in
 Ryan’s dissertation that explicitly dealt with the question, but I
 did not look very hard, and even if it isn’t explicitly mentioned
 there, I imagine people have thought about it before.

  2. Are there some fundamental, theoretical obstacles to making a
 syntax class-like thing hygienic that I have not foreseen? Or would
 it really be as simple as performing the usual scope-flipping that
 macroexpansion already performs?

  3. If it is possible, is the unhygienic nature of syntax classes
 desirable frequently enough that it outweighs the benefits of
 respecting hygiene? That seems unlikely to me, but maybe I have not
 fully considered the problem. The semantics of syntax classes