Re: Macros?

Larry Wall Mon, 06 Feb 2006 14:05:09 -0800

On Sun, Feb 05, 2006 at 02:32:08AM +0100, Brad Bowman wrote:
: 
: Hi,
: 
: I've read and reread the macro explanation but I'm still not entirely
: clear on number of things.  The questions and thoughts below are based
: on my (mis)understanding.
: 
: On 03/02/06 02:05, Larry Wall wrote:
: >    Macros are functions or operators that are called by the compiler as
: >    soon as their arguments are parsed (if not sooner).  The syntactic
: >    effect of a macro declaration or importation is always lexically
: >    scoped, even if the name of the macro is visible elsewhere.  
: 
: And presumably they can be lexically unimported, or whatever the verb is
: for what "no" does.


Presumably.  At least its grammatical effect must be unimportable, even
if the name isn't.  Which we could do, since we've divorced the grammatical
effect from name visibility.  Nevertheless, the easiest thing might just
be to hide the name, or rather the lexical alias of the name, if the
existence of the lexical alias is what controls the lexical scoping of
the grammatical effect.

: >    As with
: >    ordinary operators, macros may be classified by their grammatical
: >    category.  For a given grammatical category, a default parsing rule or
: >    set of rules is used, but those rules that have not yet been "used"
: >    by the time the macro keyword or token is seen can be replaced by
: >    use of "is parsed" trait.  (This means, for instance, that an infix
: >    operator can change the parse rules for its right operand but not
: >    its left operand.)
: >
: >    In the absence of a signature to the contrary, a macro is called as
: >    if it were a method on the current match object returned from the
: >    grammar rule being reduced; that is, all the current parse information
: >    is available by treating C<self> as if it were a C<$/> object.
: 
: Is this a :keepall match object?  
: Or is the Perl6 grammar conserving by default?  
: (The "Syntax trees [...] are reversible" suggests so)
: Or is this one of the "signature to the contrary" possibilities?

It feels to me like something that wants to be controlled by a very
large context, such as which debugger/IDE you're working under, if any.
Maybe that's one of those "signature to the contrary" things.  I dunno.

: >    [Conjecture: alternate representations may be available if arguments
: >    are declared with particular AST types.]
: >
: >    Macros may return either a string to be reparsed, or a syntax tree
: >    that needs no further parsing.  The textual form is handy, but the
: >    syntax tree form is generally preferred because it allows the parser
: >    and debugger to give better error messages.  Textual substitution
: >    on the other hand tends to yield error messages that are opaque to
: >    the user.  Syntax trees are also better in general because they are
: >    reversible, so things like syntax highlighters can get back to the
: >    original language and know which parts of the derived program come
: >    from which parts of the user's view of the program.
: >
: >    In aid of returning syntax tree, Perl provides a "quasiquoting"
: >    mechanism using the keyword "CODE", followed by a block intended to
: >    represent an AST:
: >
: >     return CODE { say $a };
: 
: I guess the string form is C<eval "CODE { $str }">

Seems like that would bind variables differently, unless we took steps
for it not too.  I was thinking that string macros would have no binding
to the macro's definition's lexical scope.  But then I'm not sure what
that could desugar to.

: If CODE may enclose arbitrary source text of whatever DSL poeple invent,
: alternate braces would probably be useful.  Either q()-like, HERE-doc
: or pod's C<< >> nesting style.

Any CODE-like macro could choose its own delimiter policy.  Arguably we
could go with q:code or some such instead, and I considered this,
but it seemed to me that if you're parsing something that the user
is thinking of primarily as generic Perl code, it ought to look more
like a code block and less like a string.

: >    [Conjecture: Other keywords are possible if we have more than one
: >    AST type.]
: 
: Ocaml and camlp4 are probably a good source of ideas for quasiquoting.
: I've only perused the documentation, has one actually used Ocaml here?

Not this one.

: See: http://caml.inria.fr/pub/docs/tutorial-camlp4/tutorial004.html

In my copious free time...  :-)

: Rather than misrepresenting Ocaml with my sketchy understanding,
: I'll just mention some possibly interesting features:
: 
: Specific expander rules from the grammar can be used, <:rulename< ... >>

Our rules are all just subs in disguise, so I'm sure we could do something
similar, modulo syntactic sugar.

: They have a C -> AST expander.  I can imagine a SQL -> AST expander
: would find some use in Perl.  I don't think the same AST type is used but 
: that's just a guess.

At this point I'm not so interested in specific mappings, but I'm sure
everyone will have their favorites.

: Two of the "p"s in p4 stand for pretty-printer, which is the AST->source
: conversion.  In addition to aiding debugging and reformatting, it allows
: interconversion between different syntaxes (sp?).  Ocaml comes with two
: grammars, one is backwards compatible and the other has jettisoned
: the baggage.  

Pugs would like to perform similar tricks.

: >    Within a quasiquote, variable and function names resolve first of
: >    all according to the lexical scope of the macro definition, and if
: >    unrecognized in that scope, are assumed to be bound from the scope
: >    of the macro call each time it is called.  If they cannot be bound
: >    from the scope of the macro call, a compile-time exception is thrown.
: >
: >    Variables that resolve from the lexical scope of the macro definition
: >    will be inserted appropriately depending on the type of the variable,
: >    which may be either a syntax tree or a string.  (Again, syntax tree
: >    is preferred.)  The case is similar to that of a macro called from
: >    within the quasiquote, insofar as reparsing only happens with the
: >    string version of interpolation, except that such a reparse happens
: >    at macro call time rather than macro definition time, so its result
: >    cannot change the parser's expectations about what follows the
: >    interpolated variable.
: 
: Is there any cpp-like protection against self-referential expansions
: when using the string returning form?

Not currently.

: The last S06 sentence above overflowed my mental stack, so I'm unsure 
: whether self-referential expansions are somehow impossible.

That wasn't the intent.  I was merely trying to straighten out what
grammatical category the parser would be looking for after an interpolation.
Basically, an auto-unquoted variable always expects an operator after it,
whereas an ordinary string macro can leave the parser in any state it
likes at the end of the string.  For example, a string macro can
be defined that return "$x +", in which case a term is expected afterwards.
You can't do that with autounquoted variables--you have to use something
that looks more like a function.

But I admit that I'm handwaving here.

: >    Hence, while the quasiquote itself is being parsed, the syntactic
: >    interpolation of a variable into the quasiquote always results in
: >    the expectation of an operator following the variable.  (You must
: >    use a call to a submacro if you want to expect something else.)
: >    Of course, the macro definition as a whole can expect whatever it
: >    likes afterwards, according to its syntactic category.  (Generally,
: >    a term expects a following postfix or infix operator, and an operator
: >    expects a following term or prefix operator.)
: 
: Do @arrays of ASTs interpolate/splice?

I would think the right thing to depends on the type of each element.

: Lisp needs ,@ (comma-at) to do splatty interpolation, that is remove the
: outer pair of parens.  Depending on what the ASTs look like and how they
: splice together, such a form may or may not be necessary.

Insert more handwaving here.  For features like this I'll be relying
heavily on feedback from the lambdacamel combinators.  My main goal
in participating in the design is to make sure Perl 6 remains usable
by mere mortals for most of the things mere mortals might want to do.

: >    In case of name ambiguity, prefix with C<COMPILING::> to indicate a
: >    name in the compiling scope, and anything else (such as C<OUTER::>)
: >    to indicate a name in the macro definition's scope, since that's the
: >    default.  In particular, any variable declared within the quasiquote
: >    block is assumed to scope to the quasiquote; to scope the declaration
: >    to the macro call's scope, you must say
: >
: >     my COMPILING::<$foo> = 123;
: >     env COMPILING::<@bar> = ();
: >     our COMPILING::<%baz>;
: >
: >    or some such if you wish to force the compiler to install the variable
: >    into the symbol table being constructed by the macro call.
: 
: "COMPILING" here means the scope in which the macro is being expanded, 
: rather
: than the scope in which the macro itself is being compiled, is that correct?

Yes.

: Perhaps a twigil would be clearer?  Such huffmanization is probably
: undeserved and would be seen as encouraging promiscuous lexical 
: intercourse...

I used to have a twigil for it, and dehuffmanized it.  (Used to be the +
twigil, which I've since reused for env vars.)

: What are the variable visibility rules when interpolating in quasiquotes?
: Does a variable unbound in a spliced AST bind to one in the enclosing 
: quasiquote?  

Good question.  I don't profess to know the right answer.  My guess
is that, if the AST was passed in as an argument, it has already
been matched against the COMPILING scope, so anything unbound would
be an error if we don't bind it into the macro body (assuming the
user is in "use strict" land, but if not, the parser could already
have bound an unrecognized variable into the current package, so an
unbound variable could still be an unambiguous indication of desire
to bind to the macro in that case).  But I think we'll just have to
play with it and see what makes the most sense.

: The consequences of this when inserting an AST from a parsed parameter 
: need to be considered.  If the enclosing quotes variables are visible 
: then an unintended binding may occur.

Yes, depends on when variables are bound to the AST snippet.

: >    [Conjecture: Due to these dwimmy scoping rules, there is no need of
: >    a special "unquote" construct as in Scheme et al.]
: 
: No gensym shenanigans either.  The scoping rules seem to be hygienic,
: no unintended variable leaking.  Unintended variable capture seems unlikely
: too, only if you forget to declare a variable with the macro declaration
: and coincidently declare the same variable in the macro use scope will
: everything go haywire.

Yes, my intent is to trade a few unlikely "Don't do thats" for an
increase in naturalness.  But again, I'm just doing this by the seat
of my pants.  It's not like I really know all the ins and outs of
what I'm doing.  I'm not smart enough to do an exhaustive search--I'm
more like one of those neural net chess programs that is too organic
to tell you why it made any particular move...

Larry

Re: Macros?

Reply via email to