Hi,

I've read and reread the macro explanation but I'm still not entirely
clear on number of things.  The questions and thoughts below are based
on my (mis)understanding.

On 03/02/06 02:05, Larry Wall wrote:
    Macros are functions or operators that are called by the compiler as
    soon as their arguments are parsed (if not sooner).  The syntactic
    effect of a macro declaration or importation is always lexically
scoped, even if the name of the macro is visible elsewhere.

And presumably they can be lexically unimported, or whatever the verb is
for what "no" does.

    As with
    ordinary operators, macros may be classified by their grammatical
    category.  For a given grammatical category, a default parsing rule or
    set of rules is used, but those rules that have not yet been "used"
    by the time the macro keyword or token is seen can be replaced by
    use of "is parsed" trait.  (This means, for instance, that an infix
    operator can change the parse rules for its right operand but not
    its left operand.)

    In the absence of a signature to the contrary, a macro is called as
    if it were a method on the current match object returned from the
    grammar rule being reduced; that is, all the current parse information
    is available by treating C<self> as if it were a C<$/> object.

Is this a :keepall match object? Or is the Perl6 grammar conserving by default? (The "Syntax trees [...] are reversible" suggests so)
Or is this one of the "signature to the contrary" possibilities?

    [Conjecture: alternate representations may be available if arguments
    are declared with particular AST types.]

    Macros may return either a string to be reparsed, or a syntax tree
    that needs no further parsing.  The textual form is handy, but the
    syntax tree form is generally preferred because it allows the parser
    and debugger to give better error messages.  Textual substitution
    on the other hand tends to yield error messages that are opaque to
    the user.  Syntax trees are also better in general because they are
    reversible, so things like syntax highlighters can get back to the
    original language and know which parts of the derived program come
    from which parts of the user's view of the program.

    In aid of returning syntax tree, Perl provides a "quasiquoting"
    mechanism using the keyword "CODE", followed by a block intended to
    represent an AST:

        return CODE { say $a };

I guess the string form is C<eval "CODE { $str }">

If CODE may enclose arbitrary source text of whatever DSL poeple invent,
alternate braces would probably be useful.  Either q()-like, HERE-doc
or pod's C<< >> nesting style.

    [Conjecture: Other keywords are possible if we have more than one
    AST type.]

Ocaml and camlp4 are probably a good source of ideas for quasiquoting.
I've only perused the documentation, has one actually used Ocaml here?
See: http://caml.inria.fr/pub/docs/tutorial-camlp4/tutorial004.html

Rather than misrepresenting Ocaml with my sketchy understanding,
I'll just mention some possibly interesting features:

Specific expander rules from the grammar can be used, <:rulename< ... >>

They have a C -> AST expander.  I can imagine a SQL -> AST expander
would find some use in Perl. I don't think the same AST type is used but that's just a guess.

Two of the "p"s in p4 stand for pretty-printer, which is the AST->source
conversion.  In addition to aiding debugging and reformatting, it allows
interconversion between different syntaxes (sp?).  Ocaml comes with two
grammars, one is backwards compatible and the other has jettisoned
the baggage.
    Within a quasiquote, variable and function names resolve first of
    all according to the lexical scope of the macro definition, and if
    unrecognized in that scope, are assumed to be bound from the scope
    of the macro call each time it is called.  If they cannot be bound
    from the scope of the macro call, a compile-time exception is thrown.

    Variables that resolve from the lexical scope of the macro definition
    will be inserted appropriately depending on the type of the variable,
    which may be either a syntax tree or a string.  (Again, syntax tree
    is preferred.)  The case is similar to that of a macro called from
    within the quasiquote, insofar as reparsing only happens with the
    string version of interpolation, except that such a reparse happens
    at macro call time rather than macro definition time, so its result
    cannot change the parser's expectations about what follows the
    interpolated variable.

Is there any cpp-like protection against self-referential expansions
when using the string returning form?

The last S06 sentence above overflowed my mental stack, so I'm unsure whether self-referential expansions are somehow impossible.

    Hence, while the quasiquote itself is being parsed, the syntactic
    interpolation of a variable into the quasiquote always results in
    the expectation of an operator following the variable.  (You must
    use a call to a submacro if you want to expect something else.)
    Of course, the macro definition as a whole can expect whatever it
    likes afterwards, according to its syntactic category.  (Generally,
    a term expects a following postfix or infix operator, and an operator
    expects a following term or prefix operator.)

Do @arrays of ASTs interpolate/splice?

Lisp needs ,@ (comma-at) to do splatty interpolation, that is remove the
outer pair of parens.  Depending on what the ASTs look like and how they
splice together, such a form may or may not be necessary.

    In case of name ambiguity, prefix with C<COMPILING::> to indicate a
    name in the compiling scope, and anything else (such as C<OUTER::>)
    to indicate a name in the macro definition's scope, since that's the
    default.  In particular, any variable declared within the quasiquote
    block is assumed to scope to the quasiquote; to scope the declaration
    to the macro call's scope, you must say

        my COMPILING::<$foo> = 123;
        env COMPILING::<@bar> = ();
        our COMPILING::<%baz>;

    or some such if you wish to force the compiler to install the variable
    into the symbol table being constructed by the macro call.

"COMPILING" here means the scope in which the macro is being expanded, rather
than the scope in which the macro itself is being compiled, is that correct?

Perhaps a twigil would be clearer?  Such huffmanization is probably
undeserved and would be seen as encouraging promiscuous lexical intercourse...


What are the variable visibility rules when interpolating in quasiquotes?
Does a variable unbound in a spliced AST bind to one in the enclosing quasiquote? The consequences of this when inserting an AST from a parsed parameter need to be considered. If the enclosing quotes variables are visible then an unintended binding may occur.


    [Conjecture: Due to these dwimmy scoping rules, there is no need of
    a special "unquote" construct as in Scheme et al.]

No gensym shenanigans either.  The scoping rules seem to be hygienic,
no unintended variable leaking.  Unintended variable capture seems unlikely
too, only if you forget to declare a variable with the macro declaration
and coincidently declare the same variable in the macro use scope will
everything go haywire.


Brad

--
That one's own district is unsophisticated and unpolished is a great
treasure. Imitating another style is simply a sham. -- Hagakure http://bereft.net/hagakure/

Reply via email to