Re: [Templates] PROPOSAL: Add macro operator ->(){}

Andy Wardley Fri, 11 Apr 2008 04:12:06 -0700

Paul Seamons wrote:
 > MACRO's are useful, but they cannot be passed to sort, map or grep, or to any
 > other function.


Hi Paul,

Yes they are, and no they can't. :-(

 > I'm proposing grabbing the "pointy-sub" syntax from Perl 6 for creating
 > anonyous macros, and allowing for return values.

Yes, it's an interesting idea. I've been keeping a close eye on Perl 6 to see
what we can beg, borrow or steal from them. I guess you've been doing a
similar thing. I think it would be good to keep TT3's syntax and semantics as
close as possible to those of Perl 6 (without being too close). However, we
also need to bear in mind that TT is probably moving away from a close
association with Perl (e.g. the Python port) rather than moving toward it.

The pointy subs are certainly something I've been considering for TT3, but I'm
not sure I've settled on a particular Way To Do It. What follows are my
current thoughts on the matter, but it's still open for discussion.

I particularly like it for creating inline anonymous subs, typically for
sort, map, etc. I wanted to make the simple cases as simple as possible,
so I started off thinking along these lines:

    things.map( item -> item.name )   # get the name of each item

Here we create an anonymous sub which takes a single 'item' parameter and
returns 'item.name'.  When fed into the map(), it returns the .name of each
item in the list of things.

Using 'this' as the default argument name is fine, although 'item' might
be more in keeping with other TT use:

    things.map( -> this.name )   # either
    things.map( -> item.name )   # or

For sort(), it could be used to sort by a particular field of the items in a
list. This is currently implemented in TT2 like this:

    things.sort('name')

But it would be better to support a macro/sub which gives you more
flexibility:

    things.sort( item -> item.name.lower )  # sort by lower case name

(it would be nice to simply accept a dotted item and automagically promote it
to the above, e.g. things.sort(.name) but I'll save that for another post).

The two-argument form would look something like this.  Here we want to
provide a subroutine that compares each pair of items.

    things.sort( (a, b) -> a.name.lower cmp a.name.lower )

<sidenote>
Perl doesn't give you any way of knowing how many arguments a
subroutine is expecting so the sort() method, on being passed a subroutine
reference, has no way of knowing it should call:

either   sort { $sub->($a) cmp $sub->($b) }    # one argument
     or   sort { $sub->($a, $b) }               # two arguments

However, I think we could fudge that by implementing macros/subs as
objects which are blessed references to the subroutine (so that the
object still is a sub ref and can be called directly if necessary)
and have some inside-out metadata defining the parameter list.  There's
also the possibility of using syntactic analysis to grok if it's a one
or two argument subroutine.  Long story short - it might be tricky and
I've got some ideas about how to make it Just Work[tm], but it's probably
better to use named parameters in this particular case anyway... more on
that below.
</sidenote>

 >  The syntax looks like this:
 >
 >     [% foo = ->{ ... } %]
 >     [% foo = ->(max){ ... } %]

In these cases, I don't particularly like the syntax.  As I understand it,
Perl 6 treats '->' is a simple replacement for the word 'sub'.

     foo = sub { ... }
     foo = sub(max) { ... }

Personally, I think the word 'sub' (or something similar) makes more sense
at first glance.  But hey, if '->' were just an alias for 'sub' then we could
keep everyone happy.

But I prefer to use the -> to separate the arguments from the body.  Maybe
I've been spending too much time at Lambda the Ultimate, but I find it more
intuitive to have the arrow mean "the stuff on the left is transmogrified
to the stuff on the right":

     a -> a + 1              # one argument
     (a) -> a + 1            # one arg in explicit parens
     (a, b) -> a + b         # two arguments

That covers anonymous subs, but we could also use it for named subs like so:

    inc(a) -> a + 1
    add(a, b) -> a + b

The only downside to putting the arrow after the name/args is that it becomes
harder to parse.  You don't know that you're defining a subroutine until you
reach the '->' which means you have to defer full analysis of the LHS.  It's
much easier to have a 'sub' keyword or special symbol up front, but I laugh
in the face of "easy to implement" and blow my nose in the general direction
of LL parsers.  (yes python, I'm looking at you)

Anyway, having figured out how to structure the parser to make all these
things possible (well, I think so... still working on it), it occurred to
me that the '->' was probably superfluous and could be served equally well
by '=' (except in a few edge cases perhaps).

So I think it should be possible to define a macro like this:

   inc(a) = a + 1
   add(a,b) = a + b

Putting a parameter list on the final element of the LHS should be
enough for us to recognise that we're defining a macro/sub.  More on
the RHS to follow below...

You could use '->' if you prefer to be explicit, or for those simple
case when you're creating an anonymous sub with a single argument that
you can't be bothered to put in parens, e.g.

   things.sort( a -> a.name )

So '->' would effectively be a special kind of '=' for creating subroutines.
It would be right-associative (like '=') and have the appropriate precedence
so that you could write this:

   inc = a -> a + 1     # "inc is something that transforms a to a + 1"

as a more explicit form of:

   inc(a) -> a + 1

which the parser will treat the same as:

   inc(a) = a + 1

That might be a little more intuitive when it comes to passing named
parameters.  For example, the sort() method could take named parameters
'map' and/or 'reduce' to handle the two different cases.  The 'map'
extracts the comparison value from each item and the 'reduce' performs
the comparison.  You could provide the named parameter subs like so:

   things.sort(
       map(a)       -> a.name.lower;    # use lower case name
       reduce(a, b) -> b cmp a;         # reverse alpha sort
   )

The above is syntactic sugar for the following (or is the following sugar
for the above?)

   things.sort(
       map    = (a) -> a.name.lower;    # use lower case name
       reduce = (a, b) -> b cmp a;      # reverse alpha sort
   )

Or, as per TT2, you can use '=>' instead of '='

   things.sort(
       map    => (a) -> a.name.lower;    # use lower case name
       reduce => (a, b) -> b cmp a;      # reverse alpha sort
   )

In pseudo-TT-Perl, they all end up as something like this:

   things.sort({
      map    => sub { my $a = shift; $a.name.lower },
      reduce => sub { my ($a, $b) = @_; $a cmp $b },
   });

So in summary, I prefer to think of '->' as a sub-making syntactic sugary-
coated alias for '=', rather than an alias for 'sub'. I think there's a nice
symmetry between '=', '=>' and '->' when that is the case.

Now, to the right hand side.  Just like '=', the '->' will expect a
single expression.  e.g.

   inc(a) -> a + 1

The '->' and/or parens on the LHS will tell the parser/compiler that the
expression on the RHS is lazy. In implementation terms, that means creating a
macro/subroutine, but we'll just call it being lazy for now.

The expression on the RHS can be anything you like as long as it's a single
expression.  Some examples:

   bold(text)   = "<b>$text</b>"
   odd(number)  = number % 2
   even(number) = ! number % 2

You could also use it to define complex data structures:

   # lazy list generator
   multiples(n) = [n, n * 2, n * 3, n * 4, ..., n * 12]

   # easy way of doing the above
   multiples(n) = [n for n in 1 to 12]

   # lazy hash generator
   user(name, email) = {
      name  = name
      email = email
      id    = name.lower.replace(/\W/, '_')
   }

The implication of allowing any expression on the RHS is that you can't
use { } to delimit code blocks because it looks just like a hash. But then
again, perhaps that's a Good Thing.  I think the block...end approach is
more in keeping with TT's language.  For example, to define a regular
template block you could do this:

   wibble(x, y, z) = block;
      ...do something...
   end

As well as things like this:

   wibble(x, y, z) =
     if x + y < z;
       x;
     else;
       y;
     end

In this case, the entire if..else..end construct is treated as a single
expression, but that's kinda how it works anyway.  In fact, there's a
subtle but important shift in the TT3 parser towards a more homogenous
language.  That's a fancy way of saying you can put directive keywords
anywhere, not just at the start of a "sentence".  For example, in TT2,
you can do this:

   a = INCLUDE b

but that's really just a filthy, dirty hack.  It's a special-case that
breaks things like [% a = b IF c %] and doesn't work consistently in
other places.  For example, you can't write:

   foo(INCLUDE bar)                # doesn't work in TT2
   foo = {                         # nor this
     bar = INCLUDE baz
   }

The shift away from the table-based parser of TT2 means that we can make
things work more like a "real" language.  That means that you will be
able to treat 'include' more like a function that returns some value
(in this case, the result of processing a template).

   foo(include bar)
   foo = {
     bar = include baz
   }

On the other hand, I'm still keen to keep simple things simple (or should
I say BASIC?), so the 'include' keyword will still expect to be followed by
a template name (quoting optional) and a list of zero or more parameters.

   include header title='Hello World'

So to make that work in all places, some ( ) scoping might be required:

   foo = {
     bar = (include bar x=10)
     baz = (include baz y=20)
   }

Now the nice thing is that once you've made the parser treat ( ) as a simple
code grouping construct, you can have it recognise multiple expressions
inside.  e.g.

   foo = (include header; 'Hello World'; include footer)

So in effect, ( ) become a means of creating a sub-template block for
simple cases.  That'll work in macros too.

   message(m) = (include header; "Hello $m"; include footer)

This all depends on the parser being smart enough to recognise when a
set of parens contain a single item which is returned:

    a = (b + c) * d        # (b + c) yields a single value

Or a list of values which are concatenated together in template style:

    a = (b; c; d)          # sequence of values

Note that it becomes quite acceptable to put semi-colons inside parens.  In
this case, they're synonymous for commas or spaces:

    a = (b; c; d)    # these all do the same thing
    a = (b, c, d)
    a = (b c d)
    a = "$b$c$d"

But commas and semi-colons aren't always the same thing of course:

    a = (include foo a=10, b=20; include bar c=30, d=40)

Here the 'include' keywords gobble up lists of params (a and b in the first
case, c and d in the second), so we use a ';' to mark the end of the first
parameter list and the start of the next expression.

 > The RETURN directive and return item, list, and hash vmethods can be
 > used to return more interesting values from a MACRO.

Yes, we need to make RETURN/return be able to return a value.

   wibble(x, y, z) = block;
      ... do something...
      return z;
   end;

I'm not sure I understand what the purpose is of the .return vmethods
in your examples?

 > The Schwartzian transform is now possible in Template::Alloy (somebody
 > somewhere is rolling over in their grave).
 >
 >   [%- qw(Z a b D y M)
 >         .map(->{ [this.lc, this].return })
 >         .sort(->(a,b){a.0 cmp b.0})
 >         .map(->{this.1})
 >         .join %]          => a b D M y Z

You get this week's Mad Badger award.  :-)

One final thing relating to macros: a syntax for gobbling positional
arguments and named parameters.  I'm currently favouring this:

   example(foo, *bar, %baz) = ...

foo is a regular parameter, *bar gobbles all other positional arguments,
%baz receives all other named parameters.

So calling it as:

   example('Hello', 'World', 'Badger', x = 10)

is the same as:

   example('Hello', bar=['World', 'Badger'], baz = { x = 10 })

If you don't specify them explicitly then the default names for additional
arguments/params will be *args and %opts.  e.g.

   foo() = (
      'args:' args.join(',')
      '  '
      'opts:' opts.join('=>', ',')
   )

   foo(10, 20, a=30, b=40)  # args:10,20  opts:a=30,b=40

Phew, I think I'm done for now.  Comments welcome as always.

A

_______________________________________________
templates mailing list
[email protected]
http://mail.template-toolkit.org/mailman/listinfo/templates

Re: [Templates] PROPOSAL: Add macro operator ->(){}

Reply via email to