Re: Selective String Interpolation

2006-02-17 Thread Damian Conway

Brad Bowman asked:


When building code strings in Perl 5 I usually write the code,
then wrap it in double quotes, then "\" escape everything light blue
under syntax highlighting.  I was wondering if there'll a better
way in Perl 6. 
I thought it would be nice to define the variables you wish to

interpolate individually, perhaps as extensions to the :s, :a,
etc quote adverbs, perhaps using a signature object.


There is already a mechanism for this. You simply turn off all variable 
interpolation, and interpolate any the variables you wish to interpolate via 
block interpolations. Or, more simply, only turn on block interpolation in a 
non-interpolating string:


my $code = q:c{
package {$package_name};

sub {$sub_name} \{
   return {$return_val}
\}
};


Damian





Re: CODE {...} mentioning variables without interpolation

2006-02-17 Thread Larry Wall
On Sat, Feb 18, 2006 at 01:57:18AM +0200, Brad Bowman wrote:
: Hi again,
: 
: L
: 
: Is it possible to refer to a variable in a CODE quotation without
: splicing it in as an AST or string?  I can't see how this is
: be possible under S06, unless using OUTER:: is intended to be 
: a non-splicing variable mention.  
: 
: The sample snippet in S06 seems simple but got me confused.
: I'll explain my interpretation so the confusion can be removed
: from the rules or my understanding, where ever it's found to be.
: The snippet:
: 
:  return CODE { say $a };
: 
: The snippet will probably be found inside a macro and will be run
: during the macro's expansion elsewhere.  When it is run,
: an AST for "say $a" is produced that searches for $a in
: the lexical scope containing the CODE block, otherwise
: the macro call scope is searched, or emit a compile time error.
: 
: $a is spliced into the say as either a string or AST, not
: as a runtime use of $a.  If the snippet was:
: 
: $a = '$a';
: return CODE { say $a };
: 
: Then we'd (eventually) get a non-splicing mention of $a, one that
: would refer to the $a in scope at the macro call, I think.
: Is this correct?  

No.  If bare $a is not found in the CODE's scope, it must *bind* to
an existing $a in the macro caller's scope as a runtime use of $a,
or the macro fails.  If the calling code wants to supply arguments
to the macro body, they must come in as ordinary arguments, or use
some modifier that chases up the dynamic stack, such as one of

CALLER::<$a>
ENV::<$a>
COMPILING::<$a>

: Perhaps signatures on CODE forms can be used to specify the variables
: which are to be spliced, and their scope of origin.  I'm posting
: some hypothetical syntax because the post made even less sense without it.
: It's obviously in need of refinement:
: 
:  # non-splicing $a from this scope
:  return CODE () { say $a }; 
: 
:  # non-splicing $a in the scope of the macro call
:  return CODE () { say COMPILING::<$a> };
: 
:  # ast-splicing with dwiminess
:  return CODE ($a) { say $a };
: 
:  # ast-splicing requiring a lexical $a here
:  return CODE (OUTER::<$a>) { say $a };
: 
: Traits could be used in the signatures instead of the pseudo packages.
: Sugar to taste.
: 
: This would probably be overloading the meaning of signatures since
: there's no explicit application of the code object to a set of runtime
: arguments.  idea--

Seems to be trying to duplicate the function of the macro's signature,
which is already presumably declaring parameters with various type
signatures.

It's possible that the interpretation of a macro's $a could depend on
its declared type of the variable it is eventually bound to, but we
can't readily extend that idea to dynamicly scoped value or run-time types.

: It seems like I'm currently obsessed with signatures as silver bullets.
: Is there any hope for my peculiar quest?

Dunno.  But then I don't know if there's any hope for my particular
quest either.   :-)

: A minor related query, can the CODE { ... } form appear outside
: of macro returns?  Can we put the AST in a variable, pass it between
: subroutines and do the usual runtime things with it?

I don't see why not.  The behavior is defined in terms of the current
lexical scope, so it's not required that that particular lexical
scope be supplied by a macro.

: This sig seems particularly apt here,

Signatures are overrated.  :-)

: Brad
: 
: -- 
: When one is not capable of true intelligence, it is good to consult with
: someone of good sense. -- Hagakure http://bereft.net/hagakure/

It's not entirely clear to me that we should trust the advice of someone
who was prevented from committing seppuku only by edict of Tokugawa.  :-)

But the scary part about that quote is that it seems to be saying that
if you have true intelligence you don't need good sense.

Larry


CODE {...} mentioning variables without interpolation

2006-02-17 Thread Brad Bowman

Hi again,

L

Is it possible to refer to a variable in a CODE quotation without
splicing it in as an AST or string?  I can't see how this is
be possible under S06, unless using OUTER:: is intended to be 
a non-splicing variable mention.  


The sample snippet in S06 seems simple but got me confused.
I'll explain my interpretation so the confusion can be removed
from the rules or my understanding, where ever it's found to be.
The snippet:

 return CODE { say $a };

The snippet will probably be found inside a macro and will be run
during the macro's expansion elsewhere.  When it is run,
an AST for "say $a" is produced that searches for $a in
the lexical scope containing the CODE block, otherwise
the macro call scope is searched, or emit a compile time error.

$a is spliced into the say as either a string or AST, not
as a runtime use of $a.  If the snippet was:

$a = '$a';
return CODE { say $a };

Then we'd (eventually) get a non-splicing mention of $a, one that
would refer to the $a in scope at the macro call, I think.
Is this correct?  



Perhaps signatures on CODE forms can be used to specify the variables
which are to be spliced, and their scope of origin.  I'm posting
some hypothetical syntax because the post made even less sense without it.
It's obviously in need of refinement:

 # non-splicing $a from this scope
 return CODE () { say $a }; 


 # non-splicing $a in the scope of the macro call
 return CODE () { say COMPILING::<$a> };

 # ast-splicing with dwiminess
 return CODE ($a) { say $a };

 # ast-splicing requiring a lexical $a here
 return CODE (OUTER::<$a>) { say $a };

Traits could be used in the signatures instead of the pseudo packages.
Sugar to taste.

This would probably be overloading the meaning of signatures since
there's no explicit application of the code object to a set of runtime
arguments.  idea--

It seems like I'm currently obsessed with signatures as silver bullets.
Is there any hope for my peculiar quest?


A minor related query, can the CODE { ... } form appear outside
of macro returns?  Can we put the AST in a variable, pass it between
subroutines and do the usual runtime things with it?


This sig seems particularly apt here,

Brad

--
When one is not capable of true intelligence, it is good to consult with
someone of good sense. -- Hagakure http://bereft.net/hagakure/


Selective String Interpolation

2006-02-17 Thread Brad Bowman

Hello,

When building code strings in Perl 5 I usually write the code,
then wrap it in double quotes, then "\" escape everything light blue
under syntax highlighting.  I was wondering if there'll a better
way in Perl 6.  


I thought it would be nice to define the variables you wish to
interpolate individually, perhaps as extensions to the :s, :a,
etc quote adverbs, perhaps using a signature object.  


Since user-defined quotes are possible it shouldn't be too hard
to add in any case, so I might just leave off the waffling there.

One more waffle:

Closure interpolation seems largely incompatible "" strings.
Interpolation restricted by variable doesn't help this anyway
so perhaps there's a more general solution covering all cases.
(or just q:c(0))

Another possibility is that code string suppport may be less 
important given the macro and grammar tools available.
On the other hand, the code is not always being generated for 
immediate consumption and is not always Perl.


That probably counts as at least two waffles,

Brad

...Maybe CODE's dwimmy binding could be abused: (CODE { ... }).perl

--
The occurrence of mysteries is alway by word of mouth.   -- Hagakure


Re: some newbie questions about synopsis 5

2006-02-17 Thread Larry Wall
On Fri, Feb 17, 2006 at 08:32:18AM -0600, Patrick R. Michaud wrote:
: > The synopsis says:
: > 
: > * If a subrule appears two (or more) times in the same lexical scope
: >   (i.e. twice within the same subpattern and alternation), or if the
: >   subrule is quantified anywhere within the entire rule, then its
: >   corresponding hash entry is always assigned a reference to an array
: >   of Match objects, rather than a single Match object.
: > 
: > Maybe you're not the right person to ask, but is there a particular
: > reason for the "entire rule" bit?
: > 
: > / (|None)  () /
: > 
: > Here we get three Matches $0 (possibly undefined), $, and
: > $1. At least, I think so.
: > 
: > / (?)  () /
: > 
: > Now, we suddenly get three more or less unrelated arrays with lengths
: > 1..1, 1, and 1. Of course, I admit this example is a bit artificial.
: 
: Oh, I hadn't caught that particular clause (or hadn't read it as
: you just did).  PGE certainly doesn't implement things that way.
: I think the "entire rule" clause was intended to cover cases like
: 
: / [  ]* /
: 
: where  is indirectly quantified and therefore is an array of
: match objects.  We should probably reword it, or get a clarification
: of what is intended.  (Damian, @Larry:  can you confirm or clarify
: this for us?)

I believe that was the intent, but I'll defer to Damian on the wordsmithing
because I'm a bit out of sorts at the moment and it'd probably come out
all sideways.

Larry


Re: some newbie questions about synopsis 5

2006-02-17 Thread Patrick R. Michaud
On Fri, Feb 17, 2006 at 02:33:12PM +0100, H. Stelling wrote:
> Patrick R. Michaud wrote:
> >>In the following,
> >>
> >>/ (a) [ (b) (c) | $5 := (d) $0 := (e) ] (f) /
> >>
> >>does the first alias have any effect on where the f's will go
> >>(probably not)?
> >
> >I'll defer to @Larry on this one, but my initial impression is
> >that the (f) capture would go into $6.
> 
> I think that sequences should behave exactly as single branch
> alternations (only that there is no such thing, although we
> can write "[foo|]"). So I would rather opt for $1.

The current implementation is that a capturing subpattern
is indexed based on the largest index in all of the alternation
branches.  I'm not sure it makes sense to base it on aliases of 
the last alternation branch.  

Here are some examples we can chew on:

/ (a) [ (b) (c) | (d) ] (f) / # (f) is $3 or $2?  (currently $3)

/ (a) [ (b) (c) | $1 := (d) ] (f) /   # (f) is $3 or $2?

Since the second example is essentially saying the same as the first,
the (f) capture ought to go to the same place in each case.  If we
say that the existence of the $1 causes the (f) to go into $2, it
also becomes the case that $2 is an array of match objects, which
isn't technically problematic but it might be a bit surprising for
many.

Some other examples to consider:

/ (a) [ (b) (c) | $0 := (d) ] (f) /   # (f) is $3 or $1?  

/ (a) [ (b) (c) | $0 := (d) (3) ] (f) /   # (f) is $3 or $2? 

At any rate, I find that having a subpattern capture base its
index on the highest index of all of the previous alternation
branches is easy to understand and works well in practice.  It can
also be easily changed with another alias if needed.

> But wouldn't it be nice if the same rules applied to aliases and
> subrule invocations, that is, recursion put aside, to think of
> 
> /  /
> 
> simply as a shorter way to say
> 
> / $ := ([definition of foo]:) /?

First, is that colon following "[definition of foo]" intentional or
a typo?  Currently we can backtrack into subrules -- there's no "cut"
assumed after them.

But secondly, I'm not sure we can casually toss recursion
aside when thinking about this, since it's really a driving force 
behind having named subrules.  :-)  There's also a difference in
that subrules can take arguments, as in , or can come
from another grammar, as in , which seems to argue that 
 is really something other than an alias shorthand.

> The synopsis says:
> 
> * If a subrule appears two (or more) times in the same lexical scope
>   (i.e. twice within the same subpattern and alternation), or if the
>   subrule is quantified anywhere within the entire rule, then its
>   corresponding hash entry is always assigned a reference to an array
>   of Match objects, rather than a single Match object.
> 
> Maybe you're not the right person to ask, but is there a particular
> reason for the "entire rule" bit?
> 
> / (|None)  () /
> 
> Here we get three Matches $0 (possibly undefined), $, and
> $1. At least, I think so.
> 
> / (?)  () /
> 
> Now, we suddenly get three more or less unrelated arrays with lengths
> 1..1, 1, and 1. Of course, I admit this example is a bit artificial.

Oh, I hadn't caught that particular clause (or hadn't read it as
you just did).  PGE certainly doesn't implement things that way.
I think the "entire rule" clause was intended to cover cases like

/ [  ]* /

where  is indirectly quantified and therefore is an array of
match objects.  We should probably reword it, or get a clarification
of what is intended.  (Damian, @Larry:  can you confirm or clarify
this for us?)

> Furthermore, I think "within the same subpattern and alternation" is
> not quite correct, at least it wouldn't apply to somethink like
> 
> / ( [  | ... ]) /
>
> unless we consider the (...) sequence as a kind of single branch
> alternation. And why are alternation branches considered to be
> lexical scopes, anyway? 

In the example you give, $0 is indeed an array of match objects.
The "same alternation" in this case is the subpattern... compare to

   / ( [  | ... ]) |  /

$0 is an array, $ is a single match object.

Alternation branches don't create new lexical scopes, they just
affect quantification and subpattern numbering.  In both of the 
following examples

/ abc  def  /

/ ghi  | jkl  /

each  has the same lexical scope ($), but in the "abc"
example $ is an array of match objects, while in the "ghi"
example $ is a single match object.

> My second question is why adding a "?" or "??" to an unquantified
> subrule which would otherwise result in a single Match object should
> result in an array, rather than a single (possibly undefined) Match.

The specification was originally this way but was later changed
to the current definition.  I think people found the idea of
"?" producing a single match object confusing, so for consistency
we ended up with "all quantifiers produces arrays of match objects".

(Note also that even if "?" produced a single Match obj

Re: some newbie questions about synopsis 5

2006-02-17 Thread H. Stelling

Patrick R. Michaud wrote:


In the following,

/ (a) [ (b) (c) | $5 := (d) $0 := (e) ] (f) /

does the first alias have any effect on where the f's will go
(probably not)?
   



I'll defer to @Larry on this one, but my initial impression is
that the (f) capture would go into $6.


I think that sequences should behave exactly as single branch
alternations (only that there is no such thing, although we
can write "[foo|]"). So I would rather opt for $1.


- Which rules do apply to repeated captures with the same alias? For
example,
the second array aliasing example

m:w/ Mr?s? @ :=  W\. @ := 
  | Mr?s? @ := 
  /;

seems to suggests that by using $, the lower branch would have
resulted in a single Match object instead of an array (like the array we
would have gotten if we hadn't used the aliases in the first place). Is
this right? 
   



Yes, that's correct.


But wouldn't it be nice if the same rules applied to aliases and
subrule invocations, that is, recursion put aside, to think of

/  /

simply as a shorter way to say

/ $ := ([definition of foo]:) /?

And I've got two more somewhat related questions:

The synopsis says:

* If a subrule appears two (or more) times in the same lexical scope
  (i.e. twice within the same subpattern and alternation), or if the
  subrule is quantified anywhere within the entire rule, then its
  corresponding hash entry is always assigned a reference to an array
  of Match objects, rather than a single Match object.

Maybe you're not the right person to ask, but is there a particular
reason for the "entire rule" bit?

/ (|None)  () /

Here we get three Matches $0 (possibly undefined), $, and
$1. At least, I think so.

/ (?)  () /

Now, we suddenly get three more or less unrelated arrays with lengths
0..1, 1, and 1. Of course, I admit this example is a bit artificial.

Furthermore, I think "within the same subpattern and alternation" is
not quite correct, at least it wouldn't apply to somethink like

/ ( [  | ... ]) /

unless we consider the (...) sequence as a kind of single branch
alternation. And why are alternation branches considered to be
lexical scopes, anyway? Just because of subpattern numbering?

My second question is why adding a "?" or "??" to an unquantified
subrule which would otherwise result in a single Match object should
result in an array, rather than a single (possibly undefined) Match.
That is, why doesn't "?" rather behave like "[|]"?
This would save us the trouble to create all these tiny arrays, or
having to write "[...|]" all the time. Or maybe one could
define one's own quantifiers?