On Mon, Oct 27, 2008 at 04:07:39PM -0500, [EMAIL PROTECTED] wrote:
> > On Sun, Oct 26, 2008 at 10:45 PM, Chris Dolan <[EMAIL PROTECTED]> wrote:
> >> S05 always uses single curlies for closures, but throughout Parrot, code
> >> seems to use double curlies in PGE regexps. Why is that?
> >>
> >> That is, why this:
> >> m/ foo {{ say "found foo" }} /
> >> and not this:
> >> m/ foo { say "found foo" } /
> >>
> >> The latter complains about "Statement not terminated properly".
> >>
> > this is old PGE syntax that hasn't yet been updated to match S05. it's a
> > bug.
Sometimes working with Parrot is a lesson in archaeology -- by
digging deep enough you see artifacts of a different age.
In order to properly parse curlies in regexes one has to have
a full-fledged (Perl 6 / Python / PHP/ whatever) parser available
to be able to find the closing curly. Long ago, before we had
any such parsers, people wanted to be able to embed executable
code inside of regexes -- indeed, the parsers themselves make use
of this capability. So, PGE used the double-curly notation as
a stop-gap -- it can easily "scan ahead" to the closing double-curly
to find the end of the embedded code and squirrel it away for
execution.
This was all done before S05 defined the :lang modifier for
embedded closures in regexes, and at the moment it only works
for embedding PIR.
Returning to present day, the next step will be to enable the
:lang modifier in regexes so that it can attach to any compiler.
However, since Perl 6 regular expression parsing is about to
receive a huge makeover anyway, I was waiting until then to
work out those details.
> Thanks for the info. I'll try to learn enough to write a PGE patch to
> support {PIR} notation.
Note that {PIR} is not valid by itself, it would need to be
:lang('PIR') {PIR}
or something like that. See S05 for the latest syntax. The default
interpretation of curlies with no :lang modifier will undoubtedly
be Perl 6 syntax (although we may make it default to "whatever
language the regex is embedded in" if we can do that).
Pm