Author: larry
Date: Fri Feb 24 16:04:13 2006
New Revision: 7860
Modified:
doc/trunk/design/syn/S05.pod
Log:
* Added $() access to "result" object.
* Added <( pat )> matcher to capture simple result object.
* Changed old <(...)> assertion to <?{...}> and <!{...}>,
which is more consistent with other callouts to code.
Modified: doc/trunk/design/syn/S05.pod
==============================================================================
--- doc/trunk/design/syn/S05.pod (original)
+++ doc/trunk/design/syn/S05.pod Fri Feb 24 16:04:13 2006
@@ -13,9 +13,9 @@
Maintainer: Patrick Michaud <[EMAIL PROTECTED]>
Date: 24 Jun 2002
- Last Modified: 24 Feb 2006
+ Last Modified: 25 Feb 2006
Number: 5
- Version: 10
+ Version: 11
This document summarizes Apocalypse 5, which is about the new regex
syntax. We now try to call them "rules" because they haven't been
@@ -613,24 +613,40 @@
=item *
-A leading C<(> indicates a code assertion:
+A leading C<?{> or C<!{>indicates a code assertion:
- / (\d**{1..3}) <( $0 < 256 )> /
+ / (\d**{1..3}) <?{ $0 < 256 }> /
+ / (\d**{1..3}) <!{ $0 < 256 }> /
Similar to:
/ (\d**{1..3}) { $0 < 256 or fail } /
+ / (\d**{1..3}) { $0 < 256 and fail } /
Unlike closures, code assertions are not guaranteed to be run at the
canonical time if the optimizer can prove something later can't match.
So you can sneak in a call to a non-canonical closure that way:
- /^foo .* <( do { say "Got here!" } or 1 )> .* bar$/
+ /^foo .* <?{ do { say "Got here!" } or 1 }> .* bar$/
The C<do> block is unlikely to run unless the string ends with "C<bar>".
=item *
+A leading C<(> indicates the start of a result capture:
+
+ / foo <( \d+ )> bar /
+
+is equivalent to:
+
+ / <before foo> \d+ <after bar> /
+
+except that the scan for "foo" can be done in the forward direction,
+when a lookbehind assertion would scan for \d+ and then match "foo"
+backwards.
+
+=item *
+
A leading C<[> or C<+> indicates an enumerated character class. Ranges
in enumerated character classes are indicated with C<..>.
@@ -1041,14 +1057,19 @@
=item *
-A match always returns a "match object", which is also available as
-(lexical) C<$/> (except within a closure lexically embedded in a rule,
-where C<$/> always refers to the current match, not any submatch done
-within the closure).
+A match always returns a "match object", which is also available
+as C<$/>, which is an environmental lexical declared in the outer
+subroutine that is calling the rule. (A closure lexically embedded
+in a rule does not redeclare C<$/>, so C<$/> always refers to the
+current match, not any prior submatch done within the closure).
=item *
-The match object evaluates differently in different contexts:
+Notionally, a match object contains (among other things) a boolean
+success value, a scalar "result object", an array of ordered submatch
+objects, and a hash of named submatch objects. To provide convenient
+access to these various values, the match object evaluates differently
+in different contexts:
=over
@@ -1083,7 +1104,7 @@
=item *
-When used as a closure, a Match object evaluates to its underlying
+When called as a closure, a Match object evaluates to its underlying
result object. Usually this is just the entire match string, but
you can override that by calling C<return> inside a rule:
@@ -1093,6 +1114,18 @@
# match succeeds -- ignore the rest of the rule
}.();
+C<$()> is a shorthand for C<$/.()> or C<$/()>. The result object
+may contain any object, not just a string.
+
+You may also capture a subset of the match as the result object using
+the C<< <(...)> construct:
+
+ "foo123bar" ~~ / foo <( \d+ \> bar /
+ say $(); # says 123
+
+In this case the result object is always a string when doing string
+matching, and a list of one or more elements when doing array matching.
+
=item *
When used as an array, a Match object pretends to be an array of all
@@ -1175,9 +1208,19 @@
incomplete C<Match> object (which can be modified via the internal C<$/>.
For example:
- $str ~~ / foo # Match 'foo'
+ $str ~~ / foo # Match 'foo'
{ $/ = 'bar' } # But pretend we matched 'bar'
/;
+ say $/; # says 'bar'
+
+This is slightly dangerous, insofar as you might return something that
+does not behave like a C<Match> object to some context that requires
+one. Fortunately, you normally just want to return a result object instead:
+
+ $str ~~ / foo # Match 'foo'
+ { return 'bar' } # But pretend we matched 'bar'
+ /;
+ say $(); # says 'bar'
=back