Author: larry
Date: Fri Mar 9 11:23:09 2007
New Revision: 14323
Modified:
doc/trunk/design/syn/S05.pod
Log:
Add :b/:basechar modifier as suggested by ruoso++.
Modified: doc/trunk/design/syn/S05.pod
==============================================================================
--- doc/trunk/design/syn/S05.pod (original)
+++ doc/trunk/design/syn/S05.pod Fri Mar 9 11:23:09 2007
@@ -14,9 +14,9 @@
Maintainer: Patrick Michaud <[EMAIL PROTECTED]> and
Larry Wall <[EMAIL PROTECTED]>
Date: 24 Jun 2002
- Last Modified: 28 Feb 2007
+ Last Modified: 9 Feb 2007
Number: 5
- Version: 53
+ Version: 54
This document summarizes Apocalypse 5, which is about the new regex
syntax. We now try to call them I<regex> rather than "regular
@@ -126,10 +126,26 @@
The single-character modifiers also have longer versions:
:i :ignorecase
+ :b :basechar
:g :global
=item *
+The C<:i> (or C<:ignorecase>) modifier causes case distinctions to be
+ignore in its lexical scope, but not in its dynamic scope. That is,
+subrules always use their own case settings.
+
+=item *
+
+The C<:b> (or C<:basechar>) modifier scopes exactly like C<:ignorecase>
+except that it ignores accents instead of case. It is equivalent
+to taking each grapheme (in both target and pattern), converting
+both to NFD (maximally decomposed) and then comparing the two base
+characters (Unicode non-mark characters) while ignoring any trailing
+mark characters.
+
+=item *
+
The C<:c> (or C<:continue>) modifier causes the pattern to continue
scanning from the string's current C<.pos>:
@@ -630,8 +646,9 @@
As with a scalar variable, each element is matched as a literal
unless it happens to be a C<Regex> object, in which case it is matched
as a subrule. As with scalar subrules, a tainted subrule always fails.
-All string values pay attention to the current C<:ignorecase> setting,
-while C<Regex> values use their own C<:ignorecase> settings.
+All string values pay attention to the current C<:ignorecase>
+and C<:basechar> settings, while C<Regex> values use their own
+C<:ignorecase> and C<:basechar> settings.
When you get tired of writing:
@@ -733,7 +750,8 @@
=back
All hash keys, and values that are strings, pay attention to the
-C<:ignorecase> setting. (Subrules maintain their own case settings.)
+C<:ignorecase> and C<:basechar> settings. (Subrules maintain their
+own case settings.)
You may combine multiple hashes under the same longest-token
consideration by using declarative alternation: