pcre pattern.syntax.xml

Aidan Lister Mon, 06 Dec 2004 19:29:24 -0800

aidan           Mon Dec  6 22:29:17 2004 EDT


  Modified files:              
    /phpdoc/en/reference/pcre   pattern.syntax.xml 
  Log:
  whitespace fixes

http://cvs.php.net/diff.php/phpdoc/en/reference/pcre/pattern.syntax.xml?r1=1.4&r2=1.5&ty=u
Index: phpdoc/en/reference/pcre/pattern.syntax.xml
diff -u phpdoc/en/reference/pcre/pattern.syntax.xml:1.4 
phpdoc/en/reference/pcre/pattern.syntax.xml:1.5
--- phpdoc/en/reference/pcre/pattern.syntax.xml:1.4     Wed Aug 11 16:15:29 2004
+++ phpdoc/en/reference/pcre/pattern.syntax.xml Mon Dec  6 22:29:16 2004
@@ -1,5 +1,5 @@
 <?xml version="1.0" encoding="iso-8859-1"?>
-<!-- $Revision: 1.4 $ -->
+<!-- $Revision: 1.5 $ -->
 <!-- splitted from ./en/functions/pcre.xml, last change in rev 1.2 -->
   <refentry id="reference.pcre.pattern.syntax">
    <refnamediv>
@@ -38,109 +38,105 @@
      </listitem>
      <listitem>
       <simpara>
-     PCRE does not allow repeat quantifiers on lookahead
-     assertions. Perl permits them, but they do not mean what you
-     might think. For example, (?!a){3} does not assert that  the
-     next three characters are not "a". It just asserts that the
-     next character is not "a" three times.
+       PCRE does not allow repeat quantifiers on lookahead
+       assertions. Perl permits them, but they do not mean what you
+       might think. For example, (?!a){3} does not assert that  the
+       next three characters are not "a". It just asserts that the
+       next character is not "a" three times.
       </simpara>
      </listitem>
      <listitem>
       <simpara>
-     Capturing subpatterns that occur inside negative
-     lookahead assertions are counted, but their entries in the
-     offsets vector are never set. Perl sets its numerical
-     variables from any such patterns that are matched before the
-     assertion fails to match something (thereby succeeding), but
-     only  if  the negative lookahead assertion contains just one
-     branch.
+       Capturing subpatterns that occur inside negative
+       lookahead assertions are counted, but their entries in the
+       offsets vector are never set. Perl sets its numerical
+       variables from any such patterns that are matched before the
+       assertion fails to match something (thereby succeeding), but
+       only  if  the negative lookahead assertion contains just one
+       branch.
       </simpara>
      </listitem>
      <listitem>
       <simpara>
-     Though binary zero characters are supported in  the  subject  string,
-     they are not allowed in a pattern string because it is passed as a
-     normal C string, terminated  by zero. The escape sequence "\\x00" can
-     be used in the pattern to represent a binary zero.
+       Though binary zero characters are supported in  the  subject  string,
+       they are not allowed in a pattern string because it is passed as a
+       normal C string, terminated  by zero. The escape sequence "\\x00" can
+       be used in the pattern to represent a binary zero.
       </simpara>
       </listitem>
       <listitem>
       <simpara>
-     The following Perl escape sequences  are  not  supported:
-     \l,  \u,  \L,  \U,  \E, \Q. In fact these are implemented by
-     Perl's general string-handling and are not part of its
-     pattern matching engine.
+       The following Perl escape sequences  are  not  supported:
+       \l,  \u,  \L,  \U,  \E, \Q. In fact these are implemented by
+       Perl's general string-handling and are not part of its
+       pattern matching engine.
       </simpara>
       </listitem>
       <listitem>
       <simpara>
-     The Perl \G assertion is  not  supported  as  it  is  not
-     relevant to single pattern matches.
+       The Perl \G assertion is  not  supported  as  it  is  not
+       relevant to single pattern matches.
       </simpara>
       </listitem>
       <listitem>
       <simpara>
-     Fairly obviously, PCRE does not support the (?{code})
-     construction.
+       Fairly obviously, PCRE does not support the (?{code})
+       construction.
       </simpara>
       </listitem>
       <listitem>
       <simpara>
-     There are at the time of writing some  oddities  in  Perl
-     5.005_02  concerned  with  the  settings of captured strings
-     when part of a pattern is repeated.  For  example,  matching
-     "aba"  against the pattern /^(a(b)?)+$/ sets $2 to the value
-     "b", but matching "aabbaa" against /^(aa(bb)?)+$/ leaves  $2
-     unset.    However,    if   the   pattern   is   changed   to
-     /^(aa(b(b))?)+$/ then $2 (and $3) get set.
-     In Perl 5.004 $2 is set in both cases, and that is also &true;
-     of PCRE. If in the future Perl changes to a consistent state
-     that is different, PCRE may change to follow.
+       There are at the time of writing some  oddities  in  Perl
+       5.005_02  concerned  with  the  settings of captured strings
+       when part of a pattern is repeated.  For  example,  matching
+       "aba"  against the pattern /^(a(b)?)+$/ sets $2 to the value
+       "b", but matching "aabbaa" against /^(aa(bb)?)+$/ leaves  $2
+       unset.    However,    if   the   pattern   is   changed   to
+       /^(aa(b(b))?)+$/ then $2 (and $3) get set.
+       In Perl 5.004 $2 is set in both cases, and that is also &true;
+       of PCRE. If in the future Perl changes to a consistent state
+       that is different, PCRE may change to follow.
       </simpara>
       </listitem>
       <listitem>
       <simpara>
-     Another as yet unresolved discrepancy  is  that  in  Perl
-     5.005_02  the  pattern /^(a)?(?(1)a|b)+$/ matches the string
-     "a", whereas in PCRE it does not.  However, in both Perl and
-     PCRE /^(a)?a/ matched against "a" leaves $1 unset.
+       Another as yet unresolved discrepancy  is  that  in  Perl
+       5.005_02  the  pattern /^(a)?(?(1)a|b)+$/ matches the string
+       "a", whereas in PCRE it does not.  However, in both Perl and
+       PCRE /^(a)?a/ matched against "a" leaves $1 unset.
       </simpara>
       </listitem>
       <listitem>
       <para>
-     PCRE  provides  some  extensions  to  the  Perl  regular
-     expression facilities:
+       PCRE  provides  some  extensions  to  the  Perl  regular
+       expression facilities:
         <orderedlist>
          <listitem>
           <simpara>
-     Although lookbehind assertions must match  fixed  length
-     strings,  each  alternative branch of a lookbehind assertion
-     can match a different length of string. Perl 5.005  requires
-     them all to have the same length.
+           Although lookbehind assertions must match  fixed  length
+           strings,  each  alternative branch of a lookbehind assertion
+           can match a different length of string. Perl 5.005  requires
+           them all to have the same length.
          </simpara>
         </listitem>
         <listitem>
          <simpara>
-          If <link
-           
linkend="reference.pcre.pattern.modifiers">PCRE_DOLLAR_ENDONLY</link>
-          is set and <link
-           linkend="reference.pcre.pattern.modifiers">PCRE_MULTILINE</link> is
+          If <link 
linkend="reference.pcre.pattern.modifiers">PCRE_DOLLAR_ENDONLY</link>
+          is set and <link 
linkend="reference.pcre.pattern.modifiers">PCRE_MULTILINE</link> is
           not set, the $ meta-character matches only at the very end of the
           string.
          </simpara>
         </listitem>
         <listitem>
          <simpara>
-          If <link
-           linkend="reference.pcre.pattern.modifiers">PCRE_EXTRA</link> is
+          If <link 
linkend="reference.pcre.pattern.modifiers">PCRE_EXTRA</link> is
           set, a backslash followed by a letter with no special meaning is
           faulted.
          </simpara>
         </listitem>
         <listitem>
          <simpara>
-          If <link
-           linkend="reference.pcre.pattern.modifiers">PCRE_UNGREEDY</link> is
+          If <link 
linkend="reference.pcre.pattern.modifiers">PCRE_UNGREEDY</link> is
           set, the greediness of the repetition  quantifiers  is inverted,
           that is, by default they are not greedy, but if followed by a
           question mark they are.
@@ -155,307 +151,202 @@
 
    <refsect1 id="regexp.reference">
     <title>Regular Expression Details</title>
-     <refsect2 id="regexp.introduction">
-      <title>Introduction</title>
-      <para>
-     The syntax and semantics of  the  regular  expressions
-     supported  by PCRE are described below. Regular expressions are
-     also described in the Perl documentation and in a number  of
-     other  books,  some  of which have copious examples. Jeffrey
-     Friedl's  "Mastering  Regular  Expressions",  published   by
-     O'Reilly  (ISBN 1-56592-257-3), covers them in great detail.
-     The description here is intended as reference documentation.
-    </para>
-    <para>
-     A regular expression is a pattern that is matched against  a
-     subject string from left to right. Most characters stand for
-     themselves in a pattern, and match the corresponding
-     characters in the subject. As a trivial example, the pattern
-       <literal>The quick brown fox</literal>
-     matches a portion of a subject string that is  identical  to
-     itself.  
-    </para>
+    <refsect2 id="regexp.introduction">
+     <title>Introduction</title>
+     <para>
+      The syntax and semantics of  the  regular  expressions
+      supported  by PCRE are described below. Regular expressions are
+      also described in the Perl documentation and in a number  of
+      other  books,  some  of which have copious examples. Jeffrey
+      Friedl's  "Mastering  Regular  Expressions",  published   by
+      O'Reilly  (ISBN 1-56592-257-3), covers them in great detail.
+      The description here is intended as reference documentation.
+     </para>
+     <para>
+      A regular expression is a pattern that is matched against  a
+      subject string from left to right. Most characters stand for
+      themselves in a pattern, and match the corresponding
+      characters in the subject. As a trivial example, the pattern
+      <literal>The quick brown fox</literal>
+      matches a portion of a subject string that is  identical  to
+      itself.  
+     </para>
     </refsect2>
     <refsect2 id="regexp.reference.meta">
      <title>Meta-characters</title>
      <para>     
-     The  power  of  regular  expressions comes from the
-     ability to include alternatives and repetitions in the
-     pattern.  These  are encoded in the pattern by the use of 
-     <emphasis>meta-characters</emphasis>, which do not stand for  themselves  
but  instead
-     are interpreted in some special way.
-    </para>
-    <para>
-     There are two different sets of meta-characters: those  that
-     are  recognized anywhere in the pattern except within square
-     brackets, and those that are recognized in square brackets.
-     Outside square brackets, the meta-characters are as follows:
+      The  power  of  regular  expressions comes from the
+      ability to include alternatives and repetitions in the
+      pattern.  These  are encoded in the pattern by the use of 
+      <emphasis>meta-characters</emphasis>, which do not stand for  themselves 
 but  instead
+      are interpreted in some special way.
+     </para>
+     <para>
+      There are two different sets of meta-characters: those  that
+      are  recognized anywhere in the pattern except within square
+      brackets, and those that are recognized in square brackets.
+      Outside square brackets, the meta-characters are as follows:
       <variablelist>
        <varlistentry>
         <term><emphasis>\</emphasis></term>
-        <listitem>
-         <simpara>
-          general escape character with several uses
-         </simpara>
-        </listitem>
+        <listitem><simpara>general escape character with several 
uses</simpara></listitem>
        </varlistentry>
        <varlistentry>
         <term><emphasis>^</emphasis></term>
-           <listitem>
-         <simpara>
-          assert start of subject (or line, in multiline mode)
-         </simpara>
-        </listitem>
+        <listitem><simpara>assert start of subject (or line, in multiline 
mode)</simpara></listitem>
        </varlistentry>
        <varlistentry>
         <term><emphasis>$</emphasis></term>
-           <listitem>
-         <simpara>
-          assert end of subject (or line, in multiline mode)
-         </simpara>
-        </listitem>
+        <listitem><simpara>assert end of subject (or line, in multiline 
mode)</simpara></listitem>
        </varlistentry>
        <varlistentry>
         <term><emphasis>.</emphasis></term>
-           <listitem>
-         <simpara>
-          match any character except newline (by default)
-         </simpara>
-        </listitem>
+        <listitem><simpara>match any character except newline (by 
default)</simpara></listitem>
        </varlistentry>
        <varlistentry>
         <term><emphasis>[</emphasis></term>
-           <listitem>
-         <simpara>
-           start character class definition
-         </simpara>
-        </listitem>
+        <listitem><simpara>start character class 
definition</simpara></listitem>
        </varlistentry>
        <varlistentry>
         <term><emphasis>]</emphasis></term>
-           <listitem>
-         <simpara>
-          end character class definition
-         </simpara>
-        </listitem>
+        <listitem><simpara>end character class definition</simpara></listitem>
        </varlistentry>
        <varlistentry>
         <term><emphasis>|</emphasis></term>
-           <listitem>
-         <simpara>
-           start of alternative branch
-         </simpara>
-        </listitem>
+        <listitem><simpara>start of alternative branch</simpara></listitem>
        </varlistentry>
        <varlistentry>
         <term><emphasis>(</emphasis></term>
-           <listitem>
-         <simpara>
-           start subpattern
-         </simpara>
-        </listitem>
+        <listitem><simpara>start subpattern</simpara></listitem>
        </varlistentry>
        <varlistentry>
         <term><emphasis>)</emphasis></term>
-           <listitem>
-         <simpara>
-          end subpattern
-         </simpara>
-        </listitem>
+        <listitem><simpara>end subpattern</simpara></listitem>
        </varlistentry>
        <varlistentry>
         <term><emphasis>?</emphasis></term>
-           <listitem>
-         <simpara>
-          extends the meaning of (, also 0 or 1 quantifier, also quantifier 
minimizer
-         </simpara>
-        </listitem>
+        <listitem><simpara>extends the meaning of (, also 0 or 1 quantifier, 
also quantifier minimizer</simpara></listitem>
        </varlistentry>
        <varlistentry>
         <term><emphasis>*</emphasis></term>
-           <listitem>
-         <simpara>
-          0 or more quantifier
-         </simpara>
-        </listitem>
+        <listitem><simpara>0 or more quantifier</simpara></listitem>
        </varlistentry>
        <varlistentry>
         <term><emphasis>+</emphasis></term>
-           <listitem>
-         <simpara>
-          1 or more quantifier
-         </simpara>
-        </listitem>
+        <listitem><simpara>1 or more quantifier</simpara></listitem>
        </varlistentry>
        <varlistentry>
         <term><emphasis>{</emphasis></term>
-           <listitem>
-         <simpara>
-          start min/max quantifier
-         </simpara>
-        </listitem>
+        <listitem><simpara>start min/max quantifier</simpara></listitem>
        </varlistentry>
        <varlistentry>
         <term><emphasis>}</emphasis></term>
-           <listitem>
-         <simpara>
-          end min/max quantifier
-         </simpara>
-        </listitem>
+        <listitem><simpara>end min/max quantifier</simpara></listitem>
        </varlistentry>
       </variablelist>
 
-     Part of a pattern that is in square  brackets is called a
-     "character  class". In a character class the only
-     meta-characters are:
+      Part of a pattern that is in square brackets is called a
+      "character class". In a character class the only
+      meta-characters are:
+
       <variablelist>
        <varlistentry>
         <term><emphasis>\</emphasis></term>
-           <listitem>
-         <simpara>
-          general escape character
-         </simpara>
-        </listitem>
+        <listitem><simpara>general escape character</simpara></listitem>
        </varlistentry>
        <varlistentry>
         <term><emphasis>^</emphasis></term>
-           <listitem>
-         <simpara>
-          negate the class, but only if the first character
-         </simpara>
-        </listitem>
+        <listitem><simpara>negate the class, but only if the first 
character</simpara></listitem>
        </varlistentry>
        <varlistentry>
         <term><emphasis>-</emphasis></term>
-           <listitem>
-         <simpara>
-          indicates character range
-         </simpara>
-        </listitem>
+        <listitem><simpara>indicates character range</simpara></listitem>
        </varlistentry>
        <varlistentry>
         <term><emphasis>]</emphasis></term>
-           <listitem>
-         <simpara>
-          terminates the character class
-         </simpara>
-        </listitem>
+        <listitem><simpara>terminates the character class</simpara></listitem>
        </varlistentry>
       </variablelist>
-     The following sections describe the use of each of the
-     meta-characters.
-    </para>
+
+      The following sections describe the use of each of the
+      meta-characters.
+     </para>
     </refsect2>
-   <refsect2 id="regexp.reference.backslash">
-    <title>backslash</title>
+
+    <refsect2 id="regexp.reference.backslash">
+     <title>backslash</title>
+     <para>
+      The backslash character has several uses. Firstly, if it  is
+      followed by a non-alphanumeric character, it takes away any
+      special  meaning that character may have. This use of
+      backslash as an escape character applies both inside and
+      outside character classes.
+     </para>
+     <para>
+      For example, if you want to match a "*" character, you write
+      "\*" in the pattern. This applies whether or not the
+      following character would otherwise be interpreted as a
+      meta-character, so it is always safe to precede a non-alphanumeric
+      with "\" to specify that it stands for itself.  In
+      particular, if you want to match a backslash, you write "\\".
+     </para>
+     <para>
+      If a pattern is compiled with the
+      <link linkend="reference.pcre.pattern.modifiers">PCRE_EXTENDED</link> 
option,
+      whitespace in the pattern (other than in a character class) and
+      characters between a "#" outside a character class and the next newline
+      character are ignored. An escaping backslash can be used to include a
+      whitespace or "#" character as part of the pattern.
+     </para>
+     <para>
+      A second use of backslash provides a way of encoding
+      non-printing characters in patterns in a visible manner. There
+      is no restriction on the appearance of non-printing  characters,
+      apart from the binary zero that terminates a pattern,
+      but when a pattern is being prepared by text editing, it is
+      usually  easier to use one of the following escape sequences
+      than the binary character it represents:
+     </para>
      <para>
-     The backslash character has several uses. Firstly, if it  is
-     followed by a non-alphanumeric character, it takes away any
-     special  meaning that character may have. This use of
-     backslash as an escape character applies both inside and
-     outside character classes.
-    </para>
-    <para>
-     For example, if you want to match a "*" character, you write
-     "\*" in the pattern. This applies whether or not the
-     following character would otherwise be interpreted as a
-     meta-character, so it is always safe to precede a non-alphanumeric
-     with "\" to specify that it stands for itself.  In
-     particular, if you want to match a backslash, you write "\\".
-    </para>
-    <para>
-     If a pattern is compiled with the <link
-      linkend="reference.pcre.pattern.modifiers">PCRE_EXTENDED</link> option,
-     whitespace in the pattern (other than in a character class) and
-     characters between a "#" outside a character class and the next newline
-     character are ignored. An escaping backslash can be used to include a
-     whitespace or "#" character as part of the pattern.
-    </para>
-    <para>
-     A second use of backslash provides a way of encoding
-     non-printing characters in patterns in a visible manner. There
-     is no restriction on the appearance of non-printing  characters,
-     apart from the binary zero that terminates a pattern,
-     but when a pattern is being prepared by text editing, it is
-     usually  easier to use one of the following escape sequences
-     than the binary character it represents:
-    </para>
-    <para>
       <variablelist>
        <varlistentry>
         <term><emphasis>\a</emphasis></term>
-           <listitem>
-         <simpara>
-          alarm, that is, the BEL character (hex 07)
-         </simpara>
-        </listitem>
+        <listitem><simpara>alarm, that is, the BEL character (hex 
07)</simpara></listitem>
        </varlistentry>
        <varlistentry>
         <term><emphasis>\cx</emphasis></term>
-           <listitem>
-         <simpara>
-           "control-x", where x is any character
-         </simpara>
-        </listitem>
+        <listitem><simpara>"control-x", where x is any 
character</simpara></listitem>
        </varlistentry>
        <varlistentry>
         <term><emphasis>\e</emphasis></term>
-           <listitem>
-         <simpara>
-          escape (hex 1B)
-         </simpara>
-        </listitem>
+        <listitem><simpara>escape (hex 1B)</simpara></listitem>
        </varlistentry>
        <varlistentry>
         <term><emphasis>\f</emphasis></term>
-           <listitem>
-         <simpara>
-          formfeed (hex 0C)
-         </simpara>
-        </listitem>
+        <listitem><simpara>formfeed (hex 0C)</simpara></listitem>
        </varlistentry>
        <varlistentry>
         <term><emphasis>\n</emphasis></term>
-           <listitem>
-         <simpara>
-          newline (hex 0A)
-         </simpara>
-        </listitem>
+        <listitem><simpara>newline (hex 0A)</simpara></listitem>
        </varlistentry>
        <varlistentry>
         <term><emphasis>\r</emphasis></term>
-           <listitem>
-         <simpara>
-          carriage return (hex 0D)
-         </simpara>
-        </listitem>
+        <listitem><simpara>carriage return (hex 0D)</simpara></listitem>
        </varlistentry>
        <varlistentry>
         <term><emphasis>\t</emphasis></term>
-           <listitem>
-         <simpara>
-          tab (hex 09)
-         </simpara>
-        </listitem>
+        <listitem><simpara>tab (hex 09)</simpara></listitem>
        </varlistentry>
        <varlistentry>
         <term><emphasis>\xhh</emphasis></term>
-           <listitem>
-         <simpara>
-           character with hex code hh
-         </simpara>
-        </listitem>
+        <listitem><simpara>character with hex code hh</simpara></listitem>
        </varlistentry>
        <varlistentry>
         <term><emphasis>\ddd</emphasis></term>
-           <listitem>
-         <simpara>
-          character with octal code ddd, or backreference
-         </simpara>
-        </listitem>
+        <listitem><simpara>character with octal code ddd, or 
backreference</simpara></listitem>
        </varlistentry>
       </variablelist>
-    </para>
+     </para>
     <para>
      The precise effect of "<literal>\cx</literal>" is as follows: 
      if "<literal>x</literal>" is a lower case  letter, it is converted
@@ -496,83 +387,63 @@
      stand for themselves.  For example:
     </para>
     <para>
-      <variablelist>
-       <varlistentry>
-        <term><emphasis>\040</emphasis></term>
-           <listitem>
-         <simpara>
-          is another way of writing a space
-         </simpara>
-        </listitem>
-       </varlistentry>
-       <varlistentry>
-        <term><emphasis>\40</emphasis></term>
-           <listitem>
-         <simpara>
-          is the same, provided there are fewer than 40
-          previous capturing subpatterns
-         </simpara>
-        </listitem>
-       </varlistentry>
-       <varlistentry>
-        <term><emphasis>\7</emphasis></term>
-           <listitem>
-         <simpara>
-          is always a back reference
-         </simpara>
-        </listitem>
-       </varlistentry>
-       <varlistentry>
-        <term><emphasis>\11</emphasis></term>
-           <listitem>
-         <simpara>
-          might be a back reference, or another way of
-          writing a tab
-         </simpara>
-        </listitem>
-       </varlistentry>
-       <varlistentry>
-        <term><emphasis>\011</emphasis></term>
-           <listitem>
-         <simpara>
-          is always a tab
-         </simpara>
-        </listitem>
-       </varlistentry>
-       <varlistentry>
-        <term><emphasis>\0113</emphasis></term>
-           <listitem>
-         <simpara>
-          is a tab followed by the character "3"
-         </simpara>
-        </listitem>
-       </varlistentry>
-       <varlistentry>
-        <term><emphasis>\113</emphasis></term>
-           <listitem>
-         <simpara>
-          is the character with octal code 113 (since there
-          can be no more than 99 back references)
-         </simpara>
-        </listitem>
-       </varlistentry>
-       <varlistentry>
-        <term><emphasis>\377</emphasis></term>
-           <listitem>
-         <simpara>
-           is a byte consisting entirely of 1 bits
-         </simpara>
-        </listitem>
-       </varlistentry>
-       <varlistentry>
-        <term><emphasis>\81</emphasis></term>
-           <listitem>
-         <simpara>
-          is either a back reference, or a binary zero
-          followed by the two characters "8" and "1"
-         </simpara>
-        </listitem>
-       </varlistentry>
+     <variablelist>
+      <varlistentry>
+       <term><emphasis>\040</emphasis></term>
+       <listitem><simpara>is another way of writing a 
space</simpara></listitem>
+      </varlistentry>
+      <varlistentry>
+       <term><emphasis>\40</emphasis></term>
+       <listitem>
+        <simpara>
+         is the same, provided there are fewer than 40
+         previous capturing subpatterns
+        </simpara>
+       </listitem>
+      </varlistentry>
+      <varlistentry>
+       <term><emphasis>\7</emphasis></term>
+       <listitem><simpara>is always a back reference</simpara></listitem>
+      </varlistentry>
+      <varlistentry>
+       <term><emphasis>\11</emphasis></term>
+       <listitem>
+        <simpara>
+         might be a back reference, or another way of
+         writing a tab
+        </simpara>
+       </listitem>
+      </varlistentry>
+      <varlistentry>
+       <term><emphasis>\011</emphasis></term>
+       <listitem><simpara>is always a tab</simpara></listitem>
+      </varlistentry>
+      <varlistentry>
+       <term><emphasis>\0113</emphasis></term>
+       <listitem><simpara>is a tab followed by the character 
"3"</simpara></listitem>
+      </varlistentry>
+      <varlistentry>
+       <term><emphasis>\113</emphasis></term>
+       <listitem>
+        <simpara>
+         is the character with octal code 113 (since there
+         can be no more than 99 back references)
+        </simpara>
+       </listitem>
+      </varlistentry>
+      <varlistentry>
+       <term><emphasis>\377</emphasis></term>
+       <listitem><simpara>is a byte consisting entirely of 1 
bits</simpara></listitem>
+      </varlistentry>
+      <varlistentry>
+       <term><emphasis>\81</emphasis></term>
+       <listitem>
+        <simpara>
+         is either a back reference, or a binary zero
+         followed by the two characters "8" and "1"
+        </simpara>
+       </listitem>
+      </varlistentry>
      </variablelist>
     </para>
     <para>
@@ -592,56 +463,32 @@
      character types:
     </para>
     <para>
-      <variablelist>
-       <varlistentry>
-        <term><emphasis>\d</emphasis></term>
-           <listitem>
-         <simpara>
-          any decimal digit
-         </simpara>
-        </listitem>
-       </varlistentry>
-       <varlistentry>
-        <term><emphasis>\D</emphasis></term>
-           <listitem>
-         <simpara>
-          any character that is not a decimal digit
-         </simpara>
-        </listitem>
-       </varlistentry>
-       <varlistentry>
-        <term><emphasis>\s</emphasis></term>
-           <listitem>
-         <simpara>
-          any whitespace character
-         </simpara>
-        </listitem>
-       </varlistentry>
-       <varlistentry>
-        <term><emphasis>\S</emphasis></term>
-           <listitem>
-         <simpara>
-          any character that is not a whitespace character
-         </simpara>
-        </listitem>
-       </varlistentry>
-       <varlistentry>
-        <term><emphasis>\w</emphasis></term>
-           <listitem>
-         <simpara>
-          any "word" character
-         </simpara>
-        </listitem>
-       </varlistentry>
-       <varlistentry>
-        <term><emphasis>\W</emphasis></term>
-           <listitem>
-         <simpara>
-          any "non-word" character
-         </simpara>
-        </listitem>
-       </varlistentry>
-      </variablelist>
+     <variablelist>
+      <varlistentry>
+       <term><emphasis>\d</emphasis></term>
+       <listitem><simpara>any decimal digit</simpara></listitem>
+      </varlistentry>
+      <varlistentry>
+       <term><emphasis>\D</emphasis></term>
+       <listitem><simpara>any character that is not a decimal 
digit</simpara></listitem>
+      </varlistentry>
+      <varlistentry>
+       <term><emphasis>\s</emphasis></term>
+       <listitem><simpara>any whitespace character</simpara></listitem>
+      </varlistentry>
+      <varlistentry>
+       <term><emphasis>\S</emphasis></term>
+       <listitem><simpara>any character that is not a whitespace 
character</simpara></listitem>
+      </varlistentry>
+      <varlistentry>
+       <term><emphasis>\w</emphasis></term>
+       <listitem><simpara>any "word" character</simpara></listitem>
+      </varlistentry>
+      <varlistentry>
+       <term><emphasis>\W</emphasis></term>
+       <listitem><simpara>any "non-word" character</simpara></listitem>
+      </varlistentry>
+     </variablelist>
     </para>
     <para>
      Each pair of escape sequences partitions the complete set of
@@ -677,44 +524,28 @@
      <variablelist>
       <varlistentry>
        <term><emphasis>\b</emphasis></term>
-          <listitem>
-        <simpara>
-         word boundary
-        </simpara>
-       </listitem>
+       <listitem><simpara>word boundary</simpara></listitem>
       </varlistentry>
       <varlistentry>
        <term><emphasis>\B</emphasis></term>
-          <listitem>
-        <simpara>
-          not a word boundary
-        </simpara>
-       </listitem>
+       <listitem><simpara>not a word boundary</simpara></listitem>
       </varlistentry>
       <varlistentry>
        <term><emphasis>\A</emphasis></term>
-          <listitem>
-        <simpara>
-         start of subject (independent of multiline mode)
-        </simpara>
-       </listitem>
+       <listitem><simpara>start of subject (independent of multiline 
mode)</simpara></listitem>
       </varlistentry>
       <varlistentry>
        <term><emphasis>\Z</emphasis></term>
-          <listitem>
+       <listitem>
         <simpara>
-        end of subject or newline at end (independent of
-        multiline mode)
+         end of subject or newline at end (independent of
+         multiline mode)
         </simpara>
        </listitem>
       </varlistentry>
       <varlistentry>
        <term><emphasis>\z</emphasis></term>
-          <listitem>
-        <simpara>
-         end of subject(independent of multiline mode)
-        </simpara>
-       </listitem>
+       <listitem><simpara>end of subject(independent of multiline 
mode)</simpara></listitem>
       </varlistentry>
      </variablelist>
     </para>
@@ -738,8 +569,7 @@
      ever match at the very start and end of the subject  string,
      whatever  options  are  set.  They  are  not affected by the
      <link linkend="reference.pcre.pattern.modifiers">PCRE_MULTILINE</link> or
-     <link
-      linkend="reference.pcre.pattern.modifiers">PCRE_DOLLAR_ENDONLY</link>
+     <link 
linkend="reference.pcre.pattern.modifiers">PCRE_DOLLAR_ENDONLY</link>
      options. The  difference  between <literal>\Z</literal> and
      <literal>\z</literal>  is that <literal>\Z</literal> matches before a
      newline that is the last character of the string as well as at the end of
@@ -750,60 +580,59 @@
     <refsect2 id="regexp.reference.circudollar">
      <title>Circumflex and dollar</title>
      <para>
-     Outside a character class, in the default matching mode, the
-     circumflex  character  is an assertion which is true only if
-     the current matching point is at the start  of  the  subject
-     string. Inside a character class, circumflex has an entirely
-     different meaning (see below).
-    </para>
-    <para>
-     Circumflex need not be the first character of the pattern if
-     a number of alternatives are involved, but it should be the
-     first thing in each alternative in which it appears  if  the
-     pattern is ever to match that branch. If all possible
-     alternatives start with a circumflex, that is, if the pattern  is
-     constrained to match only at the start of the subject, it is
-     said to be an "anchored" pattern. (There are also other
-     constructs that can cause a pattern to be anchored.)
-    </para>
-    <para>
-     A dollar character is an assertion which is &true; only if the
-     current  matching point is at the end of the subject string,
-     or immediately before a newline character that is  the  last
-     character in the string (by default). Dollar need not be the
-     last character of the pattern if a  number  of  alternatives
-     are  involved,  but it should be the last item in any branch
-     in which it appears.  Dollar has no  special  meaning  in  a
-     character class.
-    </para>
-    <para>
-     The meaning of dollar can be changed so that it matches only
-     at   the   very   end   of   the   string,  by  setting  the
-     <link 
linkend="reference.pcre.pattern.modifiers">PCRE_DOLLAR_ENDONLY</link>
-     option at compile or matching time. This
-     does not affect the \Z assertion.
-    </para>
-    <para>
-     The meanings of the circumflex  and  dollar  characters  are
-     changed if the <link
-      linkend="reference.pcre.pattern.modifiers">PCRE_MULTILINE</link> option
-     is set. When this is the case, they match immediately after and
-     immediately before an internal "\n" character, respectively, in addition
-     to matching at the start and end of the subject string. For example, the
-     pattern /^abc$/ matches the subject string "def\nabc" in multiline mode,
-     but not otherwise. Consequently, patterns that are anchored in single
-     line mode because all branches start with "^" are not anchored in
-     multiline mode. The  <link
-      linkend="reference.pcre.pattern.modifiers">PCRE_DOLLAR_ENDONLY</link>
-     option is ignored if <link
-      linkend="reference.pcre.pattern.modifiers">PCRE_MULTILINE</link> is
-     set.
-    </para>
-    <para>
-     Note that the sequences \A, \Z, and \z can be used to  match
-     the  start  and end of the subject in both modes, and if all
-     branches of a pattern start with \A is it  always  anchored,
-     whether <link 
linkend="reference.pcre.pattern.modifiers">PCRE_MULTILINE</link>  is set or not.
+      Outside a character class, in the default matching mode, the
+      circumflex  character  is an assertion which is true only if
+      the current matching point is at the start  of  the  subject
+      string. Inside a character class, circumflex has an entirely
+      different meaning (see below).
+     </para>
+     <para>
+      Circumflex need not be the first character of the pattern if
+      a number of alternatives are involved, but it should be the
+      first thing in each alternative in which it appears  if  the
+      pattern is ever to match that branch. If all possible
+      alternatives start with a circumflex, that is, if the pattern is
+      constrained to match only at the start of the subject, it is
+      said to be an "anchored" pattern. (There are also other
+      constructs that can cause a pattern to be anchored.)
+     </para>
+     <para>
+      A dollar character is an assertion which is &true; only if the
+      current  matching point is at the end of the subject string,
+      or immediately before a newline character that is  the  last
+      character in the string (by default). Dollar need not be the
+      last character of the pattern if a  number  of  alternatives
+      are  involved,  but it should be the last item in any branch
+      in which it appears.  Dollar has no  special  meaning  in  a
+      character class.
+     </para>
+     <para>
+      The meaning of dollar can be changed so that it matches only
+      at the very end of the string, by setting the
+      <link 
linkend="reference.pcre.pattern.modifiers">PCRE_DOLLAR_ENDONLY</link>
+      option at compile or matching time. This does not affect the \Z 
assertion.
+     </para>
+     <para>
+      The meanings of the circumflex and dollar characters are
+      changed if the
+      <link linkend="reference.pcre.pattern.modifiers">PCRE_MULTILINE</link> 
option
+      is set. When this is the case, they match immediately after and
+      immediately before an internal "\n" character, respectively, in addition
+      to matching at the start and end of the subject string. For example, the
+      pattern /^abc$/ matches the subject string "def\nabc" in multiline mode,
+      but not otherwise. Consequently, patterns that are anchored in single
+      line mode because all branches start with "^" are not anchored in
+      multiline mode. The
+      <link 
linkend="reference.pcre.pattern.modifiers">PCRE_DOLLAR_ENDONLY</link>
+      option is ignored if
+      <link linkend="reference.pcre.pattern.modifiers">PCRE_MULTILINE</link> is
+      set.
+     </para>
+     <para>
+      Note that the sequences \A, \Z, and \z can be used to  match
+      the  start  and end of the subject in both modes, and if all
+      branches of a pattern start with \A is it  always  anchored,
+      whether <link 
linkend="reference.pcre.pattern.modifiers">PCRE_MULTILINE</link>  is set or not.
      </para>
     </refsect2>
 
@@ -812,8 +641,8 @@
      <para>
      Outside a character class, a dot in the pattern matches  any
      one  character  in  the  subject,  including  a non-printing
-     character, but not (by default) newline.  If the <link
-      linkend="reference.pcre.pattern.modifiers">PCRE_DOTALL</link> 
+     character, but not (by default) newline.  If the
+     <link linkend="reference.pcre.pattern.modifiers">PCRE_DOTALL</link> 
      option  is  set,  then dots match newlines as well. The
      handling of dot is entirely independent of the handling of
      circumflex  and  dollar,  the only relationship being that they
@@ -825,90 +654,90 @@
     <refsect2 id="regexp.reference.squarebrackets">
      <title>Square brackets</title>
      <para>
-     An opening square bracket introduces a character class,
-     terminated  by  a  closing  square  bracket.  A  closing square
-     bracket on its own is  not  special.  If  a  closing  square
-     bracket  is  required as a member of the class, it should be
-     the first data character in the class (after an initial
-     circumflex, if present) or escaped with a backslash.
-    </para>
-    <para>
-     A character class matches a single character in the subject;
-     the  character  must  be in the set of characters defined by
-     the class, unless the first character in the class is a
-     circumflex,  in which case the subject character must not be in
-     the set defined by the class. If a  circumflex  is  actually
-     required  as  a  member  of  the class, ensure it is not the
-     first character, or escape it with a backslash.
-    </para>
-    <para>
-     For example, the character class [aeiou] matches  any  lower
-     case vowel, while [^aeiou] matches any character that is not
-     a lower case vowel. Note that a circumflex is  just  a
-     convenient  notation for specifying the characters which are in
-     the class by enumerating those that are not. It  is  not  an
-     assertion:  it  still  consumes a character from the subject
-     string, and fails if the current pointer is at  the  end  of
-     the string.
-    </para>
-    <para>
-     When caseless matching  is  set,  any  letters  in  a  class
-     represent  both their upper case and lower case versions, so
-     for example, a caseless [aeiou] matches "A" as well as  "a",
-     and  a caseless [^aeiou] does not match "A", whereas a
-     caseful version would.
-    </para>
-    <para>
-     The newline character is never treated in any special way in
-     character  classes,  whatever the setting of the <link
-      linkend="reference.pcre.pattern.modifiers">PCRE_DOTALL</link> 
-     or <link linkend="reference.pcre.pattern.modifiers">PCRE_MULTILINE</link>
-     options is. A class such as [^a] will always match a newline.
-    </para>
-    <para>
-     The minus (hyphen) character can be used to specify a  range
-     of  characters  in  a  character  class.  For example, [d-m]
-     matches any letter between d and m, inclusive.  If  a  minus
-     character  is required in a class, it must be escaped with a
-     backslash or appear in a position where it cannot be
-     interpreted as indicating a range, typically as the first or last
-     character in the class.
-    </para>
-    <para>
-     It is not possible to have the literal character "]" as  the
-     end  character  of  a  range.  A  pattern such as [W-]46] is
-     interpreted as a class of two characters ("W" and "-")
-     followed by a literal string "46]", so it would match "W46]" or
-     "-46]". However, if the "]" is escaped with a  backslash  it
-     is  interpreted  as  the end of range, so [W-\]46] is
-     interpreted as a single class containing a range followed by  two
-     separate characters. The octal or hexadecimal representation
-     of "]" can also be used to end a range.
-    </para>
-    <para>
-     Ranges operate in ASCII collating sequence. They can also be
-     used  for  characters  specified  numerically,  for  example
-     [\000-\037]. If a range that includes letters is  used  when
-     caseless  matching  is set, it matches the letters in either
-     case. For example, [W-c] is equivalent  to  [][\^_`wxyzabc],
-     matched  caselessly,  and  if  character tables for the "fr"
-     locale are in use, [\xc8-\xcb] matches accented E characters
-     in both cases.
-    </para>
-    <para>
-     The character types \d, \D, \s, \S,  \w,  and  \W  may  also
-     appear  in  a  character  class, and add the characters that
-     they match to the class. For example, [\dABCDEF] matches any
-     hexadecimal  digit.  A  circumflex  can conveniently be used
-     with the upper case character types to specify a  more
-     restricted set of characters than the matching lower case type.
-     For example, the class [^\W_] matches any letter  or  digit,
-     but not underscore.
-    </para>
-    <para>
-     All non-alphanumeric characters other than \,  -,  ^  (at  the
-     start)  and  the  terminating ] are non-special in character
-     classes, but it does no harm if they are escaped.
+      An opening square bracket introduces a character class,
+      terminated  by  a  closing  square  bracket.  A  closing square
+      bracket on its own is  not  special.  If  a  closing  square
+      bracket  is  required as a member of the class, it should be
+      the first data character in the class (after an initial
+      circumflex, if present) or escaped with a backslash.
+     </para>
+     <para>
+      A character class matches a single character in the subject;
+      the  character  must  be in the set of characters defined by
+      the class, unless the first character in the class is a
+      circumflex,  in which case the subject character must not be in
+      the set defined by the class. If a  circumflex  is  actually
+      required  as  a  member  of  the class, ensure it is not the
+      first character, or escape it with a backslash.
+     </para>
+     <para>
+      For example, the character class [aeiou] matches  any  lower
+      case vowel, while [^aeiou] matches any character that is not
+      a lower case vowel. Note that a circumflex is  just  a
+      convenient  notation for specifying the characters which are in
+      the class by enumerating those that are not. It  is  not  an
+      assertion:  it  still  consumes a character from the subject
+      string, and fails if the current pointer is at  the  end  of
+      the string.
+     </para>
+     <para>
+      When caseless matching  is  set,  any  letters  in  a  class
+      represent  both their upper case and lower case versions, so
+      for example, a caseless [aeiou] matches "A" as well as  "a",
+      and  a caseless [^aeiou] does not match "A", whereas a
+      caseful version would.
+     </para>
+     <para>
+      The newline character is never treated in any special way in
+      character  classes,  whatever the setting of the <link
+       linkend="reference.pcre.pattern.modifiers">PCRE_DOTALL</link> 
+      or <link linkend="reference.pcre.pattern.modifiers">PCRE_MULTILINE</link>
+      options is. A class such as [^a] will always match a newline.
+     </para>
+     <para>
+      The minus (hyphen) character can be used to specify a  range
+      of  characters  in  a  character  class.  For example, [d-m]
+      matches any letter between d and m, inclusive.  If  a  minus
+      character  is required in a class, it must be escaped with a
+      backslash or appear in a position where it cannot be
+      interpreted as indicating a range, typically as the first or last
+      character in the class.
+     </para>
+     <para>
+      It is not possible to have the literal character "]" as  the
+      end  character  of  a  range.  A  pattern such as [W-]46] is
+      interpreted as a class of two characters ("W" and "-")
+      followed by a literal string "46]", so it would match "W46]" or
+      "-46]". However, if the "]" is escaped with a  backslash  it
+      is  interpreted  as  the end of range, so [W-\]46] is
+      interpreted as a single class containing a range followed by  two
+      separate characters. The octal or hexadecimal representation
+      of "]" can also be used to end a range.
+     </para>
+     <para>
+      Ranges operate in ASCII collating sequence. They can also be
+      used  for  characters  specified  numerically,  for  example
+      [\000-\037]. If a range that includes letters is  used  when
+      caseless  matching  is set, it matches the letters in either
+      case. For example, [W-c] is equivalent  to  [][\^_`wxyzabc],
+      matched  caselessly,  and  if  character tables for the "fr"
+      locale are in use, [\xc8-\xcb] matches accented E characters
+      in both cases.
+     </para>
+     <para>
+      The character types \d, \D, \s, \S,  \w,  and  \W  may  also
+      appear  in  a  character  class, and add the characters that
+      they match to the class. For example, [\dABCDEF] matches any
+      hexadecimal  digit.  A  circumflex  can conveniently be used
+      with the upper case character types to specify a  more
+      restricted set of characters than the matching lower case type.
+      For example, the class [^\W_] matches any letter  or  digit,
+      but not underscore.
+     </para>
+     <para>
+      All non-alphanumeric characters other than \,  -,  ^  (at  the
+      start)  and  the  terminating ] are non-special in character
+      classes, but it does no harm if they are escaped.
      </para>
     </refsect2>
 
@@ -917,9 +746,7 @@
      <para>
      Vertical bar characters are  used  to  separate  alternative
      patterns. For example, the pattern
-
-       <literal>gilbert|sullivan</literal>
-
+      <literal>gilbert|sullivan</literal>
      matches either "gilbert" or "sullivan". Any number of alternatives
      may  appear,  and an empty alternative is permitted
      (matching the empty string).   The  matching  process  tries
@@ -934,104 +761,105 @@
     <refsect2 id="regexp.reference.internal-options">
      <title>Internal option setting</title>
      <para>
-     The settings of <link 
linkend="reference.pcre.pattern.modifiers">PCRE_CASELESS</link>, 
-     <link linkend="reference.pcre.pattern.modifiers">PCRE_MULTILINE</link>,  
-     <link linkend="reference.pcre.pattern.modifiers">PCRE_DOTALL</link>,
-     <link linkend="reference.pcre.pattern.modifiers">PCRE_UNGREEDY</link>,
-     and  <link 
linkend="reference.pcre.pattern.modifiers">PCRE_EXTENDED</link>  can be changed 
from within the pattern by
-     a sequence of Perl option letters enclosed between "(?"  and
-     ")". The option letters are
+      The settings of <link 
linkend="reference.pcre.pattern.modifiers">PCRE_CASELESS</link>, 
+      <link linkend="reference.pcre.pattern.modifiers">PCRE_MULTILINE</link>,  
+      <link linkend="reference.pcre.pattern.modifiers">PCRE_DOTALL</link>,
+      <link linkend="reference.pcre.pattern.modifiers">PCRE_UNGREEDY</link>,
+      and  <link 
linkend="reference.pcre.pattern.modifiers">PCRE_EXTENDED</link>
+      can be changed from within the pattern by
+      a sequence of Perl option letters enclosed between "(?"  and
+      ")". The option letters are:
+
+      <table>
+       <title>Internal option letters</title>
+       <tgroup cols="2">
+        <tbody>
+         <row>
+          <entry><literal>i</literal></entry>
+          <entry>for <link 
linkend="reference.pcre.pattern.modifiers">PCRE_CASELESS</link></entry>
+         </row>
+         <row>
+          <entry><literal>m</literal></entry>
+          <entry>for <link 
linkend="reference.pcre.pattern.modifiers">PCRE_MULTILINE</link></entry>
+         </row>
+         <row>
+          <entry><literal>s</literal></entry>
+          <entry>for <link 
linkend="reference.pcre.pattern.modifiers">PCRE_DOTALL</link></entry>
+         </row>
+         <row>
+          <entry><literal>x</literal></entry>
+          <entry>for <link 
linkend="reference.pcre.pattern.modifiers">PCRE_EXTENDED</link></entry>
+         </row>
+         <row>
+          <entry><literal>U</literal></entry>
+          <entry>for <link 
linkend="reference.pcre.pattern.modifiers">PCRE_UNGREEDY</link></entry>
+         </row>
+        </tbody>
+       </tgroup>
+      </table>
+     </para>
+     <para>
+      For example, (?im) sets caseless, multiline matching. It  is
+      also possible to unset these options by preceding the letter
+      with a hyphen, and a combined setting and unsetting such  as
+      (?im-sx),  which sets <link 
linkend="reference.pcre.pattern.modifiers">PCRE_CASELESS</link>  and <link 
linkend="reference.pcre.pattern.modifiers">PCRE_MULTILINE</link>  while
+      unsetting <link 
linkend="reference.pcre.pattern.modifiers">PCRE_DOTALL</link>  and <link 
linkend="reference.pcre.pattern.modifiers">PCRE_EXTENDED</link>, is also  
permitted.
+      If  a  letter  appears both before and after the hyphen, the
+      option is unset.
+     </para>
+     <para>
+      The scope of these option changes depends on  where  in  the
+      pattern  the  setting  occurs. For settings that are outside
+      any subpattern (defined below), the effect is the same as if
+      the  options were set or unset at the start of matching. The
+      following patterns all behave in exactly the same way:
+     </para>
 
-     <table>
-      <title>Internal option letters</title>
-      <tgroup cols="2">
-       <tbody>
-        <row>
-         <entry><literal>i</literal></entry>
-         <entry>for <link 
linkend="reference.pcre.pattern.modifiers">PCRE_CASELESS</link></entry>
-        </row>
-        <row>
-         <entry><literal>m</literal></entry>
-         <entry>for <link 
linkend="reference.pcre.pattern.modifiers">PCRE_MULTILINE</link></entry>
-        </row>
-        <row>
-         <entry><literal>s</literal></entry>
-         <entry>for <link 
linkend="reference.pcre.pattern.modifiers">PCRE_DOTALL</link></entry>
-        </row>
-        <row>
-         <entry><literal>x</literal></entry>
-         <entry>for <link 
linkend="reference.pcre.pattern.modifiers">PCRE_EXTENDED</link></entry>
-        </row>
-        <row>
-         <entry><literal>U</literal></entry>
-         <entry>for <link 
linkend="reference.pcre.pattern.modifiers">PCRE_UNGREEDY</link></entry>
-        </row>
-       </tbody>
-      </tgroup>
-     </table>
-    </para>
-    <para>
-     For example, (?im) sets caseless, multiline matching. It  is
-     also possible to unset these options by preceding the letter
-     with a hyphen, and a combined setting and unsetting such  as
-     (?im-sx),  which sets <link 
linkend="reference.pcre.pattern.modifiers">PCRE_CASELESS</link>  and <link 
linkend="reference.pcre.pattern.modifiers">PCRE_MULTILINE</link>  while
-     unsetting <link 
linkend="reference.pcre.pattern.modifiers">PCRE_DOTALL</link>  and <link 
linkend="reference.pcre.pattern.modifiers">PCRE_EXTENDED</link>, is also  
permitted.
-     If  a  letter  appears both before and after the hyphen, the
-     option is unset.
-    </para>
-    <para>
-     The scope of these option changes depends on  where  in  the
-     pattern  the  setting  occurs. For settings that are outside
-     any subpattern (defined below), the effect is the same as if
-     the  options were set or unset at the start of matching. The
-     following patterns all behave in exactly the same way:
-    </para>
-
-     <literallayout>
-       (?i)abc
-       a(?i)bc
-       ab(?i)c
-       abc(?i)
-     </literallayout>
-
-    <para>
-     which in turn is the same as compiling the pattern abc  with
-     <link linkend="reference.pcre.pattern.modifiers">PCRE_CASELESS</link> set.
-     In  other words, such "top level" settings apply to the whole
-     pattern  (unless  there  are  other changes  inside subpatterns).
-     If there is more than one setting of the same option at top level,
-     the rightmost  setting is used.
-    </para>
-    <para>
-     If an option change occurs inside a subpattern,  the  effect
-     is  different.  This is a change of behaviour in Perl 5.005.
-     An option change inside a subpattern affects only that  part
-     of the subpattern that follows it, so
-
-       <literal>(a(?i)b)c</literal>
-
-     matches  abc  and  aBc  and  no  other   strings   (assuming
-     <link linkend="reference.pcre.pattern.modifiers">PCRE_CASELESS</link>   
is  not used).  By this means, options can be
-     made to have different settings in different  parts  of  the
-     pattern.  Any  changes  made  in one alternative do carry on
-     into subsequent branches within  the  same  subpattern.  For
-     example,
-
-       <literal>(a(?i)b|c)</literal>
-
-     matches "ab", "aB", "c", and "C", even though when  matching
-     "C" the first branch is abandoned before the option setting.
-     This is because the effects of  option  settings  happen  at
-     compile  time. There would be some very weird behaviour otherwise.
-    </para>
-    <para>
-     The PCRE-specific options <link 
linkend="reference.pcre.pattern.modifiers">PCRE_UNGREEDY</link>  and  
-     <link linkend="reference.pcre.pattern.modifiers">PCRE_EXTRA</link>   can
-     be changed in the same way as the Perl-compatible options by
-     using the characters U and X  respectively.  The  (?X)  flag
-     setting  is  special in that it must always occur earlier in
-     the pattern than any of the additional features it turns on,
-     even when it is at top level. It is best put at the start.
+      <literallayout>
+        (?i)abc
+        a(?i)bc
+        ab(?i)c
+        abc(?i)
+      </literallayout>
+
+     <para>
+      which in turn is the same as compiling the pattern abc  with
+      <link linkend="reference.pcre.pattern.modifiers">PCRE_CASELESS</link> 
set.
+      In  other words, such "top level" settings apply to the whole
+      pattern  (unless  there  are  other changes  inside subpatterns).
+      If there is more than one setting of the same option at top level,
+      the rightmost  setting is used.
+     </para>
+     <para>
+      If an option change occurs inside a subpattern,  the  effect
+      is  different.  This is a change of behaviour in Perl 5.005.
+      An option change inside a subpattern affects only that  part
+      of the subpattern that follows it, so
+
+        <literal>(a(?i)b)c</literal>
+
+      matches  abc  and  aBc  and  no  other   strings   (assuming
+      <link linkend="reference.pcre.pattern.modifiers">PCRE_CASELESS</link>   
is  not used).  By this means, options can be
+      made to have different settings in different  parts  of  the
+      pattern.  Any  changes  made  in one alternative do carry on
+      into subsequent branches within  the  same  subpattern.  For
+      example,
+
+        <literal>(a(?i)b|c)</literal>
+
+      matches "ab", "aB", "c", and "C", even though when  matching
+      "C" the first branch is abandoned before the option setting.
+      This is because the effects of  option  settings  happen  at
+      compile  time. There would be some very weird behaviour otherwise.
+     </para>
+     <para>
+      The PCRE-specific options <link 
linkend="reference.pcre.pattern.modifiers">PCRE_UNGREEDY</link>  and  
+      <link linkend="reference.pcre.pattern.modifiers">PCRE_EXTRA</link>   can
+      be changed in the same way as the Perl-compatible options by
+      using the characters U and X  respectively.  The  (?X)  flag
+      setting  is  special in that it must always occur earlier in
+      the pattern than any of the additional features it turns on,
+      even when it is at top level. It is best put at the start.
      </para>
     </refsect2>

[PHP-DOC] cvs: phpdoc /en/reference/pcre pattern.syntax.xml

Reply via email to