vrana Mon Jun 13 12:26:28 2005 EDT
Modified files: /phpdoc/en/reference/pcre pattern.syntax.xml Log: Document PCRE 4.3 # However, PCRE 5.0 is in the sources now http://cvs.php.net/diff.php/phpdoc/en/reference/pcre/pattern.syntax.xml?r1=1.6&r2=1.7&ty=u Index: phpdoc/en/reference/pcre/pattern.syntax.xml diff -u phpdoc/en/reference/pcre/pattern.syntax.xml:1.6 phpdoc/en/reference/pcre/pattern.syntax.xml:1.7 --- phpdoc/en/reference/pcre/pattern.syntax.xml:1.6 Sun Dec 19 08:35:34 2004 +++ phpdoc/en/reference/pcre/pattern.syntax.xml Mon Jun 13 12:26:27 2005 @@ -1,5 +1,5 @@ <?xml version="1.0" encoding="iso-8859-1"?> -<!-- $Revision: 1.6 $ --> +<!-- $Revision: 1.7 $ --> <!-- splitted from ./en/functions/pcre.xml, last change in rev 1.2 --> <refentry id="reference.pcre.pattern.syntax"> <refnamediv> @@ -67,7 +67,7 @@ <listitem> <simpara> The following Perl escape sequences are not supported: - \l, \u, \L, \U, \E, \Q. In fact these are implemented by + \l, \u, \L, \U. In fact these are implemented by Perl's general string-handling and are not part of its pattern matching engine. </simpara> @@ -575,6 +575,15 @@ newline that is the last character of the string as well as at the end of the string, whereas <literal>\z</literal> matches only at the end. </para> + + <para> + <literal>\Q</literal> and <literal>\E</literal> can be used to ignore + regexp metacharacters in the pattern. For example: + <literal>\w+\Q.$.\E$</literal> will match one or more word characters, + followed by literals <literal>.$.</literal> and anchored at the end of + the string. + </para> + </refsect2> <refsect2 id="regexp.reference.circudollar"> @@ -924,6 +933,13 @@ setting in one branch does affect subsequent branches, so the above patterns match "SUNDAY" as well as "Saturday". </para> + + <para> + It is possible to name the subpattern with + <literal>(?P<name>pattern)</literal>. Array with matches will + contain the match indexed by the string alongside the match indexed by + a number, then. + </para> </refsect2> <refsect2 id="regexp.reference.repetition"> @@ -1058,6 +1074,13 @@ default behaviour. </para> <para> + Quantifiers followed by <literal>+</literal> are "possessive". They eat + as many characters as possible and don't return to match the rest of the + pattern. Thus <literal>.*abc</literal> matches "aabc" but + <literal>.*+abc</literal> doesn't because <literal>.*+</literal> eats the + whole string. Possessive quantifiers can be used to speed up processing. + </para> + <para> When a parenthesized subpattern is quantified with a minimum repeat count that is greater than 1 or with a limited maximum, more store is required for the compiled pattern, in @@ -1556,6 +1579,13 @@ there is no way to give an out-of-memory error from within a recursion. </para> + + <para> + <literal>(?1)</literal>, <literal>(?2)</literal> and so on can be used + for recursive subpatterns too. It is also possible to use named + subpatterns: <literal>(?P>foo)</literal>. + </para> + </refsect2> <refsect2 id="regexp.reference.performances">