In perl.git, the branch blead has been updated <http://perl5.git.perl.org/perl.git/commitdiff/4044502721ac7b89c6d21cf1099a3a518717eeba?hp=eb7cfb310cbb07e1385c0580a4ee80816e35043b>
- Log ----------------------------------------------------------------- commit 4044502721ac7b89c6d21cf1099a3a518717eeba Author: David Mitchell <[email protected]> Date: Wed Jul 24 15:20:22 2013 +0100 perlvar.pod: add a separate section on $& et al Add a new separate section explaining the performance issues of $`, $& and $'; plus descriptions of the various workarounds like @-, /p and COW, and which perl version they were each introduced in. Then in the entries for each individual var, strip out any commentary about performance, and just include a link to the new performance section. M pod/perlvar.pod commit 142a37fdb385bb222232b286abdedf9b1daaa746 Author: David Mitchell <[email protected]> Date: Wed Jul 24 14:18:22 2013 +0100 English.pm: update perl version where perf fixed It still said that the performance of $`, $&, $' was fixed in 5.18. Update that to 5.20, since COW wasn't enabled by default in 5.18. M lib/English.pm ----------------------------------------------------------------------- Summary of changes: lib/English.pm | 6 ++-- pod/perlvar.pod | 86 ++++++++++++++++++++++++++++++++++++++------------------- 2 files changed, 61 insertions(+), 31 deletions(-) diff --git a/lib/English.pm b/lib/English.pm index e4ee10a..6560f5f 100644 --- a/lib/English.pm +++ b/lib/English.pm @@ -1,6 +1,6 @@ package English; -our $VERSION = '1.07'; +our $VERSION = '1.08'; require Exporter; @ISA = qw(Exporter); @@ -34,9 +34,9 @@ See L<perlvar> for a complete list of these. =head1 PERFORMANCE -NOTE: This was fixed in perl 5.18. Mentioning these three variables no +NOTE: This was fixed in perl 5.20. Mentioning these three variables no longer makes a speed difference. This section still applies if your code -is to run on perl 5.16 or earlier. +is to run on perl 5.18 or earlier. This module can provoke sizeable inefficiencies for regular expressions, due to unfortunate implementation details. If performance matters in diff --git a/pod/perlvar.pod b/pod/perlvar.pod index a278d10..4d869f1 100644 --- a/pod/perlvar.pod +++ b/pod/perlvar.pod @@ -801,16 +801,51 @@ we have not made another match: $1 is Mutt; $2 is Jeff $1 is Wallace; $2 is Grommit -The C<Devel::NYTProf> and C<Devel::FindAmpersand> -modules can help you find uses of these -problematic match variables in your code. +=head3 Performance issues -Since Perl v5.10.0, you can use the C</p> match operator flag and the -C<${^PREMATCH}>, C<${^MATCH}>, and C<${^POSTMATCH}> variables instead -so you only suffer the performance penalties. +Traditionally in Perl, any use of any of the three variables C<$`>, C<$&> +or C<$'> (or their C<use English> equivalents) anywhere in the code, caused +all subsequent successful pattern matches to make a copy of the matched +string, in case the code might subsequently access one of those variables. +This imposed a considerable performance penalty across the whole program, +so generally the use of these variables has been discouraged. -If you are using Perl v5.20.0 or higher, you do not need to worry about -this, as the three naughty variables are no longer naughty. +In Perl 5.6.0 the C<@-> and C<@+> dynamic arrays were introduced that +supply the indices of successful matches. So you could for example do +this: + + $str =~ /pattern/; + + print $`, $&, $'; # bad: perfomance hit + + print # good: no perfomance hit + substr($str, 0, $-[0]), + substr($str, $-[0], $+[0]-$-[0]), + substr($str, $+[0]); + +In Perl 5.10.0 the C</p> match operator flag and the C<${^PREMATCH}>, +C<${^MATCH}>, and C<${^POSTMATCH}> variables were introduced, that allowed +you to suffer the penalties only on patterns marked with C</p>. + +In Perl 5.18.0 onwards, perl started noting the presence of each of the +three variables separately, and only copied that part of the string +required; so in + + $`; $&; "abcdefgh" =~ /d/ + +perl would only copy the "abcd" part of the string. That could make a big +difference in something like + + $str = 'x' x 1_000_000; + $&; # whoops + $str =~ /x/g # one char copied a million times, not a million chars + +In Perl 5.20.0 a new copy-on-write system was enabled by default, which +finally fixes all performance issues with these three variables, and makes +them safe to use anywhere. + +The C<Devel::NYTProf> and C<Devel::FindAmpersand> modules can help you +find uses of these problematic match variables in your code. =over 8 @@ -834,12 +869,8 @@ The string matched by the last successful pattern match (not counting any matches hidden within a BLOCK or C<eval()> enclosed by the current BLOCK). -In Perl v5.18 and earlier, the use of this variable -anywhere in a program imposes a considerable -performance penalty on all regular expression matches. To avoid this -penalty, you can extract the same substring by using L</@->. Starting -with Perl v5.10.0, you can use the C</p> match flag and the C<${^MATCH}> -variable to do the same thing for particular match operations. +See L</Performance issues> above for the serious performance implications +of using this variable (even once) in your code. This variable is read-only and dynamically-scoped. @@ -850,6 +881,9 @@ X<${^MATCH}> This is similar to C<$&> (C<$MATCH>) except that it does not incur the performance penalty associated with that variable. + +See L</Performance issues> above. + In Perl v5.18 and earlier, it is only guaranteed to return a defined value when the pattern was compiled or executed with the C</p> modifier. In Perl v5.20, the C</p> modifier does nothing, so @@ -868,13 +902,8 @@ The string preceding whatever was matched by the last successful pattern match, not counting any matches hidden within a BLOCK or C<eval> enclosed by the current BLOCK. -In Perl v5.18 and earlier, the use of this variable -anywhere in a program imposes a considerable -performance penalty on all regular expression matches. To avoid this -penalty, you can extract the same substring by using L</@->. Starting -with Perl v5.10.0, you can use the C</p> match flag and the -C<${^PREMATCH}> variable to do the same thing for particular match -operations. +See L</Performance issues> above for the serious performance implications +of using this variable (even once) in your code. This variable is read-only and dynamically-scoped. @@ -885,6 +914,9 @@ X<$`> X<${^PREMATCH}> This is similar to C<$`> ($PREMATCH) except that it does not incur the performance penalty associated with that variable. + +See L</Performance issues> above. + In Perl v5.18 and earlier, it is only guaranteed to return a defined value when the pattern was compiled or executed with the C</p> modifier. In Perl v5.20, the C</p> modifier does nothing, so @@ -907,13 +939,8 @@ enclosed by the current BLOCK). Example: /def/; print "$`:$&:$'\n"; # prints abc:def:ghi -In Perl v5.18 and earlier, the use of this variable -anywhere in a program imposes a considerable -performance penalty on all regular expression matches. -To avoid this penalty, you can extract the same substring by -using L</@->. Starting with Perl v5.10.0, you can use the C</p> match flag -and the C<${^POSTMATCH}> variable to do the same thing for particular -match operations. +See L</Performance issues> above for the serious performance implications +of using this variable (even once) in your code. This variable is read-only and dynamically-scoped. @@ -924,6 +951,9 @@ X<${^POSTMATCH}> X<$'> X<$POSTMATCH> This is similar to C<$'> (C<$POSTMATCH>) except that it does not incur the performance penalty associated with that variable. + +See L</Performance issues> above. + In Perl v5.18 and earlier, it is only guaranteed to return a defined value when the pattern was compiled or executed with the C</p> modifier. In Perl v5.20, the C</p> modifier does nothing, so -- Perl5 Master Repository
