In perl.git, the branch smoke-me/yves-revert_skipwhite has been created
<http://perl5.git.perl.org/perl.git/commitdiff/f5c29cdd38b1a8046b274f3f0471813b781f545e?hp=0000000000000000000000000000000000000000>
at f5c29cdd38b1a8046b274f3f0471813b781f545e (commit)
- Log -----------------------------------------------------------------
commit f5c29cdd38b1a8046b274f3f0471813b781f545e
Author: Yves Orton <[email protected]>
Date: Tue Mar 26 08:37:36 2013 +0100
improve flag check
M regcomp.c
commit f7dfe693f4479ec9dfcdfe3bc35de0f77137d2bf
Author: Yves Orton <[email protected]>
Date: Mon Mar 25 23:29:17 2013 +0100
rework split() special case interaction with regex engine
This patch resolves several issues at once. The parts are
sufficiently interconnected that it is hard to break it down
into smaller commits. The tickets open for these issues are:
RT #94490 - split and constant folding
RT #116086 - split "\x20" doesn't work as documented
It effectively reverts 5255171e6cd0accee6f76ea2980e32b3b5b8e171
and cccd1425414e6518c1fc8b7bcaccfb119320c513.
Prior to this patch the special RXf_SKIPWHITE behavior of
split(" ", $thing)
was only available if Perl could resolve the first argument to
split at compile time, meaning under various arcane situations.
This manifested as oddities like
my $delim = $cond ? " " : qr/\s+/;
split $delim, $string;
and
split $cond ? " ", qr/\s+/, $string
not behaving the same as:
($cond ? split(" ", $string) : split(/\s+/, $string))
which isn't very convenient.
This patch changes this by adding a new flag to the op_pmflags,
PMf_SPLIT which enables pp_regcomp() to know whether it was called
as part of split, which allows the RXf_SPLIT to be passed into run
time regex compilation.
Note that this essentially the opposite fix from the one applied
originally to fix #94490 in 5255171e6cd0accee6f76ea2980e32b3b5b8e171.
The reverted patch was meant to make:
split( 0 || " ", $thing ) #1
consistent with
my $x=0; split( $x || " ", $thing ) #2
and not with
split( " ", $thing ) #3
This was reverted because it broke C<split("\x{20}", $thing)>, and
because one might argue that is not that #1 does the wrong thing,
but rather that the behavior of #2 that is wrong. In other words
we might expect that all three should behave the same as #3, and
that instead of "fixing" the behavior of #1 to be like #2, we should
really fix the behavior of #2 to behave like #3. (Which is what we did.)
Also, it doesn't make sense to move the special case detection logic
further from the regex engine. We really want the regex engine to decide
this stuff itself, otherwise split " ", ... wouldn't work properly with
an alternate engine. (Imagine we add a special regexp meta pattern that
behaves
the same as " " does in a split /.../. For instance we might make
split /(*SPLITWHITE)/ trigger the same behavior as split " ".
The other major change as result of this patch is it effectively
reverts commit cccd1425414e6518c1fc8b7bcaccfb119320c513, which
was intended to get rid of RXf_SPLIT and RXf_SKIPWHITE, which
and free up bits in the regex flags structure.
But we dont want to get rid of these vars, and it turns out that
RXf_SEEN_LOOKBEHIND is used only in the same situation as the new
RXf_MODIFIES_VARS. So I have renamed RXf_SEEN_LOOKBEHIND to
RXf_NO_INPLACE_SUBST, and then instead of using two vars we use
only the one. Which in turn allows RXf_SPLIT and RXf_SKIPWHITE to
have their bits back.
M dist/B-Deparse/Deparse.pm
M dump.c
M op.c
M op.h
M op_reg_common.h
M pod/perlreapi.pod
M pp.c
M pp_ctl.c
M pp_hot.c
M regcomp.c
M regexp.h
M regnodes.h
M t/op/split.t
commit d7484bc9ef5fdcc1be431c7e7eb0e9d4500696ff
Author: Yves Orton <[email protected]>
Date: Mon Mar 25 23:23:40 2013 +0100
We have to check flags when we do a pattern compare
The pattern might be identical but with differing flags.
This needs better tests, some will come in a later patch
as a side effect of testing split.
M regcomp.c
commit cf993999de04c8e65ef4a1fbffb4b5dfe030fdb0
Author: Yves Orton <[email protected]>
Date: Mon Mar 25 23:15:00 2013 +0100
reindent and add parenthesis for clarity
I had to stare at this expression and make sure there wasn't
anything tricky for too long, so I added parens, and reindented
it.
M dist/B-Deparse/Deparse.pm
commit 5d600aa7f3d1ecba81d71a7379281a0cb13322d6
Author: Yves Orton <[email protected]>
Date: Mon Mar 25 23:06:22 2013 +0100
simplify regcomp.c by using vars to avoid repeated macros
Use two temporary variables to simplify the logic, and maybe
speed up a nanosecond or two.
Also chainsaw some long dead logic. (I #ifdef'ed it out years ago)
M regcomp.c
commit 335f58f72309c760770d4a21cd5220510210eb4d
Author: Yves Orton <[email protected]>
Date: Mon Mar 25 20:08:56 2013 +0100
Improve how regcomp.pl handles multibits
In preparation for future changes.
M regen/regcomp.pl
M regnodes.h
-----------------------------------------------------------------------
--
Perl5 Master Repository