In perl.git, the branch maint-5.22 has been updated

<http://perl5.git.perl.org/perl.git/commitdiff/252ab0bb8fa8a2ec1f266cc4ef62c4afb520b30f?hp=8b0897613193634594a3bc37314e614c6550eb08>

- Log -----------------------------------------------------------------
commit 252ab0bb8fa8a2ec1f266cc4ef62c4afb520b30f
Author: Karl Williamson <[email protected]>
Date:   Wed Mar 23 09:17:05 2016 -0600

     PATCH: [perl 127537] /\W/ regression with UTF-8
    
    This bug is apparently uncommon in the field, as I was the one who
    discovered it.  It requires a UTF-8 pattern containing a complemented
    posix class, like \W or \S, in an inverted character class, like
    [^\Wfoo] in a pattern that also has a synthetic start class generated by
    the regex optimizer for it .
    
    The fix is trivial.
    
    (modified from commit ac33c516140ee213a8a20ada506f97b3a7776ae4 so that
    it would apply to 5.22.
-----------------------------------------------------------------------

Summary of changes:
 regcomp.c     | 8 ++++++--
 t/re/re_tests | 2 ++
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/regcomp.c b/regcomp.c
index b8d7d38..e7b82a8 100644
--- a/regcomp.c
+++ b/regcomp.c
@@ -1184,8 +1184,12 @@ S_get_ANYOF_cp_list_for_ssc(pTHX_ const RExC_state_t 
*pRExC_state,
     }
 
     /* If this can match all upper Latin1 code points, have to add them
-     * as well */
-    if (ANYOF_FLAGS(node) & ANYOF_MATCHES_ALL_NON_UTF8_NON_ASCII) {
+     * as well.  But don't add them if inverting, as when that gets done below,
+     * it would exclude all these characters, including the ones it shouldn't
+     * that were added just above */
+    if (ANYOF_FLAGS(node) & (ANYOF_INVERT|ANYOF_MATCHES_ALL_NON_UTF8_NON_ASCII)
+           == ANYOF_MATCHES_ALL_NON_UTF8_NON_ASCII)
+    {
         _invlist_union(invlist, PL_UpperLatin1, &invlist);
     }
 
diff --git a/t/re/re_tests b/t/re/re_tests
index 663307f..85ce7f4 100644
--- a/t/re/re_tests
+++ b/t/re/re_tests
@@ -1613,6 +1613,8 @@ a(.)\4294967298   ab\o{42}94967298        ya      $1      
b       \d not converted to native; \o{} is
 ^m?(\d)(.*)\1$ 5b5     y       $1      5
 ^m?(\d)(.*)\1$ aba     n       -       -
 
+^_?[^\W_0-9]\w\z       \xAA\x{100}     y       $&      \xAA\x{100}             
[perl #127537]
+
 # 17F is 'Long s';  This makes sure the a's in /aa can be separate
 /s/ai  \x{17F} y       $&      \x{17F}
 /s/aia \x{17F} n       -       -

--
Perl5 Master Repository

Reply via email to