[perl.git] branch blead, updated. v5.21.3-485-g570afc5

Karl Williamson Sat, 06 Sep 2014 20:19:59 -0700

In perl.git, the branch blead has been updated

<http://perl5.git.perl.org/perl.git/commitdiff/570afc52205374eff1efd9d307ca6ece2719dd2f?hp=7dc3b71e0eb1edf0a7cedd966a76e358434bbae7>


- Log -----------------------------------------------------------------
commit 570afc52205374eff1efd9d307ca6ece2719dd2f
Author: Karl Williamson <[email protected]>
Date:   Wed Sep 3 20:00:28 2014 -0600

    regcomp.c: White-space only
    
    Properly indent code in blocks newly formed by the previous commit

M       regcomp.c

commit cd01acce420ea4e3e187994f198e24a6d4fcab81
Author: Karl Williamson <[email protected]>
Date:   Wed Sep 3 19:52:05 2014 -0600

    regcomp.c: Refactor func so caller handles anomalies
    
    S_grok_bslash_N() is refactored to not know about the strictness level
    required by the caller, and to return things instead so that the caller
    can decide what action to take.
    
    This is in preparation for some changes in the caller's behavior in
    future commits.
    
    This has the effect of changing the parsing position or where a problem
    occurs shown in a warning message.

M       embed.fnc
M       embed.h
M       proto.h
M       regcomp.c
M       t/re/reg_mesg.t

commit 520eec4f0b92945d9e42874ffac8096b939a521a
Author: Karl Williamson <[email protected]>
Date:   Wed Sep 3 18:28:25 2014 -0600

    regcomp.c: Comment clarifications, nits

M       regcomp.c

commit 1eaa9028796b8bad514effe7ea2a37264d5195b7
Author: Karl Williamson <[email protected]>
Date:   Wed Sep 3 17:31:39 2014 -0600

    regcomp.c: Refactor one area to use common subroutine
    
    By using the inline function append_utf8_from_native_byte(), the details
    of this conversion are hidden from here.  Since that routine advances
    the parsing pointer with each byte, this has to be slightly refactored.

M       regcomp.c

commit 499333dc7a261e5b3794e032f578b461bd895084
Author: Karl Williamson <[email protected]>
Date:   Mon Sep 1 20:00:01 2014 -0600

    PATCH: [perl #122671] Many warnings in regcomp.c can occur twice
    
    This solves the problem by moving the warnings to be output only in
    pass2 of compilation.  The problem arises because almost all of pass1
    can be repeated under certain circumstances described in the ticket and
    the added comments of this patch.

M       pod/perldelta.pod
M       regcomp.c
M       t/lib/warnings/regcomp
M       t/re/reg_mesg.t

commit 3da4c246d54e1970c6c140305135aea1f1745fbd
Author: Karl Williamson <[email protected]>
Date:   Mon Sep 1 18:54:03 2014 -0600

    recomp.c: output given warning in only one pass
    
    Otherwise, it will be output multiple times.  This warning was untested
    for as well.

M       regcomp.c
M       t/re/reg_mesg.t

commit deda9400516b26041cef43bb345b354992a5da95
Author: Karl Williamson <[email protected]>
Date:   Mon Sep 1 16:44:38 2014 -0600

    regcomp.c: Vertically stack ternary
    
    for legibility

M       regcomp.c

commit d9218970031244a06745934121df17f0c84fe27a
Author: Karl Williamson <[email protected]>
Date:   Mon Sep 1 14:57:49 2014 -0600

    regcomp.c: Don't prematurely skip error checking
    
    The assertion in the comment changed by this commit was true only for
    pass1 of the regex compilation; not pass2.  This makes it true in both
    passes by moving it, and the code it was about past some error checking.
    This error checking was executed in pass1, but not pass2.  It also
    changes the warning to only be done in the second pass, part of
    [perl #122671].  A future commit will fix the others

M       regcomp.c

commit 3f9090cb7f77e45487c80fa744f544030eb60222
Author: Karl Williamson <[email protected]>
Date:   Mon Sep 1 14:48:02 2014 -0600

    regcomp.c: Move comment closer to code it applies to

M       regcomp.c

commit 4ea3d013da305b2f2c1dc4bd374270fa7a57c1ae
Author: Karl Williamson <[email protected]>
Date:   Thu Aug 28 21:12:39 2014 -0600

    regcomp.c: Remove unnecessary cast
    
    The macro does the appropriate cast, and this is slightly more legible.

M       regcomp.c

commit 8e35b0561082715dcfa5b85db6cfb4fd650e72f1
Author: Karl Williamson <[email protected]>
Date:   Tue Aug 26 15:34:25 2014 -0600

    regcomp.c: Make macro a lookup
    
    The recently introduced macro isMNEMONIC_CNTRL has a look-up and several
    tests in it, which occupy time and space.  Since it was only used for
    debugging, that did not matter much, but future commits will use it in
    more mainline code.  This commit changes it to be a single look-up,
    using up one of the spare bits available for that purpose in
    PL_charclass.  There are enough available bits that we aren't likely to
    run out, really ever.  (We can always add a 2nd word of bits if
    necessary.)

M       handy.h
M       l1_char_class_tab.h
M       regcomp.c
M       regen/mk_PL_charclass.pl

commit 549b4e78e20d6b000653c2b3d043c46bf7ba0d57
Author: Karl Williamson <[email protected]>
Date:   Tue Aug 26 17:29:31 2014 -0600

    regcomp.c: Extract functionality into a static function
    
    This is in preparation for it being used in more than one place in a
    future commit.

M       embed.fnc
M       embed.h
M       proto.h
M       regcomp.c
-----------------------------------------------------------------------

Summary of changes:
 embed.fnc                |   7 +-
 embed.h                  |   3 +-
 handy.h                  |   5 +-
 l1_char_class_tab.h      |  56 ++---
 pod/perldelta.pod        |   6 +
 proto.h                  |   5 +-
 regcomp.c                | 518 ++++++++++++++++++++++++++++-------------------
 regen/mk_PL_charclass.pl |   4 +
 t/lib/warnings/regcomp   |  25 ++-
 t/re/reg_mesg.t          | 102 +++++-----
 10 files changed, 438 insertions(+), 293 deletions(-)

diff --git a/embed.fnc b/embed.fnc
index 54c7f97..d1c73d4 100644
--- a/embed.fnc
+++ b/embed.fnc
@@ -2115,10 +2115,10 @@ Es      |regnode*|reg_node      |NN RExC_state_t 
*pRExC_state|U8 op
 Es     |UV     |reg_recode     |const char value|NN SV **encp
 Es     |regnode*|regpiece      |NN RExC_state_t *pRExC_state \
                                |NN I32 *flagp|U32 depth
-Es     |bool   |grok_bslash_N  |NN RExC_state_t *pRExC_state        \
+Es     |STRLEN |grok_bslash_N  |NN RExC_state_t *pRExC_state               \
                                |NULLOK regnode** nodep|NULLOK UV *valuep   \
-                               |NN I32 *flagp|U32 depth|bool in_char_class \
-                               |const bool strict
+                               |NN I32 *flagp|U32 depth                    \
+                               |NULLOK SV** substitute_parse
 Es     |void   |reginsert      |NN RExC_state_t *pRExC_state \
                                |U8 op|NN regnode *opnd|U32 depth
 Es     |void   |regtail        |NN RExC_state_t *pRExC_state \
@@ -2194,6 +2194,7 @@ Es        |const regnode*|dumpuntil|NN const regexp *r|NN 
const regnode *start \
                                |NULLOK const regnode *last \
                                |NULLOK const regnode *plast \
                                |NN SV* sv|I32 indent|U32 depth
+EnPs   |const char *|cntrl_to_mnemonic|const U8 c
 Es     |void   |put_code_point |NN SV* sv|UV c
 Es     |bool   |put_charclass_bitmap_innards|NN SV* sv     \
                                |NN char* bitmap            \
diff --git a/embed.h b/embed.h
index 2abc4e2..607cca8 100644
--- a/embed.h
+++ b/embed.h
@@ -913,6 +913,7 @@
 #define yylex()                        Perl_yylex(aTHX)
 #  if defined(DEBUGGING)
 #    if defined(PERL_IN_REGCOMP_C)
+#define cntrl_to_mnemonic      S_cntrl_to_mnemonic
 #define dump_trie(a,b,c,d)     S_dump_trie(aTHX_ a,b,c,d)
 #define dump_trie_interim_list(a,b,c,d,e)      S_dump_trie_interim_list(aTHX_ 
a,b,c,d,e)
 #define dump_trie_interim_table(a,b,c,d,e)     S_dump_trie_interim_table(aTHX_ 
a,b,c,d,e)
@@ -948,7 +949,7 @@
 #define get_ANYOF_cp_list_for_ssc(a,b) S_get_ANYOF_cp_list_for_ssc(aTHX_ a,b)
 #define get_invlist_iter_addr  S_get_invlist_iter_addr
 #define get_invlist_previous_index_addr        
S_get_invlist_previous_index_addr
-#define grok_bslash_N(a,b,c,d,e,f,g)   S_grok_bslash_N(aTHX_ a,b,c,d,e,f,g)
+#define grok_bslash_N(a,b,c,d,e,f)     S_grok_bslash_N(aTHX_ a,b,c,d,e,f)
 #define handle_regex_sets(a,b,c,d,e)   S_handle_regex_sets(aTHX_ a,b,c,d,e)
 #define invlist_array          S_invlist_array
 #define invlist_clone(a)       S_invlist_clone(aTHX_ a)
diff --git a/handy.h b/handy.h
index 0ee6774..511bba3 100644
--- a/handy.h
+++ b/handy.h
@@ -963,7 +963,8 @@ patched there.  The file as of this writing is 
cpan/Devel-PPPort/parts/inc/misc
 #  define _CC_QUOTEMETA                21
 #  define _CC_NON_FINAL_FOLD           22
 #  define _CC_IS_IN_SOME_FOLD          23
-/* Unused: 24-31
+#  define _CC_MNEMONIC_CNTRL           24
+/* Unused: 25-31
  * If more bits are needed, one could add a second word for non-64bit
  * QUAD_IS_INT systems, using some #ifdefs to distinguish between having a 2nd
  * word or not.  The IS_IN_SOME_FOLD bit is the most easily expendable, as it
@@ -1096,6 +1097,8 @@ EXTCONST U32 PL_charclass[];
                                             _generic_isCC(c, 
_CC_NON_FINAL_FOLD)
 #   define _IS_IN_SOME_FOLD_ONLY_FOR_USE_BY_REGCOMP_DOT_C(c) \
                                             _generic_isCC(c, 
_CC_IS_IN_SOME_FOLD)
+#   define _IS_MNEMONIC_CNTRL_ONLY_FOR_USE_BY_REGCOMP_DOT_C(c) \
+                                            _generic_isCC(c, 
_CC_MNEMONIC_CNTRL)
 #else   /* else we don't have perl.h */
 
     /* If we don't have perl.h, we are compiling a utility program.  Below we
diff --git a/l1_char_class_tab.h b/l1_char_class_tab.h
index ccc7014..fc262be 100644
--- a/l1_char_class_tab.h
+++ b/l1_char_class_tab.h
@@ -15,13 +15,13 @@
 /* U+04 EOT */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* U+05 ENQ */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* U+06 ACK */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
-/* U+07 BEL */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
-/* U+08 BS */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
-/* U+09 HT */ 
(1U<<_CC_ASCII)|(1U<<_CC_BLANK)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE),
-/* U+0A LF */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE)|(1U<<_CC_VERTSPACE),
+/* U+07 BEL */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_MNEMONIC_CNTRL),
+/* U+08 BS */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_MNEMONIC_CNTRL),
+/* U+09 HT */ 
(1U<<_CC_ASCII)|(1U<<_CC_BLANK)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE)|(1U<<_CC_MNEMONIC_CNTRL),
+/* U+0A LF */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE)|(1U<<_CC_VERTSPACE)|(1U<<_CC_MNEMONIC_CNTRL),
 /* U+0B VT */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE)|(1U<<_CC_VERTSPACE),
-/* U+0C FF */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE)|(1U<<_CC_VERTSPACE),
-/* U+0D CR */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE)|(1U<<_CC_VERTSPACE),
+/* U+0C FF */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE)|(1U<<_CC_VERTSPACE)|(1U<<_CC_MNEMONIC_CNTRL),
+/* U+0D CR */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE)|(1U<<_CC_VERTSPACE)|(1U<<_CC_MNEMONIC_CNTRL),
 /* U+0E SO */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* U+0F SI */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* U+10 DLE */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
@@ -35,7 +35,7 @@
 /* U+18 CAN */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* U+19 EOM */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* U+1A SUB */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
-/* U+1B ESC */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
+/* U+1B ESC */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_MNEMONIC_CNTRL),
 /* U+1C FS */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* U+1D GS */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* U+1E RS */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
@@ -276,15 +276,15 @@
 /* U+02 STX */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* U+03 ETX */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x04 U+9C ST */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
-/* 0x05 U+09 HT */ 
(1U<<_CC_ASCII)|(1U<<_CC_BLANK)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE),
+/* 0x05 U+09 HT */ 
(1U<<_CC_ASCII)|(1U<<_CC_BLANK)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE)|(1U<<_CC_MNEMONIC_CNTRL),
 /* 0x06 U+86 SSA */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x07 U+7F DEL */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x08 U+97 EPA */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x09 U+8D RI */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x0A U+8E SS2 */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* U+0B VT */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE)|(1U<<_CC_VERTSPACE),
-/* U+0C FF */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE)|(1U<<_CC_VERTSPACE),
-/* U+0D CR */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE)|(1U<<_CC_VERTSPACE),
+/* U+0C FF */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE)|(1U<<_CC_VERTSPACE)|(1U<<_CC_MNEMONIC_CNTRL),
+/* U+0D CR */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE)|(1U<<_CC_VERTSPACE)|(1U<<_CC_MNEMONIC_CNTRL),
 /* U+0E SO */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* U+0F SI */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* U+10 DLE */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
@@ -292,8 +292,8 @@
 /* U+12 DC2 */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* U+13 DC3 */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x14 U+9D OSC */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
-/* 0x15 U+0A LF */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE)|(1U<<_CC_VERTSPACE),
-/* 0x16 U+08 BS */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
+/* 0x15 U+0A LF */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE)|(1U<<_CC_VERTSPACE)|(1U<<_CC_MNEMONIC_CNTRL),
+/* 0x16 U+08 BS */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_MNEMONIC_CNTRL),
 /* 0x17 U+87 ESA */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* U+18 CAN */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* U+19 EOM */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
@@ -310,7 +310,7 @@
 /* 0x24 U+84 IND */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x25 U+85 NEL */ 
(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE)|(1U<<_CC_VERTSPACE),
 /* 0x26 U+17 ETB */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
-/* 0x27 U+1B ESC */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
+/* 0x27 U+1B ESC */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_MNEMONIC_CNTRL),
 /* 0x28 U+88 HTS */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x29 U+89 HTJ */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x2A U+8A VTS */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
@@ -318,7 +318,7 @@
 /* 0x2C U+8C PLU */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x2D U+05 ENQ */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x2E U+06 ACK */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
-/* 0x2F U+07 BEL */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
+/* 0x2F U+07 BEL */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_MNEMONIC_CNTRL),
 /* 0x30 U+90 DCS */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x31 U+91 PU1 */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x32 U+16 SYN */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
@@ -539,15 +539,15 @@
 /* U+02 STX */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* U+03 ETX */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x04 U+9C ST */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
-/* 0x05 U+09 HT */ 
(1U<<_CC_ASCII)|(1U<<_CC_BLANK)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE),
+/* 0x05 U+09 HT */ 
(1U<<_CC_ASCII)|(1U<<_CC_BLANK)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE)|(1U<<_CC_MNEMONIC_CNTRL),
 /* 0x06 U+86 SSA */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x07 U+7F DEL */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x08 U+97 EPA */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x09 U+8D RI */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x0A U+8E SS2 */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* U+0B VT */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE)|(1U<<_CC_VERTSPACE),
-/* U+0C FF */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE)|(1U<<_CC_VERTSPACE),
-/* U+0D CR */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE)|(1U<<_CC_VERTSPACE),
+/* U+0C FF */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE)|(1U<<_CC_VERTSPACE)|(1U<<_CC_MNEMONIC_CNTRL),
+/* U+0D CR */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE)|(1U<<_CC_VERTSPACE)|(1U<<_CC_MNEMONIC_CNTRL),
 /* U+0E SO */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* U+0F SI */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* U+10 DLE */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
@@ -556,7 +556,7 @@
 /* U+13 DC3 */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x14 U+9D OSC */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x15 U+85 NEL */ 
(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE)|(1U<<_CC_VERTSPACE),
-/* 0x16 U+08 BS */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
+/* 0x16 U+08 BS */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_MNEMONIC_CNTRL),
 /* 0x17 U+87 ESA */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* U+18 CAN */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* U+19 EOM */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
@@ -571,9 +571,9 @@
 /* 0x22 U+82 BPH */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x23 U+83 NBH */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x24 U+84 IND */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
-/* 0x25 U+0A LF */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE)|(1U<<_CC_VERTSPACE),
+/* 0x25 U+0A LF */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE)|(1U<<_CC_VERTSPACE)|(1U<<_CC_MNEMONIC_CNTRL),
 /* 0x26 U+17 ETB */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
-/* 0x27 U+1B ESC */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
+/* 0x27 U+1B ESC */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_MNEMONIC_CNTRL),
 /* 0x28 U+88 HTS */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x29 U+89 HTJ */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x2A U+8A VTS */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
@@ -581,7 +581,7 @@
 /* 0x2C U+8C PLU */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x2D U+05 ENQ */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x2E U+06 ACK */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
-/* 0x2F U+07 BEL */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
+/* 0x2F U+07 BEL */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_MNEMONIC_CNTRL),
 /* 0x30 U+90 DCS */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x31 U+91 PU1 */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x32 U+16 SYN */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
@@ -802,15 +802,15 @@
 /* U+02 STX */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* U+03 ETX */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x04 U+9C ST */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
-/* 0x05 U+09 HT */ 
(1U<<_CC_ASCII)|(1U<<_CC_BLANK)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE),
+/* 0x05 U+09 HT */ 
(1U<<_CC_ASCII)|(1U<<_CC_BLANK)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE)|(1U<<_CC_MNEMONIC_CNTRL),
 /* 0x06 U+86 SSA */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x07 U+7F DEL */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x08 U+97 EPA */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x09 U+8D RI */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x0A U+8E SS2 */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* U+0B VT */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE)|(1U<<_CC_VERTSPACE),
-/* U+0C FF */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE)|(1U<<_CC_VERTSPACE),
-/* U+0D CR */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE)|(1U<<_CC_VERTSPACE),
+/* U+0C FF */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE)|(1U<<_CC_VERTSPACE)|(1U<<_CC_MNEMONIC_CNTRL),
+/* U+0D CR */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE)|(1U<<_CC_VERTSPACE)|(1U<<_CC_MNEMONIC_CNTRL),
 /* U+0E SO */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* U+0F SI */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* U+10 DLE */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
@@ -818,8 +818,8 @@
 /* U+12 DC2 */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* U+13 DC3 */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x14 U+9D OSC */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
-/* 0x15 U+0A LF */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE)|(1U<<_CC_VERTSPACE),
-/* 0x16 U+08 BS */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
+/* 0x15 U+0A LF */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE)|(1U<<_CC_VERTSPACE)|(1U<<_CC_MNEMONIC_CNTRL),
+/* 0x16 U+08 BS */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_MNEMONIC_CNTRL),
 /* 0x17 U+87 ESA */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* U+18 CAN */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* U+19 EOM */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
@@ -836,7 +836,7 @@
 /* 0x24 U+84 IND */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x25 U+85 NEL */ 
(1U<<_CC_CNTRL)|(1U<<_CC_PSXSPC)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_SPACE)|(1U<<_CC_VERTSPACE),
 /* 0x26 U+17 ETB */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
-/* 0x27 U+1B ESC */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
+/* 0x27 U+1B ESC */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_MNEMONIC_CNTRL),
 /* 0x28 U+88 HTS */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x29 U+89 HTJ */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x2A U+8A VTS */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
@@ -844,7 +844,7 @@
 /* 0x2C U+8C PLU */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x2D U+05 ENQ */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x2E U+06 ACK */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
-/* 0x2F U+07 BEL */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
+/* 0x2F U+07 BEL */ 
(1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA)|(1U<<_CC_MNEMONIC_CNTRL),
 /* 0x30 U+90 DCS */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x31 U+91 PU1 */ (1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
 /* 0x32 U+16 SYN */ (1U<<_CC_ASCII)|(1U<<_CC_CNTRL)|(1U<<_CC_QUOTEMETA),
diff --git a/pod/perldelta.pod b/pod/perldelta.pod
index 48e9c58..016c1bf 100644
--- a/pod/perldelta.pod
+++ b/pod/perldelta.pod
@@ -437,6 +437,12 @@ Stub declarations like C<sub f;> and C<sub f ();> no 
longer wipe out
 constants of the same name declared by C<use constant>.  This bug was
 introduced in perl 5.10.0.
 
+=item *
+
+Under some conditions a warning raised in compilation of regular
+expression patterns could be displayed multiple times.  This is now
+fixed.
+
 =back
 
 =head1 Known Problems
diff --git a/proto.h b/proto.h
index af28f6c..3ff1650 100644
--- a/proto.h
+++ b/proto.h
@@ -5356,6 +5356,9 @@ STATIC void       S_cv_dump(pTHX_ const CV *cv, const 
char *title)
 
 #  endif
 #  if defined(PERL_IN_REGCOMP_C)
+STATIC const char *    S_cntrl_to_mnemonic(const U8 c)
+                       __attribute__pure__;
+
 STATIC void    S_dump_trie(pTHX_ const struct _reg_trie_data *trie, HV* 
widecharmap, AV *revcharmap, U32 depth)
                        __attribute__nonnull__(pTHX_1)
                        __attribute__nonnull__(pTHX_3);
@@ -6767,7 +6770,7 @@ PERL_STATIC_INLINE IV*    
S_get_invlist_previous_index_addr(SV* invlist)
 #define PERL_ARGS_ASSERT_GET_INVLIST_PREVIOUS_INDEX_ADDR       \
        assert(invlist)
 
-STATIC bool    S_grok_bslash_N(pTHX_ RExC_state_t *pRExC_state, regnode** 
nodep, UV *valuep, I32 *flagp, U32 depth, bool in_char_class, const bool strict)
+STATIC STRLEN  S_grok_bslash_N(pTHX_ RExC_state_t *pRExC_state, regnode** 
nodep, UV *valuep, I32 *flagp, U32 depth, SV** substitute_parse)
                        __attribute__nonnull__(pTHX_1)
                        __attribute__nonnull__(pTHX_4);
 #define PERL_ARGS_ASSERT_GROK_BSLASH_N \
diff --git a/regcomp.c b/regcomp.c
index a1422d0..810cfef 100644
--- a/regcomp.c
+++ b/regcomp.c
@@ -570,80 +570,85 @@ static const scan_data_t zero_scan_data =
             REPORT_LOCATION_ARGS(offset));         \
 } STMT_END
 
+/* These have asserts in them because of [perl #122671] Many warnings in
+ * regcomp.c can occur twice.  If they get output in pass1 and later in that
+ * pass, the pattern has to be converted to UTF-8 and the pass restarted, they
+ * would get output again.  So they should be output in pass2, and these
+ * asserts make sure new warnings follow that paradigm. */
 
 /* m is not necessarily a "literal string", in this macro */
 #define reg_warn_non_literal_string(loc, m) STMT_START {                \
     const IV offset = loc - RExC_precomp;                               \
-    Perl_warner(aTHX_ packWARN(WARN_REGEXP), "%s" REPORT_LOCATION,      \
+    __ASSERT_(PASS2) Perl_warner(aTHX_ packWARN(WARN_REGEXP), "%s" 
REPORT_LOCATION,      \
             m, REPORT_LOCATION_ARGS(offset));       \
 } STMT_END
 
 #define        ckWARNreg(loc,m) STMT_START {                                   
\
     const IV offset = loc - RExC_precomp;                              \
-    Perl_ck_warner(aTHX_ packWARN(WARN_REGEXP), m REPORT_LOCATION,     \
+    __ASSERT_(PASS2) Perl_ck_warner(aTHX_ packWARN(WARN_REGEXP), m 
REPORT_LOCATION,    \
            REPORT_LOCATION_ARGS(offset));              \
 } STMT_END
 
 #define        vWARN_dep(loc, m) STMT_START {                                  
\
     const IV offset = loc - RExC_precomp;                              \
-    Perl_warner(aTHX_ packWARN(WARN_DEPRECATED), m REPORT_LOCATION,    \
+    __ASSERT_(PASS2) Perl_warner(aTHX_ packWARN(WARN_DEPRECATED), m 
REPORT_LOCATION,   \
            REPORT_LOCATION_ARGS(offset));              \
 } STMT_END
 
 #define        ckWARNdep(loc,m) STMT_START {                                   
\
     const IV offset = loc - RExC_precomp;                              \
-    Perl_ck_warner_d(aTHX_ packWARN(WARN_DEPRECATED),                  \
+    __ASSERT_(PASS2) Perl_ck_warner_d(aTHX_ packWARN(WARN_DEPRECATED),         
        \
            m REPORT_LOCATION,                                          \
            REPORT_LOCATION_ARGS(offset));              \
 } STMT_END
 
 #define        ckWARNregdep(loc,m) STMT_START {                                
\
     const IV offset = loc - RExC_precomp;                              \
-    Perl_ck_warner_d(aTHX_ packWARN2(WARN_DEPRECATED, WARN_REGEXP),    \
+    __ASSERT_(PASS2) Perl_ck_warner_d(aTHX_ packWARN2(WARN_DEPRECATED, 
WARN_REGEXP),   \
            m REPORT_LOCATION,                                          \
            REPORT_LOCATION_ARGS(offset));              \
 } STMT_END
 
 #define        ckWARN2reg_d(loc,m, a1) STMT_START {                            
\
     const IV offset = loc - RExC_precomp;                              \
-    Perl_ck_warner_d(aTHX_ packWARN(WARN_REGEXP),                      \
+    __ASSERT_(PASS2) Perl_ck_warner_d(aTHX_ packWARN(WARN_REGEXP),             
        \
            m REPORT_LOCATION,                                          \
            a1, REPORT_LOCATION_ARGS(offset));  \
 } STMT_END
 
 #define        ckWARN2reg(loc, m, a1) STMT_START {                             
\
     const IV offset = loc - RExC_precomp;                              \
-    Perl_ck_warner(aTHX_ packWARN(WARN_REGEXP), m REPORT_LOCATION,     \
+    __ASSERT_(PASS2) Perl_ck_warner(aTHX_ packWARN(WARN_REGEXP), m 
REPORT_LOCATION,    \
            a1, REPORT_LOCATION_ARGS(offset));  \
 } STMT_END
 
 #define        vWARN3(loc, m, a1, a2) STMT_START {                             
\
     const IV offset = loc - RExC_precomp;                              \
-    Perl_warner(aTHX_ packWARN(WARN_REGEXP), m REPORT_LOCATION,                
\
+    __ASSERT_(PASS2) Perl_warner(aTHX_ packWARN(WARN_REGEXP), m 
REPORT_LOCATION,               \
            a1, a2, REPORT_LOCATION_ARGS(offset));      \
 } STMT_END
 
 #define        ckWARN3reg(loc, m, a1, a2) STMT_START {                         
\
     const IV offset = loc - RExC_precomp;                              \
-    Perl_ck_warner(aTHX_ packWARN(WARN_REGEXP), m REPORT_LOCATION,     \
+    __ASSERT_(PASS2) Perl_ck_warner(aTHX_ packWARN(WARN_REGEXP), m 
REPORT_LOCATION,    \
            a1, a2, REPORT_LOCATION_ARGS(offset));      \
 } STMT_END
 
 #define        vWARN4(loc, m, a1, a2, a3) STMT_START {                         
\
     const IV offset = loc - RExC_precomp;                              \
-    Perl_warner(aTHX_ packWARN(WARN_REGEXP), m REPORT_LOCATION,                
\
+    __ASSERT_(PASS2) Perl_warner(aTHX_ packWARN(WARN_REGEXP), m 
REPORT_LOCATION,               \
            a1, a2, a3, REPORT_LOCATION_ARGS(offset)); \
 } STMT_END
 
 #define        ckWARN4reg(loc, m, a1, a2, a3) STMT_START {                     
\
     const IV offset = loc - RExC_precomp;                              \
-    Perl_ck_warner(aTHX_ packWARN(WARN_REGEXP), m REPORT_LOCATION,     \
+    __ASSERT_(PASS2) Perl_ck_warner(aTHX_ packWARN(WARN_REGEXP), m 
REPORT_LOCATION,    \
            a1, a2, a3, REPORT_LOCATION_ARGS(offset)); \
 } STMT_END
 
 #define        vWARN5(loc, m, a1, a2, a3, a4) STMT_START {                     
\
     const IV offset = loc - RExC_precomp;                              \
-    Perl_warner(aTHX_ packWARN(WARN_REGEXP), m REPORT_LOCATION,                
\
+    __ASSERT_(PASS2) Perl_warner(aTHX_ packWARN(WARN_REGEXP), m 
REPORT_LOCATION,               \
            a1, a2, a3, a4, REPORT_LOCATION_ARGS(offset)); \
 } STMT_END
 
@@ -800,6 +805,33 @@ DEBUG_OPTIMISE_MORE_r(if(data){                            
          \
     PerlIO_printf(Perl_debug_log,"\n");                              \
 });
 
+#ifdef DEBUGGING
+
+/* is c a control character for which we have a mnemonic? */
+#define isMNEMONIC_CNTRL(c) _IS_MNEMONIC_CNTRL_ONLY_FOR_USE_BY_REGCOMP_DOT_C(c)
+
+STATIC const char *
+S_cntrl_to_mnemonic(const U8 c)
+{
+    /* Returns the mnemonic string that represents character 'c', if one
+     * exists; NULL otherwise.  The only ones that exist for the purposes of
+     * this routine are a few control characters */
+
+    switch (c) {
+        case '\a':       return "\\a";
+        case '\b':       return "\\b";
+        case ESC_NATIVE: return "\\e";
+        case '\f':       return "\\f";
+        case '\n':       return "\\n";
+        case '\r':       return "\\r";
+        case '\t':       return "\\t";
+    }
+
+    return NULL;
+}
+
+#endif
+
 /* Mark that we cannot extend a found fixed substring at this point.
    Update the longest found anchored substring and the longest found
    floating substrings if needed. */
@@ -5638,9 +5670,9 @@ S_pat_upgrade_to_utf8(pTHX_ RExC_state_t * const 
pRExC_state,
                    char **pat_p, STRLEN *plen_p, int num_code_blocks)
 {
     U8 *const src = (U8*)*pat_p;
-    U8 *dst;
+    U8 *dst, *d;
     int n=0;
-    STRLEN s = 0, d = 0;
+    STRLEN s = 0;
     bool do_end = 0;
     GET_RE_DEBUG_FLAGS_DECL;
 
@@ -5648,32 +5680,27 @@ S_pat_upgrade_to_utf8(pTHX_ RExC_state_t * const 
pRExC_state,
         "UTF8 mismatch! Converting to utf8 for resizing and compile\n"));
 
     Newx(dst, *plen_p * 2 + 1, U8);
+    d = dst;
 
     while (s < *plen_p) {
-        if (NATIVE_BYTE_IS_INVARIANT(src[s]))
-            dst[d]   = src[s];
-        else {
-            dst[d++] = UTF8_EIGHT_BIT_HI(src[s]);
-            dst[d]   = UTF8_EIGHT_BIT_LO(src[s]);
-        }
+        append_utf8_from_native_byte(src[s], &d);
         if (n < num_code_blocks) {
             if (!do_end && pRExC_state->code_blocks[n].start == s) {
-                pRExC_state->code_blocks[n].start = d;
-                assert(dst[d] == '(');
+                pRExC_state->code_blocks[n].start = d - dst - 1;
+                assert(*(d - 1) == '(');
                 do_end = 1;
             }
             else if (do_end && pRExC_state->code_blocks[n].end == s) {
-                pRExC_state->code_blocks[n].end = d;
-                assert(dst[d] == ')');
+                pRExC_state->code_blocks[n].end = d - dst - 1;
+                assert(*(d - 1) == ')');
                 do_end = 0;
                 n++;
             }
         }
         s++;
-        d++;
     }
-    dst[d] = '\0';
-    *plen_p = d;
+    *d = '\0';
+    *plen_p = d - dst;
     *pat_p = (char*) dst;
     SAVEFREEPV(*pat_p);
     RExC_orig_utf8 = RExC_utf8 = 1;
@@ -9353,7 +9380,7 @@ S_parse_lparen_question_flags(pTHX_ RExC_state_t 
*pRExC_state)
                 /*NOTREACHED*/
             case ONCE_PAT_MOD: /* 'o' */
             case GLOBAL_PAT_MOD: /* 'g' */
-                if (SIZE_ONLY && ckWARN(WARN_REGEXP)) {
+                if (PASS2 && ckWARN(WARN_REGEXP)) {
                     const I32 wflagbit = *RExC_parse == 'o'
                                          ? WASTED_O
                                          : WASTED_G;
@@ -9373,7 +9400,7 @@ S_parse_lparen_question_flags(pTHX_ RExC_state_t 
*pRExC_state)
                 break;
 
             case CONTINUE_PAT_MOD: /* 'c' */
-                if (SIZE_ONLY && ckWARN(WARN_REGEXP)) {
+                if (PASS2 && ckWARN(WARN_REGEXP)) {
                     if (! (wastedflags & WASTED_C) ) {
                         wastedflags |= WASTED_GC;
                        /* diag_listed_as: Useless (?-%s) - don't use /%s 
modifier in regex; marked by <-- HERE in m/%s/ */
@@ -9388,7 +9415,7 @@ S_parse_lparen_question_flags(pTHX_ RExC_state_t 
*pRExC_state)
                 break;
             case KEEPCOPY_PAT_MOD: /* 'p' */
                 if (flagsp == &negflags) {
-                    if (SIZE_ONLY)
+                    if (PASS2)
                         ckWARNreg(RExC_parse + 1,"Useless use of (?-p)");
                 } else {
                     *flagsp |= RXf_PMf_KEEPCOPY;
@@ -10520,7 +10547,6 @@ S_regpiece(pTHX_ RExC_state_t *pRExC_state, I32 *flagp, 
U32 depth)
             if (max < min) {    /* If can't match, warn and optimize to fail
                                    unconditionally */
                 if (SIZE_ONLY) {
-                    ckWARNreg(RExC_parse, "Quantifier {n,m} with n > m can't 
match");
 
                     /* We can't back off the size because we have to reserve
                      * enough space for all the things we are about to throw
@@ -10529,6 +10555,7 @@ S_regpiece(pTHX_ RExC_state_t *pRExC_state, I32 *flagp, 
U32 depth)
                     RExC_size = PREVOPER(RExC_size) - regarglen[(U8)OPFAIL];
                 }
                 else {
+                    ckWARNreg(RExC_parse, "Quantifier {n,m} with n > m can't 
match");
                     RExC_emit = orig_emit;
                 }
                 ret = reg_node(pRExC_state, OPFAIL);
@@ -10538,7 +10565,7 @@ S_regpiece(pTHX_ RExC_state_t *pRExC_state, I32 *flagp, 
U32 depth)
                      && RExC_parse < RExC_end
                      && (*RExC_parse == '?' || *RExC_parse == '+'))
             {
-                if (SIZE_ONLY) {
+                if (PASS2) {
                     ckWARN2reg(RExC_parse + 1,
                                "Useless use of greediness modifier '%c'",
                                *RExC_parse);
@@ -10684,10 +10711,9 @@ S_regpiece(pTHX_ RExC_state_t *pRExC_state, I32 
*flagp, U32 depth)
     return(ret);
 }
 
-STATIC bool
+STATIC STRLEN
 S_grok_bslash_N(pTHX_ RExC_state_t *pRExC_state, regnode** node_p,
-                      UV *valuep, I32 *flagp, U32 depth, bool in_char_class,
-                      const bool strict   /* Apply stricter parsing rules? */
+                      UV *valuep, I32 *flagp, U32 depth, SV** substitute_parse
     )
 {
 
@@ -10695,46 +10721,75 @@ S_grok_bslash_N(pTHX_ RExC_state_t *pRExC_state, 
regnode** node_p,
    and needs to handle the rest. RExC_parse is expected to point at the first
    char following the N at the time of the call.  On successful return,
    RExC_parse has been updated to point to just after the sequence identified
-   by this routine, and <*flagp> has been updated.
-
-   The \N may be inside (indicated by the boolean <in_char_class>) or outside a
-   character class.
-
-   \N may begin either a named sequence, or if outside a character class, mean
-   to match a non-newline.  For non single-quoted regexes, the tokenizer has
-   attempted to decide which, and in the case of a named sequence, converted it
+   by this routine, <*flagp> has been updated, and the non-NULL input pointers
+   have been set appropriately.
+
+   The typical case for this is \N{some character name}.  This is usually
+   called while parsing the input, filling in or ready to fill in an EXACTish
+   node, and the code point for the character should be returned, so that it
+   can be added to the node, and parsing continued with the next input
+   character.  But it may be that instead of a single character the \N{}
+   expands to more than one, a named sequence.  In this case any following
+   quantifier applies to the whole sequence, and it is easier, given the code
+   structure that calls this, to handle it from a different area of the code.
+   For this reason, the input parameters can be set so that it returns valid
+   only on one or the other of these cases.
+
+   Another possibility is for the input to be an empty \N{}, which for
+   backwards compatibility we accept, but generate a NOTHING node which should
+   later get optimized out.  This is handled from the area of code which can
+   handle a named sequence, so if called with the parameters for the other, it
+   fails.
+
+   Still another possibility is for the \N to mean [^\n], and not a single
+   character or explicit sequence at all.  This is determined by context.
+   Again, this is handled from the area of code which can handle a named
+   sequence, so if called with the parameters for the other, it also fails.
+
+   And the final possibility is for the \N to be called from within a bracketed
+   character class.  In this case the [^\n] meaning makes no sense, and so is
+   an error.  Other anomalous situations are left to the calling code to 
handle.
+
+   For non-single-quoted regexes, the tokenizer has attempted to decide which
+   of the above applies, and in the case of a named sequence, has converted it
    into one of the forms: \N{} (if the sequence is null), or \N{U+c1.c2...},
    where c1... are the characters in the sequence.  For single-quoted regexes,
    the tokenizer passes the \N sequence through unchanged; this code will not
    attempt to determine this nor expand those, instead raising a syntax error.
    The net effect is that if the beginning of the passed-in pattern isn't '{U+'
    or there is no '}', it signals that this \N occurrence means to match a
-   non-newline.
+   non-newline. (This mostly was done because of [perl #56444].)
 
-   Only the \N{U+...} form should occur in a character class, for the same
-   reason that '.' inside a character class means to just match a period: it
-   just doesn't make sense.
+   The API is somewhat convoluted due to historical and the above reasons.
 
    The function raises an error (via vFAIL), and doesn't return for various
-   syntax errors.  Otherwise it returns TRUE and sets <node_p> or <valuep> on
-   success; it returns FALSE otherwise. Returns FALSE, setting *flagp to
-   RESTART_UTF8 if the sizing scan needs to be restarted. Such a restart is
-   only possible if node_p is non-NULL.
-
+   syntax errors.  For other failures, it returns (STRLEN) -1.  For successes,
+   it returns a count of how many characters were accounted for by it.  (This
+   can be 0 for \N{}; 1 for it meaning [^\n]; and otherwise the number of code
+   points in the sequence.  It sets <node_p>, <valuep>, and/or
+   <substitute_parse> on success.
 
    If <valuep> is non-null, it means the caller can accept an input sequence
-   consisting of a just a single code point; <*valuep> is set to that value
-   if the input is such.
-
-   If <node_p> is non-null it signifies that the caller can accept any other
-   legal sequence (i.e., one that isn't just a single code point).  <*node_p>
-   is set as follows:
-    1) \N means not-a-NL: points to a newly created REG_ANY node;
-    2) \N{}:              points to a new NOTHING node;
+   consisting of a just a single code point; <*valuep> is set to the value
+   of the only or first code point in the input.
+
+   If <substitute_parse> is non-null, it means the caller can accept an input
+   sequence consisting of one or more code points; <*substitute_parse> is a
+   newly created mortal SV* in this case, containing \x{} escapes representing
+   those code points.
+
+   Both <valuep> and <substitute_parse> can be non-NULL.
+
+   If <node_p> is non-null, <substitute_parse> must be NULL.  This signifies
+   that the caller can accept any legal sequence other than a single code
+   point.  To wit, <*node_p> is set as follows:
+    1) \N means not-a-NL: points to a newly created REG_ANY node; return is 1
+    2) \N{}:              points to a new NOTHING node; return is 0
     3) otherwise:         points to a new EXACT node containing the resolved
-                          string.
-   Note that FALSE is returned for single code point sequences if <valuep> is
-   null.
+                          string; return is the number of code points in the
+                          string.  This will never be 1.
+   Note that failure is returned for single code point sequences if <valuep> is
+   null and <node_p> is not.
  */
 
     char * endbrace;    /* '}' following the name */
@@ -10743,6 +10798,8 @@ S_grok_bslash_N(pTHX_ RExC_state_t *pRExC_state, 
regnode** node_p,
                            stream */
     bool has_multiple_chars; /* true if the input stream contains a sequence of
                                 more than one character */
+    bool in_char_class = substitute_parse != NULL;
+    STRLEN count = 0;   /* Number of characters in this sequence */
 
     GET_RE_DEBUG_FLAGS_DECL;
 
@@ -10751,6 +10808,7 @@ S_grok_bslash_N(pTHX_ RExC_state_t *pRExC_state, 
regnode** node_p,
     GET_RE_DEBUG_FLAGS;
 
     assert(cBOOL(node_p) ^ cBOOL(valuep));  /* Exactly one should be set */
+    assert(! (node_p && substitute_parse)); /* At most 1 should be set */
 
     /* The [^\n] meaning of \N ignores spaces and comments under the /x
      * modifier.  The other meaning does not, so use a temporary until we find
@@ -10769,7 +10827,7 @@ S_grok_bslash_N(pTHX_ RExC_state_t *pRExC_state, 
regnode** node_p,
             if (in_char_class) {
                 vFAIL("\\N in a character class must be a named character: 
\\N{...}");
             }
-            return FALSE;
+            return (STRLEN) -1;
         }
         RExC_parse--;   /* Need to back off so nextchar() doesn't skip the
                            current char */
@@ -10778,7 +10836,7 @@ S_grok_bslash_N(pTHX_ RExC_state_t *pRExC_state, 
regnode** node_p,
        *flagp |= HASWIDTH|SIMPLE;
        RExC_naughty++;
         Set_Node_Length(*node_p, 1); /* MJD */
-       return TRUE;
+       return 1;
     }
 
     /* Here, we have decided it should be a named character or sequence */
@@ -10805,28 +10863,14 @@ S_grok_bslash_N(pTHX_ RExC_state_t *pRExC_state, 
regnode** node_p,
     }
 
     if (endbrace == RExC_parse) {   /* empty: \N{} */
-        bool ret = TRUE;
        if (node_p) {
            *node_p = reg_node(pRExC_state,NOTHING);
        }
-        else if (in_char_class) {
-            if (SIZE_ONLY && in_char_class) {
-                if (strict) {
-                    RExC_parse++;   /* Position after the "}" */
-                    vFAIL("Zero length \\N{}");
-                }
-                else {
-                    ckWARNreg(RExC_parse,
-                              "Ignoring zero length \\N{} in character class");
-                }
-            }
-            ret = FALSE;
-       }
-        else {
-            return FALSE;
+        else if (! in_char_class) {
+            return (STRLEN) -1;
         }
         nextchar(pRExC_state);
-        return ret;
+        return 0;
     }
 
     RExC_uni_semantics = 1; /* Unicode named chars imply Unicode semantics */
@@ -10838,90 +10882,103 @@ S_grok_bslash_N(pTHX_ RExC_state_t *pRExC_state, 
regnode** node_p,
      * point, and is terminated by the brace */
     has_multiple_chars = (endchar < endbrace);
 
-    if (valuep && (! has_multiple_chars || in_char_class)) {
-       /* We only pay attention to the first char of
-        multichar strings being returned in char classes. I kinda wonder
-       if this makes sense as it does change the behaviour
-       from earlier versions, OTOH that behaviour was broken
-       as well. XXX Solution is to recharacterize as
-       [rest-of-class]|multi1|multi2... */
-
+    /* We get the first code point if we want it, and either there is only one,
+     * or we can accept both cases of one and more than one */
+    if (valuep && (substitute_parse || ! has_multiple_chars)) {
        STRLEN length_of_hex = (STRLEN)(endchar - RExC_parse);
        I32 grok_hex_flags = PERL_SCAN_ALLOW_UNDERSCORES
-           | PERL_SCAN_DISALLOW_PREFIX
-           | (SIZE_ONLY ? PERL_SCAN_SILENT_ILLDIGIT : 0);
+                           | PERL_SCAN_DISALLOW_PREFIX
+
+                             /* No errors in the first pass (See [perl
+                              * #122671].)  We let the code below find the
+                              * errors when there are multiple chars. */
+                           | ((SIZE_ONLY || has_multiple_chars)
+                              ? PERL_SCAN_SILENT_ILLDIGIT
+                              : 0);
 
        *valuep = grok_hex(RExC_parse, &length_of_hex, &grok_hex_flags, NULL);
 
        /* The tokenizer should have guaranteed validity, but it's possible to
-        * bypass it by using single quoting, so check */
-       if (length_of_hex == 0
-           || length_of_hex != (STRLEN)(endchar - RExC_parse) )
-       {
-           RExC_parse += length_of_hex;        /* Includes all the valid */
-           RExC_parse += (RExC_orig_utf8)      /* point to after 1st invalid */
-                           ? UTF8SKIP(RExC_parse)
-                           : 1;
-           /* Guard against malformed utf8 */
-           if (RExC_parse >= endchar) {
-                RExC_parse = endchar;
+         * bypass it by using single quoting, so check.  Don't do the check
+         * here when there are multiple chars; we do it below anyway. */
+        if (! has_multiple_chars) {
+            if (length_of_hex == 0
+                || length_of_hex != (STRLEN)(endchar - RExC_parse) )
+            {
+                RExC_parse += length_of_hex;   /* Includes all the valid */
+                RExC_parse += (RExC_orig_utf8) /* point to after 1st invalid */
+                                ? UTF8SKIP(RExC_parse)
+                                : 1;
+                /* Guard against malformed utf8 */
+                if (RExC_parse >= endchar) {
+                    RExC_parse = endchar;
+                }
+                vFAIL("Invalid hexadecimal number in \\N{U+...}");
             }
-           vFAIL("Invalid hexadecimal number in \\N{U+...}");
-       }
 
-        if (in_char_class && has_multiple_chars) {
-            if (strict) {
-                RExC_parse = endbrace;
-                vFAIL("\\N{} in character class restricted to one character");
-            }
-            else {
-                ckWARNreg(endchar, "Using just the first character returned by 
\\N{} in character class");
-            }
+            RExC_parse = endbrace + 1;
+            return 1;
         }
-
-        RExC_parse = endbrace + 1;
     }
-    else if (! node_p || ! has_multiple_chars) {
 
-        /* Here, the input is legal, but not according to the caller's
-         * options.  We fail without advancing the parse, so that the
-         * caller can try again */
+    /* Here, we should have already handled the case where a single character
+     * is expected and found.  So it is a failure if we aren't expecting
+     * multiple chars and got them; or didn't get them but wanted them.  We
+     * fail without advancing the parse, so that the caller can try again with
+     * different acceptance criteria */
+    if ((! node_p && ! substitute_parse) || ! has_multiple_chars) {
         RExC_parse = p;
-        return FALSE;
+        return (STRLEN) -1;
     }
-    else {
+
+    {
 
        /* What is done here is to convert this to a sub-pattern of the form
-        * (?:\x{char1}\x{char2}...)
-        * and then call reg recursively.  That way, it retains its atomicness,
-        * while not having to worry about special handling that some code
-        * points may have.  toke.c has converted the original Unicode values
-        * to native, so that we can just pass on the hex values unchanged.  We
-        * do have to set a flag to keep recoding from happening in the
-        * recursion */
-
-       SV * substitute_parse = newSVpvn_flags("?:", 2, SVf_UTF8|SVs_TEMP);
+        * \x{char1}\x{char2}...
+         * and then either return it in <*substitute_parse> if non-null; or
+         * call reg recursively to parse it (enclosing in "(?: ... )" ).  That
+         * way, it retains its atomicness, while not having to worry about
+         * special handling that some code points may have.  toke.c has
+         * converted the original Unicode values to native, so that we can just
+         * pass on the hex values unchanged.  We do have to set a flag to keep
+         * recoding from happening in the recursion */
+
+       SV * dummy = NULL;
        STRLEN len;
        char *orig_end = RExC_end;
         I32 flags;
 
+        if (substitute_parse) {
+            *substitute_parse = newSVpvs("");
+        }
+        else {
+            substitute_parse = &dummy;
+            *substitute_parse = newSVpvs("?:");
+        }
+        *substitute_parse = sv_2mortal(*substitute_parse);
+
        while (RExC_parse < endbrace) {
 
            /* Convert to notation the rest of the code understands */
-           sv_catpv(substitute_parse, "\\x{");
-           sv_catpvn(substitute_parse, RExC_parse, endchar - RExC_parse);
-           sv_catpv(substitute_parse, "}");
+           sv_catpv(*substitute_parse, "\\x{");
+           sv_catpvn(*substitute_parse, RExC_parse, endchar - RExC_parse);
+           sv_catpv(*substitute_parse, "}");
 
            /* Point to the beginning of the next character in the sequence. */
            RExC_parse = endchar + 1;
            endchar = RExC_parse + strcspn(RExC_parse, ".}");
+
+            count++;
        }
-       sv_catpv(substitute_parse, ")");
+        if (! in_char_class) {
+            sv_catpv(*substitute_parse, ")");
+        }
 
-       RExC_parse = SvPV(substitute_parse, len);
+       RExC_parse = SvPV(*substitute_parse, len);
 
        /* Don't allow empty number */
-       if (len < 8) {
+       if (len < (STRLEN) ((substitute_parse) ? 6 : 8)) {
+            RExC_parse = endbrace;
            vFAIL("Invalid hexadecimal number in \\N{U+...}");
        }
        RExC_end = RExC_parse + len;
@@ -10929,15 +10986,17 @@ S_grok_bslash_N(pTHX_ RExC_state_t *pRExC_state, 
regnode** node_p,
        /* The values are Unicode, and therefore not subject to recoding */
        RExC_override_recoding = 1;
 
-       if (!(*node_p = reg(pRExC_state, 1, &flags, depth+1))) {
-            if (flags & RESTART_UTF8) {
-                *flagp = RESTART_UTF8;
-                return FALSE;
+        if (node_p) {
+            if (!(*node_p = reg(pRExC_state, 1, &flags, depth+1))) {
+                if (flags & RESTART_UTF8) {
+                    *flagp = RESTART_UTF8;
+                    return (STRLEN) -1;
+                }
+                FAIL2("panic: reg returned NULL to grok_bslash_N, 
flags=%#"UVxf"",
+                    (UV) flags);
             }
-            FAIL2("panic: reg returned NULL to grok_bslash_N, flags=%#"UVxf"",
-                  (UV) flags);
+            *flagp |= flags&(HASWIDTH|SPSTART|SIMPLE|POSTPONED);
         }
-       *flagp |= flags&(HASWIDTH|SPSTART|SIMPLE|POSTPONED);
 
        RExC_parse = endbrace;
        RExC_end = orig_end;
@@ -10946,7 +11005,7 @@ S_grok_bslash_N(pTHX_ RExC_state_t *pRExC_state, 
regnode** node_p,
         nextchar(pRExC_state);
     }
 
-    return TRUE;
+    return count;
 }
 
 
@@ -11052,7 +11111,7 @@ S_alloc_maybe_populate_EXACT(pTHX_ RExC_state_t 
*pRExC_state,
                 if (LOC || ! FOLD) {    /* /l defers folding until runtime */
                     *character = (U8) code_point;
                 }
-                else { /* Here is /i and not /l (toFOLD() is defined on just
+                else { /* Here is /i and not /l. (toFOLD() is defined on just
                           ASCII, which isn't the same thing as INVARIANT on
                           EBCDIC, but it works there, as the extra invariants
                           fold to themselves) */
@@ -11083,7 +11142,10 @@ S_alloc_maybe_populate_EXACT(pTHX_ RExC_state_t 
*pRExC_state,
                                                       ? FOLD_FLAGS_NOMIX_ASCII
                                                       : 0));
                 if (downgradable
-                    && folded == code_point
+                    && folded == code_point /* This quickly rules out many
+                                               cases, avoiding the
+                                               _invlist_contains_cp() overhead
+                                               for those.  */
                     && ! _invlist_contains_cp(PL_utf8_foldable, code_point))
                 {
                     OP(node) = EXACT;
@@ -11403,7 +11465,7 @@ tryagain:
            ret = reg_node(pRExC_state, CANY);
             RExC_seen |= REG_CANY_SEEN;
            *flagp |= HASWIDTH|SIMPLE;
-            if (SIZE_ONLY) {
+            if (PASS2) {
                 ckWARNdep(RExC_parse+1, "\\C is deprecated");
             }
            goto finish_meta_pat;
@@ -11559,8 +11621,9 @@ tryagain:
              * special treatment for quantifiers is not needed for such single
              * character sequences */
             ++RExC_parse;
-            if (! grok_bslash_N(pRExC_state, &ret, NULL, flagp, depth, FALSE,
-                                FALSE /* not strict */ )) {
+            if ((STRLEN) -1 == grok_bslash_N(pRExC_state, &ret, NULL, flagp,
+                                             depth, FALSE))
+            {
                 if (*flagp & RESTART_UTF8)
                     return NULL;
                 RExC_parse--;
@@ -11861,10 +11924,12 @@ tryagain:
                          * point sequence.  Handle those in the switch() above
                          * */
                         RExC_parse = p + 1;
-                        if (! grok_bslash_N(pRExC_state, NULL, &ender,
-                                            flagp, depth, FALSE,
-                                            FALSE /* not strict */ ))
-                        {
+                        if ((STRLEN) -1 == grok_bslash_N(pRExC_state, NULL,
+                                                         &ender,
+                                                         flagp,
+                                                         depth,
+                                                         FALSE
+                        )) {
                             if (*flagp & RESTART_UTF8)
                                 FAIL("panic: grok_bslash_N set RESTART_UTF8");
                             RExC_parse = p = oldp;
@@ -11903,7 +11968,7 @@ tryagain:
                            bool valid = grok_bslash_o(&p,
                                                       &result,
                                                       &error_msg,
-                                                      TRUE, /* out warnings */
+                                                      PASS2, /* out warnings */
                                                        FALSE, /* not strict */
                                                        TRUE, /* Output warnings
                                                                 for non-
@@ -11932,7 +11997,7 @@ tryagain:
                            bool valid = grok_bslash_x(&p,
                                                       &result,
                                                       &error_msg,
-                                                      TRUE, /* out warnings */
+                                                      PASS2, /* out warnings */
                                                        FALSE, /* not strict */
                                                        TRUE, /* Output warnings
                                                                 for non-
@@ -11955,7 +12020,7 @@ tryagain:
                        }
                    case 'c':
                        p++;
-                       ender = grok_bslash_c(*p++, SIZE_ONLY);
+                       ender = grok_bslash_c(*p++, PASS2);
                        break;
                     case '8': case '9': /* must be a backreference */
                         --p;
@@ -11994,7 +12059,7 @@ tryagain:
                                REQUIRE_UTF8;
                            }
                            p += numlen;
-                            if (SIZE_ONLY   /* like \08, \178 */
+                            if (PASS2   /* like \08, \178 */
                                 && numlen < 3
                                 && p < RExC_end
                                 && isDIGIT(*p) && ckWARN(WARN_REGEXP))
@@ -12011,7 +12076,7 @@ tryagain:
                        if (! RExC_override_recoding) {
                            SV* enc = PL_encoding;
                            ender = reg_recode((const char)(U8)ender, &enc);
-                           if (!enc && SIZE_ONLY)
+                           if (!enc && PASS2)
                                ckWARNreg(p, "Invalid escape in the specified 
encoding");
                            REQUIRE_UTF8;
                        }
@@ -12745,9 +12810,7 @@ S_handle_regex_sets(pTHX_ RExC_state_t *pRExC_state, 
SV** return_invlist,
      * upon an unescaped ']' that isn't one ending a regclass.  To do both
      * these things, we need to realize that something preceded by a backslash
      * is escaped, so we have to keep track of backslashes */
-    if (SIZE_ONLY) {
-        UV depth = 0; /* how many nested (?[...]) constructs */
-
+    if (PASS2) {
         Perl_ck_warner_d(aTHX_
             packWARN(WARN_EXPERIMENTAL__REGEX_SETS),
             "The regex_sets feature is experimental" REPORT_LOCATION,
@@ -12755,6 +12818,9 @@ S_handle_regex_sets(pTHX_ RExC_state_t *pRExC_state, 
SV** return_invlist,
                 UTF8fARG(UTF,
                          RExC_end - RExC_start - (RExC_parse - RExC_precomp),
                          RExC_precomp + (RExC_parse - RExC_precomp)));
+    }
+    else {
+        UV depth = 0; /* how many nested (?[...]) constructs */
 
         while (RExC_parse < RExC_end) {
             SV* current = NULL;
@@ -13255,7 +13321,9 @@ S_add_above_Latin1_folds(pTHX_ RExC_state_t 
*pRExC_state, const U8 cp, SV** invl
         default:
             /* Use deprecated warning to increase the chances of this being
              * output */
-            ckWARN2reg_d(RExC_parse, "Perl folding rules are not up-to-date 
for 0x%02X; please use the perlbug utility to report;", cp);
+            if (PASS2) {
+                ckWARN2reg_d(RExC_parse, "Perl folding rules are not 
up-to-date for 0x%02X; please use the perlbug utility to report;", cp);
+            }
             break;
     }
 }
@@ -13442,7 +13510,6 @@ S_regclass(pTHX_ RExC_state_t *pRExC_state, I32 *flagp, 
U32 depth,
     if (UCHARAT(RExC_parse) == ']')
        goto charclassloop;
 
-parseit:
     while (1) {
         if  (RExC_parse >= stop_ptr) {
             break;
@@ -13522,19 +13589,51 @@ parseit:
            case 'H':   namedclass = ANYOF_NHORIZWS;    break;
             case 'N':  /* Handle \N{NAME} in class */
                 {
-                    /* We only pay attention to the first char of
-                    multichar strings being returned. I kinda wonder
-                    if this makes sense as it does change the behaviour
-                    from earlier versions, OTOH that behaviour was broken
-                    as well. */
-                    if (! grok_bslash_N(pRExC_state, NULL, &value, flagp, 
depth,
-                                      TRUE, /* => charclass */
-                                      strict))
-                    {
-                        if (*flagp & RESTART_UTF8)
-                            FAIL("panic: grok_bslash_N set RESTART_UTF8");
-                        goto parseit;
+                    SV *as_text;
+                    STRLEN cp_count = grok_bslash_N(pRExC_state, NULL, &value,
+                                                    flagp, depth, &as_text);
+                    if (*flagp & RESTART_UTF8)
+                        FAIL("panic: grok_bslash_N set RESTART_UTF8");
+                    if (cp_count != 1) {    /* The typical case drops through 
*/
+                        assert(cp_count != (STRLEN) -1);
+                        if (cp_count == 0) {
+                            if (strict) {
+                                RExC_parse++;   /* Position after the "}" */
+                                vFAIL("Zero length \\N{}");
+                            }
+                            else if (PASS2) {
+                                ckWARNreg(RExC_parse,
+                                        "Ignoring zero length \\N{} in 
character class");
+                            }
+                        }
+                        else { /* cp_count > 1 */
+                            /* We only pay attention to the first char of
+                             * multichar strings being returned in char
+                             * classes. I kinda wonder if this makes sense as
+                             * it does change the behaviour from earlier
+                             * versions, OTOH that behaviour was broken as
+                             * well. XXX Solution is to recharacterize as
+                             * [rest-of-class]|multi1|multi2...  */
+                                    if (strict) {
+                                        RExC_parse--;
+                                        vFAIL("\\N{} in character class 
restricted to one character");
+                                    }
+                                    else if (PASS2) {
+                                        ckWARNreg(RExC_parse, "Using just the 
first character returned by \\N{} in character class");
+                                    }
+                                break; /* <value> contains the first code
+                                          point. Drop out of the switch to
+                                          process it */
+                        } /* End of cp_count != 1 */
+
+                        /* This element should not be processed further in this
+                         * class */
+                        element_count--;
+                        value = save_value;
+                        prevvalue = save_prevvalue;
+                        continue;   /* Back to top of loop to get next char */
                     }
+                    /* Here, is a single code point, and <value> contains it */
                 }
                 break;
            case 'p':
@@ -13723,8 +13822,8 @@ parseit:
                    bool valid = grok_bslash_o(&RExC_parse,
                                               &value,
                                               &error_msg,
-                                               SIZE_ONLY,   /* warnings in pass
-                                                               1 only */
+                                               PASS2,   /* warnings only in
+                                                           pass 2 */
                                                strict,
                                                silence_non_portable,
                                                UTF);
@@ -13743,7 +13842,7 @@ parseit:
                    bool valid = grok_bslash_x(&RExC_parse,
                                               &value,
                                               &error_msg,
-                                              TRUE, /* Output warnings */
+                                              PASS2, /* Output warnings */
                                                strict,
                                                silence_non_portable,
                                                UTF);
@@ -13755,7 +13854,7 @@ parseit:
                    goto recode_encoding;
                break;
            case 'c':
-               value = grok_bslash_c(*RExC_parse++, SIZE_ONLY);
+               value = grok_bslash_c(*RExC_parse++, PASS2);
                break;
            case '0': case '1': case '2': case '3': case '4':
            case '5': case '6': case '7':
@@ -13795,7 +13894,7 @@ parseit:
                         if (strict) {
                             vFAIL("Invalid escape in the specified encoding");
                         }
-                        else if (SIZE_ONLY) {
+                        else if (PASS2) {
                             ckWARNreg(RExC_parse,
                                  "Invalid escape in the specified encoding");
                         }
@@ -13973,22 +14072,23 @@ parseit:
                                                      namedclass % 2 != 0,
                                                      posixes_ptr);
                 }
-                continue;   /* Go get next character */
            }
        } /* end of namedclass \blah */
 
-        /* Here, we have a single value.  If 'range' is set, it is the ending
-         * of a range--check its validity.  Later, we will handle each
-         * individual code point in the range.  If 'range' isn't set, this
-         * could be the beginning of a range, so check for that by looking
-         * ahead to see if the next real character to be processed is the range
-         * indicator--the minus sign */
-
         if (skip_white) {
             RExC_parse = regpatws(pRExC_state, RExC_parse,
                                 FALSE /* means don't recognize comments */ );
         }
 
+        /* If 'range' is set, 'value' is the ending of a range--check its
+         * validity.  (If value isn't a single code point in the case of a
+         * range, we should have figured that out above in the code that
+         * catches false ranges).  Later, we will handle each individual code
+         * point in the range.  If 'range' isn't set, this could be the
+         * beginning of a range, so check for that by looking ahead to see if
+         * the next real character to be processed is the range indicator--the
+         * minus sign */
+
        if (range) {
            if (prevvalue > value) /* b-a */ {
                const int w = RExC_parse - rangebegin;
@@ -14018,15 +14118,15 @@ parseit:
 
                     /* a bad range like \w-, [:word:]- ? */
                     if (namedclass > OOB_NAMEDCLASS) {
-                        if (strict || ckWARN(WARN_REGEXP)) {
-                            const int w =
-                                RExC_parse >= rangebegin ?
-                                RExC_parse - rangebegin : 0;
+                        if (strict || (PASS2 && ckWARN(WARN_REGEXP))) {
+                            const int w = RExC_parse >= rangebegin
+                                          ?  RExC_parse - rangebegin
+                                          : 0;
                             if (strict) {
                                 vFAIL4("False [] range \"%*.*s\"",
                                     w, w, rangebegin);
                             }
-                            else {
+                            else if (PASS2) {
                                 vWARN4(RExC_parse,
                                     "False [] range \"%*.*s\"",
                                     w, w, rangebegin);
@@ -14043,8 +14143,12 @@ parseit:
            }
        }
 
-        /* Here, <prevvalue> is the beginning of the range, if any; or <value>
-         * if not */
+        if (namedclass > OOB_NAMEDCLASS) {
+            continue;
+        }
+
+        /* Here, we have a single value, and <prevvalue> is the beginning of
+         * the range, if any; or <value> if not */
 
        /* non-Latin1 code point implies unicode semantics.  Must be set in
         * pass1 so is there for the whole of pass 2 */
@@ -16687,13 +16791,6 @@ Perl_save_re_context(pTHX)
 #endif
 
 #ifdef DEBUGGING
-
-/* Given that c is a control character, is it one for which we have a
- * mnemonic? */
-#define isMNEMONIC_CNTRL(c) ((isSPACE_A(c) && (c) != '\v')  \
-                          || (c) == '\a'                    \
-                          || (c) == '\b'                    \
-                          || (c) == ESC_NATIVE)
 /* Certain characters are output as a sequence with the first being a
  * backslash. */
 #define isBACKSLASHED_PUNCT(c)                                              \
@@ -16714,15 +16811,12 @@ S_put_code_point(pTHX_ SV *sv, UV c)
        sv_catpvn(sv, &string, 1);
     }
     else {
-        switch ((U8) c) {
-            case '\a': Perl_sv_catpvf(aTHX_ sv, "\\a"); break;
-            case '\b': Perl_sv_catpvf(aTHX_ sv, "\\b"); break;
-            case ESC_NATIVE: Perl_sv_catpvf(aTHX_ sv, "\\e"); break;
-            case '\f': Perl_sv_catpvf(aTHX_ sv, "\\f"); break;
-            case '\n': Perl_sv_catpvf(aTHX_ sv, "\\n"); break;
-            case '\r': Perl_sv_catpvf(aTHX_ sv, "\\r"); break;
-            case '\t': Perl_sv_catpvf(aTHX_ sv, "\\t"); break;
-            default: Perl_sv_catpvf(aTHX_ sv, "\\x{%02X}", (U8) c); break;
+        const char * const mnemonic = cntrl_to_mnemonic((char) c);
+        if (mnemonic) {
+            Perl_sv_catpvf(aTHX_ sv, "%s", mnemonic);
+        }
+        else {
+            Perl_sv_catpvf(aTHX_ sv, "\\x{%02X}", (U8) c);
         }
     }
 }
@@ -16914,7 +17008,7 @@ S_put_charclass_bitmap_innards(pTHX_ SV *sv, char 
*bitmap, SV** bitmap_invlist)
      * ASCII puncts are set, including an extra amount for the backslashed
      * ones.  */
     for (i = 0; i < NUM_ANYOF_CODE_POINTS; i++) {
-        if (BITMAP_TEST((U8 *) bitmap,i)) {
+        if (BITMAP_TEST(bitmap, i)) {
             *invlist_ptr = add_cp_to_invlist(*invlist_ptr, i);
             if (isPUNCT_A(i)) {
                 punct_count++;
diff --git a/regen/mk_PL_charclass.pl b/regen/mk_PL_charclass.pl
index aaefb46..8a682dc 100644
--- a/regen/mk_PL_charclass.pl
+++ b/regen/mk_PL_charclass.pl
@@ -47,6 +47,7 @@ my @properties = qw(
     XDIGIT
     VERTSPACE
     IS_IN_SOME_FOLD
+    MNEMONIC_CNTRL
 );
 
 # Read in the case fold mappings.
@@ -235,6 +236,9 @@ for my $ord (0..255) {
             $re = qr/\p{Is_Non_Final_Fold}/;
         } elsif ($name eq 'IS_IN_SOME_FOLD') {
             $re = qr/\p{_Perl_Any_Folds}/;
+        } elsif ($name eq 'MNEMONIC_CNTRL') {
+            # These are the control characters that there are mnemonics for
+            $re = qr/[\a\b\e\f\n\r\t]/;
         } else {    # The remainder have the same name and values as Unicode
             $re = eval "qr/\\p{$name}/";
             use Carp;
diff --git a/t/lib/warnings/regcomp b/t/lib/warnings/regcomp
index b55959e..f62f5f1 100644
--- a/t/lib/warnings/regcomp
+++ b/t/lib/warnings/regcomp
@@ -1,5 +1,6 @@
   regcomp.c    These tests have been moved to t/re/reg_mesg.t
-               except for those that explicitly test line numbers.
+               except for those that explicitly test line numbers
+                and those that don't have a <-- HERE in them.
 
 __END__
 use warnings 'regexp';
@@ -7,3 +8,25 @@ $r=qr/(??{ q"\\b+" })/;
 "a" =~ /a$r/; # warning should come from this line
 EXPECT
 \b+ matches null string many times in regex; marked by <-- HERE in m/\b+ <-- 
HERE / at - line 3.
+########
+# regcomp.c
+use warnings 'digit' ;
+my $a = qr/\o{1238456}\x{100}/;
+my $a = qr/[\o{6548321}]\x{100}/;
+no warnings 'digit' ;
+my $a = qr/\o{1238456}\x{100}/;
+my $a = qr/[\o{6548321}]\x{100}/;
+EXPECT
+Non-octal character '8'.  Resolved as "\o{123}" at - line 3.
+Non-octal character '8'.  Resolved as "\o{654}" at - line 4.
+########
+# regcomp.c.c
+use warnings;
+$a = qr/\c,/;
+$a = qr/[\c,]/;
+no warnings 'syntax';
+$a = qr/\c,/;
+$a = qr/[\c,]/;
+EXPECT
+"\c," is more clearly written simply as "l" at - line 3.
+"\c," is more clearly written simply as "l" at - line 4.
diff --git a/t/re/reg_mesg.t b/t/re/reg_mesg.t
index d8f47e1..9d75e39 100644
--- a/t/re/reg_mesg.t
+++ b/t/re/reg_mesg.t
@@ -320,20 +320,23 @@ push @death, @death_utf8;
 # In the following arrays of warnings, the value can be an array of things to
 # expect.  If the empty string, it means no warning should be raised.
 
-##
-## Key-value pairs of code/error of code that should have non-fatal regexp 
warnings.
-##
-my @warning = (
-    'm/\b*/' => '\b* matches null string many times {#} m/\b*{#}/',
-    'm/[:blank:]/' => 'POSIX syntax [: :] belongs inside character classes {#} 
m/[:blank:]{#}/',
 
-    "m'[\\y]'"     => 'Unrecognized escape \y in character class passed 
through {#} m/[\y{#}]/',
+# Key-value pairs of code/error of code that should have non-fatal regexp
+# warnings.  Most currently have \x{100} appended to them to force them to be
+# upgraded to UTF-8, and the first pass restarted.  Previously this would
+# cause some warnings to be output twice.  This tests that that behavior has
+# been fixed.
 
-    'm/[a-\d]/' => 'False [] range "a-\d" {#} m/[a-\d{#}]/',
-    'm/[\w-x]/' => 'False [] range "\w-" {#} m/[\w-{#}x]/',
-    'm/[a-\pM]/' => 'False [] range "a-\pM" {#} m/[a-\pM{#}]/',
-    'm/[\pM-x]/' => 'False [] range "\pM-" {#} m/[\pM-{#}x]/',
-    "m'\\y'"     => 'Unrecognized escape \y passed through {#} m/\y{#}/',
+my @warning = (
+    'm/\b*\x{100}/' => '\b* matches null string many times {#} 
m/\b*{#}\x{100}/',
+    'm/[:blank:]\x{100}/' => 'POSIX syntax [: :] belongs inside character 
classes {#} m/[:blank:]{#}\x{100}/',
+    "m'[\\y]\\x{100}'"     => 'Unrecognized escape \y in character class 
passed through {#} m/[\y{#}]\x{100}/',
+    'm/[a-\d]\x{100}/' => 'False [] range "a-\d" {#} m/[a-\d{#}]\x{100}/',
+    'm/[\w-x]\x{100}/' => 'False [] range "\w-" {#} m/[\w-{#}x]\x{100}/',
+    'm/[a-\pM]\x{100}/' => 'False [] range "a-\pM" {#} m/[a-\pM{#}]\x{100}/',
+    'm/[\pM-x]\x{100}/' => 'False [] range "\pM-" {#} m/[\pM-{#}x]\x{100}/',
+    'm/[\N{LATIN CAPITAL LETTER A WITH MACRON AND GRAVE}]/' => 'Using just the 
first character returned by \N{} in character class {#} m/[\N{U+100.300}{#}]/',
+    "m'\\y\\x{100}'"     => 'Unrecognized escape \y passed through {#} 
m/\y{#}\x{100}/',
     '/x{3,1}/'   => 'Quantifier {n,m} with n > m can\'t match {#} 
m/x{3,1}{#}/',
     '/\08/' => '\'\08\' resolved to \'\o{0}8\' {#} m/\08{#}/',
     '/\018/' => '\'\018\' resolved to \'\o{1}8\' {#} m/\018{#}/',
@@ -342,52 +345,59 @@ my @warning = (
     '/(?=a)*/' => '(?=a)* matches null string many times {#} m/(?=a)*{#}/',
     'my $x = \'\m\'; qr/a$x/' => 'Unrecognized escape \m passed through {#} 
m/a\m{#}/',
     '/\q/' => 'Unrecognized escape \q passed through {#} m/\q{#}/',
-    '/(?=a){1,3}/' => 'Quantifier unexpected on zero-length expression {#} 
m/(?=a){1,3}{#}/',
-    '/(a|b)(?=a){3}/' => 'Quantifier unexpected on zero-length expression {#} 
m/(a|b)(?=a){3}{#}/',
+
+    # Feel free to modify these 2 tests, should they start failing because the
+    # marker of where the problem is becomes wrong.  The current behavior is
+    # bad, always marking at the very end of the regex instead of where the
+    # problem is.  See [perl #122680] regcomp warning gives wrong position of
+    # problem.
+    '/(?=a){1,3}\x{100}/' => 'Quantifier unexpected on zero-length expression 
{#} m/(?=a){1,3}\x{100}{#}/',
+    '/(a|b)(?=a){3}\x{100}/' => 'Quantifier unexpected on zero-length 
expression {#} m/(a|b)(?=a){3}\x{100}{#}/',
+
     '/\_/' => "",
     '/[\_\0]/' => "",
     '/[\07]/' => "",
     '/[\006]/' => "",
     '/[\0005]/' => "",
-    '/[\8\9]/' => ['Unrecognized escape \8 in character class passed through 
{#} m/[\8{#}\9]/',
-                   'Unrecognized escape \9 in character class passed through 
{#} m/[\8\9{#}]/',
+    '/[\8\9]\x{100}/' => ['Unrecognized escape \8 in character class passed 
through {#} m/[\8{#}\9]\x{100}/',
+                   'Unrecognized escape \9 in character class passed through 
{#} m/[\8\9{#}]\x{100}/',
                   ],
-    '/[:alpha:]/' => 'POSIX syntax [: :] belongs inside character classes {#} 
m/[:alpha:]{#}/',
-    '/[:zog:]/' => 'POSIX syntax [: :] belongs inside character classes {#} 
m/[:zog:]{#}/',
-    '/[.zog.]/' => 'POSIX syntax [. .] belongs inside character classes {#} 
m/[.zog.]{#}/',
+    '/[:alpha:]\x{100}/' => 'POSIX syntax [: :] belongs inside character 
classes {#} m/[:alpha:]{#}\x{100}/',
+    '/[:zog:]\x{100}/' => 'POSIX syntax [: :] belongs inside character classes 
{#} m/[:zog:]{#}\x{100}/',
+    '/[.zog.]\x{100}/' => 'POSIX syntax [. .] belongs inside character classes 
{#} m/[.zog.]{#}\x{100}/',
     '/[a-b]/' => "",
-    '/[a-\d]/' => 'False [] range "a-\d" {#} m/[a-\d{#}]/',
-    '/[\d-b]/' => 'False [] range "\d-" {#} m/[\d-{#}b]/',
-    '/[\s-\d]/' => 'False [] range "\s-" {#} m/[\s-{#}\d]/',
-    '/[\d-\s]/' => 'False [] range "\d-" {#} m/[\d-{#}\s]/',
-    '/[a-[:digit:]]/' => 'False [] range "a-[:digit:]" {#} 
m/[a-[:digit:]{#}]/',
-    '/[[:digit:]-b]/' => 'False [] range "[:digit:]-" {#} m/[[:digit:]-{#}b]/',
-    '/[[:alpha:]-[:digit:]]/' => 'False [] range "[:alpha:]-" {#} 
m/[[:alpha:]-{#}[:digit:]]/',
-    '/[[:digit:]-[:alpha:]]/' => 'False [] range "[:digit:]-" {#} 
m/[[:digit:]-{#}[:alpha:]]/',
-    '/[a\zb]/' => 'Unrecognized escape \z in character class passed through 
{#} m/[a\z{#}b]/',
-    '/(?c)/' => 'Useless (?c) - use /gc modifier {#} m/(?c{#})/',
-    '/(?-c)/' => 'Useless (?-c) - don\'t use /gc modifier {#} m/(?-c{#})/',
-    '/(?g)/' => 'Useless (?g) - use /g modifier {#} m/(?g{#})/',
-    '/(?-g)/' => 'Useless (?-g) - don\'t use /g modifier {#} m/(?-g{#})/',
-    '/(?o)/' => 'Useless (?o) - use /o modifier {#} m/(?o{#})/',
-    '/(?-o)/' => 'Useless (?-o) - don\'t use /o modifier {#} m/(?-o{#})/',
-    '/(?g-o)/' => [ 'Useless (?g) - use /g modifier {#} m/(?g{#}-o)/',
-                    'Useless (?-o) - don\'t use /o modifier {#} m/(?g-o{#})/',
+    '/[a-\d]\x{100}/' => 'False [] range "a-\d" {#} m/[a-\d{#}]\x{100}/',
+    '/[\d-b]\x{100}/' => 'False [] range "\d-" {#} m/[\d-{#}b]\x{100}/',
+    '/[\s-\d]\x{100}/' => 'False [] range "\s-" {#} m/[\s-{#}\d]\x{100}/',
+    '/[\d-\s]\x{100}/' => 'False [] range "\d-" {#} m/[\d-{#}\s]\x{100}/',
+    '/[a-[:digit:]]\x{100}/' => 'False [] range "a-[:digit:]" {#} 
m/[a-[:digit:]{#}]\x{100}/',
+    '/[[:digit:]-b]\x{100}/' => 'False [] range "[:digit:]-" {#} 
m/[[:digit:]-{#}b]\x{100}/',
+    '/[[:alpha:]-[:digit:]]\x{100}/' => 'False [] range "[:alpha:]-" {#} 
m/[[:alpha:]-{#}[:digit:]]\x{100}/',
+    '/[[:digit:]-[:alpha:]]\x{100}/' => 'False [] range "[:digit:]-" {#} 
m/[[:digit:]-{#}[:alpha:]]\x{100}/',
+    '/[a\zb]\x{100}/' => 'Unrecognized escape \z in character class passed 
through {#} m/[a\z{#}b]\x{100}/',
+    '/(?c)\x{100}/' => 'Useless (?c) - use /gc modifier {#} m/(?c{#})\x{100}/',
+    '/(?-c)\x{100}/' => 'Useless (?-c) - don\'t use /gc modifier {#} 
m/(?-c{#})\x{100}/',
+    '/(?g)\x{100}/' => 'Useless (?g) - use /g modifier {#} m/(?g{#})\x{100}/',
+    '/(?-g)\x{100}/' => 'Useless (?-g) - don\'t use /g modifier {#} 
m/(?-g{#})\x{100}/',
+    '/(?o)\x{100}/' => 'Useless (?o) - use /o modifier {#} m/(?o{#})\x{100}/',
+    '/(?-o)\x{100}/' => 'Useless (?-o) - don\'t use /o modifier {#} 
m/(?-o{#})\x{100}/',
+    '/(?g-o)\x{100}/' => [ 'Useless (?g) - use /g modifier {#} 
m/(?g{#}-o)\x{100}/',
+                    'Useless (?-o) - don\'t use /o modifier {#} 
m/(?g-o{#})\x{100}/',
                   ],
-    '/(?g-c)/' => [ 'Useless (?g) - use /g modifier {#} m/(?g{#}-c)/',
-                    'Useless (?-c) - don\'t use /gc modifier {#} m/(?g-c{#})/',
+    '/(?g-c)\x{100}/' => [ 'Useless (?g) - use /g modifier {#} 
m/(?g{#}-c)\x{100}/',
+                    'Useless (?-c) - don\'t use /gc modifier {#} 
m/(?g-c{#})\x{100}/',
                   ],
       # (?c) means (?g) error won't be thrown
-     '/(?o-cg)/' => [ 'Useless (?o) - use /o modifier {#} m/(?o{#}-cg)/',
-                      'Useless (?-c) - don\'t use /gc modifier {#} 
m/(?o-c{#}g)/',
+     '/(?o-cg)\x{100}/' => [ 'Useless (?o) - use /o modifier {#} 
m/(?o{#}-cg)\x{100}/',
+                      'Useless (?-c) - don\'t use /gc modifier {#} 
m/(?o-c{#}g)\x{100}/',
                     ],
-    '/(?ogc)/' => [ 'Useless (?o) - use /o modifier {#} m/(?o{#}gc)/',
-                    'Useless (?g) - use /g modifier {#} m/(?og{#}c)/',
-                    'Useless (?c) - use /gc modifier {#} m/(?ogc{#})/',
+    '/(?ogc)\x{100}/' => [ 'Useless (?o) - use /o modifier {#} 
m/(?o{#}gc)\x{100}/',
+                    'Useless (?g) - use /g modifier {#} m/(?og{#}c)\x{100}/',
+                    'Useless (?c) - use /gc modifier {#} m/(?ogc{#})\x{100}/',
                   ],
-    '/a{1,1}?/' => 'Useless use of greediness modifier \'?\' {#} 
m/a{1,1}?{#}/',
-    '/b{3}  +/x' => 'Useless use of greediness modifier \'+\' {#} m/b{3}  
+{#}/',
-);
+    '/a{1,1}?\x{100}/' => 'Useless use of greediness modifier \'?\' {#} 
m/a{1,1}?{#}\x{100}/',
+    '/b{3}  +\x{100}/x' => 'Useless use of greediness modifier \'+\' {#} 
m/b{3}  +{#}\x{100}/',
+); # See comments before this for why '\x{100}' is generally needed
 
 my @warnings_utf8 = mark_as_utf8(
     'm/ã\b*ã/' => '\b* matches null string many times {#} m/ã\b*{#}ã/',

--
Perl5 Master Repository

[perl.git] branch blead, updated. v5.21.3-485-g570afc5

Reply via email to