In perl.git, the branch blead has been updated

<http://perl5.git.perl.org/perl.git/commitdiff/c7f317a9270a52c9028667b8adec18e94f450586?hp=2c0445268a1bb7696e04b8b9b324c3d6880bb18a>

- Log -----------------------------------------------------------------
commit c7f317a9270a52c9028667b8adec18e94f450586
Author: David Mitchell <[email protected]>
Date:   Wed Apr 15 08:47:18 2015 +0100

    assertion failure on interpolated parse err
    
    RT# 124216
    
    When paring the interpolated string "$X", where X is a unicode char that
    is not a legal variable name, failure to restore things properly during
    error recovery led to corrupted state and assertion failures.
    
    In more detail:
    
    When parsing a double-quoted string, S_sublex_push() saves most of the
    current parser state. On parse error, the save stack is popped back,
    which restores all that state. However, PL_lex_defer wasn't being saved,
    so if we were in the middle of handling a forced token, PL_lex_state gets
    restored from PL_lex_defer, and suddenly the lexer thinks we're back
    inside an interpolated string again. So S_sublex_done() gets called
    multiple times, too many scopes are popped, and things like PL_compcv are
    freed prematurely.
    
    Note that in order to reproduce:
    
    * we must be within a double quoted context;
    * we must be parsing a var  (which causes a forced token);
    * the variable name must be illegal, which implies unicode, as
      chr(0..255) are all legal names;
    * the terminating string quote must be the last char of the input
      file, as this code:
    
        case LEX_INTERPSTART:
        if (PL_bufptr == PL_bufend)
            return REPORT(sublex_done());
    
      won't trigger an extra call to sublex_done() otherwise.
    
    I'm sure this bug affects other cases too, but this was the only way I
    found to reproduce.
-----------------------------------------------------------------------

Summary of changes:
 t/uni/parser.t | 26 +++++++++++++++++++++++++-
 toke.c         |  1 +
 2 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/t/uni/parser.t b/t/uni/parser.t
index 9c39943..3d89249 100644
--- a/t/uni/parser.t
+++ b/t/uni/parser.t
@@ -9,7 +9,7 @@ BEGIN {
     skip_all_without_unicode_tables();
 }
 
-plan (tests => 51);
+plan (tests => 52);
 
 use utf8;
 use open qw( :utf8 :std );
@@ -197,3 +197,27 @@ like( $@, qr/Bad name after Foo'/, 'Bad name after 
Foo\'' );
     CORE::evalbytes "use charnames ':full'; use utf8; my \$x = 
\"\\N{abc$malformed_to_be}\"";
     like( $@, qr/Malformed UTF-8 character immediately after '\\N\{abc' at .* 
within string/, 'Malformed UTF-8 input to \N{}');
 }
+
+# RT# 124216: Perl_sv_clear: Assertion
+# If a parsing error occurred during a forced token within an interpolated
+# context, the stack unwinding failed to restore PL_lex_defer and so after
+# error recovery the state restored after the forced token was processed
+# was the wrong one, resulting in the lexer thinking we're still inside a
+# quoted string and things getting freed multiple times.
+#
+# \xe3\x80\xb0 are the utf8 bytes making up the character \x{3030}.
+# The \x{3030} char isn't a legal var name, and this triggers the error.
+#
+# NB: this only failed if the closing quote of the interpolated string is
+# the last char of the file (i.e. no trailing \n).
+
+{
+    no utf8;
+
+    fresh_perl_is(qq{use utf8; "\$\xe3\x80\xb0"}, <<EOF, { stderr => 1},
+Wide character in print at - line 1.\
+syntax error at - line 1, near "\$\xe3\x80\xb0"
+Execution of - aborted due to compilation errors.
+EOF
+    "RT# 124216");
+}
diff --git a/toke.c b/toke.c
index 2a99f0b..294cb8f 100644
--- a/toke.c
+++ b/toke.c
@@ -2342,6 +2342,7 @@ S_sublex_push(pTHX)
     SAVEI32(PL_lex_casemods);
     SAVEI32(PL_lex_starts);
     SAVEI8(PL_lex_state);
+    SAVEI8(PL_lex_defer);
     SAVESPTR(PL_lex_repl);
     SAVEVPTR(PL_lex_inpat);
     SAVEI16(PL_lex_inwhat);

--
Perl5 Master Repository

Reply via email to