In perl.git, the branch blead has been updated <http://perl5.git.perl.org/perl.git/commitdiff/7aa8cb0dec173dcfca4157e60634c74b97429a05?hp=0d1cf11425608e9be019f27a3a4575bc71c49e6b>
- Log ----------------------------------------------------------------- commit 7aa8cb0dec173dcfca4157e60634c74b97429a05 Author: Father Chrysostomos <[email protected]> Date: Sun Feb 8 21:29:59 2015 -0800 [perl #123677] Crash with token stack overflow In this naughty code snippet: s)$0{0h());qx(@0);qx(@0);qx(@0) the s)...)) is treated as a substition, with $0{0h( for the left part. When the lexer reaches the h( it tries to emit two tokens at once, '&' and a WORD token representing the h. To do that it pushes the WORD on to the pending token stack and then emits '&'. The next call to yylex will usually pop the token off the pending stack and use that, because the lexing state (PL_lex_state) is LEX_KNOWNEXT. However, when the parser sees '&', it immediately reports it as a syntax error, and tries to pop tokens to make sense of what it has, popping scopes in the process. Inside a quote-like operator, PL_lex_state is localised, so the value after this scope-popping is no longer LEX_KNOWNEXT, so the next call to yylex continues parsing â;qx...â and ignores the pending token. When it reaches the @0 inside the qx, it tries to push five pending tokens on to the stack at once, because thatâs how the implicit join works. But the stack only has room for five items. Since it already has one, the last item overflows, corrupting the parser state. Crashes ensue. If we check for the number of pending tokens and always emit any regardless of the lexing state, then we avoid the crash. This is arguably how it should have been written to begin with. This makes LEX_KNOWNEXT, and probably PL_lex_defer, redundant, but I will wait till after perl 5.22 before removing those, as the removal may break CPAN modules, and this is a little late in the dev cycle. M t/base/lex.t M toke.c commit 1f7c3e7c639e9586a88d4bfb034a8ff37d2b2cce Author: Father Chrysostomos <[email protected]> Date: Sun Feb 8 20:24:15 2015 -0800 toke.c: Assert that we donât overflow token stack With this assertion, the test case from #123743 fails sooner. M toke.c commit ba511db061a88439acb528a66c780ab574bb4fb0 Author: Father Chrysostomos <[email protected]> Date: Sun Feb 8 20:04:50 2015 -0800 Consistent spaces after dots in perldata M pod/perldata.pod ----------------------------------------------------------------------- Summary of changes: pod/perldata.pod | 2 +- t/base/lex.t | 6 ++++++ toke.c | 14 ++++++++------ 3 files changed, 15 insertions(+), 7 deletions(-) diff --git a/pod/perldata.pod b/pod/perldata.pod index d5df9e3..9b3f63a 100644 --- a/pod/perldata.pod +++ b/pod/perldata.pod @@ -560,7 +560,7 @@ in many forms. Case is ignored, and the Win32-specific forms like C<1.#INF> are understood, but on output the values are normalized to C<Inf> and C<NaN>. -The C<NaN> has two special features of its own. Firstly, it comes in +The C<NaN> has two special features of its own. Firstly, it comes in two flavors, quiet and signaling. What this means is depends on the platform. Secondly, it may have "payload" of a number of bits. The number of bits available again depends on the platform. (Though for diff --git a/t/base/lex.t b/t/base/lex.t index 66db28b..5449b46 100644 --- a/t/base/lex.t +++ b/t/base/lex.t @@ -488,3 +488,9 @@ print "ok $test - map{sub :lvalue...}\n"; $test++; # Used to fail an assertion [perl #123617] eval '"$a{ 1 m// }"; //'; + +# Pending token stack overflow [perl #123677] +{ + local $SIG{__WARN__}=sub{}; + eval q|s)$0{0h());qx(@0);qx(@0);qx(@0)|; +} diff --git a/toke.c b/toke.c index 388b272..24b5ed0 100644 --- a/toke.c +++ b/toke.c @@ -1907,6 +1907,7 @@ S_force_next(pTHX_ I32 type) tokereport(type, &NEXTVAL_NEXTTOKE); } #endif + assert(PL_nexttoke < C_ARRAY_LENGTH(PL_nexttype)); PL_nexttype[PL_nexttoke] = type; PL_nexttoke++; if (PL_lex_state != LEX_KNOWNEXT) { @@ -4311,13 +4312,8 @@ Perl_yylex(pTHX) SvREFCNT_dec(tmp); } ); - switch (PL_lex_state) { - case LEX_NORMAL: - case LEX_INTERPNORMAL: - break; - /* when we've already built the next token, just pull it out of the queue */ - case LEX_KNOWNEXT: + if (PL_nexttoke) { PL_nexttoke--; pl_yylval = PL_nextval[PL_nexttoke]; if (!PL_nexttoke) { @@ -4342,6 +4338,12 @@ Perl_yylex(pTHX) } return REPORT(next_type == 'p' ? pending_ident() : next_type); } + } + + switch (PL_lex_state) { + case LEX_NORMAL: + case LEX_INTERPNORMAL: + break; /* interpolated case modifiers like \L \U, including \Q and \E. when we get here, PL_bufptr is at the \ -- Perl5 Master Repository
