In perl.git, the branch smoke-me/deerhock has been created

<http://perl5.git.perl.org/perl.git/commitdiff/a4394f2da4e1e0b2d4c7841813fcfca17de61b3d?hp=0000000000000000000000000000000000000000>

        at  a4394f2da4e1e0b2d4c7841813fcfca17de61b3d (commit)

- Log -----------------------------------------------------------------
commit a4394f2da4e1e0b2d4c7841813fcfca17de61b3d
Author: David Nicol <[email protected]>
Date:   Sun Aug 19 23:05:40 2012 -0700

    toke.c:S_scan_heredoc: Add comment about <<\FOO

M       toke.c

commit 00e83d3eba9612d1c7be59ea2be040f08d5744d6
Author: Father Chrysostomos <[email protected]>
Date:   Sun Aug 19 23:05:06 2012 -0700

    [perl #65838] Allow here-doc with no final newline
    
    When reading a line of input while scanning a here-doc, if the line
    does not end in \n, then we know we have reached the end of input.  By
    simply tacking a \n on to the buffer, we can meet the expectations of
    the rest of the here-doc parsing code.  If it turns out the delimiter
    is not found on that line, it does not matter that we modified it, as
    we will croak anyway.
    
    I had to add a new flag to lex_next_chunk.  Before commit f0e67a1d2,
    S_scan_heredoc would read from the stream itself, without closing any
    handles.  So the next time through yylex, the eof code would supply
    the final implicit semicolon.
    
    Since f0e67a1d2, S_scan_heredoc has been calling lex_next_chunk, which
    takes care of reading from the stream an supply any final ; at eof.
    The here-doc parser will just get confused as a result (<<';' would
    work without any terminator).  The new flag tells lex_next_chunk not
    to do anything at eof (not even closing handles and resetting the
    parser state), but to return false and leave everything as it was.

M       t/op/heredoc.t
M       toke.c

commit b5fb6e3974c86cfc06c584ebcc34b54f195f8e53
Author: Father Chrysostomos <[email protected]>
Date:   Sun Aug 19 22:41:08 2012 -0700

    heredoc.t: Suppress deprecation warnings

M       t/op/heredoc.t

commit 175370a4bad6b8e76cdc9bfb8d861c1fc9f381e2
Author: Michael G. Schwern <[email protected]>
Date:   Fri Jun 12 15:35:00 2009 -0700

    Clean up heredoc.t
    
    * Made the tests more independent, mostly by decoupling the use of
      a single $string.  This will make it easier to expand on the test file
      later.
    
    * Replace ok( $foo eq $bar ) with is() for better diagnostics
    
    * Remove unnecessary STDERR redirection.  fresh_perl does that for you.
    
    * fix fresh_perl to honor progfile and stderr arguments passed in
      rather than just blowing over them

M       t/op/heredoc.t
M       t/test.pl

commit e744486916cf331780c17304f63c885d2c2d82d0
Author: David Nicol <[email protected]>
Date:   Sun Aug 19 22:16:13 2012 -0700

    [perl #65838] Tests for here-docs without final newlines
    
    and a few error cases

M       MANIFEST
A       t/op/heredoc.t

commit c627236b875dd78a31c0380e5ac5c26c98ab2717
Author: Father Chrysostomos <[email protected]>
Date:   Sun Aug 19 02:45:38 2012 -0700

    [perl #114040] Parse here-docs correctly in quoted constructs
    
    When parsing code outside a string eval or quoted construct, the lexer
    reads one line at a time into PL_linestr.
    
    To parse a here-doc (hereinafter ‘deer hock’, because I spike lunar-
    isms), the lexer has to pull extra lines out of the input stream ahead
    of the current line, the value of PL_linestr remaining the same.
    
    In a string eval, the entire piece of code being parsed is in
    PL_linestr.
    
    To parse a deer hock inside a string eval, the lexer has to fiddle
    with the contents of PL_linestr, scanning for newline characters.
    
    Originally, S_scan_heredoc just followed those two approaches.
    
    When the lexer encounters a quoted construct, it looks for the end-
    ing delimiter (reading from the input stream if necessary), puts the
    entire quoted thing (minus quotes) in PL_linestr, and then starts an
    inner lexing scope.
    
    This means that deer hocks would not nest properly outside of a string
    eval, because the body of the inner deer hock would be pulled out of
    the input stream *after* the outer deer hock.
    
    Larry Wall fixed that in commit fd2d095329 (Jan. 1997), so that this
    would work:
    
    <<foo
    ${\<<bar}
    ber
    bar
    foo
    
    He did so by following the string eval approach (looking for the deer
    hock body in PL_linestr) if the deer hock was inside another quoted
    construct.
    
    Later, commit a2c066523a (Mar. 1998) fixed this:
    
    s/^not /substr(<<EOF, 0, 0)/e;
      Ignored
    EOF
    
    by following the string eval approach only if the deer hock was inside
    another non-backtick deer hock, not just any quoted construct.
    
    The problem with the string eval approach inside a substitu-
    tion is that it only looks in PL_linestr, which only contains
    ‘substr(<<EOF, 0, 0)’ when the lexer is handling the second part of
    the s/// operator.
    
    But that unfortunately broke this:
    
    s/^not /substr(<<EOF, 0, 0)
      Ignored
    EOF
     /e;
    
    and this:
    
    print <<`EOF`;
    ${\<<EOG}
    echo stuff
    EOG
    EOF
    
    reverting it to the pre-fd2d095329 behaviour, because the outer quoted
    construct was treated as one line.
    
    Later on, commit 0244c3a403 (Mar. 1999) fixed this:
    
    eval 's/.../<<FOO/e
      stuff
    FOO
    ';
    
    which required a new approach not used before.  When the replacement
    part of the s/// is being parsed, PL_linestr contains ‘<<FOO’.  The
    body of the deer hock is not in the input stream (there isn’t one),
    but in what was the previous value of PL_linestr before the lexer
    encountered s///.
    
    So 0244c3a403 fixed that by recording pointers into the outer string
    and using them in S_scan_heredoc.  That commit, for some reason, was
    written such that it applied only to substitutions, and not to other
    quoted constructs.
    
    It also failed to take interpolation into account, and did not record
    the outer buffer position, but then tried to use it anyway, resulting
    in crashes in both these cases:
    
    eval 's/${ <<END }//';
    eval 's//${ <<END }//';
    
    It also failed to take multiline s///’s into account, resulting in
    neither of these working, because it lost track of the current cursor,
    leaving it at 'D' instead of the line break following it:
    
    eval '
    s//<<END
    /e;
    blah blah blah
    END
    ;1' or die $@;
    
    eval '
    s//<<END
    blah blah blah
    END
    /e;
    ;1' or die $@;
    
    S_scan_heredoc currently positions the cursor (s) at the last charac-
    ter of <<END if there is a line break on the same line.  There is an
    s++ later on to account, but the code added by 0244c3a403 bypassed it.
    
    So, in the end, deer hocks could only be nested in other quoted con-
    structs if the outer construct was in a string eval and was not s///,
    or was a non-backtick deer hock.
    
    This commit hopefully fixes most of the problems. :-)
    
    The s///-in-eval case is a little tricky.  We have to see whether the
    deer hock label is on the last line of the s///.  If it is, we have
    to peek into the outer buffer.  Otherwise, we have to treat it like a
    string eval.
    
    This commit does not deal with <<END inside the pattern of a multi-
    line s/// or in nested quotes.

M       t/comp/parser.t
M       toke.c
-----------------------------------------------------------------------

--
Perl5 Master Repository

Reply via email to