Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
Tom Christiansen wrote: I am certainly in strong favor of a simple and visually distinctive solution, and find that the leading bit helps a lot. But I would probably have written that as: die POEM =~ /[^!]*/g; !The old lie ! Dulce et decorum est ! Pro patria mori. POEM save for the whitespace on " POEM". But Tom, that preserves all the white space both before and after the '!'! Michael's goal is to eliminate the leading white space, although he didn't like the '!' bit. So I'm not sure how you'd have written that if you'd have done it to the specification. -- Glenn = Even if you're on the right track, you'll get run over if you just sit there. -- Will Rogers ___ Why pay for something you could get for free? NetZero provides FREE Internet Access and Email http://www.netzero.net/download/index.html
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
But Tom, that preserves all the white space both before and after the '!'! Michael's goal is to eliminate the leading white space, although he didnSNIP '!' bit. So I'm not sure how you'd have written that if you'd have doneSNIP specification. Yeah, ok. I still think # Your stuff that you write #goes nicely right here # If you want it to print #sans mungeing to fear is the nicest way to heredoc, where "#" is some distinctive string. --tom
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
This is the problem that currently here-doc content must be relative to SNIP indented code. 2 Preserving sub-indentation. This is not _currently_ a problem. Perl _currently_ preserves indentatiSNIP the way, that this problem is a problem. If problem 1 were solved by inSNIP the HERE document, then this problem suddenly appears. So what this "prSNIP (using your "current stumper" example below) by die POEM =~ s/^\s*//m; because that affects the relative horizontal relationships between charaSNIP avoided when solving other problems, rather than being a problem today. Once again, we see why a version of s/// that returns the result is desirable. You actually meant something more on the order of die POEM =~ m/\S.*/g; but relying on knowing what die() does with a list. Wouldn't it be nice to be able just to say, positing a duadic ~ binding operator for s///: die POEM ~ s/^\s*//gm; I think you need the /g, too. --tom
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
Dave Storrs [EMAIL PROTECTED] writes: [...] print FIRST_HERE_DOC; print SECOND_HERE_DOC; This is on the left margin. This is indented one char. FIRST_HERE_DOC This is indented one char. This is on the left margin. SECOND_HERE_DOC RFC 111 specifically disallows statements after the terminator because it is too confusing. I would say that the same logic should apply to the start of the here doc; I'm not sure, just from looking at it, if the example above is meant to be two interleaved heredocs, one heredoc after another, or what. It's two statements, separated by a semicolon. What's wrong? (Or, if you don't like that, just take 2 here docs for the same statement). This is totally unlike the here-document line. The same (without indentation, of course) works for Perl today, and confuses no-one. And just because Perl has some feature does not mean you are obligated to use it in all programs. -- Ariel Scolnicov|"GCAAGAATTGAACTGTAG"| [EMAIL PROTECTED] Compugen Ltd. |Tel: +972-2-5713025 (Jerusalem) \ We recycle all our Hz 72 Pinhas Rosen St.|Tel: +972-3-7658514 (Main office)`- Tel-Aviv 69512, ISRAEL |Fax: +972-3-7658555http://3w.compugen.co.il/~ariels
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
On Thu, Sep 14, 2000 at 03:36:10PM -0700, Nathan Wiger wrote: See, this is just too inflexible. The main complaint that I've heard has been "You can't have leading or trailing whitespace around your terminator". This is a very common error made by everyone, and *this* is where Perl should DWIM. See, I never understood this. If you're indenting the terminator, it implies you're also indenting the here-doc text. I mean, this doesn't make any sense: { { { { print TAG; I don't know what their gripe is. A critic is simply someone paid to render opinions glibly. TAG } } } } Right? You're not going to just indent the terminator because you can. Its going to go along with indenting the text. So indenting the terminator and indenting the text are linked. If you do one, you want to do the other. -- Michael G Schwern http://www.pobox.com/~schwern/ [EMAIL PROTECTED] Just Another Stupid Consultant Perl6 Kwalitee Ashuranse Yet one of these kittens is not prepared to have a good time. It stands alone, away from the crowd. Its your kind of kitten. And now the time has come to climb into that car and shake the paw of destiny.
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
Michael Schwern wrote: See, I never understood this. If you're indenting the terminator, it implies you're also indenting the here-doc text. I mean, this doesn't make any sense: { { { { print TAG; I don't know what their gripe is. A critic is simply someone paid to render opinions glibly. TAG } } } } Right? You're not going to just indent the terminator because you can. Its going to go along with indenting the text. So indenting the terminator and indenting the text are linked. If you do one, you want to do the other. Don't tell me what I want to do :-) $chunk1 = CHUNK1; table tr td class=m1 text that's in the table cell /td /tr CHUNK1 $chunk2 = CHUNK2; tr td class=m2 text that's in another table cell /td /tr CHUNK2 $chunk3 = CHUNK3; /table CHUNK3 The here-doc terminators all line up with the perl code. The generated program is nicely indented relative to the left margin. -- Eric J. Roode, [EMAIL PROTECTED] print scalar reverse sort Senior Software Engineer'tona ', 'reh', 'ekca', 'lre', Myxa Corporation'.r', 'h ', 'uj', 'p ', 'ts';
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
Michael G Schwern wrote: See, I never understood this. If you're indenting the terminator, it implies you're also indenting the here-doc text. I mean, this doesn't make any sense: { { { { print TAG; I don't know what their gripe is. A critic is simply someone paid to render opinions glibly. TAG } } } } Sure it does, as Eric's shown: if ( $this $that ) while (DATABASE) { chomp; $record = quotemeta $_; if ( $record ) { ($rec, $name, $dob, $address, $joindate, $books) = split /\s+/, $record; print END_OF_RECORD; Current record: $rec Name:$name DOB: $dob Address: $address The above person has been a member of the Perl 6 Book of the Month club since $joindate, purchasing a total of $books books. END_OF_RECORD push @records, $record; } } } So indenting the terminator and indenting the text are linked. If you do one, you want to do the other. As I and many others have said, that's not necessarily true. I like all my code to line up, braces, parens, and all. It enhances readability, and is easier to scan. Anyways, it seems both your and my needs could be met if we simply added a operator that does what you want. Otherwise we're forced to choose between two useful alternatives that are both valid. I could see using both your and "my" way in many different situations, so we should make them coexistant, not mutually exclusive. -Nate
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
On Fri 15 Sep, Michael G Schwern wrote: On Fri, Sep 15, 2000 at 06:38:37PM +0100, Richard Proctor wrote: 1) removes whitespace equivalent to the terminator (e) this is largely backward complatible as many existing heredocs are unlikely to have white space before the terminator. 2) removes whitespace equivalent to the smallest whitespace (d) or are these the options that will satisfy everybody [no but its worth a try] 1) Does just what it does now 2) implements (d) or (e) I'd say: 1) does what it does now mod RFC 111 (ie. you can put whitespace in the terminator, but it doesn't effect anything) I was assuming that the terminators changed ala RFC 111 whatever happens 2) does (e). These are equivalent to my second set of options 3) distribute a collection of dequote() mutations with perl. As a module presumably 4) mention the s/// tricks in the documentation (POD =~ s/// seems dead) Yes. [[there is still the tabs debate however]] Tabs are easy, don't expand them. Consider them as a literal character. This assums that the code author is going to use the same keystrokes to indent their here-doc text as the terminator, about as safe an assumption as any for tabs. Maybe I'm being too simplistic, I don't use tabs anymore. Yes you are, the problem comes with mixing editors - some use tabs for indented material some dont, some reduce files using tabs etc etc. [I move between too many editors]. Perl should DWIM. I think that treating tabs=8 as the default would work for most people, even those who set tabs at other values as long as they are consistent - a "use tabs 4" could be used by them if they want to get the same behaviour if they mix tabs and spaces. Richard -- [EMAIL PROTECTED]
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
I'm happy with this solution, it seems to address everyone's needs. -Nate Michael G Schwern wrote: I'd say: 1) does what it does now mod RFC 111 (ie. you can put whitespace in the terminator, but it doesn't effect anything) 2) does (e).
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
On Thu, 14 Sep 2000 03:11:54 -0400, Michael G Schwern wrote: The current stumper, which involves problems 1, 2 and 3 is this: if( $is_fitting $is_just ) { die POEM; The old lie Dulce et decorum est Pro patria mori. POEM } I propose that this work out to "The old lie\n Dulce et decorum est\n Pro patria mori.\n" and always work out to that, no matter how far left or right the expression be indented. I happen to disagree, and here's why. To me, here docs are like *literal extracts* from text documents that you want to reproduce. *Nothing* is supposed to be changed about it: the result should be *exactly* what it is in the here doc, apart from interpolation in double-quotish here docs. I very often insert (parts of) text files produced by other people, and I don't want to be forced to indenting all of it, every single line. However, the same does not count for the here doc terminator. This one very often trips me up. Since this may not be randomly indented, I lose sight of my code indentation, and as a consequence I forget closing braces for blocks etc. Annoying. Another problem is the trailing whitespace: invisible, yet extremeley important: there should be none. Being freed of these two concerns, that boil down to one thing: leading and trailing whitespace, would be most welcome. I do not mind having an option of loosing some leading spaces or tabs for here docs. However, I'm already pretty sure that if this is optional, I won't ever use it. So please, do not force it down my throat. -- Bart.
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
Richard Proctor wrote: Maybe I'm being too simplistic, I don't use tabs anymore. Yes you are, the problem comes with mixing editors - some use tabs for indented material some dont, some reduce files using tabs etc etc. [I move between too many editors]. Perl should DWIM. I think that treating tabs=8 as the default would work for most people, even those who set tabs at other values as long as they are consistent - a "use tabs 4" could be used by them if they want to get the same behaviour if they mix tabs and spaces. Yes, but by being simplistic he eliminates the need to invent "use tabs 4". I have a don't care on this issue, but I lean towards Michael's assumption being valid enough... and if people don't mix tabs and spaces it works for any tab size setting, even the one true tab size setting of 8 characters. -- Glenn = There are two kinds of people, those who finish what they start, and so on... -- Robert Byrne NetZero Free Internet Access and Email_ Download Now http://www.netzero.net/download/index.html Request a CDROM 1-800-333-3633 ___
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
Amen to the below. So can we have an RFC 111 (v4) that gets rid of allowing stuff after the terminator? Even the ";" afterward seems useless... the ";" should be at the end of the statement, not the end of the here doc. The only improvement to here docs I see in this RFC is to allow whitespace before/after the here doc terminator. The rest is handled adequately and consistently today, and Tom's dequote is adequate to eliminate leading white space... especially among people who cannot agree that a tab in a file means "mod 8" (which it does). Michael G Schwern wrote: I can't think of much else I'd want to comment about the end of a here-doc than "this is the end of the here-doc" which is about as useful as "$i++ # add one to $i". There's a big difference. Every code block ends with a '}'. Every here doc ends with its own custom tag. Thus to state: print EOF; Four score and seven years ago... EOF # end of print EOF line 23 can currently be better written as: print GETTYSBURG_ADDRESS Four score and seven years ago... GETTYSBURG_ADDRESS The tag itself describes what the text is, similar to the way a well-named variable describes what's inside of it and removes the need for a descriptive comment. At a glance one can tell that 'GETTYSBURG_ADDRESS' closes the here-doc containing the Gettysburg Address, without having to maintain a comment. (I guarantee the line number mentioned in the comment will not be maintained.) Another reason for wanting to comment the closing of a code block is nesting. Simply searching for the previous '{' will not work. Here-docs cannot be nested and do not have this problem. Simply searching backwards for your here-doc tag will always work. -- Glenn = There are two kinds of people, those who finish what they start, and so on... -- Robert Byrne _NetZero Free Internet Access and Email__ http://www.netzero.net/download/index.html
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
On Wed, Sep 13, 2000 at 11:34:20PM -0700, Glenn Linderman wrote: The rest is handled adequately and consistently today, and Tom's dequote is adequate to eliminate leading white space... especially among people who cannot agree that a tab in a file means "mod 8" (which it does). Damnit, I'm going to continue beating this horse until it stops twitching. Tom and I had an extensive off-list discussion about this, and here's about where it left off (hopefully I'll get everything right). We have three major problems and three proposed solutions: Problems: 1 Allowing here-docs to be indented without effecting the ouput. 2 Preserving sub-indentation. 3 Preserving the output of the here-doc regardless of how its overall indentation is changed (ie. shifted left and right) Solutions 1 POD =~ s/some_regex// 2 dequote(POD) 3 indentation of the end-tag Each solution has their strengths and weaknesses. Regexes can handle problem #1 but only #2 xor #3. However, they cover a wide variety of more general problems. dequote has the same problem. #1 is fine, but it can only do #2 xor #3. Not both. The current stumper, which involves problems 1, 2 and 3 is this: if( $is_fitting $is_just ) { die POEM; The old lie Dulce et decorum est Pro patria mori. POEM } I propose that this work out to "The old lie\n Dulce et decorum est\n Pro patria mori.\n" and always work out to that, no matter how far left or right the expression be indented. { { { { { if( $is_fitting $is_just ) { die POEM; The old lie Dulce et decorum est Pro patria mori. POEM } } } } } Four spaces, two spaces, six spaces. Makes sense, everything lines up. So far I have yet to see a regex or dequote() style proposal which can accomdate this. So solution #1 is powerful, solution #2 is simple, solution #3 solves a set of common problems which the others do not (but doesn't provide the other's flexibility). All are orthoganal. All are fairly simple and fairly obvious. Allow all three. My most common case for needing indented here-docs is this: { { { { # I'm nested if($error) { warn "So there's this problem with the starboard warp coupling and oh shit I just ran off the right margin."; } } } } } Usually I wind up doing this: { { { { # I'm nested if($error) { warn "So there's this problem with the starboard ". "warp coupling and oh shit I just ran off the ". "right margin."; } } } } } I'd love it if I could do this instead: { { { { # I'm nested if($error) { warn ERROR =~ s/\n/ /; So there's this problem with the starboard warp coupling and hey, now I have lots of room to pummell you with technobabble! ERROR } } } } } By combining two of the solutions, my problem is solved. I can indent my here-docs and yet keep the output a single line. Show me where this fails and I'll shut up about it. -- Michael G Schwern http://www.pobox.com/~schwern/ [EMAIL PROTECTED] Just Another Stupid Consultant Perl6 Kwalitee Ashuranse Sometimes these hairstyles are exaggerated beyond the laws of physics - Unknown narrator speaking about Anime
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
Michael G Schwern [EMAIL PROTECTED] writes: [...] I propose that this work out to "The old lie\n Dulce et decorum est\n Pro patria mori.\n" and always work out to that, no matter how far left or right the expression be indented. { { { { { if( $is_fitting $is_just ) { die POEM; The old lie Dulce et decorum est Pro patria mori. POEM } } } } } Four spaces, two spaces, six spaces. Makes sense, everything lines up. So far I have yet to see a regex or dequote() style proposal which can accomdate this. I really like this. [...] Show me where this fails and I'll shut up about it. Here are 2 problems I can think of. But please don't "shut up about it" -- I like the solution, but these need to be sorted out! 1. It requires the perl parser know about indentation. Of course we all know that tabs are 8 characters wide (I myself make a point of bludgeoning anyone who says otherwise), but do we really want to open this can of worms? 2. Existing practice for here docs will have the contents of the here doc on the left margin. People might want to preserve that. For instance, it makes sense if you're here-docking a bunch of 80 char lines. (2) can be solved, and the ambiguous "no matter how far left or right the expression be be indented" resolved, by saying that indentation of the here doc is relative to the terminator (*not* the statement that launched it). This might also make slightly better sense when you have 2 here docs in one line: print FIRST_HERE_DOC; print SECOND_HERE_DOC; This is on the left margin. This is indented one char. FIRST_HERE_DOC This is indented one char. This is on the left margin. SECOND_HERE_DOC But (1) needs to be resolved (and don't say "use tabs 8"!). -- Ariel Scolnicov|"GCAAGAATTGAACTGTAG"| [EMAIL PROTECTED] Compugen Ltd. |Tel: +972-2-5713025 (Jerusalem) \ We recycle all our Hz 72 Pinhas Rosen St.|Tel: +972-3-7658514 (Main office)`- Tel-Aviv 69512, ISRAEL |Fax: +972-3-7658555http://3w.compugen.co.il/~ariels
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
I've implemented a prototype of the indented here-doc tag I'm proposing. http://www.pobox.com/~schwern/src/RFC-Prototype-0.02.tar.gz Its RFC::Prototype::111, which is probably the wrong number. I'll have to add POD =~ s/// syntax. Also, if anyone's good with filters I couldn't quite get the prototype working with Filter::Util::Call. I found myself needing to work line-by-line, and that whole "build up $_" was getting in my way, so I switched to Filter::Util::Exec and it works, but it makes debugging really hard. =head1 NAME RFC::Prototype::111 - Implements Perl 6 RFC 111 =head1 SYNOPSIS use RFC::Prototype::111; if( $is_fitting $is_just ) { die "POEM"; The old lie Dulce et decorum est pro patria mori POEM } =head1 DESCRIPTION Two changes. 1. Allows POD end tags to be indented. The amount of space a tag is indented is the amount which will be clipped off of each line of the here-doc. Tabs will BNOT be expanded. 2. POD end tags may now be followed by trailing whitespace -- Michael G Schwern http://www.pobox.com/~schwern/ [EMAIL PROTECTED] Just Another Stupid Consultant Perl6 Kwalitee Ashuranse When faced with desperate circumstances, we must adapt. - Seven of Nine
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
Michael, I just noticed your post (I am at work). This is begining to get there (maybe I should not have split the original 111). In the prototype you only cover use of " quotes. if( ($pre_code, $quote_type, $curr_tag, $post_code) = $_ =~ m/(.*)\\(")(\w+)"(.*)/ ) It needs to match (.*)((["'`])(\w+)\2)|(\w+))(.*) or something like that. Richard Proctor
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
Glenn Linderman wrote: Amen to the below. So can we have an RFC 111 (v4) that gets rid of allowing stuff after the terminator? Even the ";" afterward seems useless... the ";" should be at the end of the statement, not the end of the here doc. The only improvement to here docs I see in this RFC is to allow whitespace before/after the here doc terminator. The rest is handled adequately and consistently today, and Tom's dequote is adequate to eliminate leading white space... especially among people who cannot agree that a tab in a file means "mod 8" (which it does). The semicolon, as you point out, belongs on the statement at the head of the here doc. The proposal to allow a semicolon at the end is mere window-dressing. Aesthetics only. Personally, I have used editors and pretty-printers that could handle here-docs except that they thought that the "statement" without a semicolon meant that all subsequent lines should be indented. I have had to resort to: $foo = HERE; ... HERE ; other_statements(); Yes, the obvious solution is to get a better editor/pretty printer. Not always an option. But, as I said, it's mere aesthetics. Perhaps not worth changing the language to accommodate the minority of people who have inferior tools. But why not allow a comment? Can't think of a use for one? Michael Schwern, whom you quote, points out that the here doc tag ought to be self-documenting, and he is 100% correct. But comments are used for more than documentation. Ever write a note to yourself or to the next programmer in a comment? $foo = TABLE_OF_GOODS; ... TABLE_OF_GOODS # must combine with TABLE_OF_SUPPLIES, below, someday Sure, you can put that comment in a different place, with little harm. But as long as we're proposing allowing whitespace before/after the doc tag, comments are a Good Thing, imho. -- Eric J. Roode, [EMAIL PROTECTED] print scalar reverse sort Senior Software Engineer'tona ', 'reh', 'ekca', 'lre', Myxa Corporation'.r', 'h ', 'uj', 'p ', 'ts';
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
Ariel Scolnicov wrote: 1. It requires the perl parser know about indentation. Of course we all know that tabs are 8 characters wide (I myself make a point of bludgeoning anyone who says otherwise), but do we really want to open this can of worms? Not so fast with those 8-column tabs. (But, I do NOT want to start a religious war here). At my company, we're required to have one tab stop, no spaces, between indentation levels. Boss likes 8 columns, which to my mind is way too much -- it doesn't take too many levels for your code to march off the right side of the screen. I prefer four columns. No problem -- I make my tab settings four columns. Which, for purposes of here docs and this proposal, works just as well. The REAL sinners are those who mix spaces and tabs. THAT's evil. :-) -- Eric J. Roode, [EMAIL PROTECTED] print scalar reverse sort Senior Software Engineer'tona ', 'reh', 'ekca', 'lre', Myxa Corporation'.r', 'h ', 'uj', 'p ', 'ts';
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
In Michael Schwerns prototype, expansion to treat both semicolons and comments at the end tag is possible by changing /^(\s*)$curr_tag\s*$/ to /^(\s*)$curr_tag\s*(;\s*)?(#.*)?$/ Richard
Drop here docs altogether? (was Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs))
Show me where this fails and I'll shut up about it. Actually, to me this thread underscores how broken here docs are themselves. We already have q//, qq//, and qx// which duplicate their functions far more flexibly. Question: Do we really need here docs? Before you scream "Bloody murder", please read on... The current stumper, which involves problems 1, 2 and 3 is this: if( $is_fitting $is_just ) { die POEM; The old lie Dulce et decorum est Pro patria mori. POEM } I propose that this work out to "The old lie\n Dulce et decorum est\n Pro patria mori.\n" Let's look at what happens if we ignore here docs and instead use qq// instead: if( $is_fitting $is_just ) { die qq/ The old lie Dulce et decorum est Pro patria mori. /; } Solves problem #1, indented terminator, except that it adds two newlines (more later). However, it leaves 2 and 3. Let's try adding in a regexp: if( $is_fitting $is_just ) { (my $mesg = qq/ The old lie Dulce et decorum est Pro patria mori. /) =~ s/\s{8}(.*?\n)/$1/g; die $mesg; } But the dang =~ operator make that ugly and hard to read, and requires a $mesg variable. So let's try RFC 164's approach to patterns then: if( $is_fitting $is_just ) { die subst /\s{8}(.*?\n)/$1/g, qq/ The old lie Dulce et decorum est Pro patria mori. /; } Seems to work for me (and yes I'm working on a prototype of RFC 164's functions). I think we're trying to jam alot of stuff into here docs that maybe shouldn't be jammed in, especially since Perl already has the q// alternatives that are much more flexible. Don't get me wrong, I like here docs and all, but I wonder if it isn't time for them to go? I think I'd actually much rather see a new qh// "quoted here doc" operator that solves these problems than trying to jam them all into the existing shell-like syntax, which is a leftover oddity, really. -Nate
Re: Drop here docs altogether? (was Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs))
At 10:52 AM 9/14/00 -0700, Nathan Wiger wrote: Actually, to me this thread underscores how broken here docs are themselves. We already have q//, qq//, and qx// which duplicate their functions far more flexibly. Question: Do we really need here docs? I have thought this before, but I think the answer is yes, for the circumstance of when the quoted material does or may contain the terminator character. No matter what you pick, you still only have one character as a terminator, and if you're quoting something big and sufficiently general (think Perl code), then it's a pain to check it each time to see if you've stuck in the terminator by mistake. At any rate, this is what I tell my students when they realize that "..." can contain newlines and start to wonder about the raison d'etre of here documents. I think I'd actually much rather see a new qh// "quoted here doc" operator that solves these problems than trying to jam them all into the existing shell-like syntax, which is a leftover oddity, really. -- Peter Scott Pacific Systems Design Technologies
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
This whole debate has got silly. RFC 111 V1 covered both the whitespace on the terminator and the indenting - there was a lot of debate that this was two things - more were in favour of the terminator and there was more debate on the indenting. Therefore I split this into two RFCs leaving RFC111 just dealing with the terminator. RFC 111V3 represents what I believe was rough concenus (ALA IETF meaning) on the terminator issue. (The debate had been quiet for several weeks) Michael Schwern has gone as far as doing a prototype that almost covers it and with the few things I have posted earlier today could be extended to handle all cases. Next comes the issue of the removing whitespace on the left of the content. There are several possibilities, these are now mostly in RFC 162. These are: 1) There is no processing of the input (current state) 2) All whitespace to the left is removed (my original idea) 3) Whitespace equivalent to the first line is removed (not a good solution) 4) Whitespace equivalent to the terminator is removed if possible (ALA Michaels prototype) - this could be workable. 5) Whitespace equivalent to the smallest amount of the content is removed (current RFC 162 preffered solution) When measuring whitespace how does the system treat tabs? (be realistic and dont FLAME) So where do we go from here? A) Do we want one syntax or two? (HERE and THERE)? I would prefer one but would accept two. B) Is there rough concencus on the terminator issue at least? C) Which of the 5 cases of handling the whitespace in the content might be agreed upon? D) Decide how to treat tabs in the indenting. (Suggest =8 spaces plus allow prama to override) E) If the answer to A) is one and we have B) and we agree on 4) or 5) for the whitespace and some treatment of tabs, then I should cancel RFC 162 and just put everything back into RFC 111 (including Michaels Prototype) and lets try and freeze it and move on to other things. Peace! Richard -- [EMAIL PROTECTED]
Re: Drop here docs altogether? (was Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs))
Nathan Wiger wrote: Actually, to me this thread underscores how broken here docs are themselves. We already have q//, qq//, and qx// which duplicate their functions far more flexibly. Question: Do we really need here docs? Yes. Try generating lots of HTML, Javascript, Postscript, or other languages without here docs. Example: print CODE_SNIPPET; // this is a javascript function function valid(s) { ... if (var2 = '"')) { // rest of code to be generated later. CODE_SNIPPET There's a chunk of code for which '', "", qq//, qq, qq{}, are all inadequate. This kind of code happens A LOT in web programming. I do not want to have to examine all of my generated strings to see what quoting character I can use this time around, and I do not want to risk breaking my program whenever I change the text in a code snippet ("oops! I added a bracket. gotta change the quoting character!"). -- Eric J. Roode, [EMAIL PROTECTED] print scalar reverse sort Senior Software Engineer'tona ', 'reh', 'ekca', 'lre', Myxa Corporation'.r', 'h ', 'uj', 'p ', 'ts';
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
Richard Proctor made some excellent comments, and asked: When measuring whitespace how does the system treat tabs? (be realistic and dont FLAME) I suggest that there be NO tab/space conversion. Not 8 columns, not 4 columns, nothing. If the here doc terminator has four tabs preceding it, then four tabs should be stripped from each of the lines in the string. If the terminator has one tab and four spaces, then one tab and four spaces should be stripped from each of the lines. Mixing spaces and tabs is basically evil, but if you're consistent about it, it's your own rope for you to trip over or hang yourself. I set my tab stops to four columns; at least one of my coworkers sets his tab stops to eight columns. We edit the same code with no problems. -- Eric J. Roode, [EMAIL PROTECTED] print scalar reverse sort Senior Software Engineer'tona ', 'reh', 'ekca', 'lre', Myxa Corporation'.r', 'h ', 'uj', 'p ', 'ts';
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
Eric Roode wrote: I suggest that there be NO tab/space conversion. I also suggest that no whitespace stripping/appending/etc/etc be done at all. If I write: if ( $its_all_good ) { print EOF; Thank goodness this text is centered! EOF } That should print out: Thank goodness this text is centered! Without forcing me to left-justify my EOF marker. Tying space-stripping to the placement of EOF is a Bad Idea, IMO. Do this if you want: if ( $its_all_good ) { (my $s = EOF) =~ s/\s{8}(.*?\n)/$1/g; print $s; Thank goodness this text isn't centered! EOF } But this shouldn't be implicit in the language. -Nate
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
Nathan Wiger wrote: I also suggest that no whitespace stripping/appending/etc/etc be done at all. If I write: [...deletia...] But this shouldn't be implicit in the language. That's a good argument for having a separate operator for these "enhanced here docs", say , rather than chucking the whole idea out the window. -- Eric J. Roode, [EMAIL PROTECTED] print scalar reverse sort Senior Software Engineer'tona ', 'reh', 'ekca', 'lre', Myxa Corporation'.r', 'h ', 'uj', 'p ', 'ts';
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
Michael G Schwern wrote: On Wed, Sep 13, 2000 at 11:34:20PM -0700, Glenn Linderman wrote: The rest is handled adequately and consistently today, and Tom's dequote is adequate to eliminate leading white space... especially among people who cannot agree that a tab in a file means "mod 8" (which it does). Damnit, I'm going to continue beating this horse until it stops twitching. That's fine, but it could have been done politely. I'm all for solving problems, and this message attempts to specify 3 problems, but it needs more specification. You describe three problems, but it is not clear what the problems are, exactly, because the words you used to describe them must not describe the problem universally. Let me attempt to describe the problems more completely, and when I diverge onto the wrong problem, you can clarify it-- and then maybe we'll be communicating. I think you've also omitted some of the problems-- maybe they shouldn't be classified as major, but since they are related, and get in the way of some of the possible solutions, I think we should mention them all, so I've continued numbering. We have three major problems and three proposed solutions: Problems: 1 Allowing here-docs to be indented without effecting the ouput. This is the problem that currently here-doc content must be relative to the left margin, so doesn't look nice with respect to nearby indented code. 2 Preserving sub-indentation. This is not _currently_ a problem. Perl _currently_ preserves indentation in here-docs. It is not until some other "solutions" gets in the way, that this problem is a problem. If problem 1 were solved by independently eliminating all leading white space from each line of the HERE document, then this problem suddenly appears. So what this "problem" is trying to state is that problem #1 cannot be solved (using your "current stumper" example below) by die POEM =~ s/^\s*//m; because that affects the relative horizontal relationships between characters on different lines. So this problem only needs to be avoided when solving other problems, rather than being a problem today. 3 Preserving the output of the here-doc regardless of how its overall indentation is changed (ie. shifted left and right) This problem appears to be attempting to address what happens when indenting large blocks of code, with something equivalent to $code =~ s/^/^ /m; # N.B. that's 3 spaces after the 2nd ^ character The effect of the indentation is desirable, but the current semantics of here documents result in two problems: your number 3, which is actually subsumes your problem number 1, that the text result of the here document is different than it was before the indentation took place, and also the first additional problem below Additional problems: 4 An indented here-doc terminator is not recognized, because perl6 requires the here-doc terminator to be at the left boundary. 5 Because white space is not visible, white space after the here-doc terminator, which perl6 requires must be followed by end-of-line, can cause apparent here-doc terminators to not be recognized. 6 Because indenting a tab character with non-tab characters changes its starting point, its apparant size also changes, thus affecting the horizontal relationship between characters on different lines of a here-doc. 7 Because people don't all subscribe to the universal definition of the ASCII tab character as meaning proceed to the next (mod 8) horizontal boundary, the appearance of here-docs containing tabs in various environments differs in the horizontal relationship between charactes on different lines of a here-doc. This can be particularly significant if there are different numbers of leading tabs on a line, or a mixture of tabs and spaces at the front of some lines, or tabs found after non-white space characters. Solutions 1 POD =~ s/some_regex// 2 dequote(POD) 3 indentation of the end-tag Each solution has their strengths and weaknesses. Regexes can handle problem #1 but only #2 xor #3. However, they cover a wide variety of more general problems. dequote has the same problem. #1 is fine, but it can only do #2 xor #3. Not both. Agreed that there is unlikely to be a single solution that solves all the problems. So can we look at solutions to each of the problems, and then attempt to pick a set of solutions to make available in perl6 that covers the problem space? Before I do that, let's analyze the current stumper in terms of the problems above, to make sure we are talking about the same problems. The current stumper, which involves problems 1, 2 and 3 is this: if( $is_fitting $is_just ) { die POEM; The old lie Dulce et decorum est Pro patria mori. POEM } I propose that this work out to "The old lie\n Dulce et decorum est\n Pro patria mori.\n"
Re: Drop here docs altogether? (was Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs))
On Thu, 14 Sep 2000 10:52:16 -0700, Nathan Wiger wrote: We already have q//, qq//, and qx// which duplicate their functions far more flexibly. Question: Do we really need here docs? With your above functions, you always need to be able to escape the string end delimiter. Therefore, you will always have to escape backslashes. You don't need to escape backslashes, or anything else, in a single-quoted here-doc. Here-docs are extremely handy if you have to incorporate text from an external file, which perl is supposed to print out verbatim. Their disadvantage is that they'll always end with a newline. -- Bart.
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
On Thu, Sep 14, 2000 at 11:49:18AM -0700, Glenn Linderman wrote: I'm all for solving problems, and this message attempts to specify 3 problems, but it needs more specification. You describe three problems, but it is not clear what the problems are Since we've been charging back and forth over this ground like a troop of doughboys over No Man's Land for the past month, I figured everyone knew the problem and proposed solutions. Your review accuractely lays everything out. { { { { { if( $is_fitting $is_just ) { die dequote_like('!', POEM); !The old lie ! Dulce et decorum est ! Pro patria mori. POEM } # this } had been omitted } } } } } Things like this have come up, and to my eyes and fingers its unacceptable. Some people like the explicit demarcation of the left boundry, I find it ugly and don't like the extra typing. It doesn't win me much over: die 'The old lie'. ' Dulce et decorum est'. ' Pro patria mori.'; I'd prefer if here-docs just DWIM. So we may want to add Yet Another problem. I forget what number you got up to, but its basically "You shouldn't have to add anything but whitespace to the here-doc for indenting". An additional problem with dequote() style solutions is they are not as efficient. DOC =~ s/// and the terminator indentation can both be applied at compile time and deparse the whole mess into a simple string (as the prototype does), while the dequote() routine must be run over and over again at run-time. This can get nasty in hot loops. #!/usr/bin/perl -w use strict; use Benchmark; sub dequote_like { local $_ = shift; my ($leader); # common white space and common leading string if (/^\s*(?:([^\w\s]+).*\n)(?:\s*\1.*\n)+$/) { $leader = quotemeta($1); } else { $leader = ''; } s/^\s*$leader//gm; return $_; } my $foo; timethese(shift || -3, { dequote = sub { $foo = dequote_like('!', POEM); !The old lie ! Dulce et decorum est ! Pro patria mori. POEM }, terminator = sub { use RFC::Prototype::111; $foo = "POEM"; The old lie Dulce et decorum est Pro patria mori. POEM }, }); Benchmark: running dequote, terminator, each for at least 3 CPU seconds... dequote: 2 wallclock secs ( 3.00 usr + 0.01 sys = 3.01 CPU) @ 39857.81/s (n=119972) terminator: 3 wallclock secs ( 3.00 usr + 0.02 sys = 3.02 CPU) @ 268209.93/s (n=809994) dequote() comes out nearly seven times slower than the terminator approach (which is basically dequote() vs a plain string). So that's another problem to add to the list. "here-docs should be no slower than the equivalent string, indented or otherwise" The syntax for POEM =~ s/regex/subst/; generally returns 1, and introducing a special case to make it return the string if the left hand side is a here-doc seems to be a pointless inconsistency. I think its considered closer to the current trick of doing: print ($var = POEM) =~ s/regex/subst/; # or something like that Another suggestion was POEM =~ m/re(ge)x/. The match would be run over each line and $1 used to generate the here-doc. Honestly, I'm not really the one who should be evangelizing this technique. but these subs [dequote] work in perl 5 today, so don't really need to be part of the RFC They most definately do. If we're going to propose them as a solution to the indented here-doc problem, it would be best to distribute a collection of commonly used ones as a module with perl. -- Michael G Schwern http://www.pobox.com/~schwern/ [EMAIL PROTECTED] Just Another Stupid Consultant Perl6 Kwalitee Ashuranse slick and shiny crust over my hairy anus constipation sucks -- Ken Flagg
Re: Drop here docs altogether? (was Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs))
Nathan Wiger wrote: Solves problem #1, indented terminator, except that it adds two newlines (more later). I never found anything later about these extra newlines... so if this idea has merit, it needs to be finished. However, it leaves 2 and 3. Let's try adding in a regexp: if( $is_fitting $is_just ) { (my $mesg = qq/ The old lie Dulce et decorum est Pro patria mori. /) =~ s/\s{8}(.*?\n)/$1/g; die $mesg; } I think $mesg wins up with the value of "1" the way you've coded it. You cured that issue with the RFC 164 syntax for subst, of course, but it could be cured without that, but does require a temp var. I think we're trying to jam alot of stuff into here docs that maybe shouldn't be jammed in Yes, all we need is to recognize the terminator when embedded in white space on its line, and the rest can be handled with "here doc postprocessing functions". Per my somewhat longer reply to Michael Schwern. I agree with need for a multiple character termination sequence for easy to write here docs. -- Glenn = There are two kinds of people, those who finish what they start, and so on... -- Robert Byrne _NetZero Free Internet Access and Email__ http://www.netzero.net/download/index.html
Re: Drop here docs altogether? (was Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs))
On Thu, Sep 14, 2000 at 10:52:16AM -0700, Nathan Wiger wrote: Before you scream "Bloody murder", please read on... I'll wait patiently for the end... if( $is_fitting $is_just ) { die subst /\s{8}(.*?\n)/$1/g, qq/ The old lie Dulce et decorum est Pro patria mori. /; } Seems to work for me (and yes I'm working on a prototype of RFC 164's functions). No, it still has all the problems of any other regex-based solution. If you shift the code right or left, it breaks (due to the \s{8}) and you're back to counting whitespace again. And as Glen pointed out, what about that leading newline? Can I scream now? -- Michael G Schwern http://www.pobox.com/~schwern/ [EMAIL PROTECTED] Just Another Stupid Consultant Perl6 Kwalitee Ashuranse MORONS!
Re: Drop here docs altogether? (was Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs))
Glenn Linderman wrote: I think $mesg wins up with the value of "1" the way you've coded it. Sorry, I missed the placement of the (). $mesg is fine. -- Glenn = There are two kinds of people, those who finish what they start, and so on... -- Robert Byrne ___ Why pay for something you could get for free? NetZero provides FREE Internet Access and Email http://www.netzero.net/download/index.html
Re: Drop here docs altogether? (was Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs))
Michael G Schwern wrote: No, it still has all the problems of any other regex-based solution. If you shift the code right or left, it breaks (due to the \s{8}) and you're back to counting whitespace again. Y'know, I pointed out before why I think this is a superfluous issue. You have to either change your regexp, or change the indentation of your here docs terminator when you move your code around. And counting whitespace is not so hard to justify breaking this: if ( $its_all_good ) { print EOF; Thank goodness this text is centered! I'd really hate for it to left-shift on me. EOF } This should print out the text as shown verbatim. If you want reformatting of any kind, that's what regex's are for. The above is far more flexible, and your problem already has several other solutions, which you have yourself noted. Plus how to address the whole can of worms with tabs - spaces, 4 or 8, trailing too, blank line stripping? Blech. regex. And as Glen pointed out, what about that leading newline? Handled by the regexp, actually (yep I tested it). Can I scream now? Not yet, but I might! :-) -Nate
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
On Thu, Sep 14, 2000 at 02:51:14PM -0700, Glenn Linderman wrote: Michael G Schwern wrote: Well, OK, so now we're talking shades of opinion. You'd agree it works, though, and quite effectively. But you'd disagree about its aesthetics, and its performance. The former is much less interesting to me than the latter. Here-docs are all about aesthetics. Otherwise, we'd just be using regular strings. That's fair, except that they aren't equivalent: you'd need die 'The old lie'."\n". ' Dulce et decorum est'."\n". ' Pro patria mori.'."\n"; Just to be silly... die join "\n", 'The old lie', ' Dulce et decorum est', ' Pro patria mori.',''; Which is somewhat worse, compared to the here doc, even with "!" or other leading demarcation of choice (your choice, is, of course, none). They're all yicky. I'd prefer if here-docs just DWIM. Yes, but... what do you mean vs. what do others mean, and all these problems Others can continue to put the here-doc tag flush left if they don't want this behavior. I'd like to keep it clear that I consider all the proposals orthoganal, each solving a different (yet often overlapping) set of problems. This leads me down another path: wouldn't it be nice to have a function to interpolate a string on demand? Whoa! Hey, yes, great idea! Not so much for his problem, but I can definately see a need for anyone that's writing any sort of templating system. Then you could hoist the here-doc processing above out of the loop, and still get the effects of interpolation inside the loop, which would make the performance of here-doc postprocessing much less critical... but this means defining variables to hold the intermediate results, and moving the here-doc to a different location, which might not be as friendly to the understanding of the script. Right. Moving the text away from the point where it is used has maintenance problems. PS Do you use 132 columns to write mail? -- Michael G Schwern http://www.pobox.com/~schwern/ [EMAIL PROTECTED] Just Another Stupid Consultant Perl6 Kwalitee Ashuranse Sometimes these hairstyles are exaggerated beyond the laws of physics - Unknown narrator speaking about Anime
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
Michael G Schwern wrote: I'd prefer if here-docs just DWIM. Yes, but... what do you mean vs. what do others mean, and all these problems Others can continue to put the here-doc tag flush left if they don't want this behavior. See, this is just too inflexible. The main complaint that I've heard has been "You can't have leading or trailing whitespace around your terminator". This is a very common error made by everyone, and *this* is where Perl should DWIM. The main complaint has *not* been "Man, I wish that indenting my terminator could tell Perl to automatically strip off that much leading whitespace", which is what you're purporting it to be. If we want to add this feature, which does not solve the existing problem - and there is a problem! - then I support the new "autostrip " operator which does this. However, this shouldn't be forced into the existing operator, since it prevents us from fixing a very important and annoying problem with current here docs. I don't mind disagreeing on a given issue, but this issue is one where everyone has enough brains on this list to resolve reasonably. I propose we all step back, recognize the difference between fixing problems and adding new features, and make it so the latter doesn't prevent the former. -Nate
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
On 4 Sep 2000 21:32:00 -, Perl6 RFC Librarian [EMAIL PROTECTED] wrote: This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1TITLE Here Docs Terminators (Was Whitespace and Here Docs) [...] =head1 IMPLENTATION Intentional? It's either 'IMPLANTATION', which is something that has to be done with Damian's brain into the perl6-core, so every operator is DWIM, or 'IMPLEMENTATION', something you seem to be describing here. Just nitpicking. -- H.Merijn Brand Amsterdam Perl Mongers (http://www.amsterdam.pm.org/) using perl5.005.03, 5.6.0 516 on HP-UX 10.20, HP-UX 11.00, AIX 4.2, AIX 4.3, DEC OSF/1 4.0 and WinNT 4.0 SP-6a, often with Tk800.022 and/or DBD-Unify ftp://ftp.funet.fi/pub/languages/perl/CPAN/authors/id/H/HM/HMBRAND/
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
I think it should be made explicit what happens if the here doc terminator itself contains comment characters or semicolons. This is my suggestion: The here doc terminator must match as a string (that is, Cm/^\s*\Q$term\E\s*(?:\#|;\s*$)/ should match the line, where $term is the desired terminator. Otherwise the behaviour of Cprint 'END#17' is unclear. This issue was raised during previous discussion of the RFC. -- Ariel Scolnicov|"GCAAGAATTGAACTGTAG"| [EMAIL PROTECTED] Compugen Ltd. |Tel: +972-2-5713025 (Jerusalem) \ We recycle all our Hz 72 Pinhas Rosen St.|Tel: +972-3-7658514 (Main office)`- Tel-Aviv 69512, ISRAEL |Fax: +972-3-7658555http://3w.compugen.co.il/~ariels
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
On Mon, Sep 04, 2000 at 09:32:00PM -, Perl6 RFC Librarian wrote: Perl6 should ignore any whitespace before the terminator of a heredoc on any line. Good. I don't see anything wrong with this. ***BRAIN STORM!*** RFC 162 (http://dev.perl.org/rfc/162.html) wanted to allow indented here-docs, but had the problem of how to figure out what was code indentation and what was deliberate text indentation. For example: if( $payment $they_owe ) { print MAIL LETTER; Attention delinquent scum, We have noticed that you still owe us money which we graciously loaned you in your hour of need. Rocco will be by shortly to collect your kneecaps. LETTER } The RFC proposes a operator which would strip whitespace off the front of the here-doc. Problem is preserving indentation. We can merge the two. if( $payment $they_owe ) { print MAIL LETTER; Attention delinquent scum, We have noticed that you still owe us money which we graciously loaned you in your hour of need. Rocco will be by shortly to collect your kneecaps. LETTER } will notice that the closing 'LETTER' tag is indented and strip that amount of whitespace off the front. No regexes, no counting spaces (you just line it up with the left margin of the text) and it does what you mean. In this case... print FOO; text ooops more text FOO Perl will issue a warning because the leftmost margin of the here-doc text is to the left of the closing tag. This feels like it should work, so something must be wrong. Shoot it full of holes, boys. Further it should ignore any whitespace ";"s (and comments) that follow the terminator. I've seen no reason to allow a semicolon other than "why not". Its a special case which adds no syntax sugar and simply represents unnecessary orthoginality. It just lets you be sloppy. Comments, too. Since the here-doc tag is free-form, you can make it anything you'd like. Including a comment! For example, why would this: print EOF; Foo EOF # This is the end of the here-doc, my friends. be any better than this? print THIS_IS_THE_END; Foo THIS_IS_THE_END I can't think of much else I'd want to comment about the end of a here-doc than "this is the end of the here-doc" which is about as useful as "$i++ # add one to $i". Perl should also ignore whitespace between the and the terminator. I'm worried this might cause here docs to look too much like left shift operators. And consider the following ambiguity. use constant BAR = 2; $foo BAR; Stuff BAR print $foo BAR; The first is a here-doc. The second is a left shift. They look veeery close. =head1 IMPLENTATION This should be a relatively simple addition to perl (I think just to scan_heredoc in toke.c + docs in perl5) The disambiguation of a here-doc start from a binary left shift might add serious complexity to th parser. -- Michael G Schwern http://www.pobox.com/~schwern/ [EMAIL PROTECTED] Just Another Stupid Consultant Perl6 Kwalitee Ashuranse "None of our men are "experts."... because no one ever considers himself expert if he really knows his job." -- From Henry Ford Sr., "My Life and Work," p. 86 (1922):
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
Michael G Schwern wrote: The RFC proposes a operator which would strip whitespace off the front of the here-doc. Problem is preserving indentation. We can merge the two. Actually, the two started merged. :-) They were split up after there were too many people for RFC 111 but against RFC 162. Personally, I'd rather see the recipe method of: print END_OF_DOC =~ s/^\s{0,5}//g; (It's something like that) used for the "stripping leading whitespace" issue. I've seen no reason to allow a semicolon other than "why not". Its a special case which adds no syntax sugar and simply represents unnecessary orthoginality. It just lets you be sloppy. Well, it does add some consistency. Admittedly, there's not a huge value-add so if it doesn't get in, hey, it doesn't get in. :-) I can't think of much else I'd want to comment about the end of a here-doc than "this is the end of the here-doc" which is about as useful as "$i++ # add one to $i". If you have a potentially huge here doc it can help, just like a potentially huge if statement: if ( $cond ) { # 200 lines pass } # end if($cond) line 23 I'm worried this might cause here docs to look too much like left shift operators. And consider the following ambiguity. use constant BAR = 2; $foo BAR; Stuff BAR print $foo BAR; The first is a here-doc. The second is a left shift. They look veeery close. Well, the ambiguity you mention actually already exists in a similar form; see Camel-3 p. 67. So I don't know that this would really hurt or help the situation. -Nate
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
On Mon, Sep 04, 2000 at 05:36:32PM -0700, Nathan Wiger wrote: Actually, the two started merged. :-) They were split up after there were too many people for RFC 111 but against RFC 162. Personally, I'd rather see the recipe method of: print END_OF_DOC =~ s/^\s{0,5}//g; This still leaves the problem of having to count whitespace and having to change your regex if you reindent your code. In effect, it causes whitespace to become significant. Bleh. I can't think of much else I'd want to comment about the end of a here-doc than "this is the end of the here-doc" which is about as useful as "$i++ # add one to $i". If you have a potentially huge here doc it can help, just like a potentially huge if statement: if ( $cond ) { # 200 lines pass } # end if($cond) line 23 There's a big difference. Every code block ends with a '}'. Every here doc ends with its own custom tag. Thus to state: print EOF; Four score and seven years ago... EOF # end of print EOF line 23 can currently be better written as: print GETTYSBURG_ADDRESS Four score and seven years ago... GETTYSBURG_ADDRESS The tag itself describes what the text is, similar to the way a well-named variable describes what's inside of it and removes the need for a descriptive comment. At a glance one can tell that 'GETTYSBURG_ADDRESS' closes the here-doc containing the Gettysburg Address, without having to maintain a comment. (I guarantee the line number mentioned in the comment will not be maintained.) Another reason for wanting to comment the closing of a code block is nesting. Simply searching for the previous '{' will not work. Here-docs cannot be nested and do not have this problem. Simply searching backwards for your here-doc tag will always work. -- Michael G Schwern http://www.pobox.com/~schwern/ [EMAIL PROTECTED] Just Another Stupid Consultant Perl6 Kwalitee Ashuranse BOFH excuse #356: the daemons! the daemons! the terrible daemons!
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
This still leaves the problem of having to count whitespace and having to change your regex if you reindent your code. In effect, it causes whitespace to become significant. Bleh. It's much better to use the Cookbook method: it stands out better. Please observe. --tom
Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
Michael G Schwern wrote: print END_OF_DOC =~ s/^\s{0,5}//g; This still leaves the problem of having to count whitespace and having to change your regex if you reindent your code. In effect, it causes whitespace to become significant. Bleh. How is this different from having to count the number of spaces you reindent your here doc terminator? print EOF; We want a total of 3 leading spaces stripped off EOF print EOF; Now we want a total of 5 leading spaces stripped EOF As was already discussed, this approach is quite fragile; I think the consensus was trying to get away from this. Plus regex's are more general; they can strip leading funny chars as well: print EOF =~ s/^\s*\|*\s{0,5}//g; | I like to use the bar symbol to | logically partition off my here docs EOF Anyways, there was a sizeable discussion on this already, here are some of the points discussed: http://www.mail-archive.com/perl6-language@perl.org/msg02556.html http://www.mail-archive.com/perl6-language@perl.org/msg03034.html http://www.mail-archive.com/perl6-language@perl.org/msg03035.html http://www.mail-archive.com/perl6-language@perl.org/msg03037.html http://www.mail-archive.com/perl6-language@perl.org/msg03048.html http://www.mail-archive.com/perl6-language@perl.org/msg03041.html Not trying to cut your points off at the knees, but many of them were discussed already and I think the conclusions make sense. -Nate
RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)
This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Here Docs Terminators (Was Whitespace and Here Docs) =head1 VERSION Maintainer: Richard Proctor [EMAIL PROTECTED] Date: 16 Aug 2000 Last Modified: 2 Sep 2000 Mailing List: [EMAIL PROTECTED] Version: 3 Number: 111 STatus: Developing =head1 ABSTRACT With a here doc print ZZZ; the ZZZ has to be at the start of a line and the text of the here doc, is processed verbatum. This results in the terminator sticking out in the body of the document, makes indenting blocks of text difficult and causes errors and confusion. There are several FAQs that relate to this problem. This proposal tidies this up. =head1 DESCRIPTION Perl6 should ignore any whitespace before the terminator of a heredoc on any line. Further it should ignore any whitespace ";"s (and comments) that follow the terminator. Perl should also ignore whitespace between the and the terminator. Discussion took place on allowing statements following the terminator, but generally these where thought of as a bad idea. So only ";" and comments should occour on the same line. All of these should work: print EOL; EOL print EOL; EOL print EOL; EOL; print EOL EOL; print EOL ; EOL # this is the end of the here doc print EOL EOL;# this is the end of the here doc print EOL ; EOL;# this is the end of the here doc But this should be an error: print EOL EOL; $i++; =head1 IMPLENTATION This should be a relatively simple addition to perl (I think just to scan_heredoc in toke.c + docs in perl5) =head1 CHANGES RFC111 V1 Had two concepts, one about the terminator and another about the content. This has been split into two concepts, this RFC and RFC 162. RFC111 V2 Just had the termination issue, and # commets after the terminator RFC111 V3 Adds the ";" as acceptable after the terminator (and more examples) =head1 REFERENCES RFC162Filtering Heredocs (was originally part of RFC 111 V1)