Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)

2000-09-18 Thread Glenn Linderman

Tom Christiansen wrote:

 I am certainly in strong favor of a simple and visually distinctive
 solution, and find that the leading bit helps a lot.  But I would probably
 have written that as:

 die POEM =~ /[^!]*/g;
 !The old lie
 !  Dulce et decorum est
 !  Pro patria mori.
 POEM

 save for the whitespace on "   POEM".

But Tom, that preserves all the white space both before and after the '!'!
Michael's goal is to eliminate the leading white space, although he didn't like the
'!' bit.  So I'm not sure how you'd have written that if you'd have done it to the
specification.

--
Glenn
=
Even if you're on the right track,
you'll get run over if you just sit there.
   -- Will Rogers


___
Why pay for something you could get for free?
NetZero provides FREE Internet Access and Email
http://www.netzero.net/download/index.html



Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)

2000-09-18 Thread Tom Christiansen

But Tom, that preserves all the white space both before and after the '!'!
Michael's goal is to eliminate the leading white space, although he didnSNIP
'!' bit.  So I'm not sure how you'd have written that if you'd have doneSNIP
specification.

Yeah, ok.  I still think

# Your stuff that you write
#goes nicely right here
# If you want it to print
#sans mungeing to fear

is the nicest way to heredoc, where "#" is some distinctive string.

--tom



Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)

2000-09-17 Thread Tom Christiansen

This is the problem that currently here-doc content must be relative to SNIP
indented code.

 2 Preserving sub-indentation.

This is not _currently_ a problem.  Perl _currently_ preserves indentatiSNIP
the way, that this problem is a problem.  If problem 1 were solved by inSNIP
the HERE document, then this problem suddenly appears.  So what this "prSNIP
(using your "current stumper" example below) by

  die POEM =~ s/^\s*//m;

because that affects the relative horizontal relationships between charaSNIP
avoided when solving other problems, rather than being a problem today.

Once again, we see why a version of s/// that returns the result
is desirable.  You actually meant something more on the order of

die POEM =~ m/\S.*/g;

but relying on knowing what die() does with a list.

Wouldn't it be nice to be able just to say, positing a duadic ~
binding operator for s///:

die POEM ~ s/^\s*//gm;

I think you need the /g, too.

--tom



Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)

2000-09-15 Thread Ariel Scolnicov

Dave Storrs [EMAIL PROTECTED] writes:

[...]

  print  FIRST_HERE_DOC; print  SECOND_HERE_DOC;
  This is on the left margin.
   This is indented one char.
  FIRST_HERE_DOC
This is indented one char.
   This is on the left margin.
   SECOND_HERE_DOC
 
   RFC 111 specifically disallows statements after the terminator
 because it is too confusing. I would say that the same logic should apply
 to the start of the here doc; I'm not sure, just from looking at it, if
 the example above is meant to be two interleaved heredocs, one heredoc
 after another, or what.

It's two statements, separated by a semicolon.  What's wrong?  (Or, if 
you don't like that, just take 2 here docs for the same statement).
This is totally unlike the here-document line.

The same (without indentation, of course) works for Perl today, and
confuses no-one.  And just because Perl has some feature does not mean 
you are obligated to use it in all programs.
-- 
Ariel Scolnicov|"GCAAGAATTGAACTGTAG"| [EMAIL PROTECTED]
Compugen Ltd.  |Tel: +972-2-5713025 (Jerusalem) \ We recycle all our Hz
72 Pinhas Rosen St.|Tel: +972-3-7658514 (Main office)`-
Tel-Aviv 69512, ISRAEL |Fax: +972-3-7658555http://3w.compugen.co.il/~ariels



Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)

2000-09-15 Thread Michael G Schwern

On Thu, Sep 14, 2000 at 03:36:10PM -0700, Nathan Wiger wrote:
 See, this is just too inflexible. The main complaint that I've heard has
 been "You can't have leading or trailing whitespace around your
 terminator". This is a very common error made by everyone, and *this* is
 where Perl should DWIM.

See, I never understood this.  If you're indenting the terminator, it
implies you're also indenting the here-doc text.  I mean, this doesn't
make any sense:

{ { { {
print TAG;
I don't know what their
gripe is.  A critic is
simply someone paid to
render opinions glibly.
TAG
} } } }

Right?  You're not going to just indent the terminator because you
can.  Its going to go along with indenting the text.

So indenting the terminator and indenting the text are linked.  If you
do one, you want to do the other.


-- 

Michael G Schwern  http://www.pobox.com/~schwern/  [EMAIL PROTECTED]
Just Another Stupid Consultant  Perl6 Kwalitee Ashuranse
Yet one of these kittens is not prepared to have a good time.  It
stands alone, away from the crowd.  Its your kind of kitten.  And now
the time has come to climb into that car and shake the paw of destiny.



Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)

2000-09-15 Thread Eric Roode

Michael Schwern wrote:
See, I never understood this.  If you're indenting the terminator, it
implies you're also indenting the here-doc text.  I mean, this doesn't
make any sense:

{ { { {
print TAG;
I don't know what their
gripe is.  A critic is
simply someone paid to
render opinions glibly.
TAG
} } } }

Right?  You're not going to just indent the terminator because you
can.  Its going to go along with indenting the text.

So indenting the terminator and indenting the text are linked.  If you
do one, you want to do the other.

Don't tell me what I want to do :-)

  $chunk1 = CHUNK1;
table
tr
td class=m1
text that's in the table cell
/td
/tr
  CHUNK1
  
  $chunk2 = CHUNK2;
tr
td class=m2
text that's in another table cell
/td
/tr
  CHUNK2
  
  $chunk3 = CHUNK3;
/table
  CHUNK3
  

The here-doc terminators all line up with the perl code. 
The generated program is nicely indented relative to the left margin.

 --
 Eric J. Roode,  [EMAIL PROTECTED]   print  scalar  reverse  sort
 Senior Software Engineer'tona ', 'reh', 'ekca', 'lre',
 Myxa Corporation'.r', 'h ', 'uj', 'p ', 'ts';




Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)

2000-09-15 Thread Nathan Wiger

Michael G Schwern wrote:
 
 See, I never understood this.  If you're indenting the terminator, it
 implies you're also indenting the here-doc text.  I mean, this doesn't
 make any sense:
 
 { { { {
 print TAG;
 I don't know what their
 gripe is.  A critic is
 simply someone paid to
 render opinions glibly.
 TAG
 } } } }

Sure it does, as Eric's shown:

   if ( $this  $that )
   while (DATABASE) {
   chomp;
   $record = quotemeta $_;
   if ( $record ) {
($rec, $name, $dob, $address, $joindate, $books)
 = split /\s+/, $record;
 print END_OF_RECORD;
Current record: $rec

Name:$name
DOB: $dob
Address: $address

The above person has been a member of the Perl 6 Book of the Month
club since $joindate, purchasing a total of $books books.
 END_OF_RECORD
 push @records, $record;
   }
   }
   } 


 So indenting the terminator and indenting the text are linked.  If you
 do one, you want to do the other.

As I and many others have said, that's not necessarily true. I like all
my code to line up, braces, parens, and all. It enhances readability,
and is easier to scan.

Anyways, it seems both your and my needs could be met if we simply added
a  operator that does what you want. Otherwise we're forced to choose
between two useful alternatives that are both valid. I could see using
both your and "my" way in many different situations, so we should make
them coexistant, not mutually exclusive.

-Nate



Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)

2000-09-15 Thread Richard Proctor

On Fri 15 Sep, Michael G Schwern wrote:
 On Fri, Sep 15, 2000 at 06:38:37PM +0100, Richard Proctor wrote:
  1)  removes whitespace equivalent to the terminator (e) this is largely
  backward complatible as many existing heredocs are unlikely to have white
  space before the terminator.
  
  2)  removes whitespace equivalent to the smallest whitespace (d)
  
  or are these the options that will satisfy everybody [no but its worth a
  try]
  
  1)  Does just what it does now
  
  2)  implements (d) or (e)
 
 
 I'd say:
 
 1)  does what it does now mod RFC 111 (ie. you can put whitespace in the
terminator, but it doesn't effect anything)

I was assuming that the terminators changed ala RFC 111 whatever happens

 
 2)  does (e).

These are equivalent to my second set of options

 
 3) distribute a collection of dequote() mutations with perl.

As a module presumably

 
 4) mention the s/// tricks in the documentation (POD =~ s/// seems dead)
 

Yes.

  [[there is still the tabs debate however]]
 
 Tabs are easy, don't expand them.  Consider them as a literal
 character.  This assums that the code author is going to use the same
 keystrokes to indent their here-doc text as the terminator, about as
 safe an assumption as any for tabs.
 
 Maybe I'm being too simplistic, I don't use tabs anymore.
 

Yes you are, the problem comes with mixing editors - some use tabs for
indented material some dont, some reduce files using tabs etc etc.  [I move
between too many editors].  Perl should DWIM.  I think that treating tabs=8
as the default would work for most people, even those who set tabs at other
values as long as they are consistent - a "use tabs 4" could be used by them
if they want to get the same behaviour if they mix tabs and spaces.

Richard



-- 

[EMAIL PROTECTED]




Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)

2000-09-15 Thread Nathan Wiger

I'm happy with this solution, it seems to address everyone's needs.

-Nate

Michael G Schwern wrote:
 
 I'd say:
 
 1)  does what it does now mod RFC 111 (ie. you can put whitespace in the
terminator, but it doesn't effect anything)
 
 2)  does (e).



Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)

2000-09-15 Thread Bart Lateur

On Thu, 14 Sep 2000 03:11:54 -0400, Michael G Schwern wrote:

The current stumper, which involves problems 1, 2 and 3 is this:

   if( $is_fitting  $is_just ) {
die POEM;
The old lie
  Dulce et decorum est
  Pro patria mori.
POEM
   }

I propose that this work out to 

"The old lie\n  Dulce et decorum est\n  Pro patria mori.\n"

and always work out to that, no matter how far left or right the
expression be indented.

I happen to disagree, and here's why. To me, here docs are like *literal
extracts* from text documents that you want to reproduce. *Nothing* is
supposed to be changed about it: the result should be *exactly* what it
is in the here doc, apart from interpolation in double-quotish here
docs.

I very often insert (parts of) text files produced by other people, and
I don't want to be forced to indenting all of it, every single line.

However, the same does not count for the here doc terminator. This one
very often trips me up. Since this may not be randomly indented, I lose
sight of my code indentation, and as a consequence I forget closing
braces for blocks etc. Annoying.

Another problem is the trailing whitespace: invisible, yet extremeley
important: there should be none.

Being freed of these two concerns, that boil down to one thing: leading
and trailing whitespace, would be most welcome.

I do not mind having an option of loosing some leading spaces or tabs
for here docs. However, I'm already pretty sure that if this is
optional, I won't ever use it. So please, do not force it down my
throat.

-- 
Bart.



Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)

2000-09-15 Thread Glenn Linderman

Richard Proctor wrote:

  Maybe I'm being too simplistic, I don't use tabs anymore.
 

 Yes you are, the problem comes with mixing editors - some use tabs for
 indented material some dont, some reduce files using tabs etc etc.  [I move
 between too many editors].  Perl should DWIM.  I think that treating tabs=8
 as the default would work for most people, even those who set tabs at other
 values as long as they are consistent - a "use tabs 4" could be used by them
 if they want to get the same behaviour if they mix tabs and spaces.

Yes, but by being simplistic he eliminates the need to invent "use tabs 4".  I
have a don't care on this issue, but I lean towards Michael's assumption being
valid enough... and if people don't mix tabs and spaces it works for any tab size
setting, even the one true tab size setting of 8 characters.

--
Glenn
=
There  are two kinds of people, those
who finish  what they start,  and  so
on... -- Robert Byrne



NetZero Free Internet Access and Email_
Download Now http://www.netzero.net/download/index.html
Request a CDROM  1-800-333-3633
___



Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)

2000-09-14 Thread Glenn Linderman

Amen to the below.  So can we have an RFC 111 (v4) that gets rid of allowing
stuff after the terminator?  Even the ";" afterward seems useless... the ";"
should be at the end of the statement, not the end of the here doc.  The only
improvement to here docs I see in this RFC is to allow whitespace before/after
the here doc terminator.  The rest is handled adequately and consistently today,
and Tom's dequote is adequate to eliminate leading white space... especially
among people who cannot agree that a tab in a file means "mod 8" (which it
does).

Michael G Schwern wrote:

   I can't think of much else I'd want to comment about the end of a
   here-doc than "this is the end of the here-doc" which is about as
   useful as "$i++ # add one to $i".

 There's a big difference.  Every code block ends with a '}'.  Every
 here doc ends with its own custom tag.  Thus to state:

 print EOF;

 Four score and seven years ago...

 EOF  # end of print EOF line 23

 can currently be better written as:

 print GETTYSBURG_ADDRESS

 Four score and seven years ago...

 GETTYSBURG_ADDRESS

 The tag itself describes what the text is, similar to the way a
 well-named variable describes what's inside of it and removes the need
 for a descriptive comment.  At a glance one can tell that
 'GETTYSBURG_ADDRESS' closes the here-doc containing the Gettysburg
 Address, without having to maintain a comment.  (I guarantee the line
 number mentioned in the comment will not be maintained.)

 Another reason for wanting to comment the closing of a code block is
 nesting.  Simply searching for the previous '{' will not work.
 Here-docs cannot be nested and do not have this problem.  Simply
 searching backwards for your here-doc tag will always work.

--
Glenn
=
There  are two kinds of people, those
who finish  what they start,  and  so
on... -- Robert Byrne



_NetZero Free Internet Access and Email__
   http://www.netzero.net/download/index.html



Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)

2000-09-14 Thread Michael G Schwern

On Wed, Sep 13, 2000 at 11:34:20PM -0700, Glenn Linderman wrote:
 The rest is handled adequately and consistently today, and Tom's
 dequote is adequate to eliminate leading white space... especially
 among people who cannot agree that a tab in a file means "mod 8"
 (which it does).

Damnit, I'm going to continue beating this horse until it stops twitching.


Tom and I had an extensive off-list discussion about this, and here's
about where it left off (hopefully I'll get everything right).

We have three major problems and three proposed solutions:

Problems:
1 Allowing here-docs to be indented without effecting the ouput.
2 Preserving sub-indentation.
3 Preserving the output of the here-doc regardless of how its
  overall indentation is changed (ie. shifted left and right)

Solutions
1 POD =~ s/some_regex//
2 dequote(POD)
3 indentation of the end-tag

Each solution has their strengths and weaknesses.  Regexes can handle
problem #1 but only #2 xor #3.  However, they cover a wide variety of
more general problems.  dequote has the same problem.  #1 is fine, but
it can only do #2 xor #3.  Not both.

The current stumper, which involves problems 1, 2 and 3 is this:

   if( $is_fitting  $is_just ) {
die POEM;
The old lie
  Dulce et decorum est
  Pro patria mori.
POEM
   }

I propose that this work out to 

"The old lie\n  Dulce et decorum est\n  Pro patria mori.\n"

and always work out to that, no matter how far left or right the
expression be indented.

   { { { { {
 if( $is_fitting  $is_just ) {
die POEM;
The old lie
  Dulce et decorum est
  Pro patria mori.
POEM
   } } } } }

Four spaces, two spaces, six spaces.  Makes sense, everything lines
up.  So far I have yet to see a regex or dequote() style proposal
which can accomdate this.


So solution #1 is powerful, solution #2 is simple, solution #3 solves
a set of common problems which the others do not (but doesn't provide
the other's flexibility).  All are orthoganal.  All are fairly simple
and fairly obvious.  Allow all three.


My most common case for needing indented here-docs is this:

{   {   {   {  # I'm nested
if($error) {
warn "So there's this problem with the starboard warp coupling 
and oh shit I just ran off the right margin.";
}
}   }   }   }

Usually I wind up doing this:

{   {   {   {  # I'm nested
if($error) {
warn "So there's this problem with the starboard ".
 "warp coupling and oh shit I just ran off the ".
 "right margin.";
}
}   }   }   }

I'd love it if I could do this instead:

{   {   {   {  # I'm nested
if($error) {
warn ERROR =~ s/\n/ /;
So there's this problem with the starboard warp
coupling and hey, now I have lots of room to
pummell you with technobabble!
ERROR
}
}   }   }   }

By combining two of the solutions, my problem is solved.  I can indent
my here-docs and yet keep the output a single line.

Show me where this fails and I'll shut up about it.


-- 

Michael G Schwern  http://www.pobox.com/~schwern/  [EMAIL PROTECTED]
Just Another Stupid Consultant  Perl6 Kwalitee Ashuranse
Sometimes these hairstyles are exaggerated beyond the laws of physics
  - Unknown narrator speaking about Anime



Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)

2000-09-14 Thread Ariel Scolnicov

Michael G Schwern [EMAIL PROTECTED] writes:

[...]

 I propose that this work out to 
 
 "The old lie\n  Dulce et decorum est\n  Pro patria mori.\n"
 
 and always work out to that, no matter how far left or right the
 expression be indented.
 
{ { { { {
  if( $is_fitting  $is_just ) {
 die POEM;
 The old lie
   Dulce et decorum est
   Pro patria mori.
 POEM
} } } } }
 
 Four spaces, two spaces, six spaces.  Makes sense, everything lines
 up.  So far I have yet to see a regex or dequote() style proposal
 which can accomdate this.

I really like this.

[...]

 Show me where this fails and I'll shut up about it.

Here are 2 problems I can think of.  But please don't "shut up about
it" -- I like the solution, but these need to be sorted out!

1. It requires the perl parser know about indentation.  Of course we
   all know that tabs are 8 characters wide (I myself make a point of
   bludgeoning anyone who says otherwise), but do we really want to
   open this can of worms?

2. Existing practice for here docs will have the contents of the here
   doc on the left margin.  People might want to preserve that.  For
   instance, it makes sense if you're here-docking a bunch of 80 char
   lines.

(2) can be solved, and the ambiguous "no matter how far left or right
the expression be be indented" resolved, by saying that indentation of 
the here doc is relative to the terminator (*not* the statement that
launched it).  This might also make slightly better sense when you
have 2 here docs in one line:

print  FIRST_HERE_DOC; print  SECOND_HERE_DOC;
This is on the left margin.
 This is indented one char.
FIRST_HERE_DOC
  This is indented one char.
 This is on the left margin.
 SECOND_HERE_DOC

But (1) needs to be resolved (and don't say "use tabs 8"!).

-- 
Ariel Scolnicov|"GCAAGAATTGAACTGTAG"| [EMAIL PROTECTED]
Compugen Ltd.  |Tel: +972-2-5713025 (Jerusalem) \ We recycle all our Hz
72 Pinhas Rosen St.|Tel: +972-3-7658514 (Main office)`-
Tel-Aviv 69512, ISRAEL |Fax: +972-3-7658555http://3w.compugen.co.il/~ariels



Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)

2000-09-14 Thread Michael G Schwern

I've implemented a prototype of the indented here-doc tag I'm proposing.

http://www.pobox.com/~schwern/src/RFC-Prototype-0.02.tar.gz

Its RFC::Prototype::111, which is probably the wrong number.

I'll have to add POD =~ s/// syntax.  Also, if anyone's good with
filters I couldn't quite get the prototype working with
Filter::Util::Call.  I found myself needing to work line-by-line, and
that whole "build up $_" was getting in my way, so I switched to
Filter::Util::Exec and it works, but it makes debugging really hard.


=head1 NAME

RFC::Prototype::111 - Implements Perl 6 RFC 111


=head1 SYNOPSIS

  use RFC::Prototype::111;

  if( $is_fitting  $is_just ) {
  die "POEM";
The old lie
   Dulce et decorum est
 pro patria mori
  POEM
  }


=head1 DESCRIPTION

Two changes.

1. Allows POD end tags to be indented.  The amount of space a tag is
   indented is the amount which will be clipped off of each line of
   the here-doc.  Tabs will BNOT be expanded.

2. POD end tags may now be followed by trailing whitespace


-- 

Michael G Schwern  http://www.pobox.com/~schwern/  [EMAIL PROTECTED]
Just Another Stupid Consultant  Perl6 Kwalitee Ashuranse
When faced with desperate circumstances, we must adapt.
- Seven of Nine



Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)

2000-09-14 Thread Richard Proctor



Michael,

I just noticed your post (I am at work).  This is begining to get there (maybe I
should not have split the original
111).

In the prototype you only cover use of " quotes.

if( ($pre_code, $quote_type, $curr_tag, $post_code) = $_ =~
m/(.*)\\(")(\w+)"(.*)/ )

It needs to match (.*)((["'`])(\w+)\2)|(\w+))(.*) or something like that.

Richard Proctor







Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)

2000-09-14 Thread Eric Roode

Glenn Linderman wrote:
Amen to the below.  So can we have an RFC 111 (v4) that gets rid of allowing
stuff after the terminator?  Even the ";" afterward seems useless... the ";"
should be at the end of the statement, not the end of the here doc.  The only
improvement to here docs I see in this RFC is to allow whitespace before/after
the here doc terminator.  The rest is handled adequately and consistently 
today,
and Tom's dequote is adequate to eliminate leading white space... especially
among people who cannot agree that a tab in a file means "mod 8" (which it
does).

The semicolon, as you point out, belongs on the statement at the
head of the here doc. The proposal to allow a semicolon at the end
is mere window-dressing. Aesthetics only. Personally, I have used
editors and pretty-printers that could handle here-docs except that
they thought that the "statement" without a semicolon meant that
all subsequent lines should be indented. I have had to resort to:

$foo = HERE;
...
HERE
;
other_statements();

Yes, the obvious solution is to get a better editor/pretty printer.
Not always an option. 

But, as I said, it's mere aesthetics. Perhaps not worth changing the
language to accommodate the minority of people who have inferior tools.

But why not allow a comment?  Can't think of a use for one?
Michael Schwern, whom you quote, points out that the here doc tag
ought to be self-documenting, and he is 100% correct. But comments
are used for more than documentation. Ever write a note to yourself
or to the next programmer in a comment?

$foo = TABLE_OF_GOODS;
...
TABLE_OF_GOODS # must combine with TABLE_OF_SUPPLIES, below, someday

Sure, you can put that comment in a different place, with little harm.
But as long as we're proposing allowing whitespace before/after the
doc tag, comments are a Good Thing, imho.

 --
 Eric J. Roode,  [EMAIL PROTECTED]   print  scalar  reverse  sort
 Senior Software Engineer'tona ', 'reh', 'ekca', 'lre',
 Myxa Corporation'.r', 'h ', 'uj', 'p ', 'ts';




Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)

2000-09-14 Thread Eric Roode

Ariel Scolnicov wrote:
1. It requires the perl parser know about indentation.  Of course we
   all know that tabs are 8 characters wide (I myself make a point of
   bludgeoning anyone who says otherwise), but do we really want to
   open this can of worms?

Not so fast with those 8-column tabs. (But, I do NOT want to start
a religious war here).

At my company, we're required to have one tab stop, no spaces, between
indentation levels. Boss likes 8 columns, which to my mind is way
too much -- it doesn't take too many levels for your code to march
off the right side of the screen. I prefer four columns.

No problem -- I make my tab settings four columns. Which, for purposes
of here docs and this proposal, works just as well.

The REAL sinners are those who mix spaces and tabs. THAT's evil. :-)

 --
 Eric J. Roode,  [EMAIL PROTECTED]   print  scalar  reverse  sort
 Senior Software Engineer'tona ', 'reh', 'ekca', 'lre',
 Myxa Corporation'.r', 'h ', 'uj', 'p ', 'ts';




Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)

2000-09-14 Thread Richard Proctor



In Michael Schwerns prototype, expansion to treat both semicolons and comments
at the end tag is possible by changing

 /^(\s*)$curr_tag\s*$/

to

/^(\s*)$curr_tag\s*(;\s*)?(#.*)?$/

Richard





Drop here docs altogether? (was Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs))

2000-09-14 Thread Nathan Wiger

 Show me where this fails and I'll shut up about it.

Actually, to me this thread underscores how broken here docs are
themselves. We already have q//, qq//, and qx// which duplicate their
functions far more flexibly. Question: Do we really need here docs?
Before you scream "Bloody murder", please read on...

 The current stumper, which involves problems 1, 2 and 3 is this:
 
if( $is_fitting  $is_just ) {
 die POEM;
 The old lie
   Dulce et decorum est
   Pro patria mori.
 POEM
}
 
 I propose that this work out to
 
 "The old lie\n  Dulce et decorum est\n  Pro patria mori.\n"

Let's look at what happens if we ignore here docs and instead use qq//
instead:

   if( $is_fitting  $is_just ) {
 die qq/
 The old lie
   Dulce et decorum est
   Pro patria mori.
 /;
   }

Solves problem #1, indented terminator, except that it adds two newlines
(more later). However, it leaves 2 and 3. Let's try adding in a regexp:

   if( $is_fitting  $is_just ) {
 (my $mesg = qq/
 The old lie
   Dulce et decorum est
   Pro patria mori.
 /) =~ s/\s{8}(.*?\n)/$1/g;
 die $mesg;
   }

But the dang =~ operator make that ugly and hard to read, and requires a
$mesg variable. So let's try RFC 164's approach to patterns then:

   if( $is_fitting  $is_just ) {
 die subst /\s{8}(.*?\n)/$1/g, qq/
 The old lie
   Dulce et decorum est
   Pro patria mori.
 /;
   }

Seems to work for me (and yes I'm working on a prototype of RFC 164's
functions).

I think we're trying to jam alot of stuff into here docs that maybe
shouldn't be jammed in, especially since Perl already has the q//
alternatives that are much more flexible. Don't get me wrong, I like
here docs and all, but I wonder if it isn't time for them to go?

I think I'd actually much rather see a new qh// "quoted here doc"
operator that solves these problems than trying to jam them all into the
existing shell-like syntax, which is a leftover oddity, really.

-Nate



Re: Drop here docs altogether? (was Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs))

2000-09-14 Thread Peter Scott

At 10:52 AM 9/14/00 -0700, Nathan Wiger wrote:

Actually, to me this thread underscores how broken here docs are
themselves. We already have q//, qq//, and qx// which duplicate their
functions far more flexibly. Question: Do we really need here docs?

I have thought this before, but I think the answer is yes, for the 
circumstance of when the quoted material does or may contain the terminator 
character.  No matter what you pick, you still only have one character as a 
terminator, and if you're quoting something big and sufficiently general 
(think Perl code), then it's a pain to check it each time to see if you've 
stuck in the terminator by mistake.

At any rate, this is what I tell my students when they realize that "..." 
can contain newlines and start to wonder about the raison d'etre of here 
documents.

I think I'd actually much rather see a new qh// "quoted here doc"
operator that solves these problems than trying to jam them all into the
existing shell-like syntax, which is a leftover oddity, really.

--
Peter Scott
Pacific Systems Design Technologies




Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)

2000-09-14 Thread Richard Proctor

This whole debate has got silly.

RFC 111 V1 covered both the whitespace on the terminator and the 
indenting - there was a lot of debate that this was two things - more were
in favour of the terminator and there was more debate on the indenting.
Therefore I split this into two RFCs leaving RFC111 just dealing with the
terminator.

RFC 111V3 represents what I believe was rough concenus (ALA IETF meaning)
on the terminator issue.  (The debate had been quiet for several weeks)

Michael Schwern has gone as far as doing a prototype that almost covers
it and with the few things I have posted earlier today could be extended
to handle all cases.

Next comes the issue of the removing whitespace on the left of the content.
There are several possibilities, these are now mostly in RFC 162.  These are:

1) There is no processing of the input (current state)

2) All whitespace to the left is removed (my original idea)

3) Whitespace equivalent to the first line is removed (not a good solution)

4) Whitespace equivalent to the terminator is removed if possible (ALA
Michaels prototype) - this could be workable.

5) Whitespace equivalent to the smallest amount of the content is removed
(current RFC 162 preffered solution)

When measuring whitespace how does the system treat tabs?  (be realistic
and dont FLAME)

So where do we go from here?

A) Do we want one syntax or two?  (HERE and THERE)?  I would prefer
one but would accept two.

B) Is there rough concencus on the terminator issue at least?

C) Which of the 5 cases of handling the whitespace in the content might be
agreed upon?

D) Decide how to treat tabs in the indenting.  (Suggest =8 spaces plus
allow prama to override)

E) If the answer to A) is one and we have B) and we agree on 4) or 5) for
the whitespace and some treatment of tabs, then I should cancel RFC 162 and
just put everything back into RFC 111 (including Michaels Prototype) and lets
try and freeze it and move on to other things.

Peace!

Richard

-- 

[EMAIL PROTECTED]




Re: Drop here docs altogether? (was Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs))

2000-09-14 Thread Eric Roode

Nathan Wiger wrote:
Actually, to me this thread underscores how broken here docs are
themselves. We already have q//, qq//, and qx// which duplicate their
functions far more flexibly. Question: Do we really need here docs?

Yes.

Try generating lots of HTML, Javascript, Postscript, or other
languages without here docs. Example:

print CODE_SNIPPET;
// this is a javascript function
function valid(s) 
{
   ...
   if (var2 = '"'))
   {
// rest of code to be generated later.
CODE_SNIPPET

There's a chunk of code for which '', "", qq//, qq, qq{}, are all
inadequate. This kind of code happens A LOT in web programming.

I do not want to have to examine all of my generated strings to see
what quoting character I can use this time around, and I do not want
to risk breaking my program whenever I change the text in a code
snippet ("oops! I added a bracket. gotta change the quoting character!").

 --
 Eric J. Roode,  [EMAIL PROTECTED]   print  scalar  reverse  sort
 Senior Software Engineer'tona ', 'reh', 'ekca', 'lre',
 Myxa Corporation'.r', 'h ', 'uj', 'p ', 'ts';




Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)

2000-09-14 Thread Eric Roode

Richard Proctor made some excellent comments, and asked:
When measuring whitespace how does the system treat tabs?  (be realistic
and dont FLAME)

I suggest that there be NO tab/space conversion. Not 8 columns, not
4 columns, nothing. If the here doc terminator has four tabs preceding
it, then four tabs should be stripped from each of the lines in the
string. If the terminator has one tab and four spaces, then one tab
and four spaces should be stripped from each of the lines.

Mixing spaces and tabs is basically evil, but if you're consistent
about it, it's your own rope for you to trip over or hang yourself.
I set my tab stops to four columns; at least one of my coworkers
sets his tab stops to eight columns. We edit the same code with no
problems.

 --
 Eric J. Roode,  [EMAIL PROTECTED]   print  scalar  reverse  sort
 Senior Software Engineer'tona ', 'reh', 'ekca', 'lre',
 Myxa Corporation'.r', 'h ', 'uj', 'p ', 'ts';




Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)

2000-09-14 Thread Nathan Wiger

Eric Roode wrote:
 
 I suggest that there be NO tab/space conversion.

I also suggest that no whitespace stripping/appending/etc/etc be done at
all. If I write:

  if ( $its_all_good ) {
 print EOF;
 Thank goodness this text is centered!
 EOF
  }

That should print out:

 Thank goodness this text is centered!

Without forcing me to left-justify my EOF marker. Tying space-stripping
to the placement of EOF is a Bad Idea, IMO. Do this if you want:

  if ( $its_all_good ) {
 (my $s = EOF) =~ s/\s{8}(.*?\n)/$1/g; print $s;
 Thank goodness this text isn't centered!
 EOF
  }

But this shouldn't be implicit in the language.

-Nate



Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)

2000-09-14 Thread Eric Roode

Nathan Wiger wrote:

I also suggest that no whitespace stripping/appending/etc/etc be done at
all. If I write:
[...deletia...]
But this shouldn't be implicit in the language.

That's a good argument for having a separate operator for these
"enhanced here docs", say , rather than chucking the whole idea
out the window.

 --
 Eric J. Roode,  [EMAIL PROTECTED]   print  scalar  reverse  sort
 Senior Software Engineer'tona ', 'reh', 'ekca', 'lre',
 Myxa Corporation'.r', 'h ', 'uj', 'p ', 'ts';




Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)

2000-09-14 Thread Glenn Linderman

Michael G Schwern wrote:

 On Wed, Sep 13, 2000 at 11:34:20PM -0700, Glenn Linderman wrote:
  The rest is handled adequately and consistently today, and Tom's
  dequote is adequate to eliminate leading white space... especially
  among people who cannot agree that a tab in a file means "mod 8"
  (which it does).

 Damnit, I'm going to continue beating this horse until it stops twitching.

That's fine, but it could have been done politely.

I'm all for solving problems, and this message attempts to specify 3 problems, but it 
needs more specification.  You describe three
problems, but it is not clear what the problems are, exactly, because the words you 
used to describe them must not describe the problem
universally.  Let me attempt to describe the problems more completely, and when I 
diverge onto the wrong problem, you can clarify it--
and then maybe we'll be communicating.  I think you've also omitted some of the 
problems-- maybe they shouldn't be classified as major,
but since they are related, and get in the way of some of the possible solutions, I 
think we should mention them all, so I've continued
numbering.

 We have three major problems and three proposed solutions:

 Problems:
 1 Allowing here-docs to be indented without effecting the ouput.

This is the problem that currently here-doc content must be relative to the left 
margin, so doesn't look nice with respect to nearby
indented code.

 2 Preserving sub-indentation.

This is not _currently_ a problem.  Perl _currently_ preserves indentation in 
here-docs.  It is not until some other "solutions" gets in
the way, that this problem is a problem.  If problem 1 were solved by independently 
eliminating all leading white space from each line of
the HERE document, then this problem suddenly appears.  So what this "problem" is 
trying to state is that problem #1 cannot be solved
(using your "current stumper" example below) by

  die POEM =~ s/^\s*//m;

because that affects the relative horizontal relationships between characters on 
different lines.  So this problem only needs to be
avoided when solving other problems, rather than being a problem today.

 3 Preserving the output of the here-doc regardless of how its
   overall indentation is changed (ie. shifted left and right)

This problem appears to be attempting to address what happens when indenting large 
blocks of code, with something equivalent to

 $code =~ s/^/^   /m;  # N.B. that's 3 spaces after the 2nd ^ character

The effect of the indentation is desirable, but the current semantics of here 
documents result in two problems: your number 3, which is
actually subsumes your problem number 1, that the text result of the here document is 
different than it was before the indentation took
place, and also the first additional problem below

Additional problems:

4 An indented here-doc terminator is not recognized, because perl6 requires the 
here-doc terminator to be at the left boundary.

5 Because white space is not visible, white space after the here-doc terminator, which 
perl6 requires must be followed by end-of-line,
can cause apparent here-doc terminators to not be recognized.

6 Because indenting a tab character with non-tab characters changes its starting 
point, its apparant size also changes, thus affecting
the horizontal relationship between characters on different lines of a here-doc.

7 Because people don't all subscribe to the universal definition of the ASCII tab 
character as meaning proceed to the next (mod 8)
horizontal boundary, the appearance of here-docs containing tabs in various 
environments differs in the horizontal relationship between
charactes on different lines of a here-doc.  This can be particularly significant if 
there are different numbers of leading tabs on a
line, or a mixture of tabs and spaces at the front of some lines, or tabs found after 
non-white space characters.

 Solutions
 1 POD =~ s/some_regex//
 2 dequote(POD)
 3 indentation of the end-tag

 Each solution has their strengths and weaknesses.  Regexes can handle
 problem #1 but only #2 xor #3.  However, they cover a wide variety of
 more general problems.  dequote has the same problem.  #1 is fine, but
 it can only do #2 xor #3.  Not both.

Agreed that there is unlikely to be a single solution that solves all the problems.  
So can we look at solutions to each of the problems,
and then attempt to pick a set of solutions to make available in perl6 that covers the 
problem space?  Before I do that, let's analyze
the current stumper in terms of the problems above, to make sure we are talking about 
the same problems.

 The current stumper, which involves problems 1, 2 and 3 is this:

if( $is_fitting  $is_just ) {
 die POEM;
 The old lie
   Dulce et decorum est
   Pro patria mori.
 POEM
}

 I propose that this work out to

 "The old lie\n  Dulce et decorum est\n  Pro patria mori.\n"

Re: Drop here docs altogether? (was Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs))

2000-09-14 Thread Bart Lateur

On Thu, 14 Sep 2000 10:52:16 -0700, Nathan Wiger wrote:

We already have q//, qq//, and qx// which duplicate their
functions far more flexibly. Question: Do we really need here docs?

With your above functions, you always need to be able to escape the
string end delimiter. Therefore, you will always have to escape
backslashes.

You don't need to escape backslashes, or anything else, in a
single-quoted here-doc.

Here-docs are extremely handy if you have to incorporate text from an
external file, which perl is supposed to print out verbatim.

Their disadvantage is that they'll always end with a newline.

-- 
Bart.



Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)

2000-09-14 Thread Michael G Schwern

On Thu, Sep 14, 2000 at 11:49:18AM -0700, Glenn Linderman wrote:
 I'm all for solving problems, and this message attempts to specify 3
 problems, but it needs more specification.  You describe three
 problems, but it is not clear what the problems are

Since we've been charging back and forth over this ground like a troop
of doughboys over No Man's Land for the past month, I figured everyone
knew the problem and proposed solutions.  Your review accuractely lays
everything out.


{ { { { {
  if( $is_fitting  $is_just ) {
 die dequote_like('!', POEM);
 !The old lie
 !  Dulce et decorum est
 !  Pro patria mori.
 POEM
  } # this } had been omitted
} } } } }

Things like this have come up, and to my eyes and fingers its
unacceptable.  Some people like the explicit demarcation of the left
boundry, I find it ugly and don't like the extra typing.  It doesn't
win me much over:

die 
'The old lie'.
'  Dulce et decorum est'.
'  Pro patria mori.';

I'd prefer if here-docs just DWIM.

So we may want to add Yet Another problem.  I forget what number you
got up to, but its basically "You shouldn't have to add anything but
whitespace to the here-doc for indenting".

An additional problem with dequote() style solutions is they are not
as efficient.  DOC =~ s/// and the terminator indentation can both
be applied at compile time and deparse the whole mess into a simple
string (as the prototype does), while the dequote() routine must be
run over and over again at run-time.  This can get nasty in hot loops.

#!/usr/bin/perl -w

use strict;
use Benchmark;

sub dequote_like {
  local $_ = shift;
  my ($leader);  # common white space and common leading string
  if (/^\s*(?:([^\w\s]+).*\n)(?:\s*\1.*\n)+$/) {
$leader = quotemeta($1);
  } else {
$leader = '';
  }
  s/^\s*$leader//gm;
  return $_;
}

my $foo;
timethese(shift || -3,
  {
   dequote = sub {
$foo = dequote_like('!', POEM);
!The old lie
!  Dulce et decorum est
!  Pro patria mori.
POEM
   },
   terminator = sub {
   use RFC::Prototype::111;
   $foo = "POEM";
   The old lie
 Dulce et decorum est
 Pro patria mori.
   POEM
   },
  });

Benchmark: running dequote, terminator, each for at least 3 CPU seconds...
   dequote:  2 wallclock secs ( 3.00 usr +  0.01 sys =  3.01 CPU) @ 39857.81/s 
(n=119972)
terminator:  3 wallclock secs ( 3.00 usr +  0.02 sys =  3.02 CPU) @ 268209.93/s 
(n=809994)

dequote() comes out nearly seven times slower than the terminator
approach (which is basically dequote() vs a plain string).

So that's another problem to add to the list.  "here-docs should be no
slower than the equivalent string, indented or otherwise"


 The syntax for  POEM =~ s/regex/subst/;
 
 generally returns 1, and introducing a special case to make it
 return the string if the left hand side is a here-doc seems to be a
 pointless inconsistency.

I think its considered closer to the current trick of doing:

print ($var = POEM) =~ s/regex/subst/;  # or something like that

Another suggestion was POEM =~ m/re(ge)x/.  The match would be run
over each line and $1 used to generate the here-doc.

Honestly, I'm not really the one who should be evangelizing this
technique.


 but these subs [dequote] work in perl 5 today, so don't really need
 to be part of the RFC

They most definately do.  If we're going to propose them as a solution
to the indented here-doc problem, it would be best to distribute a
collection of commonly used ones as a module with perl.

-- 

Michael G Schwern  http://www.pobox.com/~schwern/  [EMAIL PROTECTED]
Just Another Stupid Consultant  Perl6 Kwalitee Ashuranse
slick and shiny crust
over my hairy anus
constipation sucks
-- Ken Flagg



Re: Drop here docs altogether? (was Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs))

2000-09-14 Thread Glenn Linderman

Nathan Wiger wrote:

 Solves problem #1, indented terminator, except that it adds two newlines
 (more later).

I never found anything later about these extra newlines... so if this idea
has merit, it needs to be finished.

 However, it leaves 2 and 3. Let's try adding in a regexp:

if( $is_fitting  $is_just ) {
  (my $mesg = qq/
  The old lie
Dulce et decorum est
Pro patria mori.
  /) =~ s/\s{8}(.*?\n)/$1/g;
  die $mesg;
}

I think $mesg wins up with the value of "1" the way you've coded it.  You
cured that issue with the RFC 164 syntax for subst, of course, but it could
be cured without that, but does require a temp var.

 I think we're trying to jam alot of stuff into here docs that maybe
 shouldn't be jammed in

Yes, all we need is to recognize the terminator when embedded in white space
on its line, and the rest can be handled with "here doc postprocessing
functions".  Per my somewhat longer reply to Michael Schwern.

I agree with need for a multiple character termination sequence for easy to
write here docs.

--
Glenn
=
There  are two kinds of people, those
who finish  what they start,  and  so
on... -- Robert Byrne



_NetZero Free Internet Access and Email__
   http://www.netzero.net/download/index.html



Re: Drop here docs altogether? (was Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs))

2000-09-14 Thread Michael G Schwern

On Thu, Sep 14, 2000 at 10:52:16AM -0700, Nathan Wiger wrote:
 Before you scream "Bloody murder", please read on...

I'll wait patiently for the end...


if( $is_fitting  $is_just ) {
  die subst /\s{8}(.*?\n)/$1/g, qq/
  The old lie
Dulce et decorum est
Pro patria mori.
  /;
}
 
 Seems to work for me (and yes I'm working on a prototype of RFC 164's
 functions).

No, it still has all the problems of any other regex-based solution.
If you shift the code right or left, it breaks (due to the \s{8}) and
you're back to counting whitespace again.  And as Glen pointed out,
what about that leading newline?

Can I scream now?

-- 

Michael G Schwern  http://www.pobox.com/~schwern/  [EMAIL PROTECTED]
Just Another Stupid Consultant  Perl6 Kwalitee Ashuranse
MORONS!



Re: Drop here docs altogether? (was Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs))

2000-09-14 Thread Glenn Linderman

Glenn Linderman wrote:

 I think $mesg wins up with the value of "1" the way you've coded it.

Sorry, I missed the placement of the ().  $mesg is fine.
--
Glenn
=
There  are two kinds of people, those
who finish  what they start,  and  so
on... -- Robert Byrne


___
Why pay for something you could get for free?
NetZero provides FREE Internet Access and Email
http://www.netzero.net/download/index.html



Re: Drop here docs altogether? (was Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs))

2000-09-14 Thread Nathan Wiger

Michael G Schwern wrote:
 
 No, it still has all the problems of any other regex-based solution.
 If you shift the code right or left, it breaks (due to the \s{8}) and
 you're back to counting whitespace again.

Y'know, I pointed out before why I think this is a superfluous issue.
You have to either change your regexp, or change the indentation of your
here docs terminator when you move your code around. And counting
whitespace is not so hard to justify breaking this:

if ( $its_all_good ) {
 print EOF;
 Thank goodness this text is centered!
  I'd really hate for it to
   left-shift on me.
 EOF
}

This should print out the text as shown verbatim. If you want
reformatting of any kind, that's what regex's are for. The above is far
more flexible, and your problem already has several other solutions,
which you have yourself noted.

Plus how to address the whole can of worms with tabs - spaces, 4 or 8,
trailing too, blank line stripping? Blech. regex.

 And as Glen pointed out, what about that leading newline?

Handled by the regexp, actually (yep I tested it).

 Can I scream now?

Not yet, but I might! :-)

-Nate



Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)

2000-09-14 Thread Michael G Schwern

On Thu, Sep 14, 2000 at 02:51:14PM -0700, Glenn Linderman wrote:
 Michael G Schwern wrote:
 Well, OK, so now we're talking shades of opinion.  You'd agree it
 works, though, and quite effectively.  But you'd disagree about its
 aesthetics, and its performance.  The former is much less
 interesting to me than the latter.

Here-docs are all about aesthetics.  Otherwise, we'd just be using
regular strings.


 That's fair, except that they aren't equivalent: you'd need
 
die
'The old lie'."\n".
'  Dulce et decorum est'."\n".
'  Pro patria mori.'."\n";

Just to be silly...

die join "\n",
'The old lie',
'  Dulce et decorum est',
'  Pro patria mori.','';

 Which is somewhat worse, compared to the here doc, even with "!" or
 other leading demarcation of choice (your choice, is, of course,
 none).

They're all yicky.


  I'd prefer if here-docs just DWIM.

 Yes, but... what do you mean vs. what do others mean, and all these
 problems

Others can continue to put the here-doc tag flush left if they don't
want this behavior.  I'd like to keep it clear that I consider all the
proposals orthoganal, each solving a different (yet often overlapping)
set of problems.


 This leads me down another path: wouldn't it be nice to have a
 function to interpolate a string on demand?

Whoa!  Hey, yes, great idea!  Not so much for his problem, but I can
definately see a need for anyone that's writing any sort of templating
system.


 Then you could hoist the here-doc processing above out of the loop,
 and still get the effects of interpolation inside the loop, which
 would make the performance of here-doc postprocessing much less
 critical... but this means defining variables to hold the
 intermediate results, and moving the here-doc to a different
 location, which might not be as friendly to the understanding of the
 script.

Right.  Moving the text away from the point where it is used has
maintenance problems.


PS  Do you use 132 columns to write mail?

-- 

Michael G Schwern  http://www.pobox.com/~schwern/  [EMAIL PROTECTED]
Just Another Stupid Consultant  Perl6 Kwalitee Ashuranse
Sometimes these hairstyles are exaggerated beyond the laws of physics
  - Unknown narrator speaking about Anime



Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)

2000-09-14 Thread Nathan Wiger

Michael G Schwern wrote:
 
   I'd prefer if here-docs just DWIM.
 
  Yes, but... what do you mean vs. what do others mean, and all these
  problems
 
 Others can continue to put the here-doc tag flush left if they don't
 want this behavior.

See, this is just too inflexible. The main complaint that I've heard has
been "You can't have leading or trailing whitespace around your
terminator". This is a very common error made by everyone, and *this* is
where Perl should DWIM.

The main complaint has *not* been "Man, I wish that indenting my
terminator could tell Perl to automatically strip off that much leading
whitespace", which is what you're purporting it to be.

If we want to add this feature, which does not solve the existing
problem - and there is a problem! - then I support the new "autostrip
" operator which does this.

However, this shouldn't be forced into the existing  operator, since
it prevents us from fixing a very important and annoying problem with
current here docs.

I don't mind disagreeing on a given issue, but this issue is one where
everyone has enough brains on this list to resolve reasonably. I propose
we all step back, recognize the difference between fixing problems and
adding new features, and make it so the latter doesn't prevent the
former.

-Nate



Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)

2000-09-06 Thread H . Merijn Brand

On 4 Sep 2000 21:32:00 -, Perl6 RFC Librarian [EMAIL PROTECTED] wrote:
 This and other RFCs are available on the web at
   http://dev.perl.org/rfc/
 
 =head1TITLE
 
 Here Docs Terminators (Was Whitespace and Here Docs)
[...]
 =head1 IMPLENTATION

Intentional? It's either 'IMPLANTATION', which is something that has to be
done with Damian's brain into the perl6-core, so every operator is DWIM, or
'IMPLEMENTATION', something you seem to be describing here.

Just nitpicking.

-- 
H.Merijn Brand   Amsterdam Perl Mongers (http://www.amsterdam.pm.org/)
using perl5.005.03, 5.6.0  516 on HP-UX 10.20, HP-UX 11.00, AIX 4.2, AIX 4.3,
 DEC OSF/1 4.0 and WinNT 4.0 SP-6a,  often with Tk800.022 and/or DBD-Unify
ftp://ftp.funet.fi/pub/languages/perl/CPAN/authors/id/H/HM/HMBRAND/




Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)

2000-09-04 Thread Ariel Scolnicov


I think it should be made explicit what happens if the here doc
terminator itself contains comment characters or semicolons.  This is
my suggestion:

The here doc terminator must match as a string (that is,
Cm/^\s*\Q$term\E\s*(?:\#|;\s*$)/ should match the line, where $term is
the desired terminator.

Otherwise the behaviour of Cprint 'END#17' is unclear.  This issue 
was raised during previous discussion of the RFC.
-- 
Ariel Scolnicov|"GCAAGAATTGAACTGTAG"| [EMAIL PROTECTED]
Compugen Ltd.  |Tel: +972-2-5713025 (Jerusalem) \ We recycle all our Hz
72 Pinhas Rosen St.|Tel: +972-3-7658514 (Main office)`-
Tel-Aviv 69512, ISRAEL |Fax: +972-3-7658555http://3w.compugen.co.il/~ariels



Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)

2000-09-04 Thread Michael G Schwern

On Mon, Sep 04, 2000 at 09:32:00PM -, Perl6 RFC Librarian wrote:
 Perl6 should ignore any whitespace before the terminator of a heredoc on any
 line. 

Good.  I don't see anything wrong with this.

***BRAIN STORM!***

RFC 162 (http://dev.perl.org/rfc/162.html) wanted to allow indented
here-docs, but had the problem of how to figure out what was code
indentation and what was deliberate text indentation.  For example:

if( $payment  $they_owe ) {
print MAIL LETTER;
Attention delinquent scum,

We have noticed that you still owe us money
which we graciously loaned you in your hour of need.
Rocco will be by shortly to collect your kneecaps.
LETTER
}

The RFC proposes a  operator which would strip whitespace off the
front of the here-doc.  Problem is preserving indentation.  We can
merge the two.

if( $payment  $they_owe ) {
print MAIL LETTER;
Attention delinquent scum,

We have noticed that you still owe us money
which we graciously loaned you in your hour of need.
Rocco will be by shortly to collect your kneecaps.
LETTER
}

 will notice that the closing 'LETTER' tag is indented and strip
that amount of whitespace off the front.  No regexes, no counting
spaces (you just line it up with the left margin of the text) and it
does what you mean.

In this case...

print FOO;
text
ooops
more text
FOO

Perl will issue a warning because the leftmost margin of the here-doc
text is to the left of the closing tag.

This feels like it should work, so something must be wrong.  Shoot it
full of holes, boys.


 Further it should ignore any whitespace ";"s (and comments) that
 follow the terminator.  

I've seen no reason to allow a semicolon other than "why not".  Its a
special case which adds no syntax sugar and simply represents
unnecessary orthoginality.  It just lets you be sloppy.

Comments, too.  Since the here-doc tag is free-form, you can make it
anything you'd like.  Including a comment!

For example, why would this:

print EOF;
Foo
EOF  # This is the end of the here-doc, my friends.

be any better than this?

print THIS_IS_THE_END;
Foo
THIS_IS_THE_END

I can't think of much else I'd want to comment about the end of a
here-doc than "this is the end of the here-doc" which is about as
useful as "$i++ # add one to $i".


 Perl should also ignore whitespace between the  and
 the terminator.  

I'm worried this might cause here docs to look too much like left shift
operators.  And consider the following ambiguity.

use constant BAR = 2;

$foo  BAR;
Stuff
BAR

print $foo  BAR;

The first is a here-doc.  The second is a left shift.  They look
veeery close.


 =head1 IMPLENTATION
 
 This should be a relatively simple addition to perl 
 (I think just to scan_heredoc in toke.c + docs in perl5)

The disambiguation of a here-doc start from a binary left shift might
add serious complexity to th parser.


-- 

Michael G Schwern  http://www.pobox.com/~schwern/  [EMAIL PROTECTED]
Just Another Stupid Consultant  Perl6 Kwalitee Ashuranse
"None of our men are "experts."... because no one ever considers
himself expert if he really knows his job."  
-- From Henry Ford Sr., "My Life and Work," p. 86 (1922): 



Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)

2000-09-04 Thread Nathan Wiger

Michael G Schwern wrote:
 
 The RFC proposes a  operator which would strip whitespace off the
 front of the here-doc.  Problem is preserving indentation.  We can
 merge the two.

Actually, the two started merged. :-) They were split up after there
were too many people for RFC 111 but against RFC 162. Personally, I'd
rather see the recipe method of:

   print END_OF_DOC =~ s/^\s{0,5}//g;

(It's something like that) used for the "stripping leading whitespace"
issue. 
 
 I've seen no reason to allow a semicolon other than "why not".  Its a
 special case which adds no syntax sugar and simply represents
 unnecessary orthoginality.  It just lets you be sloppy.

Well, it does add some consistency. Admittedly, there's not a huge
value-add so if it doesn't get in, hey, it doesn't get in. :-)
 
 I can't think of much else I'd want to comment about the end of a
 here-doc than "this is the end of the here-doc" which is about as
 useful as "$i++ # add one to $i".

If you have a potentially huge here doc it can help, just like a
potentially huge if statement:

  if ( $cond ) {

 # 200 lines pass

  }  # end if($cond) line 23


 I'm worried this might cause here docs to look too much like left shift
 operators.  And consider the following ambiguity.
 
 use constant BAR = 2;
 
 $foo  BAR;
 Stuff
 BAR
 
 print $foo  BAR;
 
 The first is a here-doc.  The second is a left shift.  They look
 veeery close.

Well, the ambiguity you mention actually already exists in a similar
form; see Camel-3 p. 67. So I don't know that this would really hurt or
help the situation.

-Nate



Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)

2000-09-04 Thread Michael G Schwern

On Mon, Sep 04, 2000 at 05:36:32PM -0700, Nathan Wiger wrote:
 Actually, the two started merged. :-) They were split up after there
 were too many people for RFC 111 but against RFC 162. Personally, I'd
 rather see the recipe method of:
 
print END_OF_DOC =~ s/^\s{0,5}//g;

This still leaves the problem of having to count whitespace and having
to change your regex if you reindent your code.  In effect, it causes
whitespace to become significant.  Bleh.


  I can't think of much else I'd want to comment about the end of a
  here-doc than "this is the end of the here-doc" which is about as
  useful as "$i++ # add one to $i".
 
 If you have a potentially huge here doc it can help, just like a
 potentially huge if statement:
 
   if ( $cond ) {
 
  # 200 lines pass
 
   }  # end if($cond) line 23

There's a big difference.  Every code block ends with a '}'.  Every
here doc ends with its own custom tag.  Thus to state:

print EOF;

Four score and seven years ago...

EOF  # end of print EOF line 23

can currently be better written as:

print GETTYSBURG_ADDRESS

Four score and seven years ago...

GETTYSBURG_ADDRESS

The tag itself describes what the text is, similar to the way a
well-named variable describes what's inside of it and removes the need
for a descriptive comment.  At a glance one can tell that
'GETTYSBURG_ADDRESS' closes the here-doc containing the Gettysburg
Address, without having to maintain a comment.  (I guarantee the line
number mentioned in the comment will not be maintained.)


Another reason for wanting to comment the closing of a code block is
nesting.  Simply searching for the previous '{' will not work.
Here-docs cannot be nested and do not have this problem.  Simply
searching backwards for your here-doc tag will always work.


-- 

Michael G Schwern  http://www.pobox.com/~schwern/  [EMAIL PROTECTED]
Just Another Stupid Consultant  Perl6 Kwalitee Ashuranse
BOFH excuse #356:

the daemons! the daemons! the terrible daemons!



Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)

2000-09-04 Thread Tom Christiansen

This still leaves the problem of having to count whitespace and having
to change your regex if you reindent your code.  In effect, it causes
whitespace to become significant.  Bleh.

It's much better to use the Cookbook method: it stands out better.
Please observe.

--tom



Re: RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)

2000-09-04 Thread Nathan Wiger

Michael G Schwern wrote:
 
 print END_OF_DOC =~ s/^\s{0,5}//g;
 
 This still leaves the problem of having to count whitespace and having
 to change your regex if you reindent your code.  In effect, it causes
 whitespace to become significant.  Bleh.

How is this different from having to count the number of spaces you
reindent your here doc terminator?

   print EOF;
   We want a total of
   3 leading spaces stripped off
   EOF

   print EOF;
   Now we want a total of
  5 leading spaces stripped
  EOF

As was already discussed, this approach is quite fragile; I think the
consensus was trying to get away from this.

Plus regex's are more general; they can strip leading funny chars as
well:

   print EOF =~ s/^\s*\|*\s{0,5}//g;
  |   I like to use the bar symbol to 
  | logically partition off my here docs
 EOF

Anyways, there was a sizeable discussion on this already, here are some
of the points discussed:

http://www.mail-archive.com/perl6-language@perl.org/msg02556.html
http://www.mail-archive.com/perl6-language@perl.org/msg03034.html
http://www.mail-archive.com/perl6-language@perl.org/msg03035.html
http://www.mail-archive.com/perl6-language@perl.org/msg03037.html
http://www.mail-archive.com/perl6-language@perl.org/msg03048.html
http://www.mail-archive.com/perl6-language@perl.org/msg03041.html

Not trying to cut your points off at the knees, but many of them were
discussed already and I think the conclusions make sense.

-Nate



RFC 111 (v3) Here Docs Terminators (Was Whitespace and Here Docs)

2000-09-04 Thread Perl6 RFC Librarian

This and other RFCs are available on the web at
  http://dev.perl.org/rfc/

=head1  TITLE

Here Docs Terminators (Was Whitespace and Here Docs)

=head1 VERSION

Maintainer: Richard Proctor [EMAIL PROTECTED]
Date: 16 Aug 2000
Last Modified: 2 Sep 2000
Mailing List: [EMAIL PROTECTED]
Version: 3
Number: 111
STatus: Developing

=head1 ABSTRACT

With a here doc print ZZZ; the ZZZ has to be at the start of a line and the
text of the here doc, is processed verbatum.  This results in the terminator
sticking out in the body of the document, makes indenting blocks of text
difficult and causes errors and confusion.

There are several FAQs that relate to this problem.  This proposal tidies
this up.

=head1 DESCRIPTION

Perl6 should ignore any whitespace before the terminator of a heredoc on any
line.  Further it should ignore any whitespace ";"s (and comments) that
follow the terminator.  Perl should also ignore whitespace between the  and
the terminator.  

Discussion took place on allowing statements following the terminator, but
generally these where thought of as a bad idea.  So only ";" and comments
should occour on the same line.

  All of these should work:
  
  print EOL;
EOL
  print  EOL;
   EOL
  print  EOL;
   EOL;
  print  EOL
   EOL;
  print EOL ;
EOL # this is the end of the here doc
  print EOL
EOL;# this is the end of the here doc
  print EOL ;
EOL;# this is the end of the here doc

  But this should be an error:
  
  print EOL
  EOL; $i++;


=head1 IMPLENTATION

This should be a relatively simple addition to perl 
(I think just to scan_heredoc in toke.c + docs in perl5)

=head1 CHANGES

RFC111 V1 Had two concepts, one about the terminator and another about the
content.  This has been split into two concepts, this RFC and RFC 162.

RFC111 V2 Just had the termination issue, and # commets after the
terminator

RFC111 V3 Adds the ";" as acceptable after the terminator (and more examples)

=head1 REFERENCES

RFC162Filtering Heredocs (was originally part of RFC 111 V1)