Michael G Schwern wrote:
> On Thu, Sep 14, 2000 at 11:49:18AM -0700, Glenn Linderman wrote:
> > I'm all for solving problems, and this message attempts to specify 3
> > problems, but it needs more specification. You describe three
> > problems, but it is not clear what the problems are
>
> Since we've been charging back and forth over this ground like a troop
> of doughboys over No Man's Land for the past month, I figured everyone
> knew the problem and proposed solutions. Your review accuractely lays
> everything out.
OK, I'll try to keep a running list of the problems and solutions... and non-solutions.
> Things like this have come up, and to my eyes and fingers its
> unacceptable.
Well, OK, so now we're talking shades of opinion. You'd agree it works, though, and
quite
effectively. But you'd disagree about its aesthetics, and its performance. The
former is
much less interesting to me than the latter.
> Some people like the explicit demarcation of the left
> boundry, I find it ugly and don't like the extra typing. It doesn't
> win me much over:
>
> die
> ' The old lie'.
> ' Dulce et decorum est'.
> ' Pro patria mori.';
That's fair, except that they aren't equivalent: you'd need
die
' The old lie'."\n".
' Dulce et decorum est'."\n".
' Pro patria mori.'."\n";
Which is somewhat worse, compared to the here doc, even with "!" or other leading
demarcation
of choice (your choice, is, of course, none).
> I'd prefer if here-docs just DWIM.
Yes, but... what do you mean vs. what do others mean, and all these problems....
> So we may want to add Yet Another problem. I forget what number you
> got up to, but its basically "You shouldn't have to add anything but
> whitespace to the here-doc for indenting".
That's not so much a problem as a restriction on the solution space. This restriction
requires that indentation be inferred from something, or specified somehow. Inferring
it is
problematical.
The only practical inference is via the position of the terminator relative to the
rest of the
text, clearly the solution you are driving me towards, and I have nothing particularly
against
that solution, except that it isn't until you find the terminator that you can figure
out how
much white space should be on each of the lines. Nothing else could be aligned
practically
with the rest of text. Because all existing here docs specify their terminator at the
left
margin, as long as this is introduced concurrently with allowing leading/trailing
white space
it could work.
Your example with the desire for some leading white space on each line can't be solved
with
RFC 162's current solution.
I think that if leading white space is stripped without a visible demarcation sequence
that it
should only work if the leading white space is identical on each line... and that a
warning
(and no stripping) should occur if there is an inconsistency in the exact character
sequence
that would be stripped. This is, I think, the only way not to open the can of worms
about the
definition of how big a tab character is.
> An additional problem with dequote() style solutions is they are not
> as efficient.
This is an excellant point... for one use here docs, it is is irrelevant to the
overall script
performance if the work is done at compile or run time. But when they are used in
loops, it
can make a difference... a significant difference. This is a problem we don't want to
introduce, and dequote-like solutions introduce it.
This could be "solved" by hoisting the here-doc and related dequote processing out of
the
loop, and into variables, except for the desire to do interpolation of the content.
There seems to be no way to do interpolation of an existing string except eval, which
would
require constructing syntax around the interpolation, such as
sub interpolate { eval "qq\000" . $_[0] . "\0"; }
and this doesn't work so great if lexicals are mentioned in the parameter... it would
have to
be done inline to get lexicals to work right.
This leads me down another path: wouldn't it be nice to have a function to interpolate
a
string on demand?
Then you could hoist the here-doc processing above out of the loop, and still get the
effects
of interpolation inside the loop, which would make the performance of here-doc
postprocessing
much less critical... but this means defining variables to hold the intermediate
results, and
moving the here-doc to a different location, which might not be as friendly to the
understanding of the script.
Another direction to take in this regard would be via RFC 18. If some of the
processing for a
sequence of code could be done at compile time and it could rewrite that code to what
gets
left for runtime, then you wouldn't need to hoist the code out of the loop... instead,
you
write something like the following. Clearly your poem doesn't need interpolation, but
in
general it would be useful.
sub dequote_interpolate : immediate
{
local $_ = shift;
my ($leader); # common white space and common leading string
if (/^\s*(?:([^\w\s]+).*\n)(?:\s*\1.*\n)+$/) {
$leader = quotemeta($1);
} else {
$leader = '';
}
s/^\s*$leader//gm;
return "interpolate ( $_ );"; # this gets left in place of the call.
# $_ would get interpolated into what is left, but that could contain other
references
# to other variables that would get interpolated later.
# could use the following return, instead, to not depend on the existance of
interpolate
# return "eval \"qq\\000$_\\0\"";
}
while ( ... )
{ print OUTFILE dequote_interpolate (<<'POEM');
! The old lie
! Dulce et decorum est
! Pro patria mori.
POEM
}
I mention these ideas, because they are neat ideas with lots of general applicability,
even
though probably 90% of the cases would be covered by stripping the amount of white
space in
front of the here-doc terminator.
--
Glenn
=====
There are two kinds of people, those
who finish what they start, and so
on... -- Robert Byrne
_____NetZero Free Internet Access and Email______
http://www.netzero.net/download/index.html