Quirky comments

Jonathan Lang Sat, 16 Jun 2007 12:11:41 -0700

In "[svn:perl6-synopsis] r14421 - doc/trunk/design/syn",
Damian Conway wrote:

brian wrote:
> So, if this is the case, how will a new Perl 6 user debug a program
> failure when part of their program mysteriously disappears because
> they just happened to have =begin at the beginning of a line?


The same way they debug it when part of their program mysteriously
disappears because they just happened to have # at the beginning of a
line: by learning to distinguish between code and commentary.

Except, of course, the Pod mysteriously vanishing will be considerably
easier to debug, because ALL lines starting with =begin vanish, whereas
only some lines beginning with # do.


By this reasoning, a case could be made for declaring that all lines
that begin with '#' are comments (either block or line, depending on
the second character), no matter what.  That way, you'll have more
consistency between block comments and Pod Sections (which, from
Perl's perspective, should be equivalent).

The price to be paid (in both cases) is that you have to finagle any
code that would normally start a line with a '#' or '=' respectively,
such as the aforementioned block quote.  Admittedly, this isn't hard
to do: starting the line with whitespace or an unspace will do the
trick.  (But ending the previous line with an unspace won't, as
comments are apparently found and turned into whitespace by the
lexer.)

There is an additional price to be paid in the case of '#': you'd have
to distinguish between end-of-line comments (which cease to be
comments if placed in quotes) and line comments (which are always
comments, no matter what).  In effect, you would have four kinds of
comments, not counting Pod sections:

bracketed?  starts line?
           yes:   no:
no:         line   end-of-line
yes:        block  embedded

The semantic similarities would be as follows:

* line-oriented comments (line and block) don't care about quotes;
character-oriented comments (end-of-line and embedded) do.
* block and embedded comments continue until explicitly terminated;
line and end-of-line comments are terminated by newline.

--

Another quirk concerning comments: if I'm reading S02 correctly, C<\#>
is an unspaced comment whenever it would normally be a comment, and is
only an escaped pound sign when it would normally be a quoted pound
sign.  This is one (possibly the only) case where backslashing a
special character does _not_ shut off its special behavior.  As a
result, you have to quote a pound sign if you want to use it in a
pattern.  If this behavior is kept (I don't care either way), it
should probably be noted in "Learning Perl 6" or the like, as a
"gotcha".

--

I also can't seem to find any means of starting a comment from within
a quote, aside from Pod Sections (and, if my first suggestion is
adopted, line and block comments).  Perhaps C<\#> should count as the
start of a comment when appearing in a quote?  This has the advantage
that almost every appearance of that pair of characters will act to
comment out what follows; the only exception would be when it appears
as part of the C<\\#> sequence, which is easily tested for.  It does,
however, mean that you can't start a line that's within a quote with
C<\#> in order to start that line with a literal pound sign.  C<\ #>
would work, though, as would indenting the quote in the case of a
heredoc.

--

Also from S02:

 Although we say that the unspace hides the whitespace from the
 parser, it does not hide whitespace from the lexer. As a
 result, unspace is not allowed within a token.

Technically true; but what does this mean?  If I were to say

 foo\ bar

would the lexer generate a pair of tokens ('foo' and 'bar') that don't
have any intervening whitespace (which might be important in some
cases that involve whitespace-based disambiguation), or would it
complain about finding an unspace where it doesn't belong?  I hope
that the former is true; although Larry seems to have been very
conscientious about making sure that whitespace is never forbidden
between tokens unless a dot can be placed there instead.  Still,
letting unspace split tokens provides a more general solution.

--

Finally, from S02:

 Since there is a newline before the first =, the POD form of
 comment counts as whitespace equivalent to a newline.

This rationale doesn't hold, for two reasons.  First, there is not
going to be a newline before the first = in a Pod Section if said Pod
Section starts the file.  Second, the stated behavior doesn't follow
from the premise.  Given the logic that Pod Sections are effectively
stripped out of the file before anything else happens, one would
expect:

 say qq:to'END';
 =begin comment
 END
 =end comment
 END

to be equivalent to:

 say qq:to'END';
 END

instead of:

 say qq:to'END';

 END

However, the latter is what occurs under the current rule.  I submit
that Pod Sections shouldn't be equivalent to whitespace; they should
be equivalent to empty strings.  Likewise with line and block
comments: all line-oriented comments should remove all traces of the
line(s) being commented out, including the trailing newline
character(s).  There will still be a newline between the last line
before the comment and the first line after it (assuming that there
_is_ a line before the comment): the trailing newline of the preceding
line.

--
Jonathan "Dataweaver" Lang

Quirky comments

Reply via email to