Re: perlpodspec, draft 1

Tim Gim Yee Mon, 20 Aug 2001 13:29:32 -0700
Awhile ago, Sean M. Burke wrote:

> Pod parsers should not, by default, try to coerce apostrophe (') and
> quote (") into smart quotes (little 9's, 66's, 99's, etc), nor try to
> turn backtick (`) into anything else but a single backtick character
> (distinct from an openquote character!), nor "--" into anything but
> two minus signs.  They I<must never> do any of those things to text
> in CE<lt>...> sequences, and never I<ever> to text in verbatim
> paragraphs.

Hm.  Not that I disagree with the above, but it makes me wonder if a
parser/formatter should still try to auto-markup text?  The following
lines are absent from the perlpod rewrite:

- In particular, you can leave things like this verbatim in your text:
-
-    Perl
-    FILEHANDLE
-    $variable
-    function()
-    manpage(3r)

> Pod parsers, when processing a series of verbatim paragraphs one
> after another, should consider them to be one large verbatim
> paragraph that happens to contain blank lines.  I.e., these two
> lines, which have an blank line between them:
>
>       use Foo;
>
>       print Foo->VERSION
>
> should be unified into one paragraph ("\tuse Foo;\n\n\tprint
> Foo->VERSION") before being passed to the formatter or other
> processor.  Parsers may also allow an option for overriding this.

This is probably already implicit, but I will make it explicit.  The blank
lines between verbatim paragraphs constitute significant whitespace, and a
parser must pass those blank lines verbatim to the formatter.  In other
words, if there are 4 blank lines between two verbatim paragraphs, the
formatter should see and render 4 blank lines.  It would be an error to do
otherwise.

Speaking of concatenating multiple paragraphs into one, it would be nice
if a parser would pass the data in a "=begin X"/"=end X" block to the
formatter in one single block.

And although perlpod already illustrates it clearly, I will restate the
obvious: "=begin X" is terminated by a matching "=end X".  A
parser/formatter must not terminate "=begin X" with "=end Y", "=end Z", or
any "=end" command without a matching format name.  A "=begin X" without a
matching "=end X" is an error.

> pipe).  Pod parsers should also understand "EE<lt>lchevron>" and
> "EE<lt>rchevron>" as legacy codes for characters 171 and 187, i.e.,
> "left-pointing double angle quotation mark" = "left pointing
> guillemet" and "right-pointing double angle quotation mark" = "right
> pointing guillemet".  (These look like little "<<" and ">>", and they
> are now preferably expressed with the HTML/XHTML codes "EE<lt>laquo>"
> and "EE<lt>raquo>".)

Is there a lot of pod documents already using E<lchevron> and E<rchevron>?
Is there another reason they should be supported, given that E<laquo> and
E<raquo> are each 3 characters less typing? :)

> Some pod formatters output to formats that implement nonbreaking
> spaces as an individual character (which I'll call "NBSP"), and
> others output to formats that implement nonbreaking spaces just as
> spaces wrapped in a "don't break this across lines" code.  Note that
> at the level of pod, both sorts of codes can occur: pod can contain a
> NBSP character (whether as a literal, or as a "EE<lt>160>" or
> "EE<lt>nbsp>" sequence); and pod can contain "SE<lt>foo
> IE<lt>barE<gt> baz>" sequences, where "mere spaces" (character 32) in
> such sequences are taken to represent nonbreaking spaces.

S<> confuses me.  Does a tab ("\t") map to a NBSP also?  Maybe it
should be space-expanded to the next tabstop, then those spaces
translated to NBSPs.  What about ("\n")?  I can see a text editor
filling/wrapping a S<> into two lines.  Should multiple consecutive
spaces map to a single NBSP?  I imagine leading and trailing spaces
are significant, so "S< foo bar >" becomes "&nbsp;foo&nbsp;bar&nbsp;"
in HTML.  I wonder if "S<<  foo bar  >>" maps to the same thing, or do
the "<<" and ">>" slurp up all adjacent whitespace, leaving only
"foo&nbsp;bar"?

> Note that section names might contain markup.  I.e., if a section
> starts with:
>
>   =head2 About the C<-M> Operator
>
> (or "=item About the CE<lt>-M> Operator"), then a link to it would
> look like this:
>
>   L<somedoc/About the C<-M> Operator>

Would L<somedoc/About the -M Operator> work?
Would L<somedoc/About the B<-M> Operator> work?

Given:

    =head2 About the B<C<-M>> Operator

Would the link look like none, any, some, or all of these?

    L<somedoc/About the B<C<-M>> Operator>
    L<somedoc/About the C<B<-M>> Operator>
    L<somedoc/About the C<-M> Operator>
    L<somedoc/About the -M Operator>

How would I create links to these?

    =item rot13($text)
    =item rot13 [TEXT]
    =item $obj->rot13($text)
    =item $grfg = $obj->rot13($text)

Would the following work?  For all of the above?

    L<somedoc/rot13>
    L<somedoc/rot13()>
    L<somedoc/rot13($text)>
    L<somedoc/rot13(TEXT)>

> Authors wanting to link to a particular (absolute) URL, must do so
> only with "LE<lt>scheme:...>" sequences (like
> LE<lt>http://www.perl.org>), and must not attempt "LE<lt>Some Site
> Name|scheme:...>" sequences.  This restriction avoids many problems
> in parsing and rendering LE<lt>...> sequences.

The whole reason I'd want L<scheme:...> is so I could do
L<text|scheme:...>.  L<http://www.perl.org/> is really not much better
than http://www.perl.org/ sans L<>, given that pod2xxx should mark it
up for me anyways.

What sort of parsing and rendering problems does this restriction
avoid?  I'm clueless as to any parsing problems.  I'm guessing the
rendering problems relate to non-hypertext formats.  Rendering...

    Just another L<Perl|http://www.perl.com/> Hacker

...as either of these is undesirable:

    Just another "Perl" Hacker
    Just another "http://www.perl.com/"; Hacker

The first sacrifices link info, and the second destroys the author's intent.
Is this a good compromise?

    Just another "Perl" <URL:http://www.perl.com/> Hacker

Maybe too cluttered.  Maybe a footnote?

    Just another "Perl"[2] Hacker

And at the bottom of the page or end of the document:

    ____________________________________________________
    Footnotes:
        1. http://search.cpan.org/
        2. http://www.perl.com/
        3. http://www.perldoc.com/

BTW, how would pod2text and pod2man render the following?

    Set the L<input record separator|perlvar/$E<sol>> to ""...

Do they just scrap the link info?

    Set the "input record separator" to ""...

And could I have written it like so?

    Set the L<input record separator|perlvar/"$/"> to ""...

Or must the "/" always be escaped?


-- 
Tim Gim Yee
[EMAIL PROTECTED]
Re: perlpodspec, draft 1

Reply via email to