Re: allowing L

2009-11-30 Thread David E . Wheeler
On Nov 30, 2009, at 7:28 PM, David E. Wheeler wrote:

> And I'd be happy to test Pod::Simple's existing implementation of this 
> feature, and make sure it works properly for man pages and plain text. I can 
> probably squeeze some time to do that this week.

FWIW, Pod::Simple already parses things as you expect. Here's the relevant 
comment:

  # L
  # L or L
  # L or L or L<"sec">
  # L
  # L or L
  # L or L or L
  # L
  # L

So I'll just need to add some more tests I think, to make sure that nothing 
unexpected happens (like a a misplaced "|" or "/").

For my own personal use, Test::Pod is what most needs to be updated to allow 
this syntax. Hrm. Looks like it uses Pod::Simple; so I'll have to look to see 
what needs to be changed to get it to allow L.

Best,

David

Re: expanding =begin

2009-11-30 Thread David E. Wheeler
On Nov 30, 2009, at 4:14 PM, Ricardo Signes wrote:

> I'd like to extend this definition a bit.  I would replace the second 
> paragraph
> with:
> 
>   "=begin formatname"
>   "=begin formatname parameter"
>   This marks the following paragraphs (until the matching "=end
>   formatname") as being for some special kind of processing.  Unless
>   "formatname" begins with a colon, the contained non‐command
>   paragraphs are data paragraphs.  But if "formatname" does begin
>   with a colon, then non‐command paragraphs are ordinary paragraphs
>   or data paragraphs.  This is discussed in detail in the section
>   "About Data Paragraphs and "=begin/=end" Regions".
> 
>   It is advised that formatnames match the regexp
>   "m/\A:?[−a−zA−Z0−9_]+\z/".  Everything following whitespace after the
>   formatname is a parameter that may be used by the formatter when dealing
>   with this region.  Implementors should anticipate future
>   expansion in the semantics and syntax of the first parameter to
>   "=begin"/"=end"/"=for".
> 
> This allows for constructions like:
> 
>  =begin syntax javascript
> 
>  =end syntax
> 
> ...or...
> 
>  =begin table width(10) height(9)
> 
>  =end table
> 
> ...or...
> 
>  =begin dialect Pod6
> 
>  =end dialect
> 
> I believe several parsers already allow this implicitly.

Makes sense to me, but I think that you need to update the regex to include the 
(optional) parameter. Something like:

  C

Best,

David

Re: allowing L

2009-11-30 Thread David E. Wheeler
On Nov 30, 2009, at 4:41 PM, Russ Allbery wrote:

>> As for (1), I think that it is actually not difficult, and merely
>> appeared so due to the software involved at the time.  David and I
>> whipped up what I believe is a fairly complete and mostly unambiguous
>> grammar:

We forgot to account for L<< >>. So we need to exclude < and > from any text in 
L<>, but allow them in L<< >>. Otherwise they're the same.

> I was in favor of this originally and am still in favor of it now.  I
> think we should support it and am happy to find a way of representing the
> results textually in Pod::Text and Pod::Man.  It will produce much nicer
> HTML output.

And I'd be happy to test Pod::Simple's existing implementation of this feature, 
and make sure it works properly for man pages and plain text. I can probably 
squeeze some time to do that this week.

Best,

David

Re: allowing L

2009-11-30 Thread Russ Allbery
Ricardo Signes  writes:

> The long-lost question of "Why can't we have L?" came up this
> week on p5p.  David Wheeler and I spoke about the issue briefly, because
> we couldn't find anything in perlpod or perlpodspec that really
> specifically addressed the reason other than "because it would be
> difficult."

> I did a fair big of archive-diving and found two basic arguments:

> (1) Sean Burke was fighting against very painful parsing code and wanted to
> keep things as simple as possible, which was "not very simple" to begin
> with.  Adding this seemed too hard, and it was declared off limits.

> (2) It was unclear how non-hypertext formatters would choose to render links
> with text.

> As to (2), I think it's up to the formatter, many of them already deal
> with this problem, and I do not think any formatter author will be
> terribly inconvenienced by it.

> As for (1), I think that it is actually not difficult, and merely
> appeared so due to the software involved at the time.  David and I
> whipped up what I believe is a fairly complete and mostly unambiguous
> grammar:

I was in favor of this originally and am still in favor of it now.  I
think we should support it and am happy to find a way of representing the
results textually in Pod::Text and Pod::Man.  It will produce much nicer
HTML output.

-- 
Russ Allbery (r...@stanford.edu) 


allowing L

2009-11-30 Thread Ricardo Signes

The long-lost question of "Why can't we have L?" came up this week
on p5p.  David Wheeler and I spoke about the issue briefly, because we couldn't
find anything in perlpod or perlpodspec that really specifically addressed the
reason other than "because it would be difficult."

I did a fair big of archive-diving and found two basic arguments:

(1) Sean Burke was fighting against very painful parsing code and wanted to
keep things as simple as possible, which was "not very simple" to begin
with.  Adding this seemed too hard, and it was declared off limits.

(2) It was unclear how non-hypertext formatters would choose to render links
with text.

As to (2), I think it's up to the formatter, many of them already deal with
this problem, and I do not think any formatter author will be terribly
inconvenienced by it.

As for (1), I think that it is actually not difficult, and merely appeared so
due to the software involved at the time.  David and I whipped up what I
believe is a fairly complete and mostly unambiguous grammar:

  # Grammar for L<> in Pod5

  link-code   = "L<"
 ( link-text "|" ) ?
 (link-target)
 ">"

  link-text   = [^|]+

  link-target = pod-target
  | man-target
  | internal-target
  | url-target

  pod-target  = perl-module-name
( "/" section-name ) ?

  man-target  = [-\w]+ ( "(" digit ")" ) ?
( "/" section-name ) ?

  internal-target = "/" section-name
  | "/" quoted-sec-name
  | quoted-sec-name

  section-name= [^|/]+

  quoted-sec-name = DQUOTE section-name DQUOTE

  url-target  = \w+ ":" [^:\s] \S+

The only ambiguity that is obvious to me is also obvious to perlpodspec.
L could mean either xyzzy.pm in @INC or the man page for xyzzy.  The
L form disambiguates in favor of man.  This is already a known issue
and is not complicated by allowing Lhttp://foo.com>

So... what's the problem?

-- 
rjbs


expanding =begin

2009-11-30 Thread Ricardo Signes

>From perlpodspec:

   "=begin formatname"
   This marks the following paragraphs (until the matching "=end
   formatname") as being for some special kind of processing.  Unless
   "formatname" begins with a colon, the contained non‐command
   paragraphs are data paragraphs.  But if "formatname" does begin
   with a colon, then non‐command paragraphs are ordinary paragraphs
   or data paragraphs.  This is discussed in detail in the section
   "About Data Paragraphs and "=begin/=end" Regions".

   It is advised that formatnames match the regexp
   "m/\A:?[−a−zA−Z0−9_]+\z/".  Implementors should anticipate future
   expansion in the semantics and syntax of the first parameter to
   "=begin"/"=end"/"=for".

I'd like to extend this definition a bit.  I would replace the second paragraph
with:

   "=begin formatname"
   "=begin formatname parameter"
   This marks the following paragraphs (until the matching "=end
   formatname") as being for some special kind of processing.  Unless
   "formatname" begins with a colon, the contained non‐command
   paragraphs are data paragraphs.  But if "formatname" does begin
   with a colon, then non‐command paragraphs are ordinary paragraphs
   or data paragraphs.  This is discussed in detail in the section
   "About Data Paragraphs and "=begin/=end" Regions".

   It is advised that formatnames match the regexp
   "m/\A:?[−a−zA−Z0−9_]+\z/".  Everything following whitespace after the
   formatname is a parameter that may be used by the formatter when dealing
   with this region.  Implementors should anticipate future
   expansion in the semantics and syntax of the first parameter to
   "=begin"/"=end"/"=for".

This allows for constructions like:

  =begin syntax javascript

  =end syntax

...or...

  =begin table width(10) height(9)

  =end table

...or...

  =begin dialect Pod6

  =end dialect

I believe several parsers already allow this implicitly.

-- 
rjbs