Re: Which Pod module should be used / subclassed?

2019-05-09 Thread Karl Williamson

On 5/5/19 4:52 PM, Harald Jörg wrote:

Hello Pod-People,

I hope that I don't start a religious war with this question.
I found this mailing list advertised in the Pod::Simple docs, so I would
accept a bias towards Pod::Simple based solutions :)

So here's what I want to do: Extract Pod from a bunch of (ca 100) Perl
modules and Pod files and convert it to HTML.  Or XHTML.  Anything good
for today's browsers is fine.  Sounds pretty TIMTOWTDI, but every way
I've tried so far has minor issues, and I've some requirements for which
I haven't found an existing solution yet.

The issues are rather harmless:

  - Pod::POM is what's used today in the software to create the HTML
docs.  It fails to process Lhttps://www.perl.org>
links correctly.

  - Pod::Simple::HTML produces invalid HTML (nested 'a' elements) when a
heading or item contains a Link like this:

  =head1 Start at L

  - Pod::Simple::XHTML apparently makes no effort to find content for the
 element (nor does core pod2html, BTW).

And there are a few things I miss.  They could be implemented in a
subclass of either of these, or even provided as an enhancement via Pull
Request:

  - A custom link resolver: I want links to documents within the project
to be relative, but link to other CPAN modules to be
absolute. Preferably to metacpan.org instead of search.cpan.org.

  - A custom table of contents

  - Custom (or just different) backlinks to top of page

  - Decent heuristics for page titles (Pod::Simple::PullParser does that
marvelously)

  - ...and some more, but not enough to roll out my own converter.

So, which if the modules is considered "state of the art" by the Pod
People?  Which one of them is least likely to be deprecated?

Do others have similar requirements?



I seem to be the one mostly maintaining Pod:Simple these days.  I think 
the design of Pod::Simple is basically sound.  If you came up with 
reasonable pull requests for either of those modules, I would apply them.


All those TODOs in the code are from the original author, I believe, and 
they would be nice to have, but the code basically works and no one has 
felt the need to spend the effort to implement them.


Re: Bug #98326 for Pod-Checker: Can we make “A non-empty Z<>” a warning and not an error

2018-05-28 Thread Karl Williamson

On 05/22/2018 07:18 PM, Dan Muey wrote:

Greetings!

Per Karl Williamson’s request[1] before he makes any changes we’d like to run 
the idea past you all and get your feedback:

http://perldoc.perl.org/perlpodspec.html says about Z<>:

“This code is unusual is that it should have no content. That is, a processor may 
complain if it sees Z . Whether or not it complains, the potatoes 
text should ignored.”

Z seems to fit under warnings (i.e. “may complain” not “should 
explode”) better because it “may not necessarily cause trouble, but indicate mediocre 
style.”

I have an edge case where I essentially need inline comments in POD for some parser 
notation (https://rt.cpan.org/Public/Bug/Display.html?id=98322) and the only option 
ATM is “mediocre style” of hacking Z<>.

Or, if not by default, can we have a way, a flag maybe, to ignore certain 
errors that we grok and are OK with?

Alternatively, a way to inhibit 'POD ERRORS' section from being rendered as 
part of the POD (e.g. send it to STDERR).

A fourth option would be to add a specific inline-comment formatter so you could 
# without error and without hacking Z<>. (it would be like Z<> but 
barf if it was empty)

After the RT discussion “making non-empty Z<> merely warn” seems OK, we just 
wanted it to be discussed here first. Thanks!

—
Dan Muey

[1] https://rt.cpan.org/Ticket/Display.html?id=98326#txn-1787110




It does seem to me that treating this condition as a fatal error is 
wrong, and likely to be an accident of implementation.  Pod::Checker 
makes all but a very few of the warnings generated by Pod::Simple fatal. 
 Each such is listed as an exception.  My guess is that the exceptions 
were added one by one as needed, and this one just didn't come up.


Saying Z as opposed to Z<> makes no difference in the pod that 
gets generated, so making it fatal just seems wrong.  So I think we 
should accommodate this request, and I intend to issue a PR to do so, if 
no objections are raised here.


What to do about L and L<"Foo Bar">

2018-05-28 Thread Karl Williamson

podspec says this:

Previous versions of perlpod allowed for a "L" syntax (as in
"L"), which was not easily distinguishable from
"L" syntax and for "L<"section">" which was only slightly less
ambiguous. This syntax is no longer in the specification, and has been
replaced by the "L" syntax (where the slash was formerly
optional). Pod parsers should tolerate the "L<"section">" syntax, for
a while at least. The suggested heuristic for distinguishing
L" from "L" is that if it contains any whitespace,
it's a section. Pod processors should warn about this being deprecated
syntax.

Pod:Simple accepts these without complaint.

If I change things to complain, a bunch of things in the perl core are 
found to be in violation, even of the deprecated syntax.


The question is what to do?

1) We could leave things as they always have been, to let sleeping dogs 
lie.  It's worked for so long that we're not seriously going to stop 
accepting these.


2) Raise the warnings, either on both cases or just the deprecated

3) Don't raise warnings, but change Pod::Checker to do so, under the 
theory that you won't be using that unless you want to know the iffy 
things.  Maybe make the deprecated come out always, and the tolerated 
only for level 2 warnings.


I'm leaning towards option 3).


Re: Bug #98326 for Pod-Checker: Can we make “A non-empty Z<>” a warning and not an error

2018-05-22 Thread Karl Williamson

On 05/22/2018 07:18 PM, Dan Muey wrote:

Greetings!

Per Karl Williamson’s request[1] before he makes any changes we’d like to run 
the idea past you all and get your feedback:

http://perldoc.perl.org/perlpodspec.html says about Z<>:

“This code is unusual is that it should have no content. That is, a processor may 
complain if it sees Z . Whether or not it complains, the potatoes 
text should ignored.”

Z seems to fit under warnings (i.e. “may complain” not “should 
explode”) better because it “may not necessarily cause trouble, but indicate mediocre 
style.”

I have an edge case where I essentially need inline comments in POD for some parser 
notation (https://rt.cpan.org/Public/Bug/Display.html?id=98322) and the only option 
ATM is “mediocre style” of hacking Z<>.

Or, if not by default, can we have a way, a flag maybe, to ignore certain 
errors that we grok and are OK with?

Alternatively, a way to inhibit 'POD ERRORS' section from being rendered as 
part of the POD (e.g. send it to STDERR).

A fourth option would be to add a specific inline-comment formatter so you could 
# without error and without hacking Z<>. (it would be like Z<> but 
barf if it was empty)

After the RT discussion “making non-empty Z<> merely warn” seems OK, we just 
wanted it to be discussed here first. Thanks!

—
Dan Muey

[1] https://rt.cpan.org/Ticket/Display.html?id=98326#txn-1787110



Perhaps this was added after you filed your ticket:

"$parser->complain_stderr( SOMEVALUE )"
 If you set this attribute to a true value, it will send reports of
*parsing errors to STDERR. By default, this attribute's value is 
false,

*meaning that no output is sent to STDERR.

*Setting "complain_stderr" also sets "no_errata_section".

Please try that and see if it is sufficient to solve your issue


Interesting article: Researchers hide information in plain text

2018-05-10 Thread Karl Williamson

https://www.sciencedaily.com/releases/2018/05/180510150231.htm


Re: Pod::Simple output as POD

2018-05-09 Thread Karl Williamson

On 05/08/2018 07:05 PM, David E. Wheeler wrote:

On May 8, 2018, at 18:48, John SJ Anderson  wrote:


I suspect the plea for counsel was more intended for David, but I’ll pipe up 
from the peanut gallery and say, “why not both?” It seems like the ideal thing 
to put under a feature flag.


Actually, I wasn't thinking specifically of him.


I’m sorry, I’ve lost all context on this thread after two years. What’s it for 
again? Flag sounds okay, but better is to use =encoding.


This is a new module started by John, to extract the pod portions from 
say a .pm.  I think he said he thought he got about 80% of it done.  Now 
I'm completing it.  It requires hooks in BlackBox and other .pm's to 
enable it to really work.



Of course, that just changes this decision into “which one should be the 
default and which one should need to be enabled?”, but perhaps thinking about 
it in those terms will make it more clear which has the higher utility value?


Yeah, whichever is going to be more valuable for your intended audience.


The intended audience is me initially.  So I've decided to write it for 
my use, which wants the results in ASCII or UTF-8.


I think this is as much as I'm prepared to chew on for now, and if 
someone else wishes to extend it in the future to have the option John 
suggested, I agree it would be nice, and don't think it would be very hard.


Looking for another pod tip

2018-05-09 Thread Karl Williamson
I have an item text list.  Not all the items have content besides the 
text, and so the pod would collapse them together into adjacent lines, 
whereas I want them separated.  I did this by adding a NBSP, but then I 
get an extra line that I'd rather not have.


Here's an example

 Category "LC_NUMERIC": Numeric formatting
 This indicates how numbers should be formatted for human 
readability,

 for example the character used as the decimal point.

 Category "LC_MONETARY": Formatting of monetary amounts


 Category "LC_TIME": Date/Time formatting


 Category "LC_MESSAGES": Error and other messages
 This is used by Perl itself only for accessing operating 
system error

 messages via $! and $^E.


I don't know if your email client will collapse the lines, but there are 
two empty ones after the two items that don't have accompanying text. 
Any ideas as how to get rid of one?


Re: =item * foo bar

2018-05-09 Thread Karl Williamson

On 05/08/2018 09:57 PM, Dan Book wrote:
On Tue, May 8, 2018 at 11:32 PM, Karl Williamson 
<pub...@khwilliamson.com <mailto:pub...@khwilliamson.com>> wrote:


There is code in Pod::Simple that "tolerates" (meaning accepts as a
bullet item) this pod line that would normally be illegal by
perlpodspec.

I wonder if anyone is around who remembers why this was added.  I
didn't see details in an internet search


I don't think it's illegal by perlpodspec. perlpodspec allows "=item 
[text]" where text is not a * followed by spaces, but it doesn't say 
anything about * followed by spaces and then other words.


My reading of that is that this should be treated as

=item text

which of course is legal, but the * would be output as a star and not 
translated into a bullet, and that is what happens here.


If I had known about this earlier, I would have used it in pods to make 
it look better.  So maybe the spec could be changed to specifically 
allow this.  Or are there commonly used implementations that don't 
support it?


Re: Pod::Simple output as POD

2018-05-07 Thread Karl Williamson

On 05/13/2016 12:24 PM, David E. Wheeler wrote:

On May 13, 2016, at 11:03 AM, Karl Williamson <pub...@khwilliamson.com> wrote:


If we wanted to be cute, we could call it Pod::Simple::SimplyPod, with you 
know, only one, natural, ingredient, and no harmful additives.


But is it organic? Or Biodynamic?

D



The marketing term Biodynamic doesn't seem to have survived the test of 
time, at least in my corner of Trumpistan.


So, I converted the name to JustPod, and am trying to finish that up.
I had to suspend work on it a couple of years ago, and am just now able 
to get back to it.


Changes to BlackBox were needed.

I left it mostly working, and foolishly didn't leave notes to myself 
about what else was needed, so now I'm working on test files in the 
distribution to make sure that the pod extraction is working.  We have a 
bunch of files in the t/corpus directory, and I can see how well this 
works on each of them.


One thing that might not ever be precise is retaining the file's white 
space, as opposed to squeezing out unnecessary strings of multiple ones 
to just one blank.


And I'm running into something that I know I had not previously gotten 
as far as (which is encouraging), and I'm writing now for counsel.


What to do about input files that are encoded in some alien encoding, 
like Japanese 2202?  The Pod::Simple docs say it translates the pod into 
perl's internal representation.  But should the extracted pod also be in 
perl's representation, or should it be translated back to the original 
encoding?  The second way would be a way to really extract the pod 
portions of the original.


But I'm thinking it should be perl's, so that downstream modules can use 
it as-is.  But I'm open to other reasoned opinions


Re: Pod::Simple output as POD

2016-05-13 Thread Karl Williamson

On 05/11/2016 07:38 PM, John SJ Anderson wrote:



On May 11, 2016, at 17:52, Ron Savage  wrote:
On 12/05/16 10:39, David E. Wheeler wrote:


Which also seems a little weird. Maybe Pod::Simple::PodFormat?


Pod::Simple::ExtractPod is good, but possible is Pod::Simple::JustPod.


With only a _tiny_ bit of my tongue in my cheek, I’ll throw out 
Pod::Simple::PlainOldPOD ...

8^)

j.



I'm leaning towards Pod::Simple::JustPod.  I think that captures the 
essence, and seems to me to fit the paradigm of the output format.


If we wanted to be cute, we could call it Pod::Simple::SimplyPod, with 
you know, only one, natural, ingredient, and no harmful additives.


Re: Pod::Simple issues

2016-04-29 Thread Karl Williamson

On 04/29/2016 01:58 PM, Shawn H Corey wrote:

On Fri, 29 Apr 2016 13:34:21 -0600
Karl Williamson <pub...@khwilliamson.com> wrote:


Nested L<> are illegal.  Pretending inner one is X<> so can
continue looking for other errors.


That would be Z<>




That would generate an additional warning that it wasn't empty.  The 
mechanism is to divert the incoming text into the X<> so it doesn't do 
anything bad.  I suppose we could set a flag for the Z<> to suppress the 
warning.


Re: AW: Working on CPAN Testers fails for Pod::Simple::Search

2016-04-29 Thread Karl Williamson

On 04/24/2016 11:34 PM, Marek Rouchal wrote:

Does this mean that there is a "find"-like function in Pod::Simple that
replaces Pod::Find? That would be an opportunity to discontinue
Pod::Find along with Pod::Parser...

-Marek





Looking at the man page, it looks like Pod::Simple::Search does a 
similar function as Pod::Find does




Re: AW: pod2usage and Pod::Find, Pod::PlainText

2016-04-25 Thread Karl Williamson

On 04/23/2016 03:26 PM, Marek Rouchal wrote:

Thanks for the hint... two thoughts, feedback welcome:
1. Pod::PlainText used to be part of the core... but since now Pod::Usage 
depends on Pod::Simple, I think the tests should be restructured to use that, 
or as a last resort, Pod::Text

Is there any real difference between PlainText and Pod::Simple::Text ?


2. Pod::Find might deserve a separate distribution, but again the test of 
Pod::Usage should not depend on it.
Hope to find some time to get that done in the next days...

-Marek

-Ursprüngliche Nachricht-
It has been a goal to remove Pod::Parser from the core perl distribution.

It turns out there is a dependency in 2 test files for pod2uage upon Pod::Find 
and Pod::PlainText, which are parts of Pod::Parser.

The test files are Pod-Usage/t/pod/pod2usage.t
 and Pod-Usage/t/pod/pod2usage2.t

Note that Pod::Usage itself doesn't depend on Pod::Parser, just two test files 
do.  I don't understand this part of perl at all.  So I'm wondering what to do 
about this.  Could the tests just be deleted?  Is there a current alternative 
to the functionality of these modules?








pod2usage and Pod::Find, Pod::PlainText

2016-04-23 Thread Karl Williamson

It has been a goal to remove Pod::Parser from the core perl distribution.

It turns out there is a dependency in 2 test files for pod2uage upon 
Pod::Find and Pod::PlainText, which are parts of Pod::Parser.


The test files are Pod-Usage/t/pod/pod2usage.t
   and Pod-Usage/t/pod/pod2usage2.t

Note that Pod::Usage itself doesn't depend on Pod::Parser, just two test 
files do.  I don't understand this part of perl at all.  So I'm 
wondering what to do about this.  Could the tests just be deleted?  Is 
there a current alternative to the functionality of these modules?


Thanks


Re: Assume CP1252

2015-01-13 Thread Karl Williamson

On 01/12/2015 01:27 PM, Karl Williamson wrote:

On 01/12/2015 12:49 PM, David E. Wheeler wrote:

On Jan 12, 2015, at 11:46 AM, Karl Williamson
pub...@khwilliamson.com wrote:


I ran across this link, but didn't see what action was taken on it:
http://www.w3.org/TR/newline


Pardon my ignorance. Does that mean that `s/Latin-1/CP1252/g` could be
a mistake on EBCDIC?

David



Yes, that's essentially what I meant when I said in an earlier email
that NEL is THE new-line character on os390, which generally runs using
EBCDIC.  The code point for NEL in cp1252 is a horizontal ellipsis, and
not a next line, but on some platforms, like os390, it means next
line.   This is a conflict.

However, now that I think about it, when I look at os390 runs, I rarely
see NELs.  Maybe there is a filter that translates them to \n before the
pod sees it, but sometimes, I do see NEL all over the place but no \n.
I'll ask on the perl-mvs list about this.



tl;dr:  I was wrong to think there was a problem in s/latin1/cp1252/ for 
EBCDIC.


In researching the issue in order to create an intelligent posting, I 
found the answer.


It is an undocumented subtlety with Perl's EBCDIC implementation, that I 
was surprised I didn't know, as I've been pretty deep into that 
implementation.


And it's interesting (at least to me), so I'll document it here (as well 
as make corrections to perlebcdic.pod).


As many of you know, ASCII has both CR and LF characters that are used 
variously as line termination characters.  Old Apple used CR, and 
Windows uses the combination CR-LF.  Perl handled the Apple issue by 
swapping the meanings of \r and \n there; it handles CR-LF by having an 
I/O layer that makes CR-LF appears as a single \n internally so the 
gotchas are hidden from most applications.


In addition, Unicode defines the NEL (next line) character which is an 
another alternative line terminator.  Its code point is the one that 
CP1252 uses instead to mean a horizontal ellipsis.


It turns out that NEL is the character that os390 uses as its line 
terminator, not CR nor LF.  It is called NL in EBCDIC.  (NL is 
unfortunately a synonym for LF in ASCII and Unicode terminology.)


What Perl does to handle this is to simple swap the NEL and LF code 
points.  That makes \n mean NEL instead of LF.  Apparently LF is unused 
in EBCDIC applications, so it works.  There is official support for this 
swap, as Unicode's definition of how to get UTF-8 to work on EBCDIC 
platforms says to do the swap.


It does mean that NL doesn't mean the character that a native EBCDIC 
speaker would think.


But the bottom line is that because of this character swapping, the NEL 
characters in EBCDIC appear as \n, so aren't a problem for CP1252.


Re: Assume CP1252

2015-01-12 Thread Karl Williamson

On 01/12/2015 06:25 AM, Shawn H Corey wrote:

On Sun, 11 Jan 2015 20:57:26 -0700
Karl Williamson pub...@khwilliamson.com wrote:


To be clear, I think that assuming 1252 when there is no =encoding
line is a good idea.  But I'm leery of overriding an actual =encoding
line.


Agreed.


I could possibly be persuaded, if someone want to make it, by the 
argument that 'latin1' is kind of colloquial, and someone using it may 
very well not be familiar with the possibility that they really mean 
cp1252.  But, if so, there needs to be a way for someone to say I 
really mean it and not be overridden by us.  Perhaps

that could be =encoding ISO-8859-1.



Q: What if there is more than one =encoding line? Does it switch
encoding part way thru a POD?




Error while formatting with Pod::Perldoc::ToMan:
 Nested processed encoding. at 
/usr/share/perl/5.18/Pod/Simple/BlackBox.pm line 380.




Re: Assume CP1252

2015-01-12 Thread Karl Williamson

On 01/12/2015 12:37 PM, David E. Wheeler wrote:

On Jan 12, 2015, at 11:18 AM, Karl Williamson pub...@khwilliamson.com wrote:


To be clear, I think that assuming 1252 when there is no =encoding
line is a good idea.  But I'm leery of overriding an actual =encoding
line.


Agreed.


I’m okay with this.


I could possibly be persuaded, if someone want to make it, by the argument that 'latin1' 
is kind of colloquial, and someone using it may very well not be familiar with the 
possibility that they really mean cp1252.  But, if so, there needs to be a way for 
someone to say I really mean it and not be overridden by us.  Perhaps
that could be =encoding ISO-8859-1.


If we *were* to assume CP1252 for Latin-1, I would want it to be consistent 
with the precedent set by the W3C.


That sounds reasonable.


 Sean supplied this link:


   http://www.w3.org/TR/encoding/#names-and-labels

Here’s the list of labels that they translate to Windows-1252:


ansi_x3.4-1968
ascii
cp1252
cp819
csisolatin1
ibm819
iso-8859-1
iso-ir-100
iso8859-1
iso88591
iso_8859-1
iso_8859-1:1987
l1
latin1
us-ascii
windows-1252
x-cp1252

In their interpretation, no label ever resolves to iso-8859-1. Pretty 
interesting.


I ran across this link, but didn't see what action was taken on it:
http://www.w3.org/TR/newline






Q: What if there is more than one =encoding line? Does it switch
encoding part way thru a POD?




Error while formatting with Pod::Perldoc::ToMan:
Nested processed encoding. at /usr/share/perl/5.18/Pod/Simple/BlackBox.pm line 
380.


I recently changed this error, because that was a pretty useless message. The new message 
is Cannot have multiple =encoding directives. Also, it is no longer fatal, 
but is passed to scream(), which means it would be a failure for Test::Pod, but won’t 
break tools that generate docs.

   http://github.com/theory/pod-simple/commit/cb884b5

Best,

David





Re: Assume CP1252

2015-01-12 Thread Karl Williamson

On 01/12/2015 12:49 PM, David E. Wheeler wrote:

On Jan 12, 2015, at 11:46 AM, Karl Williamson pub...@khwilliamson.com wrote:


I ran across this link, but didn't see what action was taken on it:
http://www.w3.org/TR/newline


Pardon my ignorance. Does that mean that `s/Latin-1/CP1252/g` could be a 
mistake on EBCDIC?

David



Yes, that's essentially what I meant when I said in an earlier email 
that NEL is THE new-line character on os390, which generally runs using 
EBCDIC.  The code point for NEL in cp1252 is a horizontal ellipsis, and 
not a next line, but on some platforms, like os390, it means next 
line.   This is a conflict.


However, now that I think about it, when I look at os390 runs, I rarely 
see NELs.  Maybe there is a filter that translates them to \n before the 
pod sees it, but sometimes, I do see NEL all over the place but no \n. 
I'll ask on the perl-mvs list about this.


Re: Assume CP1252

2015-01-11 Thread Karl Williamson

On 01/10/2015 11:35 PM, David E. Wheeler wrote:

On Jan 10, 2015, at 5:48 PM, Sean Burke sbu...@cpan.org wrote:


Helleu, Pod pals!
Short version about Re: Assume CP1252-- I advise: yes, assume CP1252 where 
technically you were expecting Latin-1.


Thanks for chiming in, Sean.


I agree completely, go for it!

Yes:
* assume that input is CP1252 in the absence of any encoding being declared
* assume that input is CP1252 if the declared encoding is Latin-1

As far as I know, that amicable bait-and-switch (i.e., construing Latin-1 to 
actually mean the superset CP1252) means in practice that everybody wins, and 
nobody loses, and DWIM prevails yet again.


Right, I vaguely remember you telling me this before. I forgot about #2 (and 
the HTML 5 precedent).


I think I oppose overruling someone's =encoding line.  The reason that 
1252 is effectively a superset of latin1 is because it reuses the C1 
controls to mean something else, and we don't expect those controls to 
actually appear in a pod document.  That is quite likely, except for 
one, NEL, U+85, which is the usual line separator on some platforms, 
notably os390 (that code point is the horizontal ellipsis in 1252).


It strikes me as wrong anyway to say we know better than the coder. 
There needs to be a way for a coder to specify the coding and not have 
that specification ignored by us.  We do not have the foresight to know 
the possible circumstances where Latin1 is the correct value and 1252 is 
not.  We could be wrong, and we should provide an easy workaround for 
our wrongness.  The most straight forward which will lead to the least 
resentment against us when we are wrong is to simply not second guess 
what the coder has said.


os390 is proof that there is at least one platform that Perl runs on 
where 1252 is not a superset of Latin1.  There could be special casing 
for that platform.  But if we're wrong there, we could be wrong 
elsewhere.  It just seems a bad idea to think we know better than the 
coder.




Re: Assume CP1252

2015-01-11 Thread Karl Williamson

On 01/11/2015 11:01 AM, Karl Williamson wrote:

On 01/10/2015 11:35 PM, David E. Wheeler wrote:

On Jan 10, 2015, at 5:48 PM, Sean Burke sbu...@cpan.org wrote:


Helleu, Pod pals!
Short version about Re: Assume CP1252-- I advise: yes, assume
CP1252 where technically you were expecting Latin-1.


Thanks for chiming in, Sean.


I agree completely, go for it!

Yes:
* assume that input is CP1252 in the absence of any encoding being
declared
* assume that input is CP1252 if the declared encoding is Latin-1

As far as I know, that amicable bait-and-switch (i.e., construing
Latin-1 to actually mean the superset CP1252) means in practice that
everybody wins, and nobody loses, and DWIM prevails yet again.


Right, I vaguely remember you telling me this before. I forgot about
#2 (and the HTML 5 precedent).


I think I oppose overruling someone's =encoding line.  The reason that
1252 is effectively a superset of latin1 is because it reuses the C1
controls to mean something else, and we don't expect those controls to
actually appear in a pod document.  That is quite likely, except for
one, NEL, U+85, which is the usual line separator on some platforms,
notably os390 (that code point is the horizontal ellipsis in 1252).

It strikes me as wrong anyway to say we know better than the coder.
There needs to be a way for a coder to specify the coding and not have
that specification ignored by us.  We do not have the foresight to know
the possible circumstances where Latin1 is the correct value and 1252 is
not.  We could be wrong, and we should provide an easy workaround for
our wrongness.  The most straight forward which will lead to the least
resentment against us when we are wrong is to simply not second guess
what the coder has said.

os390 is proof that there is at least one platform that Perl runs on
where 1252 is not a superset of Latin1.  There could be special casing
for that platform.  But if we're wrong there, we could be wrong
elsewhere.  It just seems a bad idea to think we know better than the
coder.



To be clear, I think that assuming 1252 when there is no =encoding line 
is a good idea.  But I'm leery of overriding an actual =encoding line.




Re: Pod::Simple can treat binary as pod due to liberal/inconsistent regexp patterns

2015-01-08 Thread Karl Williamson

On 01/08/2015 11:17 AM, Randy Stauner wrote:

 IIRC the first liberal rx is to detect start of POD just like the Perl 
(language) parser does, i.e. it pauses parsing for instructions until the next =cut

Oh. Can someone dig into the Perl parser and confirm this?

 I think POD parsers should do the same.

My suspicion is that, even if that’s true, the Parser ignores
everything in a __DATA__ or __END__ block.


Here is an example I worked up when writing test for metacpan:
Everything after __DATA__ is data, but the pod parser will also find pod
if it's there
https://gist.github.com/rwstauner/98f97e6cd64c972d9b71



I don't understand the parser very well, but if someone wants a crack at 
it, here is the only portion of it that sets to being in pod.  The 
context is that the first character on the line is an =, and tmp holds 
the character that follows that =.  I think 's' points to the input 
starting at tmp, so that tmp == *s:


if (PL_expect == XSTATE  isALPHA(tmp) 
(s == PL_linestart+1 || s[-2] == '\n') )
{
if ((PL_in_eval  !PL_rsfp  !PL_parser-filtered)
|| PL_lex_state != LEX_NORMAL) {
d = PL_bufend;
while (s  d) {
if (*s++ == '\n') {
incline(s);
if (strnEQ(s,=cut,4)) {
s = strchr(s,'\n');
if (s)
s++;
else
s = d;
incline(s);
goto retry;
}
}
}
goto retry;
}
s = PL_bufend;
PL_parser-in_pod = 1;
goto retry;
}



Re: Pod::Simple can treat binary as pod due to liberal/inconsistent regexp patterns

2015-01-07 Thread Karl Williamson

On 01/06/2015 07:55 AM, Randy Stauner wrote:

This came up in discussing a metacpan bug
(https://github.com/CPAN-API/cpan-api/issues/364#issuecomment-66864855)...

A perl module can technically have perl code, pod, and even spans of
binary (in a data token, or maybe even a here doc).

To my surprise, the pod parser matched a line like =F\0 in the binary
blob and began treating the document as pod.

The matching is inconsistent though:
A very liberal regexp matched the binary and triggered the start of the
document:

if($line =~ m/^=([a-zA-Z]+)/s) {

https://github.com/theory/pod-simple/blob/b72a3a74bd7ba1a27ba397923f913a12f053e906/lib/Pod/Simple/BlackBox.pm#L158

Later on the line is re-processed to see what kind of pod it is and no
longer matches the more strict regexp:

if($line =~ m/^(=[a-zA-Z][a-zA-Z0-9]*)(?:\s+|$)(.*)/s) {

https://github.com/theory/pod-simple/blob/b72a3a74bd7ba1a27ba397923f913a12f053e906/lib/Pod/Simple/BlackBox.pm#L243

So in a document that had no pod, the pod parser returned a bunch of
binary blobs
because it matched a very loose regexp, started the document, and then
found no actual pod (so basically everything afterwards is treated as a
pod paragraph).

I asked David about the inconsistency and he asked that I bring it up here.

Shouldn't the more strict regexp be used in both places?


I think so.  Looking at the regexes though, I didn't know that 
directives could be capitals, and I thought that digits had to always be 
the last character (or characters ?) in a directive.  It seems to me 
that both regexes should be tightened.




On the first pass the parser marks the line as pod (presumably matching
a directive)
but on the second pass the line doesn't match any patterns and it all
falls through as a paragraph.

This inconsistency allows binary data to be treated as a pod document.
Is there a recommended way to parse the pod out of a document that might
have binary data in it?


I don't know about this.



Re: Existing tools for extracting POD from a source file?

2014-01-13 Thread Karl Williamson

On 11/28/2013 12:20 AM, Kent Fredric wrote:

We've had a lot of problems lately with the fact Parrot uses Perldoc
to simply extract pod statements from a generated file and emit them
into another file.

This stems mostly from the fact perldoc over-zealously drops privs.

But the gist of it is: parrot calls  `perldoc -ud target source`, and
we're having a nightmare because neither source nor target can be read
when UID=nobody, and even chmodding the relevant files for some reason
doesn't help ( mostly, because the directories themselves are not
readable or writeable by UID=nobody, meaning we'd have to chmod the
entire parentage of the directory to just avoid perldocs attempt at
making things secure ).

So, is there a simple tool already on CPAN that will extract POD
segments from arbitrary files and spew them into other files without
requiring priv dropping?

It seems trivial to write one, but if something already exists that'd
be helpful.

Presently, I'm just looking to patch parrot during build to use the
new tool instead of perldoc.



You may have missed this

https://github.com/genehack/pod-simple/tree/add-pod-simple-pod

which is incomplete.  What has parrot done in the meantime?


Re: Tables in PODs

2013-09-28 Thread Karl Williamson

On 09/20/2013 08:43 AM, Nicholas Clark wrote:

On Fri, Sep 20, 2013 at 10:30:13AM -0400, Shawn H Corey wrote:

Is there any specification for tables in PODs? I haven't been able to
find any. Isn't it about time tables were added? I have attached a
specification for them for your review.


There is a perl 6 specification for tables. See

https://raw.github.com/perl6/specs/master/S26-documentation.pod

and an implementation for Perl 5, at least for HTML:

https://metacpan.org/module/Perl6::Pod

However, the big question I don't know the answer to is if anyone has
implemented code to output tables to man pages. man can do tables (very
nicely), but I know approximately zero roff, so I don't know how easy it
is (or isn't)


I do know roff.  Looking briefly at the spec, it looks reasonably easy 
to do.


Or how good plain text output for tables is.

Nicholas Clark





Re: Is Pod::Simple::POD worth pursuing?

2013-05-22 Thread Karl Williamson

On 05/21/2013 08:16 PM, Ricardo Signes wrote:

* John SJ Anderson geneh...@genehack.org [2013-05-21T19:33:14]

* Is this a worthwhile idea? (The recent How do I get Pod::Simple to
extract pod thread suggests the answer is yes.)


It's hard to judge this without the context in which you're considering it.
The GH issue to which you linked is largely context-free.

That said, wanting the ability to say gimme just the Pod from this Pod
document seems pretty reasonable.  Your code looks nice and simple.  I'd
rename it from POD to Pod so it's easier to remember.



We cannot remove Pod::Parser from the core until podcheck.t stops using 
it.  The version of podcheck.t that works with Marc Green's Pod::Checker 
uses Pod::Parser for only one remaining purpose: to extract the pod from 
a file.


In order to remove this dependency, one of these things will have to happen:

1) We put John's new code in core, and podcheck.t uses it.
2) I steal John's code and put it in podcheck.t
3) I reimplement what John's code does, for podcheck.t

I have been trying to avoid #3.


Re: How do I get Pod::Simple to extract pod from its containing file?

2013-03-06 Thread Karl Williamson

On 03/06/2013 06:15 AM, Nicholas Clark wrote:

On Tue, Jan 29, 2013 at 11:52:43AM -0700, Karl Williamson wrote:

On 01/26/2013 08:37 PM, Karl Williamson wrote:

On 01/26/2013 07:44 PM, Russ Allbery wrote:

Karl Williamson pub...@khwilliamson.com writes:

On 01/26/2013 02:23 PM, Russ Allbery wrote:

Karl Williamson pub...@khwilliamson.com writes:



With Pod::Parser, you just do
 parse_from_file($in_fh, $out_fh)



and it outputs the pod to $out_fh.  Pod::Simple has a method of the
same
name which is supposed to emulate the Pod::Parser method, but when
I run
it, nothing is output.



Oh!  I misunderstood, sorry.  parse_from_file() does indeed invoke the
parser with the appropriate actions, but what you meant was that
Pod::Parser's null parser, when not subclassed, just printed the POD
back out again, so you could use it as a way to extract the POD from a
file.  I believe Pod::Simple's null parser does nothing at all, so you
get an empty file.

Yes.  I think that's an incompatibility.



If I turn on DEBUGing, it's doing a lot.  Is there some trivial way to
extract the pod?



So no one thinks there is a trivial way to get this extraction.  Would
someone make a suggestion as to the easiest way to do so using modules
that will continue to ship with the Perl 5 core (unlike Pod::Parser)?


No-one answered this, did they?


No


Is it possible to extract the pod by subclassing Pod::Simple, and the
subclass being a null parser that prints out the Pod that it was given?

Nicholas Clark



Yes, but I was hoping there was something that is less work, or already 
done, or done for some other purpose, such as Pod::Checker, and I can 
re-use the relevant parts.


Re: Should Pod::Simple test for repeated elements in a dl

2013-02-13 Thread Karl Williamson

On 01/27/2013 02:47 PM, David E. Wheeler wrote:

On Jan 26, 2013, at 12:03 PM, Karl Williamson pub...@khwilliamson.com wrote:


I do not propose warning for something like the above.  The warning would only 
be for repeated uses of the exact same item name, like

=item -

=item -

=item -


Oh, okay. Works for me.

David




Shall I create a ticket for this?


Re: How do I get Pod::Simple to extract pod from its containing file?

2013-01-29 Thread Karl Williamson

On 01/26/2013 08:37 PM, Karl Williamson wrote:

On 01/26/2013 07:44 PM, Russ Allbery wrote:

Karl Williamson pub...@khwilliamson.com writes:

On 01/26/2013 02:23 PM, Russ Allbery wrote:

Karl Williamson pub...@khwilliamson.com writes:



With Pod::Parser, you just do
parse_from_file($in_fh, $out_fh)



and it outputs the pod to $out_fh.  Pod::Simple has a method of the
same
name which is supposed to emulate the Pod::Parser method, but when
I run
it, nothing is output.



Did you flush $out_fh?  Pod::Parser did that but Pod::Simple
doesn't, so
it's possible if you're doing something short that the entire output
was
still buffered.



Pod::Man and Pod::Text use this Pod::Simple method and do have test
suites
for it and it seems to work.



See the attached program.  The resultant file is 0 length.


Oh!  I misunderstood, sorry.  parse_from_file() does indeed invoke the
parser with the appropriate actions, but what you meant was that
Pod::Parser's null parser, when not subclassed, just printed the POD
back out again, so you could use it as a way to extract the POD from a
file.  I believe Pod::Simple's null parser does nothing at all, so you
get an empty file.

Yes.  I think that's an incompatibility.



If I turn on DEBUGing, it's doing a lot.  Is there some trivial way to
extract the pod?



So no one thinks there is a trivial way to get this extraction.  Would 
someone make a suggestion as to the easiest way to do so using modules 
that will continue to ship with the Perl 5 core (unlike Pod::Parser)?


How do I get Pod::Simple to extract pod from its containing file?

2013-01-26 Thread Karl Williamson

With Pod::Parser, you just do
parse_from_file($in_fh, $out_fh)

and it outputs the pod to $out_fh.  Pod::Simple has a method of the same 
name which is supposed to emulate the Pod::Parser method, but when I run 
it, nothing is output.


Re: How do I get Pod::Simple to extract pod from its containing file?

2013-01-26 Thread Karl Williamson

On 01/26/2013 02:23 PM, Russ Allbery wrote:

Karl Williamson pub...@khwilliamson.com writes:


With Pod::Parser, you just do
parse_from_file($in_fh, $out_fh)



and it outputs the pod to $out_fh.  Pod::Simple has a method of the same
name which is supposed to emulate the Pod::Parser method, but when I run
it, nothing is output.


Did you flush $out_fh?  Pod::Parser did that but Pod::Simple doesn't, so
it's possible if you're doing something short that the entire output was
still buffered.

Pod::Man and Pod::Text use this Pod::Simple method and do have test suites
for it and it seems to work.



See the attached program.  The resultant file is 0 length.


simple.pl
Description: Perl program


Re: How do I get Pod::Simple to extract pod from its containing file?

2013-01-26 Thread Karl Williamson

On 01/26/2013 07:44 PM, Russ Allbery wrote:

Karl Williamson pub...@khwilliamson.com writes:

On 01/26/2013 02:23 PM, Russ Allbery wrote:

Karl Williamson pub...@khwilliamson.com writes:



With Pod::Parser, you just do
parse_from_file($in_fh, $out_fh)



and it outputs the pod to $out_fh.  Pod::Simple has a method of the same
name which is supposed to emulate the Pod::Parser method, but when I run
it, nothing is output.



Did you flush $out_fh?  Pod::Parser did that but Pod::Simple doesn't, so
it's possible if you're doing something short that the entire output was
still buffered.



Pod::Man and Pod::Text use this Pod::Simple method and do have test suites
for it and it seems to work.



See the attached program.  The resultant file is 0 length.


Oh!  I misunderstood, sorry.  parse_from_file() does indeed invoke the
parser with the appropriate actions, but what you meant was that
Pod::Parser's null parser, when not subclassed, just printed the POD
back out again, so you could use it as a way to extract the POD from a
file.  I believe Pod::Simple's null parser does nothing at all, so you
get an empty file.

Yes.  I think that's an incompatibility.



If I turn on DEBUGing, it's doing a lot.  Is there some trivial way to 
extract the pod?


Re: Pod::Simple doesn't warn when the text of a definition =item matches /[\*\d]/; and a fix to this bug

2013-01-18 Thread Karl Williamson

On 01/18/2013 10:33 AM, David E. Wheeler wrote:

On Jan 18, 2013, at 7:31 AM, Ricardo Signes perl@rjbs.manxome.org wrote:


I hope we can agree that if the test results bit gets sorted out, we're in
favor of this warning..?


I have submitted https://github.com/theory/pod-simple/pull/44


+1. Will merge it if there are no objections.


+1



Re: Non-ASCII data in POD

2012-04-26 Thread Karl Williamson

On 04/25/2012 09:25 PM, Russ Allbery wrote:

Grant McLeangr...@mclean.net.nz  writes:


My thoughts on the second issue are that we could modify Pod::Simple to
'whine' if it sees non-ASCII bytes but no =encoding.  This in turn would
cause Test::Pod to pick up the error and help people fix it.


I would be in favor of that.



FYI, This test is already in the checks that are run on the pods that 
are included with the perl core


Re: [rt.cpan.org #74389] Pod::Simple::Pullparser get_title should ignore X...

2012-03-02 Thread Karl Williamson

On 03/02/2012 12:34 AM, David E. Wheeler wrote:

On Jan 29, 2012, at 3:15 PM, David E. Wheeler wrote:


And NAME and not NAME 

It should probably not just become an empty string, but it should be collapse
whitespace around it, so pathological cases like:

=head1 NAME Xfoo  THIS Xbar  TUNE
Xbaz

...should be NAME THIS TUNE

But in the simpler case, I think that NAME and not NAME  is actually likely
to come up.


Okay, so if I follow this thread correctly, the upshot is that:

• Pod::Simple::HTML needs to be fixed so that it does not include the contents of 
X
• The parser overall should be adjusted to remove superfluous whitespace


FWIW, I could use confirmation on this.


What sort of confirmation do you need?  I agree with the two bulleted 
items, if you're looking for that kind of confirmation.


Meanwhile, here's a test case showing the original bug with PullParser:

diff --git a/t/pulltitl.t b/t/pulltitl.t
index 22934f5..a846048 100644
--- a/t/pulltitl.t
+++ b/t/pulltitl.t
@@ -7,7 +7,7 @@ BEGIN {

  use strict;
  use Test;
-BEGIN { plan tests =  116 };
+BEGIN { plan tests =  117 };

  #use Pod::Simple::Debug (5);

@@ -408,6 +408,13 @@ ok( $t  $t-type eq 'start'  $t-tagname, 'Document' );
  }

  ###
+print # Testing a title with an X, at line , __LINE__, \n;
+my $p = Pod::Simple::PullParser-new;
+$p-set_source( \qq{\n=head1 NAME\nXSome entry\n} );
+
+ok $p-get_title(), 'NAME';
+
+###
  ###



That fails with:

not ok 116
# Test 116 got: NAME Some entry (t/pulltitl.t at line 415)
# Expected: NAME
#  t/pulltitl.t line 415 is: ok $p-get_title(), 'NAME';

So it seems as though X  issues my be all over the place, eh?

Best,

David






Re: On verbatim paragraphs immediately following an =item command; and Pod::Simple

2012-02-19 Thread Karl Williamson

On 02/18/2012 11:26 PM, Russ Allbery wrote:

Marc Greenpongu...@gmail.com  writes:


A simple (yet lengthy) question on verbatim paragraphs that immediately
follow an =item command, and how Pod::Simple treats them:



Given the following POD,



=over



=item *



 verbatim code snippet



=back



Should the =item be considered empty, followed by a verbatim paragraph,
or should the =item's contents include the verbatim paragraph?


Semantically, I think the =item's contents should include the verbatim
paragraph.  Otherwise, there's no way to have verbatim information inside
an item, such as an example of usage of a method that's documented via
=over/=item/=back.


+1




What brought this technicality to my attention is that Pod::Simple
interprets it as the former (emphasis added):



$ perl -MPod::Simple::DumpAsText -E'exit

Pod::Simple::DumpAsText-filter(shift)-any_errata_seen' test_verb.pod
++Document
\ start_line =  1
   ++over-bullet
  \ indent =  4
  \ start_line =  1
*++item-bullet
\ start_line =  3
 --item-bullet
 ++VerbatimFormatted
\ xml:space =  preserve
\ start_line =  5
   * verbatim code snippet
 --VerbatimFormatted*
   --over-bullet
--Document


It looks like I work around this in Pod::Man and Pod::Text by accident,
and in a way that might actually break if this were fixed.  Hrm.  I can't
tell for sure.  I *think* it would be okay anyway, but I'm not positive.
I'm not sure how the method calls would change if this behavior were
changed.

Basically, at present, I'm assuming that all the paragraphs are inside the
item until I see another item, whether they're passed in as the text of
the item or not.



Deprecation of alternate text in hyperlinks

2012-01-23 Thread Karl Williamson
Version 1.50 of Pod::Parser adds a check and message indicating that 
Ltext|hyperlink is deprecated.  This is based on the following 
sentences in perlpodspec, which has been there since its inception in 2001:


Authors wanting to link to a particular (absolute) URL, must do so
only with LEltscheme:... codes (like
LElthttp://www.perl.org), and must not attempt LEltSome Site
Name|scheme:... codes.  This restriction avoids many problems
in parsing and rendering LElt... codes.

Elsewhere in the document, it says that the handler should handle these, 
as in the example


 LPerl.org|http://www.perl.org/

The new Pod:Parser has just been installed in blead, and about 10 pods 
run afoul of this new check, including things like


Lperl...@perl.org|mailto:perl...@perl.org

My question is should there really be a message for this kind of use, 
and if so, should it extend to mailto: links?


Re: Deprecation of alternate text in hyperlinks

2012-01-23 Thread Karl Williamson

On 01/23/2012 10:05 AM, David E. Wheeler wrote:

On Jan 23, 2012, at 8:06 AM, Karl Williamson wrote:


The new Pod:Parser has just been installed in blead, and about 10 pods run 
afoul of this new check, including things like

Lperl...@perl.org|mailto:perl...@perl.org

My question is should there really be a message for this kind of use, and if 
so, should it extend to mailto: links?


No. Suport for Lname|scheme:...  was added to perlpodspec in 2009. 
Pod::Simple has supported it for years; Test::Pod started officially allowing it in 
v1.41, IIRC.

Relevant changes:

   
https://github.com/theory/pod-simple/commit/1e61e819debf9c7c23907d7bb9e37855665fd595
   
http://perl5.git.perl.org/perl.git/commit/f6e963e4dd62b8e3c01b31f4a4dd57e47e104997
   
https://github.com/theory/test-pod/commit/ae6a44894eda4fd09fb412d837efe543628cd7d6

Discussion:

   http://code.activestate.com/lists/perl-pod-people/1393/

Best,

David




So, you're saying I believe the text in perlpodspec that was the 
motivation for these changes should be removed, and that Pod::Parser 
should revert to its old behavior of not checking for this.


Is that so?



Re: GSoC Status Update

2011-08-24 Thread Karl Williamson

On 08/23/2011 11:03 PM, Marc Green wrote:

Hello everyone,

This is my last status update, as GSoC ended today (Monday, at the time
of writing). [I am not able to send this out now because I do not have
access to my email (well, to the Internet). I am going to send it out
tomorrow or the day after.]

I believe my project has been successful, and I am positive my mentors
would agree. That being said, I am *still* not done with neither
Pod::Html nor Pod::Checker. Well, I am done with all the coding --
they're both in their final stages in that regard, but they are yet to
be merged into core and tested (and complained about). I plan to see
them through though, and I plan on being the maintainer for both of them
if possible.

I was able to wrap up everything with Pod::Checker this week, so all I
need to do is to clean up its commit history (which will get done within
the next few days). Actually, that is a lie, I was not able to get
everything wrapped up. There is one thing in particular that I did not
get done: splitting Pod::Checker into Pod::Checker and
Pod::Checker::Internals, the latter being-a Pod::Simple and the former
having-a Pod::Checker::Internals. This split was conjured up in order to
remove Pod::Simple from Pod::Checker's interface. However, as I just
said, I was not able to do this. Instead, I added a warning in
Pod::Checker's documentation explaining that no user should EVER use any
aspect of Pod::Checker's interface that has to do with Pod::Simple
unless it is documented. Perhaps this is a task that can be tackled by
me or someone else in the future.

Oh, actually, my lie in the previous paragraph was a lie for another
reason too. My mentor, rjbs, offered to do some grunt work on
Pod::Checker for me so I could focus on the bigger stuff. These tasks
include removing the 5.14isms I used throughout and making Pod::Checker
its own dist (apart from Pod::Parser). These are not done yet, but him
and I are going to remain in contact so that the new Pod::Checker can
see the light of day.

Pod::Html shall also see the light of day. My second mentor, theory,
released a new version of Pod::Simple which contains some new code I
added to allow me to finish Pod::Html a while back. [At the time of
writing this it might not be released, but he assured me it would be in
the next few days, so I will take his word.] Now that it is released, I
can rebase its commit history to remove some hacks that let me use the
aforementioned new code. Pod::Html will then be merged into core and
hopefully be a success.

I feel this has been a wonderful experience for me; my foot is in the
Perl community door now, and I have gotten a taste of what it is like to
contribute to an open source project. It has also been a great
experience for the Perl community (I can imagine), as who doesn't like
seemingly free, beneficial code. (Seemingly beneficial, perhaps.)

Thank you everyone who helped me this summer, in particular rjbs,
theory, and rafl. I appreciate it!

Marc


Marc++


Re: Removal of specific Pod::Checker warnings

2011-08-12 Thread Karl Williamson

On 08/11/2011 12:54 PM, Ricardo Signes wrote:

* Marc Greenpongu...@gmail.com  [2011-08-11T06:40:17]


perlpodspec states Pod processors must tolerate a bare =item as if it
were =item *. Is Pod::Checker's behavior still in line with perlpodspec?
Is the use of '=item' without any parameters deprecated? Or should that
warning be removed from Pod::Checker?


Pod::Checker's behavior isn't wrong, but its claims are.  It says:

   =item without any parameters is deprecated

No, it isn't.  Maybe somebody wishes that it was, but it isn't.  It sounds like
nobody thinks it needs to be.  I think it's fine for Pod::Checker to have
opinions of style, in some cases, but I don't think this makes any sense.  The
meaning of =item is well-documented.  I think the warning can and should go.


+1




Given that there is clearly a use for =itemless =over/=back blocks, should
it still be a warning? I think no, and instead, Pod::Checker should warn
about an empty =over/=back block, one that contains nothing but whitespace.


You've already heard my opinion on this one, but for everyone else:  I think
this warning is bogus.  =over/=back without =item is well-documented.  Some
formatters don't handle it correctly, but better to fix them than to suggest
that this is in any way problematic Pod.

If someone wants to come forward and tell us that, say, the four most-used Pod
formatters will actually *lose* these sections, that's a different matter.  But
that isn't my experience.



I agree with this that there shouldn't be a warning if there are things 
within the =over/=back that aren't =item's.  I'm not sure about if there 
is only white space.  I could be persuaded it is a useful warning, which 
Marc was originally going to implement; or I could be persuaded it is 
not worth warning about.  The Perl core has several cases where 
machine-generated pods have empty =over/=back sections.  These mean only 
that there was a potential section that the generating code wasn't smart 
enough to realize was empty here, and omit the surrounding pod directives.


Just FYI, I implemented several additional checks in the core's pod test 
program, podcheck.t, that I think may warrant being used everywhere. 
These are:

 Should have =encoding statement because have non-ASCII
 =encoding must be first command (if present)
 There is no NAME


Re: GSoC Status Update

2011-07-27 Thread Karl Williamson

On 07/27/2011 07:58 AM, Marc Green wrote:


  I am happy to announce that I have made much progress on porting
  Pod::Checker this week. I have made a list of all the errors that
  Pod::Simple already checks for, and by comparing that to what
Pod::Checker
  additionally checks for, I can efficiently implement the rest. So
that is
  what I have been doing. There is a minor snag in one of the error
checks,
  the one that warns if there is any text after a =pod directive,
because
  Pod::Simple does not offer any way to access said text. To
overcome this I
  am adding such a feature to Pod::Simple::Blackbox, so I should resume
  porting the error checks shortly.

When I looked at this before I found there tended to be significant
disagreement over whether the Pod::Checker checks were actually good
checks that ought to be included in Pod::Simple.

I know this is opening a huge can of worms but I'd be interested if you
could post the list of checks you're adding to Pod::Simple.

Michael


I am not adding checks to Pod::Simple, I was advised that would be a bad
idea (and harder to do). Rather, I am rewriting Pod::Checker to have
Pod::Simple as a superclass instead of Pod::Parser, and in doing so I
need to rewrite the checks *within Pod::Checker* using Pod::Simple.

Rereading my email I realize my ambiguity, but I hope I have now cleared
up any confusion. If not, let me know.

Also, if you still want to see what error checks I am rewriting, they
are available at
https://github.com/marcgreen/perl-pod-checker/tree/edit-bb/cpan/Pod-Parser.
There are three files: ps-errors, pc-errors, and pc-errors-todo. The
first is a list of what Pod::Simple checks for, the second is what
Pod::Checker checks for, and the third is a list of the checks I have
left to rewrite.

Thanks for your concern,
Marc


Here are a couple of pod checker errors that are in error, AFAICT

One is that it warns on any E above 255 as being out of range.  I 
think this is plain wrong, as people do this and it works.  Perhaps 
there are some circumstances when it is wrong, I don't know.


The other is that it warns that use of a link to a man page with a 
section number is deprecated.  We have discussed that on this list 
before, and as I remember it, the consensus was it should not be deprecated.


Re: Z in =item

2011-06-26 Thread Karl Williamson

On 06/26/2011 05:34 AM, Shawn H Corey wrote:

On 11-06-25 11:53 PM, Karl Williamson wrote:

In perldiag.pod, there is a line like this

=item Z500 Server error

All the other items form a definition list. My guess is that this is to
make sure that the 500 isn't mistaken for a numbered =item in the list.
However, with html, anyway, I don't see any difference in the output
with and without the Z, and podchecker ignores the Z and says that
the list has mismatched item types.

Can someone explain?


Originally, these are the only valid =item's:

=item *

=item 1

=item 1.

=item definition


These are invalid but frequency occur:

=item * bulleted?

=item 1 numbered?

=item 1. numbered?


They all should be treated as a definition but seldom are. That means,
an `=item Z anything` should be treated like a definition.




So then, does the attached patch look ok?
From ee770e42cab702ec6a23e2a97f0833a051758c55 Mon Sep 17 00:00:00 2001
From: Karl Williamson pub...@khwilliamson.com
Date: Sun, 26 Jun 2011 11:35:45 -0600
Subject: [PATCH] perlpod: Add info about using Z in =items

---
 pod/perlpod.pod |   15 ++-
 1 files changed, 14 insertions(+), 1 deletions(-)

diff --git a/pod/perlpod.pod b/pod/perlpod.pod
index 068afe4..ee7d715 100644
--- a/pod/perlpod.pod
+++ b/pod/perlpod.pod
@@ -156,7 +156,11 @@ And perhaps most importantly, keep the items consistent: either use
 =item * for all of them, to produce bullets; or use =item 1.,
 =item 2., etc., to produce numbered lists; or use =item foo,
 =item bar, etc.--namely, things that look nothing like bullets or
-numbers.
+numbers.  (If you have a list that contains both: 1) things that don't
+look like bullets nor numbers,  plus 2) things that do, you should 
+preface the bullet- or number-like items with CZEltEgt.  See
+LZEltEgt|/ZEltEgt -- a null (zero-effect) formatting code
+below for an example.)
 
 If you start with bullets or numbers, stick with them, as
 formatters use the first =item type to decide how to format the
@@ -535,6 +539,15 @@ EElt...Egt code sometimes.  For example, instead of
 the Elt so they can't be considered
 the part of a (fictitious) NElt...Egt code.
 
+Another use is to indicate that Istuff in C=item ZEltEgtIstuff...
+is not to be considered to be a bullet or number.  For example,
+without the CZEltEgt, the line
+
+ =item Z500 Server error
+
+could possibly be parsed as an item in a numbered list when it isn't
+meant to be.
+
 =for comment
  This was formerly explained as a zero-width character.  But it in
  most parser models, it parses to nothing at all, as opposed to parsing
-- 
1.7.1



Z in =item

2011-06-25 Thread Karl Williamson

In perldiag.pod, there is a line like this

=item Z500 Server error

All the other items form a definition list.  My guess is that this is to 
make sure that the 500 isn't mistaken for a numbered =item in the list. 
 However, with html, anyway, I don't see any difference in the output 
with and without the Z, and podchecker ignores the Z and says that 
the list has mismatched item types.


Can someone explain?


Re: Pod::Html's cross referencing of C links

2011-05-20 Thread Karl Williamson

On 05/20/2011 03:16 PM, Ricardo Signes wrote:

* Marc Greenpongu...@gmail.com  [2011-05-20T16:24:21]

links. More specifically, I understand how it resolves L  links, but I am
confused as to why you resolve C  links. From reading the source, I
gather that C  links are resolved by searching pod documents for =item
directives, and storing their text in a global hash.


Marc is referring to comments like this:

my %Pages = (); # associative array used to find the location
 #   of pages referenced by L  links.
my %Items = (); # associative array used to find the location
 #   of =item directives referenced by C
 #   links

...

# scan_items - scans the pod specified by $pod for =item directives.  we
#  will use this information later on in resolving C  links.

c.



My guess is that it's just plain wrong, so no wonder it's confusing. 
Perhaps it's reflecting an early design, or perhaps it's just a typo, 
and L was meant instead of C.  L can link to =items provided they are 
of a type that permits that.  Currently, the only ones that are are ones 
that are in what html calls definition lists, at least in Pod::Html.


Re: no deprecation warning for Lsection

2011-04-29 Thread Karl Williamson

On 04/29/2011 04:14 AM, Michael Stevens wrote:

On Thu, Apr 28, 2011 at 07:12:49PM -0400, Ricardo Signes wrote:

* Michael Stevensmstev...@etla.org  [2011-04-28T17:03:36]

Has it got a victim^Wvolunteer?


Yup.  Marc Green (the student) and David Wheeler and I will have our first
meeting to kick things off in a few days.  From there on, a state of constant
progress!



From my experiments in the area I predict the problem will be getting

people interested and willing to accept or reject patches.

Michael



FWIW, one of the reasons I've brought up a bunch of stuff concerning 
this lately, is that I'm about to add to the Perl 5.15 core a revision 
of podcheck.t which extends Pod::Checker.  I had wondered if I should 
add checking for the deprecated constructs, but now I'll let 
Pod::Checker do so.



People on this list might be interested in the extensions to 
Pod::Checker, some of which might be considered for pulling back into 
Pod::Checker.  Attached is the current pod for podcheck.t
=pod

=head1 NAME

podcheck.t - Look for possible problems in the Perl pods

=head1 SYNOPSIS

 cd t
 ./perl -I../lib porting/podcheck.t [--show_all] [--cpan] [--counts]
[ FILE ...]
 ./perl -I../lib porting/podcheck.t --regen

=head1 DESCRIPTION

podcheck.t is an extension of Pod::Checker.  It looks for pod errors and
potential errors in the files given as arguments, or if none specified, in all
pods in the distribution workspace, except those in the cpan directory (unless
C--cpan is specified).  It does additional checking beyond that done by
Pod::Checker, and keeps a database of known potential problems, and will
fail a pod only if the number of such problems differs from that given in the
database.  It also suppresses the C(section) deprecated message from
Pod::Checker, since specifying the man page section number is quite proper to 
do.

The additional checks it makes are:

=over

=item Cross-pod link checking

Pod::Checker verifies that links to an internal target in a pod are not
broken.  podcheck.t extends that (when called without FILE arguments) to
external links.  It does this by gathering up all the possible targets in the
workspace, and cross-checking them.  The database has a list of known targets
outside the workspace, so podcheck.t will not raise a warning for
using those.  It also checks that a non-broken link points to just one target.
(The destination pod could have two targets with the same name.)

=item An internal link that isn't so specified

If a link is broken, but there is an existing internal target of the same
name, it is likely that the internal target was meant, and the C/ is
missing from the CLEltEgt pod command.

=item Verbatim paragraphs that wrap in an 80 column window

It's annoying to have lines wrap when displaying pod documentation in a
terminal window.  This checks that all such lines fit, and for those that
don't, it tells you how much needs to be cut in order to fit.  However,
if you're fixing these, keep in mind that some terminal/pager combinations
require really a maximum of 79 or 78 columns to display properly.

Often, the easiest thing to do to gain space for these is to lower the indent
to just one space.

=item Missing or duplicate NAME or missing NAME short description

A pod can't be linked to unless it has a unique name.
And a NAME should have a dash and short description after it.

=item =encoding statement issues

This indicates if an C=encoding statement should be present, or moved to the
front of the pod.

=item Items that perhaps should be links

There are mentions of apparent files in the pods that perhaps should be links
instead, using CLElt...Egt

=item Items that perhaps should be CFElt...Egt

What look like path names enclosed in CCElt...Egt should perhaps have 
CFElt...Egt mark-up instead.

=back

A number of issues raised by podcheck.t and by the base Pod::Checker are not
really problems, but merely potential problems.  After inspecting them and
deciding that they aren't real problems, it is possible to shut up this program
about them, unlike base Pod::Checker.  To do this, call podcheck.t with the
C--regen option to regenerate the database.  This tells it that all existing
issues are to not be mentioned again.

This isn't fool-proof.  The database merely keeps track of the number of these
potential problems of each type for each pod.  If a new problem of a given
type is introduced into the pod, podcheck.t will spit out all of them.  You
then have to figure out which is the new one, and should it be changed or not.
But doing it this way insulates the database from having to keep track of line
numbers of problems, which may change, or the exact wording of each problem
which might also change without affecting whether it is a problem or not.

Also, if the count of potential problems of a given type for a pod decreases,
the database must be regenerated so that it knows the new number.  The program
gives 

no deprecation warning for Lsection

2011-04-27 Thread Karl Williamson

I was reading podspec, and saw this

 Previous versions of perlpod allowed for a Lsection syntax (as in 
LObject Attributes), which was not easily distinguishable from 
Lname syntax and for Lsection which was only slightly less 
ambiguous.  This syntax is no longer in the specification, and has been 
replaced by the L/section syntax (where the slash was formerly 
optional).  Pod parsers should tolerate the Lsection syntax, for a 
while at least.  The suggested heuristic for distinguishing Lsection 
from Lname is that if it contains any whitespace, it's a section. 
Pod processors should warn about this being deprecated syntax.


I notice that perldoc does not warn on this being deprecated.  Is this 
by design?


Re: Lack of html anchor for =item * foo

2011-04-27 Thread Karl Williamson

On 04/26/2011 11:02 AM, David E. Wheeler wrote:


FWIW, Pod::Simple::XHTML doesn't output an ID fordts, either.

The Perl core docs have roughly 700 links to dts.  For example, 
perlfunc uses =item's for all its functions.


Lack of html anchor for =item * foo

2011-04-26 Thread Karl Williamson

I discovered that in html output of lists that have elements of the form
=item * foo
no a anchor is generated for foo; this is different from lists of the form
=item foo

The first case generates a ul list, and the second a dl list.
The problem is that in the first form, any link in the file to 'foo' is 
broken, since there is no anchor for it.


Is this deliberate?  Should it be changed?


Re: Lack of html anchor for =item * foo

2011-04-26 Thread Karl Williamson

On 04/26/2011 11:02 AM, David E. Wheeler wrote:

On Apr 26, 2011, at 9:51 AM, Karl Williamson wrote:


I discovered that in html output of lists that have elements of the form
=item * foo
noa  anchor is generated for foo; this is different from lists of the form
=item foo

The first case generates aul  list, and the second adl  list.
The problem is that in the first form, any link in the file to 'foo' is broken, 
since there is no anchor for it.

Is this deliberate?  Should it be changed?


I think it is deliberate because

 =item * foo

Is no different from

 =item *

 foo

That is, it's just a bullet, it has no name associated with it.dts, OTOH, do 
have a name.

FWIW, Pod::Simple::XHTML doesn't output an ID fordts, either.



When I do a perldoc -ohtml, what module is getting called that does 
generate an ID for dts ?


Blanks and underscores in html links

2011-04-26 Thread Karl Williamson

Look at
http://search.cpan.org/~rjbs/perl-5.12.3/pod/perlsyn.pod

There is a heading in the original source
=head2 Switch statements

The anchor that is generated somehow on the web is
h2a class='u' href='#___top' title='click to go to top of document'
name=Switch_statements_
Switch statements ...

Note that the space in the original is translated into an underscore, 
and the addition of several trailing underscores.  This means that the 
link on the page that goes like

See also L/Switch statements.

doesn't work, as it gets translated into
a href=#Switch_statements class=podlinkpod

Can someone explain the trailing underscores?


Re: How can one put a table into a pod

2011-04-25 Thread Karl Williamson

On 04/23/2011 11:53 PM, David E. Wheeler wrote:

On Apr 23, 2011, at 10:09 PM, Karl Williamson wrote:


I was thinking that PseudoPod implemented most of what might be needed, and so 
why not ship that.

Its table spec looks quite simple, and perhaps sufficient.


+1


However, in thinking about this some more, I think we need to be able to 
at least specify centered column headings, and spans.  This is easily 
done with html and tbl.



tbl's is also pretty simple; it allows, without my looking at the 
documentation, at least columns that have justification of left, center, and 
the default 'alpha' which is essentially left, but I can't remember the 
difference, and numeric, so the decimal points line up.  You can also create 
spans, and bold, etc.


I expect one can use B  and friends within PseudoPod tables, yes?

I do think we should keep Pod as Pod; troff is something else entirely.


The other specification I'm familiar with is html, and it offers far more power 
than needed.


Yeah, and one can always =begin html to do that.

Best,

David






How can one put a table into a pod

2011-04-23 Thread Karl Williamson
I don't know how to put a table into a pod.  One can simulate it by 
using as-is formatting, but it's not very good.


The documentation in perlpod seems to indicate that in

  =begin html

  brFigure 1.brIMG SRC=figure1.pngbr

  =end html

  =begin text

---
|  foo|
|bar  |
---

   Figure 1. 

  =end text



the 'text' part acts as a fall back, and that an html formatter will not 
output it, even though it knows how to.  Is that the case?  If so, that 
could be used to put a table into the pod that works for the formatters 
that one knows how to specify tables for, and have the as-is function as 
a fallback.  But how does one guarantee that the format version and the 
fallback aren't both output?


Re: How can one put a table into a pod

2011-04-23 Thread Karl Williamson

On 04/23/2011 01:09 PM, Karl Williamson wrote:

I don't know how to put a table into a pod. One can simulate it by using
as-is formatting, but it's not very good.

The documentation in perlpod seems to indicate that in

=begin html

brFigure 1.brIMG SRC=figure1.pngbr

=end html

=begin text

---
| foo |
| bar |
---

 Figure 1. 

=end text



the 'text' part acts as a fall back, and that an html formatter will not
output it, even though it knows how to. Is that the case? If so, that
could be used to put a table into the pod that works for the formatters
that one knows how to specify tables for, and have the as-is function as
a fallback. But how does one guarantee that the format version and the
fallback aren't both output?



It's worse than I thought.  I ran some experiments.  It appears that the 
various formatters don't recognize 'text', and so there's no way to 
specify a fall back.  Perhaps there is a 'text' formatter.  I don't know 
what it would be.


There also doesn't appear to be a way to extend the pod language in a 
backwards compatible way.


Am I missing something?


Re: How can one put a table into a pod

2011-04-23 Thread Karl Williamson

On 04/23/2011 06:23 PM, Allison Randal wrote:

On 04/23/2011 03:12 PM, Karl Williamson wrote:

On 04/23/2011 01:09 PM, Karl Williamson wrote:

I don't know how to put a table into a pod. One can simulate it by using
as-is formatting, but it's not very good.


There also doesn't appear to be a way to extend the pod language in a
backwards compatible way.

Am I missing something?


You can subclass Pod::Simple to produce a formatter capable of parsing
all normal pod, plus tables. For a good example of this, see
Pod::PseudoPod. The table formatting it uses is demonstrated in:

http://search.cpan.org/~arandal/Pod-PseudoPod-0.16/lib/Pod/PseudoPod/Tutorial.pod#Tables


(Or, perhaps this implementation of tables will be enough for your
purposes, and you won't need your own subclass.)

Allison



That explains how to do it.  Thanks.  I would like something like this 
for the core Perl 5 documentation.  Are there reasons besides inertia 
for this to not be shipped with the Perl core?


Re: How can one put a table into a pod

2011-04-23 Thread Karl Williamson

On 04/23/2011 09:58 PM, Russ Allbery wrote:

Karl Williamsonpub...@khwilliamson.com  writes:


It's worse than I thought.  I ran some experiments.  It appears that the
various formatters don't recognize 'text', and so there's no way to
specify a fall back.  Perhaps there is a 'text' formatter.  I don't know
what it would be.


pod2text (Pod::Text).

There have been many proposals over the years for a table representation
in POD, but the general consensus has always been that tables are
inherently complex enough that the result would be straying away from the
plain in Plain Old Documentation.  Table markup in wikis is probably
about as simple as one can get away with and still generate a useful
table, and it's still pretty complex.  Tables are inherently hard to do
properly, and there's an immediate demand for additional features (row
spanning, column spanning, headings, etc.).

I don't object to supporting it in Pod::Text and Pod::Man, but I'd have to
ask someone else to write the initial implementation.  I think support in
Pod::Man at least would be fairly important before deciding to add tables
to Perl's core documentation, but the research on how to use tbl properly
is more than I currently have time for.



I used to be considered a [nt]roff guru.  I would still rather use it 
than MS Word, but I find the Linux implementations lacking, and actually 
don't have much need to write documents.  Anyway, I could easily write 
the Pod::Man part (famous last words); but I don't know how widespread 
tbl is, or its quality on Linux.  I imagine the nroff output of tbl 
could be used for Pod::Text, again spoken with no investigation.


Re: How can one put a table into a pod

2011-04-23 Thread Karl Williamson

On 04/23/2011 10:13 PM, David E. Wheeler wrote:

On Apr 23, 2011, at 6:38 PM, Karl Williamson wrote:


That explains how to do it.  Thanks.  I would like something like this for the 
core Perl 5 documentation.  Are there reasons besides inertia for this to not 
be shipped with the Perl core?


Tuits. If you or someone else would like to propose an addition to perlpodspec 
for tables, I'm sure discussion would be welcome here, and we could come to 
some consensus on how it should work. Then it's a SMoP.

Best,

David




I was thinking that PseudoPod implemented most of what might be needed, 
and so why not ship that.


Its table spec looks quite simple, and perhaps sufficient.  tbl's is 
also pretty simple; it allows, without my looking at the documentation, 
at least columns that have justification of left, center, and the 
default 'alpha' which is essentially left, but I can't remember the 
difference, and numeric, so the decimal points line up.  You can also 
create spans, and bold, etc.  The other specification I'm familiar with 
is html, and it offers far more power than needed.


Re: Why does Pod::Checker deprecate section numbers in links to man pages?

2010-06-01 Thread karl williamson

Nicholas Clark wrote:

On Fri, May 28, 2010 at 10:33:13PM -0600, karl williamson wrote:

And in fact, recommends not using L to anything other than another pod.

This seems like a useful feature, which is supported at least in html.

I've searched the archives but not found anything.


git blame on the lines in question in blead from
cpan/Pod-Parser/lib/Pod/Checker.pm
points to commit 92e3d63aacb66085fea74c3f951f09e136337b97

Update to Pod::Parser 1.17, from Brad Appleton.


Some grovelling in CPAN reveals that the code was added in release 1.091 of
Pod-Parser:

http://search.cpan.org/diff?from=PodParser-1.09to=PodParser-1.091w=1

commented out in release 1.093 of Pod-Parser:

http://search.cpan.org/diff?from=PodParser-1.092to=PodParser-1.093w=1

uncommented in release 1.14 of Pod-Parser:

http://search.cpan.org/diff?from=PodParser-1.13to=PodParser-1.14w=1

and documented in release 1.15 of Pod-Parser:

http://search.cpan.org/diff?from=PodParser-1.14to=PodParser-1.15w=1


No idea whether Brad Appleton, Marek Rouchal, or someone else initiated
these changes

On Sat, May 29, 2010 at 01:01:35PM -0700, Russ Allbery wrote:

karl williamson pub...@khwilliamson.com writes:


And in fact, recommends not using L to anything other than another pod.
This seems like a useful feature, which is supported at least in html.

Yes, I disagree with this as well and have POD conversion software that
relies on being able to use L for man pages, URLs, and several other
things that aren't POD.

If you have tools available to create real links for man pages, you want
to allow specifying man pages in L, since otherwise you have to make
guesses at whether something(1) is a man page reference or something else.
So this is a bad thing to deprecate.


Agree. As I loathe heuristics, because they mean that you can't predict
how your document will be parsed, and effectively mean that you can't
write some totally legitimate code examples without it mistakenly being
treated as a link.

Nicholas Clark



FWIW, I submitted a bug report for this module on an unrelated issue a 
month ago, and about the same time sent an email directly to Marek about 
a p5p discussion, and have gotten no responses.


Why does Pod::Checker deprecate section numbers in links to man pages?

2010-05-28 Thread karl williamson

And in fact, recommends not using L to anything other than another pod.

This seems like a useful feature, which is supported at least in html.

I've searched the archives but not found anything.

Thanks in advance

Karl Williamson