Pod::StdParser

Marek Rouchal DAT CAD HW Tel 25849 Mon, 07 Aug 2000 00:42:56 -0700
Hello all,

I didn't manage to read all of your discussion, but let me elaborate a
little on Pod::StdParser or Pod::Compiler and its benefits.

First of all I'd like to stress that "perlpod" has to be rephrased to
reflect the abilities of existing parsers. For example, I see a lot of
possible misinterpretations when trying to auto-link bare text like
"foo(1)", "func()", "$var", "@arr" and so on, which IMHO should at least
read C<$var> etc. or even L<foo(1)>. I have nothing against extension of
the L<...> syntax (Pod::Hyperlink already knows about L<http:...>) to deal
with these situations. However, the black magic should be reduced. All
this has to be discussed first, of course.

On last night's train ride (4 hrs) I played with Pod::Checker and re-coded
it half-way through into a Pod::Compiler, which IMHO makes sense for *all*
POD *translators*, that need simple access to the *logical* contents of a
POD file, rather than the low-level directives (like in Pod::Select).

This especially includes:

* recognizing valid commands and choking on malformed ones
* parsing =over,=item,=back into genuine lists (list objects with item
  subobjects)
* Providing short and plain-\w+ identifiers for every =head ..., =item
  ..., X<...>, which can be used for hyperlink destinations
* recognizing =for, =begin ... =end
* recognizing valid entities (E<100>, E<0122>, E<0xAE>, E<auml>)
* parsing L<...> into comprehensive objects (having ->text, ->type,
  ->destination, ->destination_node etc.)

Thus this module reflects the POD capabilities defined in perlpod and all
*translators* can readily make use of this. I agree that this requires a
rewrite of the existing stuff...

Still there are additional benefits:

* Pod::Compiler will inherit from Storable, making e.g. the two-pass HTML
  translation much more efficient: Compile once, extract nodes with ids,
  store object tree, repeat for all PODs, then read each object tree and
  translate, while resolving all hyperlinks.
  Hyperlink-free destination formats of course don't need this.
* Pre-defined methods "as_text" and "as_pod": as_text gives access to the
  plain text content (ISO-88590-1 encoded), stripping all markup. as_pod
  allows for re-creation of correct POD. This will make a Pod::Lint a
  five-minute coding execise ;-)
* Text objects within S<...> have an attribute set so that conversion of
  whitespace is easily done.
* From within each object one has access to the whole Pod object tree
  (thanks to the Tree::DAG_Node module), so that e.g. in an =item you can
  easily see what this item's sisters are. Or: You can see that a
  Pod::italic object is nested within another Pod::italic, so that you can
  turn off italic mode in order to emphasise within an emphasis ;-)

A typical translator would then only have to:

* traverse the object tree
* provide markup for the different objects
* sanity-check its own =for and =begin blocks
* escape entities and/or text appropriately

More complex translators (e.g. HTML, LaTeX, FrameMaker) would

* keep track of the nodes
* consider hyperlinks
* create a TOC, index
* wrap up all PODs in a hyperlinked collection

This leaves us - AFAICS - with the speed discussion. I agree that for
perldoc and pod2text one does not want to wait for any OO exercises.
In Tk::Pod neither. But take Tk::Pod as an example: Here one wants
hyperlinks and therefore would have to recode big parts of
Pod::Compiler. Here the tradeoff is IMHO in favor of a strict OO approach
that renders the POD 100% ok.

In a few days I'll go on another train ride and hope to have a first
working version of Pod::Compiler then. I'll post it here for you to
evaluate it and most of all its performance.

If Pod::Compiler works satisfactorily and we agree to make broad use of
it, I would happily recode Pod::Checker and my Pod::HTML and could
(hopefully) quickly come up with a Pod::MIF (for FrameMaker). Tim, would
you cosider rewriting Pod::LaTeX?

All the best,

Marek
Pod::StdParser

Reply via email to