Hello all,
I didn't manage to read all of your discussion, but let me elaborate a
little on Pod::StdParser or Pod::Compiler and its benefits.
First of all I'd like to stress that "perlpod" has to be rephrased to
reflect the abilities of existing parsers. For example, I see a lot of
possible misinterpretations when trying to auto-link bare text like
"foo(1)", "func()", "$var", "@arr" and so on, which IMHO should at least
read C<$var> etc. or even L<foo(1)>. I have nothing against extension of
the L<...> syntax (Pod::Hyperlink already knows about L<http:...>) to deal
with these situations. However, the black magic should be reduced. All
this has to be discussed first, of course.
On last night's train ride (4 hrs) I played with Pod::Checker and re-coded
it half-way through into a Pod::Compiler, which IMHO makes sense for *all*
POD *translators*, that need simple access to the *logical* contents of a
POD file, rather than the low-level directives (like in Pod::Select).
This especially includes:
* recognizing valid commands and choking on malformed ones
* parsing =over,=item,=back into genuine lists (list objects with item
subobjects)
* Providing short and plain-\w+ identifiers for every =head ..., =item
..., X<...>, which can be used for hyperlink destinations
* recognizing =for, =begin ... =end
* recognizing valid entities (E<100>, E<0122>, E<0xAE>, E<auml>)
* parsing L<...> into comprehensive objects (having ->text, ->type,
->destination, ->destination_node etc.)
Thus this module reflects the POD capabilities defined in perlpod and all
*translators* can readily make use of this. I agree that this requires a
rewrite of the existing stuff...
Still there are additional benefits:
* Pod::Compiler will inherit from Storable, making e.g. the two-pass HTML
translation much more efficient: Compile once, extract nodes with ids,
store object tree, repeat for all PODs, then read each object tree and
translate, while resolving all hyperlinks.
Hyperlink-free destination formats of course don't need this.
* Pre-defined methods "as_text" and "as_pod": as_text gives access to the
plain text content (ISO-88590-1 encoded), stripping all markup. as_pod
allows for re-creation of correct POD. This will make a Pod::Lint a
five-minute coding execise ;-)
* Text objects within S<...> have an attribute set so that conversion of
whitespace is easily done.
* From within each object one has access to the whole Pod object tree
(thanks to the Tree::DAG_Node module), so that e.g. in an =item you can
easily see what this item's sisters are. Or: You can see that a
Pod::italic object is nested within another Pod::italic, so that you can
turn off italic mode in order to emphasise within an emphasis ;-)
A typical translator would then only have to:
* traverse the object tree
* provide markup for the different objects
* sanity-check its own =for and =begin blocks
* escape entities and/or text appropriately
More complex translators (e.g. HTML, LaTeX, FrameMaker) would
* keep track of the nodes
* consider hyperlinks
* create a TOC, index
* wrap up all PODs in a hyperlinked collection
This leaves us - AFAICS - with the speed discussion. I agree that for
perldoc and pod2text one does not want to wait for any OO exercises.
In Tk::Pod neither. But take Tk::Pod as an example: Here one wants
hyperlinks and therefore would have to recode big parts of
Pod::Compiler. Here the tradeoff is IMHO in favor of a strict OO approach
that renders the POD 100% ok.
In a few days I'll go on another train ride and hope to have a first
working version of Pod::Compiler then. I'll post it here for you to
evaluate it and most of all its performance.
If Pod::Compiler works satisfactorily and we agree to make broad use of
it, I would happily recode Pod::Checker and my Pod::HTML and could
(hopefully) quickly come up with a Pod::MIF (for FrameMaker). Tim, would
you cosider rewriting Pod::LaTeX?
All the best,
Marek