I must say, I am a bit disappointed that the discussions about the future
of documentation in Perl has died. Or was everyone fully occupied
by YAPC::NA? I spent last week with my family on a stormy island,
without sufficient internet access, so was unable to stirr things up
again, but maybe this email will bring the focus back.
Damian challenged me by asking what I think how Perl6's documentation
should be done. When I think about documentation, I do not (immediately)
think about some mark-up language; that is just a minor component.
My focus is on the whole process: from writer to reader. Although Damain
says to have studied OODoc, this most import features of that system
were ignored: simplifying the documentation process, improving the
Each time I re-read the list below, there are things I wish to change
or add: it is neither complete nor final. But I cannot wait longer to
post it. It's not my wish to extend this into a detailed requirements
document, just to set a focus of discussion. Maybe someone wants to
comment on it?
======== Documentation of code (i.e. Perl6)
In this text, we try to determine the environment for the optimal
documentation system for Perl6, but applicable to any other programming
Everyone will have his/her own weights on different aspects, and you
may even totally not agree with some of the listed remarks: it is open
to discussion. Some of the wishes contradict an other as well.
The sole goal: the best documentation for Perl6
It is very important to keep in mind that documentation is made for
some target community to be read. Write-only texts are useless.
=== Target communities
Documentation is added to code, to provide additional information about
the code to inform some target community. There are different target
communities possible for the same piece of code, which should all be
served as good as possible.
Traditionally, you see
1) code comments, for maintainers of the software
2) manual-pages, for everyone else
But more specific user groups can be defined:
1) code comments, for maintainers
2) developer manuals (the complete interface)
3) user manuals (distribution external interfaces)
4) selective look-up (for perldoc -f or IDE)
There are a few fundaments for good documentation:
- it must be written
- it must be correct
- it must be consistent in structure
- it must be consistent in content
- it must be accessible (find back/pleasant to read)
The best way to get people into writing documentation, is to make it as
simple as possible. This means:
- reduce the need to read man-pages or books to be able to create it,
- reduce the amount of text to be typed,
- avoid the need for additional tools to be installed,
- reduce the need for configuration.
All for the sake of laziness. The less time people need to manage their
documentation environment, the more time they have to write quality texts.
To ease the burden of writing docs, documentation generating tools
should use as much information from the code as is useful for the
target community. Replication between code and docs make changes
a double effort. Manual replicated of text between files (like the
inclusion of the license text in each file) during programming is an
Each documentation fragment belongs to some part of the implementation.
This may be a distribution, a file, a class or grammar or package,
a method, rule or sub, a positional or named parameter, and so on.
This relation comes natural (because of the mixture of code and doc),
or enforced (via some reference syntax).
The documentation fragments need some markup. Many mark-up languages
exist, which do have more or less the same features. Two of those are
POD and PDD S26. Within one distribution, it is useful to use the
same kind of markup syntax. Document generators should only get a
minimal abstract interface to collect the results of the markup
parser, for instance a $markup->produceHtml($fragment, ...)
The markup language used should be capable of addressing the things
that a document writer wishes to express, not on what certain output
back-end can handle (those can always ignore things they cannot handle)
The documentation and the related code must be cross checkable, on
matters they overlap. Better to avoid replication, in which case
there is no overlap to be checked.
The user should be stimulated to write in a good style. One of the
ways to achieve this, is to avoid the need to write the same sentence
over and over again. For instance: "This method returns a boolean,
to indicate success" is a sentence to avoid. (Template based) auto-
generation could be used introduce abbreviations for often used
Produced manuals should by default be checked for the completeness (like
Pod::Coverage), correctness in syntax (like Pod::Checker), and the
used references. If possible, spell-checking (like Pod::Spell) should
be invoked automatically.
There is a set of components we will always find in (UNIX) manual pages:
the one-line purpose (name), the synopsis, extended description, the subs
and methods, the "see also", authors, and license. The order and location
of these documentation fragments, and their exact names are arbitrary.
Only the back-end can decide how, whether, and in which order these
The doc-writers should have general information about which documentation
components are minimally needed by the back-ends, for instance the
name and the license. A short-list of chapter names suffices.
The documentation generating back-ends shall have the same idea about the
structure and meaning of the contributed documentation. The back-ends
only generate end-user texts, without any need for interpretation of the
doc fragments. Only this way, systems like search.cpan.org can be of
On the documentation writer's side, there usually is a serious problem:
writers do not know enough about the readers: their level of education,
their actual interests, and the media they use to read the documentation.
This results in inconsistent documentation between distributions.
For instance, some people put internal interfaces into the manual-pages,
where other do not want to bother the readers hence include them as
comments. In Perl6, we have scoped subs and private methods, so it
is much clearer whether a component is available to everyone or not.
We may produce different manual pages.
Iff the used markup-language permits the author to specify the commands
which change the back-end's output (in the anarchistic tradition of Perl),
therewith endangering the consistency in output style or frustrating the
automatic processing of the content by other back-ends, then there must
be a simple way for the back-ends to protect themselves. There should
be a standard way to remove this cruft from the documentation fragments.
The back-end, which produces the document the user will read, can be
traditional UNIX manual-pages, HTML web-pages, a printed book, whatever.
Of course, you want to produce documents which fit as good as possible
to the possibilities which a certain output medium gives. POD(5) can
be used to produce web-pages, but the features to link between document
elements are far below the levels we are used to for HTML web-pages.
It should be very simple to create anchors and references to very
specific locations in the text, like a single option description.
Preferably without the need to define destination anchor points: the
documentation where you point to can be in a different package, not
under your control.
Documentation fragments are usually written in coding order.
The programmer's activities are often quite chaotic: functions and
methods are written in the order that the programmer needs them.
Related components are often close together in the file, but there is
no role for manual-order during this development process. What is the
optimal order for the user to consume the fragments? A very workable
solution is to group the items (for instance, constructors, accessors,
...) into text sections, and within those groups use alphabetic sorting.
Each section may need some introductory text and examples.
To make doc fragments groupable is a requirement, and back-ends will
work-out how that grouping is used.
=== Generation documentation
The documentation generation process could look schematically
something like this:
for each file in the MANIFEST
parse Perl into its AST
extract doc and code info from AST into doc-tree
for each doc-fragment in doc-tree
preform consistency, structural checks on doc-tree
collect inheritance information into doc-tree
for each package, class, grammar, pod in doc-tree
call generator back-end(s)
syntax check produced man-pages