Whatever markup language is used, it needs to have the following
characteristics, in addition to the common visual markup and control
capabilities:
1. The markup must identify chapters , sub-chapters, and
sub-sub-chapters, etc. all with the appropriate titles. Each title
should contain metadata identifying it's appropriate parent chapter (or
multiple parents) as well as its' hierarchical level. Some would argue
that while it is nice to know an author's suggested reading sequence or
heiarchy (usually delineated in the Table of Contents), the document
should contain enough semantic metadata (or extremely descriptive
chapter titles) so that the reader that wishes to define their own
reading sequence, can do so.
2. The markup should have a way to embed metadata that is attached to
the chapter, sub-chapter, & paragraph level (used for alternate
keywords, semantic indices and clues, assistant authors, last changed
dates, previous versions, etc.). The metadata would be hidden in normal
reading views, bit could be found by search engines, and exposed in
specialized readers. This allows much more flexible and reliable
searching and document analysis.
3. Embedded metadata indicating page break 'hints', so printing
processes can know where new pages can effectively start when needed.
Screen display engines can ignore these pagination cues, if the reader
so desires.
4. A way to embed hyperlinks, with embedded identifying metadata,
defining local or external links, alternate link text, etc.
The choice of PDF as a rendering format has much to do with requirements
and availability, as anything. If the goal is to publish a document that
has a predefined format (say 8.5 x 11 pages) that isn't to be
re-formatted or edited by the readers, PDF is a reasonable choice.
Readers are available for most platforms, and many PCs come with Adobe
reader pre-installed . PDF provides a simple way to distribute
print-ready docs to computer-challenged readers.
This isn't to say the the original doc should be designed in PDF. PDF
supports few of the above-mentioned features, as it is primarily a
fixed-display rendering format. Adobe has made it easy to convert any
document, in any format, into a PDF document, simply by printing it to
the "Adobe printer" when the PDF printer driver is installed.
The original doc should probably be created in an XML-based document
definition language, to include all of the document metadata discussed
above, allowing it to be rendered in many different rendering formats,
as required. I am not sure that any of the current doc definition
languages support the requirements I describe above.
Skip Cave
<<<>>
Oleg Kobchenko wrote:
--- Chris Burke <[EMAIL PROTECTED]> wrote:
This discussion surfaced in 2004 under "[Jforum] Learning J"
http://www.jsoftware.com/pipermail/general/2004-January/016691.html
http://www.jsoftware.com/pipermail/general/2004-January/016690.html
http://www.jsoftware.com/pipermail/general/2004-January/016689.html
LaTeX is an interesting format, but it is loose, has limited
audience and is not amenable to simple transfomation with
tools like XSLT.
XML and XSLT are best tools for the job, because they allow
to write transformations declaratively in a simple way without
writing parsers and programmatic tools.
In fact, XSL-FO is THE standard thanks to which XSLT came to be
as an expectedly ubiquitous byproduct, far supassing its
initial purpose.
Unlike LaTeX and good input/raw format to create documents
is a particular HTML version, e.g. HTML 4.01 is a good common
denominator of standards supported by most platforms, browsers
and applications form Email to Word Processors.
Moreover, HTML 4.01 is rich enough semantically and stylistically
to describe a wide variety of documents. It is a long-established
standard:
http://www.w3.org/TR/html401/
It is what the XML variety is directly based upon
http://www.w3.org/TR/xhtml1/
which means that it is amenable to XSLT transformation to
virtually anything, specifically in a format, which can be
used by a PDF publishing software.
For XSL-FO there are free and commercial products that do
specifically that: prodice PDF from XML using style and format
defined in XSLT.
In case of J Publish addon, a possible approach to handle
J documentation would be such:
- use Tidy to convert it to XHTML: http://tidy.sourceforge.net/
- apply a text-type-output XSLT with Publish addon markup
- apply Publish addon to produce PDF
There may be a small driver tool, that will take care
of folders, multiple files, images, etc.
____________________________________________________________________________________
Boardwalk for $500? In 2007? Ha! Play Monopoly Here and Now (it's updated for
today's economy) at Yahoo! Games.
http://get.games.yahoo.com/proddesc?gamekey=monopolyherenow
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm