Whatever markup language is used, it needs to have the following characteristics, in addition to the common visual markup and control capabilities:

1. The markup must identify chapters , sub-chapters, and sub-sub-chapters, etc. all with the appropriate titles. Each title should contain metadata identifying it's appropriate parent chapter (or multiple parents) as well as its' hierarchical level. Some would argue that while it is nice to know an author's suggested reading sequence or heiarchy (usually delineated in the Table of Contents), the document should contain enough semantic metadata (or extremely descriptive chapter titles) so that the reader that wishes to define their own reading sequence, can do so.

2. The markup should have a way to embed metadata that is attached to the chapter, sub-chapter, & paragraph level (used for alternate keywords, semantic indices and clues, assistant authors, last changed dates, previous versions, etc.). The metadata would be hidden in normal reading views, bit could be found by search engines, and exposed in specialized readers. This allows much more flexible and reliable searching and document analysis.

3. Embedded metadata indicating page break 'hints', so printing processes can know where new pages can effectively start when needed. Screen display engines can ignore these pagination cues, if the reader so desires.

4. A way to embed hyperlinks, with embedded identifying metadata, defining local or external links, alternate link text, etc.

The choice of PDF as a rendering format has much to do with requirements and availability, as anything. If the goal is to publish a document that has a predefined format (say 8.5 x 11 pages) that isn't to be re-formatted or edited by the readers, PDF is a reasonable choice. Readers are available for most platforms, and many PCs come with Adobe reader pre-installed . PDF provides a simple way to distribute print-ready docs to computer-challenged readers.

This isn't to say the the original doc should be designed in PDF. PDF supports few of the above-mentioned features, as it is primarily a fixed-display rendering format. Adobe has made it easy to convert any document, in any format, into a PDF document, simply by printing it to the "Adobe printer" when the PDF printer driver is installed.

The original doc should probably be created in an XML-based document definition language, to include all of the document metadata discussed above, allowing it to be rendered in many different rendering formats, as required. I am not sure that any of the current doc definition languages support the requirements I describe above.

Skip Cave

<<<>>
Oleg Kobchenko wrote:
--- Chris Burke <[EMAIL PROTECTED]> wrote:
This discussion surfaced in 2004 under "[Jforum] Learning J"

http://www.jsoftware.com/pipermail/general/2004-January/016691.html
http://www.jsoftware.com/pipermail/general/2004-January/016690.html
http://www.jsoftware.com/pipermail/general/2004-January/016689.html

LaTeX is an interesting format, but it is loose, has limited
audience and is not amenable to simple transfomation with
tools like XSLT.

XML and XSLT are best tools for the job, because they allow
to write transformations declaratively in a simple way without writing parsers and programmatic tools.

In fact, XSL-FO is THE standard thanks to which XSLT came to be
as an expectedly ubiquitous byproduct, far supassing its initial purpose.

Unlike LaTeX and good input/raw format to create documents
is a particular HTML version, e.g. HTML 4.01 is a good common
denominator of standards supported by most platforms, browsers
and applications form Email to Word Processors.
Moreover, HTML 4.01 is rich enough semantically and stylistically
to describe a wide variety of documents. It is a long-established
standard:
   http://www.w3.org/TR/html401/
It is what the XML variety is directly based upon
   http://www.w3.org/TR/xhtml1/
which means that it is amenable to XSLT transformation to
virtually anything, specifically in a format, which can be
used by a PDF publishing software.
For XSL-FO there are free and commercial products that do
specifically that: prodice PDF from XML using style and format
defined in XSLT.

In case of J Publish addon, a possible approach to handle
J documentation would be such:
 - use Tidy to convert it to XHTML: http://tidy.sourceforge.net/
 - apply a text-type-output XSLT with Publish addon markup
 - apply Publish addon to produce PDF

There may be a small driver tool, that will take care
of folders, multiple files, images, etc.



____________________________________________________________________________________
Boardwalk for $500? In 2007? Ha! Play Monopoly Here and Now (it's updated for 
today's economy) at Yahoo! Games.
http://get.games.yahoo.com/proddesc?gamekey=monopolyherenow ----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm


----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to