Re: [Groff] moving TOC to start

Keith MARSHALL Wed, 28 Sep 2005 05:38:01 -0700

Werner Lemberg wrote, quoting me:
>> Since I first saw this technique, in Tadziu's own `tmac.diss'
>> documentation, I've been toying with ideas on how I might adapt it
>> for `pdfroff'.  I'd actually like to engineer the concept directly
>> into groff itself.  [...]
>
> I'm not convinced that this is the right way.  For example, it is
> recommended today not to call LaTeX directly but to use texi2dvi or a
> similar script (or using an IDE like AUCTeX) which decently runs the
> (la)tex program as often as needed, together with all the necessary
> postprocessors like BibTeX or makeindex.


I'm not sure that you've fully understood my proposal; this is exactly
how pdfroff currently works, and I don't foresee any change in this
strategy, in respect of resolving references or index entries.

>> I'd be willing to work on such an implementation, in parallel with
>> ongoing development of `pdfroff' and `pdfmark.tmac', if there is any
>> interest in pursuing the concept further.
>
> Please compare your proposal with the (IMHO simpler) approach of
> running groff multiple times.  I'd really like to see pdfroff as an
> equivalent to texi2dvi.

Currently pdfroff runs multiple groff passes, up to a maximum of four,
to resolve cross references, and generate a reference dictionary; (I
make the assumption that, if the reference dictionary from passes
three and four aren't identical, then there is a layout stability
problem, which will require manual intervention to resolve).  Naturally,
if two passes earlier that the fourth produce identical dictionaries,
then no further passes are required, so aren't performed.

I don't propose any change in this strategy, for resolving cross
references within a document.

After generating the reference dictionary, pdfroff then performs two
further passes, one to capture the table of contents into its own
PostScript file, the second to capture the document body text.  The two
PostScript files so generated, together with an optional third
describing cover sheet and title page layout, are passed to GhostScript,
in the correct order to produce the finished PDF document.

This strategy does work, but its handling of the collation of table of
contents to the start of the document is, IMHO, ugly and inefficient.
I use the `ms' macro set, which has its own built table of contents
generation mechanism; the problem with this is that it places the table
of contents at the *end* of the document, rather than at the beginning,
where it belongs.  UTP suggests that the solution to this is to manually
collate the table of contents pages to their proper location, after
printing the document; this might be ok for a small print run of a
document which is never intended for anything other than output to
paper; it is quite unacceptable, IMHO, for publishing in electronic
formats such as PDF.

To work around this, as noted above, pdfroff collects table of contents
and body text into two separate PostScript files, using independent
groff passes to collect each file.  The problem is that each of these
files contains as many pages as the other, this being the total number
of pages in the complete document, both table of contents and body.  The
two files are differentiated by using the `\O0' escape to suppress
output of body text while generating the table of contents file, and
table of contents output, when generating the body text file.  However,
this mechanism of output suppression doesn't inhibit the output of page
markers, so the table of contents file in particular contains a number
of initial blank pages, where the body text would normally be placed,
while the body text file has trailing blank pages which would normally
contain the table of contents.  To eliminate these blank pages, pdfroff
runs the PostScript output through a `sed' filter, which discards pages
with no content -- crude, ugly, but effective; unfortunately, it has the
potential side effect of discarding pages which were intentionally left
blank.

All I am proposing is that we add a capability in groff, which would
allow us to mark distinct "collating segments" within the the troff
intermediate output stream.  This is effectively what Tadziu's method
does; he simply uses the existing page marks, with a page number of 1,
to indicate the starting point of a new segment.  He then assumes that
there are two such segments, and that the second should be collated
before the first.  My proposal extends this concept to provide a more
flexible, and controllable, collating mechanism; it would still rely on
an external program, (my so called grocol), to perform the collation, in
a similar fashion to Tadziu's `sed' script.

I hope this makes the idea clearer.

Best regards,
Keith.


_______________________________________________
Groff mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/groff

Re: [Groff] moving TOC to start

Reply via email to