On Sat, 21 Apr 2018 08:16:36 -0500 Nate Bargmann <n...@n0nb.us> wrote:
> > For lack of a better term, I think it's an abstraction mismatch. > > The ditroff language presupposes a dot-addressable canvas, onto > > which lines and strings of text are drawn. That model fits most > > printers (these days) and terminals. But it doesn't describe HTML > > at all. > > I suppose it depends on what one expects from the generated HTML. As > one who reads pages more than writes them, I've been impressed with > the presentation on the man-pages project Web site (hosted at > kernel.org). For example, here is the rendering of groff_man(7): > > http://man7.org/linux/man-pages/man7/groff_man.7.html > > I couldn't find the generator being used in the Git repository and a > lot of it may be done with CSS. The text is rendered using the <pre> > tag so it looks much like tty output though it is not fully justified > yet the text blocks are indented. Aside from the justification, the > rest looks very familiar to me. I find very little to commend that version. In fact, it's an excellent example of the widespread dunderheaded monospace manpage rendering on the web. I invite you to compare it with something better: https://linux.die.net/man/7/groff_man (Better url, too, btw: "man/7/groff_man" captures everything "man-pages/man7/groff_man.7.html" does in half the space.) (One has to wonder, though, at the age of that document. For amusement, follow the link in See Also.) There's nothing in the manpage input text that specifies the font family to be used. The Postscript rendering uses proportional fonts and italics, much more pleasant to read than nroff output in a terminal. Why shouldn't the HTML output make the highest and best use of its medium, instead of poorly emulating a 40 year-old obsolete hardware? > Note that I am only working with Groff's man macro package and do > understand that other macro packages may have greater demands on the > HTML generator. It's not the demands on the HTML *generator* that present the problem. The input to grohtml is devoid of information HTML itself demands, and is thick with information it can't use. HTML needs: titles, paragraphs, tables ditroff provides: positions, text, fonts The only reason it works at all is that the pre-grohtml preprocessor sneaks some useful information to the postprocessor via ditroff escapes. That allows grohtml to generate MathML from eqn output, for example. While it's possible to work that way, I see no advantage to squirting most of the needed information to the post-processor through the formatter, when the formatter's work is discarded by the post-processor. Why bother? just translate the macros directly. Which, I suppose, it what Ingo is doing already. --jkl