[Etoile-discuss] LaTeX: The good, the bad, and the just plain wrong.

David Chisnall Sun, 29 Jul 2007 07:45:40 -0700

I've just finished writing a book and a thesis using LaTeX, and so I  
think I am now in a good position to evaluate the system, and suggest  
ideas we should steal and ideas we should abandon.


_The Good_

There are a few really nice things about LaTeX.  The first is that  
it's easy to type.  Commands are of the form \foo{wibble}, which is a  
lot easier than something like <foo>wibble</foo>, for example.  If  
we're considering using something a bit more advanced than a plain  
text editor, this isn't a problem for the format, it's a problem for  
the UI.  I haven't seen a word processor that lets you enter markup  
information faster than I can enter LaTeX markup in vim (particularly  
since I have written a 'do what I mean' script for the more verbose  
parts).

The real advantages of something like LaTeX are the clear separation  
of presentation and content, and the ability to use semantic markup.   
When I am writing a long document, I am concerned with the  
structure.  I don't care how it will look, because everything will  
all get re-flowed as I write more, and having it all dancing around  
is just distracting.  I want to say 'this is a code listing in C,'   
'this is a section heading' etc, and worry about how the appear later.

Finally, LaTeX produces really gorgeous output.  Something like Word,  
which tries to make ligatures of really strange combinations of  
letters, just looks hideous in comparison.  The OS X text system is  
not far off, so GNUstep has the potential to give us similar quality  
if I poke my SoC student hard enough.

_The Not-So-Good_

LaTeX is built on TeX.  I don't actually use LaTeX, I use LaTeX plus  
a load of packages, plus a load of my own additions.  This presents  
real problems for a text editor.  Consider the following two snippets:

\code{foo += bar}

\codefile[caption={A short listing}]{examples/short.c}

In these two snippets, the arguments to the first should be spell- 
checked according to the rules for C (i.e. spell check strings and  
comments, nothing else).  In the second, the value associated with  
the caption key should be spell checked, but nothing else.   
Unfortunately, an editor has no way of knowing which of these  
arguments are commands and which are displayed text.  Something like  
aspell knows about some basic LaTeX commands, and will spell check  
those correctly.  Everything else is guesswork.  Since TeX is Turing- 
complete, the only way of knowing this for certain is to execute the  
program and see if it outputs the text.  TeX has the same limitation  
as PostScript in this respect; the only way of executing page 100 is  
to execute pages 0-99 and discard he result.  Even on a Core 2 Duo,  
it takes a good ten to thirty seconds to typeset a decent length  
document, which is prohibitively expensive for an interactive process  
such as a spell-check-as-you-type feature.

The other disadvantage of this approach is that it's easy to turn  
LaTeX into PDF, but very hard to go the other way.  Even with  
pdfsync, clicking on the PDF will only take you within a paragraph or  
two of the markup that generated a word.

I said earlier that there was a clean separation between semantic  
markup and presentation code, but this isn't really enforced.  TeX  
doesn't do it, LaTeX tries to.  In my own documents, I write using  
semantic markup and then define commands elsewhere that describe how  
to display these, but it requires some effort and discipline to  
maintain this proper level of abstraction.

_The Ugly_

TeX.  TeX is a really hideous language.  Writing TeX feels like  
writing assembly code for a '70s stack-based 8-bit architecture.   
Some ideas from this era died an untimely death, but this was one  
that deserved to be staked, decapitated, had a communion wafer placed  
in its mouth and buried under a crossroads to ensure it never arose  
again.  Any time you need to go below the surface of LaTeX, you are  
in TeX-land, and cursing the lack of anything that even looks a bit  
like a high-level language.

_What We Can Do Better_

The first thing to remove is the in-band signalling that UNIX-users  
love so much.  Because TeX is plain text containing markup, you need  
to insert all of the formatting commands into the text.  We can avoid  
this by placing them in the NSDictionary provided by NSAttributedString.

Doing this also allows us to completely separate markup and  
transformations.  We should define some basic keys (e.g. heading  
depth), and users could add others if they wanted.  We then define  
bundles which perform the transform from the set of semantic tags to  
a set of syntactic ones (we can include things like pulling in code  
from a file and performing syntax highlighting on it as more complex  
examples).

Since we have this nice abstraction, our editor can work much more  
closely with the structure of the document.  We can switch directly  
between different view (semantic / presentation) and have something  
like an outline view for the structure while maintaining things like  
references as links.

There are probably some other things.  Comments?  Suggestions?   
Ramblings?  Flames?

_______________________________________________
Etoile-discuss mailing list
[email protected]
https://mail.gna.org/listinfo/etoile-discuss

[Etoile-discuss] LaTeX: The good, the bad, and the just plain wrong.

Répondre à