FYI, Markus Kuhn sent the following comments to the authors of PUDTR#25. They are of sufficient general interest to warrant discussion on the list.
A./ PS: I've refreshed the HTML so there should be fewer problems for people to read the equations in section 5 on Win2K or XP. >X-Mailer: exmh version 2.3+CL 01/14/2001 with nmh-1.0.4 >To: Barbara Beeton <[EMAIL PROTECTED]>, Asmus Freytag <[EMAIL PROTECTED]>, > Murray Sargent III <[EMAIL PROTECTED]> >cc: [EMAIL PROTECTED] (linux-utf8) >Subject: PDUTR #25: Unicode Support for Mathematics >X-URL: http://www.cl.cam.ac.uk/~mgk25/ >Date: Thu, 03 Jan 2002 17:11:50 +0000 >From: Markus Kuhn <[EMAIL PROTECTED]> > >Dear Unicode Maths team, > >I've read with enthusiasm your draft document > > http://www.unicode.org/unicode/reports/tr25/ > >and have great hopes that this project for "Unicode Plain Text Encoding >of Mathematics" will progress well and be widely implemented once it is >finished! > >I thought (from comp.text.sgml discussions in the early 1990s) that it >was in general widely accepted that SGML is in practice far too >inconvenient for entering mathematical text and that and math DTD will >not lead naturally to intuitive and consistent keyborad entry >techniques, which is why I always considered MathML more an academic >exercise than anything that I would ever really want to use to get work >done. MathML has never been anywhere near being a potential competitor >for TeX. > >I therefore observe with great interest that Unicode plans to treat >mathematics as just yet another complex script (like Indic, etc.), in a >way such that finally authors of SGML/XML document type definitions and >style sheets will not have to make much further provisions for support >of mathematics than for example define a single element for marking a >displayed equation. Also the prospect of being able to search for >mathematical formula fragments with web search engines is exciting. > >A few comments on the current draft: > > - It is not yet clear, how white-space is to be handled. In TeX, > the math mode has a lot of heuristics for adding white space where > mathematical typographic tradition finds it convenient, for example > around every operator. It has often been observed that scientific papers > written in Word have often far inferiour mathematical spacing than > papers written in TeX, because TeX's heuristic algorithms are > far better than an inexperienced author. However, these heuristics > fail frequencly, and more often then desireable, TeX users have to > manually override the math spacing with \, and the like. > > Your current text does not yet make it clear, whether the additional > white space used around mathematical operators will be added by the > rendering engine and font (as in TeX) or will be encoded in the plain > text. I suspect encoding the whitespace in the plaintext is ultimately > preferable, as it will ensure more control in a portable way, even > though that means that typographic beginners will be more likely > to produce ugly formulas. Heuristc's like TeX's would have to become > part of the keyboard entry and style checking mechanisms of the > editor (like the Word spell checker), not of the rendering engine. > This should make results hopefully more predictable across a wide > range of rendering engines. > > - On section 5.1 "Recognizing Mathematical Expressions": > With intra-formula white-space being encoded in the plain text, and > variables typically being written in the Plane 1 math characters, there > should never be a need to explicitly delimit mathematical formulas > from "normal text", as for the rendering engine, they would just be > normal text. In other words, it would be desireable if your proposal > wouldn't make having section 5.1 necessary. > > - What is missing at the moment are a mechanism for handling matrices > commutative diagrams and similar tabular arrangements of inline > formulas. Most markup languages and rendering engines have already > very sophisticated mechanisms for the layout of tables. I think, > the best appraoch would be to simply use or slightly extend the > already available table mechanism to encode matrices. All that Unicode > has to add is a combining modifier corresponding to TeX's \left and > \right command that instructs a delimiter glyph to grow with the > height of the text in between, which could include an inline table with > centered alignment. Don't dublicate what the existing table engines > already provide. In that light, I would reconsider the need for the > briefly mentioned align-over operator. > > Using the table mechanism of the higher markup language has numerous > advantages: > > - the DTD keeps control over where matrices are allowed (e.g., only in > displayed equations, but not inline and not in headings or > keyword lists) > > - layout and cut&paste selection in tables is a very complex process, > you really don't want to have to implement that twice > > It is true that plaintext Unicode matrixes would simplify the > cut&paste of matrices as well, but that is probably not worth the cost > of blurring the currently quite clear interface between a paragraph > redering engine and a page/table layout engine. Dramatically simplified > versions of MathML on to of plaintext Unicode math can still be used > to encode matrices in a portable and reusable way. > > - A stylistic comment: I think it would suit the text better not > too spend such a lot of time with critizising TeX and MathML. > Knowledgeable readers will be well familiar with TeX and will > discover for themselves the advantages of your approach over > existing practice, and the inadequacies of MathML are obvious to > anyone who had even a brief look at the entire idea of encoding > formulas in XML. > >The proposal is certainly still in an early stage, but it is heading in >the right direction and I will follow its progress with great interest! > >Markus > >-- >Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK >Email: mkuhn at acm.org, WWW: <http://www.cl.cam.ac.uk/~mgk25/>