Dear Unicode Maths team, I've read with enthusiasm your draft document
http://www.unicode.org/unicode/reports/tr25/ and have great hopes that this project for "Unicode Plain Text Encoding of Mathematics" will progress well and be widely implemented once it is finished! I thought (from comp.text.sgml discussions in the early 1990s) that it was in general widely accepted that SGML is in practice far too inconvenient for entering mathematical text and that and math DTD will not lead naturally to intuitive and consistent keyborad entry techniques, which is why I always considered MathML more an academic exercise than anything that I would ever really want to use to get work done. MathML has never been anywhere near being a potential competitor for TeX. I therefore observe with great interest that Unicode plans to treat mathematics as just yet another complex script (like Indic, etc.), in a way such that finally authors of SGML/XML document type definitions and style sheets will not have to make much further provisions for support of mathematics than for example define a single element for marking a displayed equation. Also the prospect of being able to search for mathematical formula fragments with web search engines is exciting. A few comments on the current draft: - It is not yet clear, how white-space is to be handled. In TeX, the math mode has a lot of heuristics for adding white space where mathematical typographic tradition finds it convenient, for example around every operator. It has often been observed that scientific papers written in Word have often far inferiour mathematical spacing than papers written in TeX, because TeX's heuristic algorithms are far better than an inexperienced author. However, these heuristics fail frequencly, and more often then desireable, TeX users have to manually override the math spacing with \, and the like. Your current text does not yet make it clear, whether the additional white space used around mathematical operators will be added by the rendering engine and font (as in TeX) or will be encoded in the plain text. I suspect encoding the whitespace in the plaintext is ultimately preferable, as it will ensure more control in a portable way, even though that means that typographic beginners will be more likely to produce ugly formulas. Heuristc's like TeX's would have to become part of the keyboard entry and style checking mechanisms of the editor (like the Word spell checker), not of the rendering engine. This should make results hopefully more predictable across a wide range of rendering engines. - On section 5.1 "Recognizing Mathematical Expressions": With intra-formula white-space being encoded in the plain text, and variables typically being written in the Plane 1 math characters, there should never be a need to explicitly delimit mathematical formulas from "normal text", as for the rendering engine, they would just be normal text. In other words, it would be desireable if your proposal wouldn't make having section 5.1 necessary. - What is missing at the moment are a mechanism for handling matrices commutative diagrams and similar tabular arrangements of inline formulas. Most markup languages and rendering engines have already very sophisticated mechanisms for the layout of tables. I think, the best appraoch would be to simply use or slightly extend the already available table mechanism to encode matrices. All that Unicode has to add is a combining modifier corresponding to TeX's \left and \right command that instructs a delimiter glyph to grow with the height of the text in between, which could include an inline table with centered alignment. Don't dublicate what the existing table engines already provide. In that light, I would reconsider the need for the briefly mentioned align-over operator. Using the table mechanism of the higher markup language has numerous advantages: - the DTD keeps control over where matrices are allowed (e.g., only in displayed equations, but not inline and not in headings or keyword lists) - layout and cut&paste selection in tables is a very complex process, you really don't want to have to implement that twice It is true that plaintext Unicode matrixes would simplify the cut&paste of matrices as well, but that is probably not worth the cost of blurring the currently quite clear interface between a paragraph redering engine and a page/table layout engine. Dramatically simplified versions of MathML on to of plaintext Unicode math can still be used to encode matrices in a portable and reusable way. - A stylistic comment: I think it would suit the text better not too spend such a lot of time with critizising TeX and MathML. Knowledgeable readers will be well familiar with TeX and will discover for themselves the advantages of your approach over existing practice, and the inadequacies of MathML are obvious to anyone who had even a brief look at the entire idea of encoding formulas in XML. The proposal is certainly still in an early stage, but it is heading in the right direction and I will follow its progress with great interest! Markus -- Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK Email: mkuhn at acm.org, WWW: <http://www.cl.cam.ac.uk/~mgk25/> -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
