On 12 Jul 2012, at 15:54, Julian Bradfield wrote: > On 2012-07-12, Hans Aberg <[email protected]> wrote: >>> There are many characters that TeX users use that are not in >>> Unicode. >> >> All standard characters from TeX, LaTeX, and AMSTeX should be there, > > What's a standard character? There's no such thing. > To take a random entry from the LaTeX Symbol Guide, where is the > \nrightspoon symbol from the MnSymbol package? (A negated multimap > symbol.) > > Not to mention the symbols I've used from time to time, because
You tell me, because I posted a request for missing characters in different forums. Perhaps you invented it after the standardization was made? >> them. In math, you can always invent your own characters and styles, > > people do. You and others knowing about those characters must make proposals if you want to see them as a part of Unicode. >> in fact you could do that with any script, but it is not possible >> for Unicode to cover that. There are though a public use area, where >> one can add ones own characters. > > You mean "private use". Crazy thing to do, because then you have to > worry about whether your PUA code point clashes with some other > author's PUA code point. There is some system for avoiding that. Perhaps someone else here can inform. >>> Because TeX is agnostic about such matters, one can set up any >>> convenient encoding for the input data (which is really the source >>> code of a program). For example, I have written documents in ASCII, >>> Latin-1, Big5, GB, UTF-8 and probably others. This is very convenient; >>> but it's only a convenience. >> >> UTF-8 only is simplest for the programmer that has to implement it. > > Some of us are more concerned with users than programmers. Well, if the programmers don't implement, you are left out in the cold. > Beside, all > the work for the "legacy" encodings has already been done. I wouldn't > ever want to go back to "ISO alphabet soup" for Latin etc., but for > CJK, the legacy codings are still sometimes convenient - for example, > if I write in Big5, I don't have to worry about telling my editor to > find a traditional Chinese font rather than a simplified or japanese > font. It uses a Big5 font, and that's it. Before UTF-8, in the 1990s, some Russians used multi-encoded text files with TeX/LaTeX, but I doubt they do that anymore. Use whatever you like. >> LuaTeX and the older XeTeX support UTF-8. They are available in TeX Live. >> http://www.tug.org/texlive/ > > They aren't TeX. Clearly not, since TeX is not developed anymore. > Neither working mathematicians nor publishers nor > typesetters like dealing with constantly changing extensions and > variations on TeX - one of the biggest selling points of TeX is > stability. (Defeated somewhat by the instability of LaTeX and its > thousands of packages, but that's another story.) > If I need to write complex - or even bidi - scripts routinely, I'd > probably be forced into one of them; but the typical mathematician > doesn't. I do not see your point here. >>> One problem, of course, is that there is no MATHEMATICAL ROMAN set of >>> characters. This is one of the biggest botches in the whole >>> mathematical alphanumerical symbol botch. >> >> This was discussed here before; the LaTeX unicode-math package has options >> to control that (see its manual). For example, one gets a literal >> interpretation by: > > Exactly. TeX can do what it likes. No. TeX cannot handle UTF-8, and I recall LaTeX's capability to emulate that was limited. > But you said it was an incompatibility > with Unicode that TeX sets plain ASCII math letters as italic, > implying that TeX should not be allowed to do what it likes. In LuaTeX or XeTeX, it is obviously relative the original TeX definitions, those that most are used to. >>> If you encode semantic font >>> distinctions without requiring the use of higher-level markup, then >>> you need to encode also letters that are semantically distinctively >>> roman upright. >> >> It has already been encoded as mathematical style, see the "Mathematical >> Alphanumeric Symbols" here: >> http://www.unicode.org/charts/ > > *You* look. The plain upright style is unified with the BMP characters. Yes, that is why the Unicode paradigm departs from the TeX one. >>> A more general problem is that which font styles are meaningful, >>> depends on the document. For example, I give lectures and talks, and I >>> set my slides in sans-serif. As I don't (usually) use distinctive >>> sans-serif symbols in my work, the maths is all in sans-serif >>> too: form, not content. But what then should I see if I type a Unicode >>> mathematical italic symbol in my slides? Serif, or sans-serif? > >> >> It is up to you. The unicode-package, mentioned above, has options to >> control that. > > Of course it's up to me. I'm glad you agree. So why say that it's an > incompatibility with Unicode that TeX (by default) displays ASCII as > italic in maths? Are you changing your mind on that? I welcome that if > so, as that was what I found surprising. You have yourself noted that the BMP characters must be used for upright for consistent Unicode use, incompatible with TeX which sets them as italic. > (And, of course, it's much easier to use the established TeX > mechanisms for controlling these things, than to learn more options > for a package to allow me to use symbols that are hard to type and > even harder to distinguish clearly on screen.) It is because there are currently no convenient input methods, also mentioned before in this thread. >> It is traditional in pure math, and also in the physics books have looked >> into, to always use serif. Possibly sanf-serif belongs to another technical >> style. Unicode makes it possible to mix these styles on the character level, >> if you so will. > > It's also traditional, for mostly good reasons to do with the limited > resolution of projectors, to use sans-serif in presentations. The only > reason that most people still have serifed maths is that they don't > know how to do otherwise (\usepackage{cmbright} is enough for most > people, if only they knew). Yes, low resolution is a motivation for using sans-serif, but that may change in view of the new high resolution displays coming by about at this time. Hans

