Edward Cherlin wrote on 2002-01-10 21:28 UTC: > Unicode language tags are heavily deprecated. Language tagging is > markup, and there is no point pretending you have plain text when you > mark languages.
That's debatable, and the community around a language called "mathematics" is just heading in the exact opposite direction for good reasons, trying to flee the horrid world of XML and other markup languages and getting 90% of what they need into plaintext, which will for the foreseeable future remain the only common denominator data format that can transcent individual applications well. I see markup (in the sense of SGML et al.) as information that is highly specific to a document type, as in "this shall be field 47 of a patent application (inventor's mail address)". Everything more generic than that (such as what language is this, perhaps even generic forms of emphasis such as <EM>...</EM>) probabaly have indeed a fair place in a plaintext standard such as Unicode, because you definitely want langauge tags (and simple generic emphasis) to survive in a cut&paste from a patent application XML into a web page or company memo wordprocessing file. Otherwise, all markup languages would need the exact same language tagging (we had that already with character sets, remember? that's how we got to Unicode ...). > If you want tagging in plain text, use a standard. As far as I can > tell, the best available standard for such things is XML, which > defines Unicode as its preferred character set. I think, this is rapidly becoming the old-fashioned and technically difficult to justify view these days, as it seems clear now that no single standard markup languages (in particular not XML with its baroque syntax) has the potential for becoming as ubiquitous as some have ho^Hyped 5 years ago. Plaintext remains a far more powerful concept and XML is mostly a markup mechanism designed to overcome deficiencies in ASCII that appears rather clumsy in a pure Unicode plaintext world. Discuss. Markus -- Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK Email: mkuhn at acm.org, WWW: <http://www.cl.cam.ac.uk/~mgk25/> -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
