I think that we are perhaps getting a little off-topic now, but Unicde
will clearly help forward computing, so perhaps it can continue a few more
postings. :-)
At 17:45 +0100 97/11/10, Kent Karlsson [EMAIL PROTECTED] wrote:
> Let me reiterate:
> Unicode is ***NOT*** a glyph encoding!
...
>and never will be. The same character can be displayed as
>a variety of glyphs, depending not only of the font/style,
>but also, and this is the important point, on the characters
>surrounding a particular instance of the character. Also,
>a sequence of characters can be displayed as a single glyph,
>and a character can be displayed as a sequence of glyphs.
>Which will be the case, is often font dependent.
According to my Merriam Webster's dictionary a glyph is "a symbol that
conveys nonverbal information". So I am not sure, what is wrong with
viewing Unicode as a collection of glyphs? :-)
>I would be interested in knowing why you think
>"the idea of it as a character encoding thoroughly
>breaks down in a mathematical context". Deciding
>what gets encoded as a character is more an
>international social process than a mathematical
>process...
>PPS I don't know what you mean by "semantics of glyphs"
A symbol (or character) is roughly a graphical entity used to convey
semantic information; this is different from an illustration, which is a
graphical entity which uses certain semantic information as input, but from
which that semantic information may not be (fully) extractable.
For example, in Unicode, the Latin letters which diacritical marks are
classified; this is reasonable, because European languages languages have
fixed sets of letter symbols. But in mathematics, this is usually
semantically wrong; a diacritical mark usually an alteration of the symbol
to which it applies. Another example, Unicode has exponential digits
"^1",...,"^0", but using that is mathematically wrong because 10^1^2 (as
you have to write in Unicode) is not the same thing as 10^{12}. In
languages, changing the style of a glyph usually does not alter its
semantic information, but in math, it usually does. Sometimes the Unicode
mathematical glyphs are classified as graphical entities, sometimes as
mathematical quantities.
So Unicode is not sufficently coherent and comprehensive, in order to
suffice as a math symbol encoding. (I do not claim the problem is easy to
solve.)
In fact, there are some other protocols underway studying this problem,
and which may interact with Unicode in the future: One is MathML, another
is the LaTeX3 math-encoding project. It is quite difficult findong good
classifications of math glyphs; I have great respect for those working with
that.
Hans Aberg
* Email: Hans Aberg <mailto:[EMAIL PROTECTED]>
* AMS member listing: <http://www.ams.org/cml/>