Re: Rendering Raised FULL STOP between Digits

Asmus Freytag Fri, 22 Mar 2013 09:32:32 -0700

On 3/22/2013 4:16 AM, Philippe Verdy wrote:

2013/3/22 Asmus Freytag <[email protected]>:

The number of conventions that can be applicable to certain punctuation
characters is truly staggering, and it seems unlikely that Unicode is the
right place to
a) discover all of them or
b) standardize an expression for them.

My intent is certainly not to discover and encode all of them. But
existing characters are well known for having very common distinct
semantics which merit separate encodings.

This claim would have to be scrutinized, and, to be accepted, wouldrequire very detailed evidence. Also, on what principles would you basethe requirement to make a distinction in encoding?

And this includes notably their use as numeric grouping separators or decimal 
separators.

Well, the standard currently rules that such use does not warrantseparate encoding - and the standard has been consistent about that forthe entire 20+ years of its existence.

Further, all other character encoding standards have encoded thesecharacters as unified with ordinary punctuation. This is very differentfrom the ANO TELEIA discussion, where an argument could be made that*before* Unicode, the character occurred only in *specific* charactersets - and that was a distinction that was lost when these charactersets were mapped to Unicode.

No such argument exists for either middle dot or raised decimal point(except insofar as you could possibly claim that raised decimal pointhad never been encoded properly before, but then you'd have to show someevidence for that position).


Such common semantic modifiers would be eaiser to support than
encoding many new special variants of characters (that won't even be
rendered by most applications, and thus won't be used).

That might be the case - except that they would introduce a number ofproblems. Any "modifier" that has no appearance of its own can getseparated from the base character during editing.

The huge base of installed software is not prepared to handle anentirely different *kind* of character code, whereas support for simplecharacter additions is something that will eventually percolate throughmost systems - that fact makes disunifications a much morestraightforward process.


Some examples : the invisible multiplication sign, the invisible
function sign,

Nah, these are not modifiers. They stand on their own. Their"invisibility" is not ideal, but not any worse than "word joiner" or"zwsp". All of these characters are separators - with the differencethat the nature of the separator was determined to be crucial enough toencode explicitly. (And of course, reasonable people can disagree oneach case).

Note that Unicode cloned several characters based on their word-break(or non-break) behavior, which is not a novel idea (earlier characterencodings did the same with "no break space"). Already at that stage thetrain of having a "word break attribute character" (what you call amodifier) had left the station.

The only way to handle these issues, for better or for worse, is bydisunification (wher that can be justified in exceptional circumstances).

  and even the Latin/Greek mathematical letter-symbols
which were only encoded for encoding style differences which have
occasional but rare semantic differences. For me, adding those
variants was really pseudo-coding, breaking the fundamental encoding
model, and complicatin the task for font creators, renderer designers,
and increasing a lot the size and complexity of collation tables.

Many of these character variants could have been expressed as a base
character and some modifier (whose distinct rendering was only
optional), allowing a much easier integration and better use. Because
of that the UCD is full of many added variants that re alsmost never
used and we have to leave with encoded texts that persist in using
ambguous characters for the most common possible distinctions.

No, for the math alphabetics you would have had to have a modifier thatwas *not* optional, breaking the variation selector model.

There was certainly discussion of a "combining bold" or "combiningitalic" at the time.

One of the major reasons this was rejected included the desire toprevent the creation of such "operators" that could be applied to*every* character in the standard.

And, of course, the desire to allow ordinary software to do the rightthing in displaying these - the whole infrastructure to handle suchmodifiers would have been lacking.

Further, when you use and italic "a" in math, you do not need most (orall) software to be aware that this relates to an ordinary "a" in anyway. It doesn't, really, except in text-to-speech conversion or similar,highly specialized tasks. So, unlike variation selectors, there wouldhave been no benefit in using a modifier.

A./

Re: Rendering Raised FULL STOP between Digits

Reply via email to