Victor Gaultney wrote,

> If however, we say that this "does not adequately consider the harm done
> to the text-processing model that underlies Unicode", then that exposes a
> weakness in that model. That may be a weakness that we have to accept for
> a variety of reasons (technical difficulty, burden on developers, UI impact,
> cost, maturity).

Unicode's character encoding principles and underlying text-processing model remain robust.  They are the foundation of modern computer text processing.  The goal of 𝑛𝑒 𝑝𝑙𝑢𝑠 𝑢𝑙𝑡𝑟𝑎¹ needs to accommodate the best expectations of the end users and the fact that the consistent approach of the model eases the software people's burdens by ensuring that effective programming solutions to support one subset or range of characters can be applied to the other subsets of the Unicode repertoire.  And that those solutions can be shared with other developers in a standard fashion.

Assigning properties to characters gives any conformant application clear instructions as to what exactly is expected as the app encounters each character in a string.  In simpler times, the only expectation was that the application would splat a glyph onto a screen (and/or sheet of paper) and store a binary string for later retrieval.  We've moved forward.

'Unicode encodes characters, not glyphs' is a core principle. There's a legitimate concern whenever anyone is perceived as heading into the general direction of turning the character encoding into a glyph registry, as it suggests a possible step backwards and might lead to a slippery slope.  For example, if italics are encoded, why not fraktur and Gaelic?²

The notion that any given system can't be improved is static.³ ("System" refers to Unicode's repertoire and coverage rather than its core principles.  Core principles are rock solid by nature.)

¹ /ne plus ultra/
² "Conversely, significant differences in writing style for the same script may be reflected in the bibliographical classification—for example, Fraktur or Gaelic styles for the Latin script. Such stylistic distinctions are ignored in the Unicode Standard, which treats them as presentation styles of the Latin script."  Ken Whistler, http://unicode.org/reports/tr24/ ³ "Static" can be interpreted as either virtually catatonic or radio noise.  Either is applicable here.

Reply via email to