Re: Tag characters and in-line graphics (from Tag characters)

Asmus Freytag (t) Fri, 05 Jun 2015 03:49:30 -0700

On 6/4/2015 17:03 , "Chris" wrote:

This whole discussion is about the fact that it would be technicallypossible to have private character sets and private agreements thatyour OS downloads without the user being aware of it.

The sticky issues are not the questions of how to make available fontsor images for use by the OS.

Instead, they concern the fact that any such a model violates somepretty basic guarantees of plain text that the entire net infrastructurerelies on.

There are very obvious security issues. The start with tracking; everytime you access a custom code point, that fact potentially results in atrackable interaction. This problem affects even the "sticker" solutionthat people are hoping for for emoji. (On my system, no externalresources are displayed when I first open any message, and there is areason for that).

Beyond tracking, and beyond stickers (that is pictures that look likepictures) a generalized custom character set would allow "text" that isno longer really stable. You would be able to deliver identical e-mailsto people that display differently, because when you serve the customfonts, you would be able to customize what you deliver under the samecustom character set designator.

While this would be a wonderful way to circumvent censorship (other thanthe "man in the middle" version), you would likewise seriously underminethe ability to filter unwanted or undesirable texts, because the customcharacter set engine might recognize when a request comes from a filterand not the end user. (Just the other day, I came across a hackedwebsite that responded differently to search engined than to live users,making the hack effective for one and invisible to the other. Customcharacter sets would seem to just add to the hackers' arsenal here).

Finally, custom character sets sound like a great idea when thinking ofan extension of an existing character set. But that's not where theissues are. The issues come in when you use the same technology toprovide aliases for existing code points or for other custom characters.

Aliasing undermines the ability to do search (or any othercontent-focused processing, from sorting to spell-check).


At that point, the circle closes.

When Unicode was created, the alternative then was ISO 2022, which was astandard that addressed the issue of how to switch among (albeitpre-defined) character sets to achieve, in principle, coverage equal tothe union of these character sets.

Unicode was created to address two main deficiencies of that situation.Unification addressed the aliasing issue, so that code points were nolonger "opaque" but could be interpreted by software (other thandisplay), which was the second big drawback of the patchwork ofcharacter sets. A processing model for opaque code points is possible todefine, but it isn't very practical and in the late eighties people hadhad enough were glad to be quit of it.

Seen from this perspective, the discussion about custom character setspresents itself as a giant step backward, undermining the very advancesthat underlie the rapid acceptance and spread of Unicode.

A./

Re: Tag characters and in-line graphics (from Tag characters)

Reply via email to