Re: Unicode, SMS and year 2012

Doug Ewell Sat, 28 Apr 2012 12:53:35 -0700

<anbu at peoplestring dot com> wrote:

What are some of the reasons a new encoding will face challenges?

The main challenge to a new encoding is that UTF-8 is already present innumerous applications and operating systems, and that any encodingintended to serve as an alternative, let alone a replacement UTF-8, mustbe "better enough" to justify re-engineering of these systems.

Some people are simply opposed to additional encoding schemes. The HTML5specification explicitly forbids the use of UTF-32, SCSU, and BOCU-1(while allowing many non-Unicode legacy encodings and quietly mappingothers to Windows encodings); one committee member was quoted as sayingthat other encodings of Unicode "waste developer time."

Any encoding that does not align code point boundaries along byteboundaries will be criticized for requiring excessive processing. Theargument that I made will be made by others, that if it necessary toprocess bit-by-bit, one might as well use a general-purpose compressionalgorithm. It is popular to present gzip as the ideal compressionapproach, since it is widely available, especially on Linux-typesystems, and publicly documented (and not IP-encumbered).


I may have missed some other objections.

--
Doug Ewell | Thornton, Colorado, USA

http://www.ewellic.org | @DougEwell

Re: Unicode, SMS and year 2012

Reply via email to