<anbu at peoplestring dot com> wrote:
What are some of the reasons a new encoding will face challenges?
The main challenge to a new encoding is that UTF-8 is already present in
numerous applications and operating systems, and that any encoding
intended to serve as an alternative, let alone a replacement UTF-8, must
be "better enough" to justify re-engineering of these systems.
Some people are simply opposed to additional encoding schemes. The HTML5
specification explicitly forbids the use of UTF-32, SCSU, and BOCU-1
(while allowing many non-Unicode legacy encodings and quietly mapping
others to Windows encodings); one committee member was quoted as saying
that other encodings of Unicode "waste developer time."
Any encoding that does not align code point boundaries along byte
boundaries will be criticized for requiring excessive processing. The
argument that I made will be made by others, that if it necessary to
process bit-by-bit, one might as well use a general-purpose compression
algorithm. It is popular to present gzip as the ideal compression
approach, since it is widely available, especially on Linux-type
systems, and publicly documented (and not IP-encumbered).
I may have missed some other objections.
--
Doug Ewell | Thornton, Colorado, USA
http://www.ewellic.org | @DougEwell