<anbu at peoplestring dot com> wrote:

What are some of the reasons a new encoding will face challenges?

The main challenge to a new encoding is that UTF-8 is already present in numerous applications and operating systems, and that any encoding intended to serve as an alternative, let alone a replacement UTF-8, must be "better enough" to justify re-engineering of these systems.

Some people are simply opposed to additional encoding schemes. The HTML5 specification explicitly forbids the use of UTF-32, SCSU, and BOCU-1 (while allowing many non-Unicode legacy encodings and quietly mapping others to Windows encodings); one committee member was quoted as saying that other encodings of Unicode "waste developer time."

Any encoding that does not align code point boundaries along byte boundaries will be criticized for requiring excessive processing. The argument that I made will be made by others, that if it necessary to process bit-by-bit, one might as well use a general-purpose compression algorithm. It is popular to present gzip as the ideal compression approach, since it is widely available, especially on Linux-type systems, and publicly documented (and not IP-encumbered).

I may have missed some other objections.

--
Doug Ewell | Thornton, Colorado, USA
http://www.ewellic.org | @DougEwell ­

Reply via email to