A small point, and just because it hasn't been mentioned yet (as far as I have seen):
I guess that the main reason that RFC 2277,... point to UTF-8 is that both the Internet protocols (v4 and v6) and most hardware work with 8-bit bytes, and there is absolutely no indication that this will change soon. Regards, Martin. At 11:15 02/03/29 +0100, Keld J$B�S(Bn Simonsen wrote: >On Fri, Mar 29, 2002 at 12:40:41PM +0900, Bruce Thomson wrote: > > > The question is really why 8/16/32 bit Unicode is better than 5bit (ACE)? > > > > ACE and UTF-8 are just compression algorithms that squeeze larger > > Unicode characters. ACE is more efficient than UTF, although more > > complex. > > > > But the claim to fame that UTF-8 has is that it is a standard that > > idn can reference, and it is coming into widespread use elsewhere. > > > > So moving to UTF-8 long term seems like such an obvious choice > > is surprises me that it even gets debated. > >I think the UTF-8 is the way to go forward too, and I actuallt think >this is also IESG policy, viz RFC 2277 and RFC 2130. The UTF-8 RFC 2279 is >the only standards track RFC on charsets for the same reason. Citing >from RFC 2277, the IESG policy on charactyer sets and language: > > "Protocols MUST be able to use the UTF-8 charset, which consists of > the ISO 10646 coded character set combined with the UTF-8 character > encoding scheme, as defined in [10646] Annex R (published in > Amendment 2), for all text." > >I dont mind having an ASCII fallback as we also have it in email, but >going into the other encoding forms of ISO 10646 is discouraged by >IESG, as they do not want to see a lot of encodings for the same >character set, with the possible problems of interoperability. >So UCS-2, UCS-4, UTF-16, UTF-16-LE, UTF-16-BE etc are discouraged. > >Best regards >Keld Simonsen
