It is important to distinguish two cases: (a) which UTF one should emit in web pages , (b) which UTF one should use for internal processing. There is a tech note about this at http://www.unicode.org/notes/tn12/
Mark __________________________________ http://www.macchiato.com â ààààààààààààààààààààà â ----- Original Message ----- From: "John Cowan" <[EMAIL PROTECTED]> To: "steve" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Mon, 2004 Feb 23 04:50 Subject: Re: unicode format > steve scripsit: > > > Could someone please clarify the difference between UTF8 and UFT16 > > please? If it is possible to encode everything in UTF8 and it is more > > efficient what is the need for UTF16? > > The short version is that in UTF-8, characters can occupy 1, 2, 3, or > (very rarely) 4 bytes; in UTF-16, characters can occupy 2 or (very > rarely) 4 bytes. Either encoding can be used with any textual content. > > UTF-8 is typically more compact than UTF-16 for English and other > Latin-alphabet languages, slightly more compact for Greek, Cyrillic, > Armenian, Hebrew, and Arabic alphabets, and almost 50% less compact > for everything else. > > -- > John Cowan [EMAIL PROTECTED] http://www.ccil.org/~cowan > O beautiful for patriot's dream that sees beyond the years > Thine alabaster cities gleam undimmed by human tears! > America! America! God mend thine every flaw, > Confirm thy soul in self-control, thy liberty in law! > -- one of the verses not usually taught in U.S. schools > >

