It is important to distinguish two cases: (a) which UTF one should emit in web
pages , (b) which UTF one should use for internal processing. There is a tech
note about this at http://www.unicode.org/notes/tn12/

Mark
__________________________________
http://www.macchiato.com
â ààààààààààààààààààààà â

----- Original Message ----- 
From: "John Cowan" <[EMAIL PROTECTED]>
To: "steve" <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Sent: Mon, 2004 Feb 23 04:50
Subject: Re: unicode format


> steve scripsit:
>
> > Could someone please clarify the difference between UTF8 and UFT16
> > please?  If it is possible to encode everything in UTF8 and it is more
> > efficient what is the need for UTF16?
>
> The short version is that in UTF-8, characters can occupy 1, 2, 3, or
> (very rarely) 4 bytes; in UTF-16, characters can occupy 2 or (very
> rarely) 4 bytes.   Either encoding can be used with any textual content.
>
> UTF-8 is typically more compact than UTF-16 for English and other
> Latin-alphabet languages, slightly more compact for Greek, Cyrillic,
> Armenian, Hebrew, and Arabic alphabets, and almost 50% less compact
> for everything else.
>
> -- 
> John Cowan  [EMAIL PROTECTED]  http://www.ccil.org/~cowan
> O beautiful for patriot's dream that sees beyond the years
> Thine alabaster cities gleam undimmed by human tears!
> America! America!  God mend thine every flaw,
> Confirm thy soul in self-control, thy liberty in law!
>         -- one of the verses not usually taught in U.S. schools
>
>


Reply via email to