Re: Unicode in VFAT file system

Mark Davis Fri, 21 Jul 2000 07:48:47 -0700

Unicode has changed and evolved over the years. At this point, UCS-2 is a funny
beast, because it shares precisely the same encoding space as UTF-16. That is,
in code units there is absolutely no difference between them. The only real
difference is whether you interpret the code units in the range D800..DFFF.
(Interpret them correctly, of course!)

As a serialization, UTF-16 has three forms: UTF-16, UTF-16BE, and UTF-16LE. The
first is with (optionally) a BOM, and the others without. Since UCS-2 shares the
same coding space, and thus serialization, it's not really a good idea to speak
of UCS-2LE etc.; much better to just use the UTF-16 names.

The best way I find to think of UCS-2 at this point is *not*
(&#x1D45B;&#x1D45C;&#x1D461;)  another encoding, but rather simply a shorthand
for a particular supported subset of UTF-16. In that way, it is like other
subsets: for example, I can talk about the Cyrillic-block repertoire in UTF-16.

Mark

Re: Unicode in VFAT file system

Reply via email to