On 15/11/08 13:51, Bram Moolenaar wrote:
>
> Dominique Pelle wrote:
>
>> In the second paragraph of ":help unicode", I read:
>>
>> <<
>> Unicode can be encoded in several ways. The two
>> most popular ones are UCS-2, which uses 16-bit
>> words and UTF-8, which uses one or more bytes
>> for each character.
>> I think this should be updated. The two most
>> popular ones are UTF-16 and UTF-8. From wikipedia,
>> "UCS-2 is an obsolete character encoding which is
>> a predecessor to UTF-16".
>>
>> Furthermore, strictly speaking, UCS-2 is not a
>> Unicode encoding, since it can only represent
>> characters in the Unicode page 0.
>>
>> UTF-16 and UCS-2 are too often confused with
>> each other. Let's not add to the confusion in Vim
>> help files.
>>
>> How about changing it to something like this...
>
> How about this:
>
> Unicode can be encoded in several ways. The most popular one is
> UTF-8, which uses one or more bytes for each character and is
> backwards compatible with ASCII. On MS-Windows UTF-16 is also
> used (previously UCS-2), which uses 16-bit words. Vim can
> support all of these encodings, but always uses UTF-8
> internally.
>
> UTF-16 and UCS-2 are confusing, mostly because on MS-Windows many
> programs only support UCS-2 and don't handle UTF-16 properly. Not many
> people in USA and Europe even know about the characters that UTF-16 adds
> or ever test with those. I suspect some code in Vim also doesn't work
> properly with UTF-16.
>
Well, at least most of the Vim Unicode code (I think all of it with the
exception of reading/writing to disk when 'fileencoding' is UTF-16 be or
le) uses UTF-8, which is completely immune to any problems due to the
U+FFFF / U+10000 boundary.
Best regards,
Tony.
--
Ignorance is the Mother of Devotion.
-- Robert Burton
--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_use" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---