On 18/10/08 10:41, JiaYanwei wrote:
> For a 2 byte BOM "FF FE", "ucs-2le" is used, which doesn't work for
> little-endian UTF-16 text.
> Like the patch 7.1.261, the only difference is the byte order.
> And I have also writen a patch for Vim-7.2.025:

I confirm that Vim 7.2.25 with 'fencs' starting in "ucs-bom" identifies 
UTF-16le files with BOM as if they were UCS-2le, even if codepoints 
above U+FFFF are present, which is an error. For instance U+20025 is 
read back as <d840><dc25> (two surrogates shown as distinct characters) 
instead of as one double-wide character.

Bram, there's work for you when you're back from holiday :-). I'm not 
competent to check the proposed patch by eyeball but I hope it does what 
is needed.

Yanwei, in the meantime I suggest the following autocommand (untested) 
as a workaround which doesn't need recompilation:

        au BufReadPost * if (&fenc == 'ucs-2le')  &&  &bomb
                \ | e ++enc=utf-16le | endif


Best regards,
Tony.
-- 
O give me a home,
Where the buffalo roam,
Where the deer and the antelope play,
Where seldom is heard
A discouraging word,
'Cause what can an antelope say?

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_dev" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

Raspunde prin e-mail lui