Re: Having trouble opening utf-16le files.

Brian G. Shacklett Fri, 18 May 2012 14:15:49 -0700

On Friday, May 18, 2012 3:03:21 PM UTC-4, Tony Mechelynck wrote:
> On 18/05/12 19:08, Brian G. Shacklett wrote:
> > Many Windows 7 administrative tools seem to use utf-16le as their default 
> > file output. The tree command and the DNS administrative console are two 
> > examples. These files are generated with the proper BOM (<FF><FE>, but 
> > $encoding is empty when the file is opened, thus the text is unreadable.
> >
> > I checked the value fileencodings and it is set to "ucs-bom" in a fresh 
> > window.  Here is my _vimrc:
> >
> > set nocompatible
> > source $VIMRUNTIME/vimrc_example.vim
> >
> > set diffexpr=MyDiff()
> > function MyDiff()
> >    let opt = '-a --binary '
> >    if &diffopt =~ 'icase' | let opt = opt . '-i ' | endif
> >    if &diffopt =~ 'iwhite' | let opt = opt . '-b ' | endif
> >    let arg1 = v:fname_in
> >    if arg1 =~ ' ' | let arg1 = '"' . arg1 . '"' | endif
> >    let arg2 = v:fname_new
> >    if arg2 =~ ' ' | let arg2 = '"' . arg2 . '"' | endif
> >    let arg3 = v:fname_out
> >    if arg3 =~ ' ' | let arg3 = '"' . arg3 . '"' | endif
> >    let eq = ''
> >    if $VIMRUNTIME =~ ' '
> >      if &sh =~ '\<cmd'
> >        let cmd = '""' . $VIMRUNTIME . '\diff"'
> >        let eq = '"'
> >      else
> >        let cmd = substitute($VIMRUNTIME, ' ', '" ', '') . '\diff"'
> >      endif
> >    else
> >      let cmd = $VIMRUNTIME . '\diff'
> >    endif
> >    silent execute '!' . cmd . ' ' . opt . arg1 . ' ' . arg2 . ' > ' . arg3 
> > . eq
> > endfunction
> >
> > set gfn=Consolas:h9:cANSI
> > syntax on
> > set tabstop=4
> > set shiftwidth=4
> >
> > set nobackup
> > set nowritebackup
> >
> 
> 
> In order to be able to read Unicode files, it is your responsibility to 
> set 'encoding' to some Unicode-compatible encoding in your vimrc. You 
> may have to set some other options too.
> 
> - Typical 'encoding' value for Unicode: utf-8
> - If 'encoding' is set to one of the following, Vim will actually use 
> UTF-8 internally to avoid null bytes as part of multi-byte characters:
>       ucs-2 aka ucs-2be
>       ucs-2le
>       utf-16 aka utf-16be
>       utf-16le
>       ucs-4 aka ucs-4be aka utf-32 aka utf-32be
>       ucs-4le aka utf-32le
> - For the ucs-4 aka utf-32 family of encodings, additional "mixed 
> endiannesses" (byte-orders) 3412 and 2143 are recognised by the 
> standards but I don't know if Vim +iconv +multi_byte knows about them. 
> The answer may depend on your version of iconv. If supported, they 
> should also be represented as UTF-8 internally.
> - The Chinese-oriented encoding GB18030 (which uses typically 1 byte for 
> ASCII, 2 bytes for ordinary Chinese characters, 4 bytes for other 
> Unicode codepoints including some "rare" CJK characters) also allows 
> representation of any Unicode codepoint, but unlike the above-mentioned 
> encodings (whose conversion into each other is algorithmic) conversion 
> between GB18030 and other Unicode encodings requires the use of bulky 
> tables. I recommend it only for texts containing mostly Chinese 
> characters with the possibility of other Unicode codepoints appearing 
> occasionally.
> 
> For details, see http://vim.wikia.com/wiki/Working_with_Unicode
> 
> 
> Best regards,
> Tony.
> -- 
> Good leaders being scarce, following yourself is allowed.




On Friday, May 18, 2012 3:03:21 PM UTC-4, Tony Mechelynck wrote:
> On 18/05/12 19:08, Brian G. Shacklett wrote:
> > Many Windows 7 administrative tools seem to use utf-16le as their default 
> > file output. The tree command and the DNS administrative console are two 
> > examples. These files are generated with the proper BOM (<FF><FE>, but 
> > $encoding is empty when the file is opened, thus the text is unreadable.
> >
> > I checked the value fileencodings and it is set to "ucs-bom" in a fresh 
> > window.  Here is my _vimrc:
> >
> > set nocompatible
> > source $VIMRUNTIME/vimrc_example.vim
> >
> > set diffexpr=MyDiff()
> > function MyDiff()
> >    let opt = '-a --binary '
> >    if &diffopt =~ 'icase' | let opt = opt . '-i ' | endif
> >    if &diffopt =~ 'iwhite' | let opt = opt . '-b ' | endif
> >    let arg1 = v:fname_in
> >    if arg1 =~ ' ' | let arg1 = '"' . arg1 . '"' | endif
> >    let arg2 = v:fname_new
> >    if arg2 =~ ' ' | let arg2 = '"' . arg2 . '"' | endif
> >    let arg3 = v:fname_out
> >    if arg3 =~ ' ' | let arg3 = '"' . arg3 . '"' | endif
> >    let eq = ''
> >    if $VIMRUNTIME =~ ' '
> >      if &sh =~ '\<cmd'
> >        let cmd = '""' . $VIMRUNTIME . '\diff"'
> >        let eq = '"'
> >      else
> >        let cmd = substitute($VIMRUNTIME, ' ', '" ', '') . '\diff"'
> >      endif
> >    else
> >      let cmd = $VIMRUNTIME . '\diff'
> >    endif
> >    silent execute '!' . cmd . ' ' . opt . arg1 . ' ' . arg2 . ' > ' . arg3 
> > . eq
> > endfunction
> >
> > set gfn=Consolas:h9:cANSI
> > syntax on
> > set tabstop=4
> > set shiftwidth=4
> >
> > set nobackup
> > set nowritebackup
> >
> 
> 
> In order to be able to read Unicode files, it is your responsibility to 
> set 'encoding' to some Unicode-compatible encoding in your vimrc. You 
> may have to set some other options too.
> 
> - Typical 'encoding' value for Unicode: utf-8
> - If 'encoding' is set to one of the following, Vim will actually use 
> UTF-8 internally to avoid null bytes as part of multi-byte characters:
>       ucs-2 aka ucs-2be
>       ucs-2le
>       utf-16 aka utf-16be
>       utf-16le
>       ucs-4 aka ucs-4be aka utf-32 aka utf-32be
>       ucs-4le aka utf-32le
> - For the ucs-4 aka utf-32 family of encodings, additional "mixed 
> endiannesses" (byte-orders) 3412 and 2143 are recognised by the 
> standards but I don't know if Vim +iconv +multi_byte knows about them. 
> The answer may depend on your version of iconv. If supported, they 
> should also be represented as UTF-8 internally.
> - The Chinese-oriented encoding GB18030 (which uses typically 1 byte for 
> ASCII, 2 bytes for ordinary Chinese characters, 4 bytes for other 
> Unicode codepoints including some "rare" CJK characters) also allows 
> representation of any Unicode codepoint, but unlike the above-mentioned 
> encodings (whose conversion into each other is algorithmic) conversion 
> between GB18030 and other Unicode encodings requires the use of bulky 
> tables. I recommend it only for texts containing mostly Chinese 
> characters with the possibility of other Unicode codepoints appearing 
> occasionally.
> 
> For details, see http://vim.wikia.com/wiki/Working_with_Unicode
> 
> 
> Best regards,
> Tony.
> -- 
> Good leaders being scarce, following yourself is allowed.

Tony, setting encoding to utf-8 seems to have done the trick. Files open 
properly without my intervention now. Forgive my ignorance, but why is this not 
the default at this point? 

-- 
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

Re: Having trouble opening utf-16le files.

Reply via email to