2016-07-01 16:11 GMT+03:00 Gabriel Barta <[email protected]>:
> <...>
>> Which OS?
> It is the same on linux, mac and windows.
>
> To see it happen, try:
> new|exe 'norm iTest'|set fenc=utf-16le|set bomb|exe '%!xxd'|%!xxd -r
>
> You will get a --No lines in buffer-- message from vim, and an empty buffer
> where it should say Test. Removing the last command shows that it was
> fine up until trying to convert from hex back to text.
>
>>
>> On a Unix-like OS, you might try (assuming a bash-like shell)
>>
>> LC_CTYPE=en_US.UTF-8 LC_ALL= xxd
>
> I don't think changing the locale for xxd can help - xxd converts between
> 7-bit ascii and untranslated binary.
>
> My previous workaround with using latin1 only happened to work because
> the file I was playing with didn't contain any unicode codepoints. Just a
> case of windows using utf-16 for the joy of it.
>
> I might have something that works for unicode buffers with BOM (see below),
> but I think the real answer might be that it is reasonable to view text files
> in
> hex mode, but it is a bit silly to want to be able to edit them.
>
> fun! s:unhexify()
> let l:unibomb = !&binary && (&fileencoding=~#'^u') && &bomb
> if (l:unibomb)
> let l:fenc = &fileencoding
> let l:enc = &encoding
> let l:tenc = &termencoding
> setlocal fenc=utf-8
> setlocal enc=utf-8
You are not supposed to alter &encoding ever after startup: if you
change &encoding this automatically corrupts all strings. Also note
that `setlocal` here is misleading: `&encoding` is pure global option.
Consider your code run on system with EBCDIC support and &encoding set
to ebcdic (not sure though that encoding=utf8 will work there): AFAIK
the only variant in which &encoding is something not ASCII-compatible
(i.e. the only variant where it may need to be set *at all*, assuming
xxd on such systems does not expect EBCDIC):
0. Before running function all strings are in EBCDIC. &encoding is
EBCDIC as well.
1. You set &fileencoding to UTF-8 from whatever it was (e.g.
UTF16-LE). This changes nothing so far because it affects only what
encoding file will be converted to before writing/filtering/etc.
2. You set &encoding to UTF-8 from EBCDIC. This makes Vim thinks that
all internal strings are UTF-8 *without reencoding such strings*.
(Maybe the result of this action will be that function will stop
executing though: Vim keeps not AST, but function lines as-is and
reparses them on each run (including e.g. each iteration of the
cycle). But assume it did not.)
3. You set &termencoding to nothing. Absolutely useless action which
may only result in corrupt view, if it will have any results at all.
4. Now you run `silent %!xxd -r`. Because &fileencoding is UTF-8 and
&encoding is UTF-8, but actual text is still EBCDIC it will pass
EBCDIC text to `xxd` as UTF-8 (&encoding) does not need to be
converted to UTF-8 (&fileencoding).
You may see the same thing with the following code:
LANG=C vim -u NONE -i NONE -N --cmd 'source /tmp/test.vim' --cmd
cq 2>&1 | iconv -f latin1
" /tmp/test.vim
scriptencoding utf-8
function Corrupt()
setlocal termencoding=latin1
setlocal encoding=latin1
setlocal fileencoding=latin1
call setline('.', ["«»"])
echomsg string(getline(1))
%!hexdump -C
echomsg string(getline(1))
%delete _
call setline('.', ["«»"])
echomsg string(getline(1))
setlocal termencoding=utf-8
setlocal encoding=utf-8
setlocal fileencoding=utf-8
echomsg string(getline(1))
%!hexdump -C
echomsg string(getline(1))
%delete _
call setline('.', ["«»"])
endfunction
call Corrupt()
output will be
'«»'
'00000000 ab bb 0a |...|'
'«»'
'<ab><bb>'
'00000000 ab bb 0a |...|'
: note that despite you changed &encoding from latin1 to utf-8 what
hexdump received did not change at all. Only you got corrupt view on
`«»`.
Basically your function needs to alter *only* &fileencoding. It *must
not* alter &encoding. It is *useless* to alter &termencoding. The only
reason it works is because unless you compiled Vim with EBCDIC support
on EBCDIC system Vim only allows ASCII-compatible &encoding values,
but &fileencoding has no such restriction so function is still useful
for your applications.
> setlocal tenc=
> endif
> silent %!xxd -r
> if l:unibomb
> exe 'setlocal fenc='.l:fenc
> exe 'setlocal enc='.l:enc
> exe 'setlocal tenc='.l:tenc
> endif
> endfun
>
> --
> --
> You received this message from the "vim_dev" maillist.
> Do not top-post! Type your reply below the text you are replying to.
> For more information, visit http://www.vim.org/maillist.php
>
> ---
> You received this message because you are subscribed to the Google Groups
> "vim_dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
--
--
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php
---
You received this message because you are subscribed to the Google Groups
"vim_dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.