2016-07-01 16:11 GMT+03:00 Gabriel Barta <[email protected]>:
> <...>
>> Which OS?
> It is the same on linux, mac and windows.
>
> To see it happen, try:
>   new|exe 'norm iTest'|set fenc=utf-16le|set bomb|exe '%!xxd'|%!xxd -r
>
> You will get a --No lines in buffer-- message from vim, and an empty buffer
> where it should say Test.  Removing the last command shows that it was
> fine up until trying to convert from hex back to text.
>
>>
>> On a Unix-like OS, you might try (assuming a bash-like shell)
>>
>>         LC_CTYPE=en_US.UTF-8 LC_ALL=  xxd
>
> I don't think changing the locale for xxd can help - xxd converts between
> 7-bit ascii and untranslated binary.
>
> My previous workaround with using latin1 only happened to work because
> the file I was playing with didn't contain any unicode codepoints. Just a
> case of windows using utf-16 for the joy of it.
>
> I might have something that works for unicode buffers with BOM (see below),
> but I think the real answer might be that it is reasonable to view text files 
> in
> hex mode, but it is a bit silly to want to be able to edit them.
>
> fun! s:unhexify()
>     let l:unibomb = !&binary && (&fileencoding=~#'^u') && &bomb
>     if (l:unibomb)
>         let l:fenc = &fileencoding
>         let l:enc = &encoding
>         let l:tenc = &termencoding
>         setlocal fenc=utf-8
>         setlocal enc=utf-8

You are not supposed to alter &encoding ever after startup: if you
change &encoding this automatically corrupts all strings. Also note
that `setlocal` here is misleading: `&encoding` is pure global option.
Consider your code run on system with EBCDIC support and &encoding set
to ebcdic (not sure though that encoding=utf8 will work there): AFAIK
the only variant in which &encoding is something not ASCII-compatible
(i.e. the only variant where it may need to be set *at all*, assuming
xxd on such systems does not expect EBCDIC):

0. Before running function all strings are in EBCDIC. &encoding is
EBCDIC as well.
1. You set &fileencoding to UTF-8 from whatever it was (e.g.
UTF16-LE). This changes nothing so far because it affects only what
encoding file will be converted to before writing/filtering/etc.
2. You set &encoding to UTF-8 from EBCDIC. This makes Vim thinks that
all internal strings are UTF-8 *without reencoding such strings*.
(Maybe the result of this action will be that function will stop
executing though: Vim keeps not AST, but function lines as-is and
reparses them on each run (including e.g. each iteration of the
cycle). But assume it did not.)
3. You set &termencoding to nothing. Absolutely useless action which
may only result in corrupt view, if it will have any results at all.
4. Now you run `silent %!xxd -r`. Because &fileencoding is UTF-8 and
&encoding is UTF-8, but actual text is still EBCDIC it will pass
EBCDIC text to `xxd` as UTF-8 (&encoding) does not need to be
converted to UTF-8 (&fileencoding).

You may see the same thing with the following code:

    LANG=C vim -u NONE -i NONE -N --cmd 'source /tmp/test.vim' --cmd
cq 2>&1 | iconv -f latin1

    " /tmp/test.vim
    scriptencoding utf-8
    function Corrupt()
        setlocal termencoding=latin1
        setlocal encoding=latin1
        setlocal fileencoding=latin1
        call setline('.', ["«»"])
        echomsg string(getline(1))
        %!hexdump -C
        echomsg string(getline(1))
        %delete _
        call setline('.', ["«»"])
        echomsg string(getline(1))
        setlocal termencoding=utf-8
        setlocal encoding=utf-8
        setlocal fileencoding=utf-8
        echomsg string(getline(1))
        %!hexdump -C
        echomsg string(getline(1))
        %delete _
        call setline('.', ["«»"])
    endfunction
    call Corrupt()

output will be

    '«»'
    '00000000  ab bb 0a                                          |...|'
    '«»'
    '<ab><bb>'
    '00000000  ab bb 0a                                          |...|'

: note that despite you changed &encoding from latin1 to utf-8 what
hexdump received did not change at all. Only you got corrupt view on
`«»`.

Basically your function needs to alter *only* &fileencoding. It *must
not* alter &encoding. It is *useless* to alter &termencoding. The only
reason it works is because unless you compiled Vim with EBCDIC support
on EBCDIC system Vim only allows ASCII-compatible &encoding values,
but &fileencoding has no such restriction so function is still useful
for your applications.

>         setlocal tenc=
>     endif
>     silent %!xxd -r
>     if l:unibomb
>         exe 'setlocal fenc='.l:fenc
>         exe 'setlocal enc='.l:enc
>         exe 'setlocal tenc='.l:tenc
>     endif
> endfun
>
> --
> --
> You received this message from the "vim_dev" maillist.
> Do not top-post! Type your reply below the text you are replying to.
> For more information, visit http://www.vim.org/maillist.php
>
> ---
> You received this message because you are subscribed to the Google Groups 
> "vim_dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.

-- 
-- 
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

--- 
You received this message because you are subscribed to the Google Groups 
"vim_dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Raspunde prin e-mail lui