Re: How to open - and fileencodings for various code pages (P.S.)

Tony Mechelynck Tue, 21 Apr 2009 20:36:11 -0700

On 22/04/09 05:09, Tony Mechelynck wrote:
>
> On 21/04/09 21:07, David Fishburn wrote:
>>
>> I am finishing off some work getting Vim to work with Outlook.
>>
>> Things were looking good until I tried to edit a message from a
>> Japanese co-worker.
>>
>> The basic flow is this:
>> 1.  Open the message in Outlook.
>> 2.  Hit a toolbar icon which fires some VB code.
>> 3.  The VB code opens which takes the body of the message and writes
>> it out to a file.
>> 4.  Then uses Vim OLE object to edit it by sending the actual command
>> you would type:
>>            :e ++enc=utf-16 myfile.txt
>>
>> When we write out the file (from VB) we can tell it to create a unicode file.
>>
>> In VB, I can look up the code page of the email.
>> It returns the last set of digits below, so for my Japanese email the
>> following is returned by VB: 50220 so using the chart below, I see
>> this line:
>>        Japanese (JIS) iso-2022-jp 50220
>>
>> So, now that I have written out a unicode file, I am trying to open it in 
>> Vim.
>> I can't figure out what the actual :e line is to open this file.
>>
>>            :e ++enc=utf-16 myfile.txt
>>            :e ++enc=utf-8 myfile.txt
>>
>> Both leave the file unreadable in Vim (upside down question marks fill
>> the screen).
>
> Windows "Unicode" files are typically little-endian UTF-16, while in
> Vim, if you don't specify the endianness for 16- or 32-bit Unicode you
> get big-endian. So if I were you I'd try
>
>       :e ++enc=utf-16le myfile.txt
>
> If the file (as displayed in Vim) starts then with<feff>, it means that
> it has a BOM (Windows UTF-16 files often do), and in that case Vim will
> detect the encoding automatically provided that 'fileencodings' (plural)
> starts with ucs-bom (and, of course, that you don't force a specific
> encoding with ++enc). If that's the case,
>
>       :e myfile.txt
>
> would be enough (and Vim will remember that the file has a BOM by
> setting 'bomb' locally for that file).
>
>
> Best regards,
> Tony.


P.S. Another possibility is that 'encoding' is not set to UTF-8, in 
which case Vim cannot represent in memory any UTF-8 characters which 
can't be represented in your current 'encoding'. In that case, you can 
add the following near the top of your vimrc:

if has('multi_byte')    " can we handle Unicode?
        " if already Unicode, no need to change it
        if &enc =~? '^u'
                " avoid misunderstandings with the keyboard
                " (and the display if in console mode)
                if &tenc == ""
                        let &tenc = &enc
                endif
                set enc=utf-8
        " set encoding autodetection
        set fencs=ucs-bom,utf-8,latin1
        " replace 'latin1' above by something else
        " if you want a different 'fileencoding' default
        " for non-Unicode files
else
        echohl Error
        " use :echomsg rather than just :echo
        " so it can be recalled by :messages
        echomsg 'Warning: Cannot use multibyte encodings'
        echohl Normal
endif

Since the characters of any encoding can be represented in Unicode, this 
shouldn't have any harmful effects, provided that (if you use 
non-Unicopde encodings other than Latin1) the iconv or libiconv library 
is either linked statically with Vim, or linked dynamically and 
installed where Vim can reach it. You can check it with

        :echo has('iconv')

which should return a nonzero value (normally 1).


-- 
hundred-and-one symptoms of being an internet addict:
16. You step out of your room and realize that your parents have moved 
and you
     don't have a clue when it happened.

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_use" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

Re: How to open - and fileencodings for various code pages (P.S.)

Reply via email to