peter juuls wrote:
--- "A.J.Mechelynck" <[EMAIL PROTECTED]>
escribió:
If you have some files using a Dos charset, and
other ones using a
Windows charset, the way to do it is file-by-file.
Here are a few
sections you should read in the help:
Thanks, Tony, for a thorough walkthrough of the
character set encoding options in vim, not only
regarding Windows-to-DOS switching, but in general.
My primary needs, by now, are to be able, on W32, to
open, display, edit and save files in 3 formats
- DOS-files with danish letters (CHCP tells me cp850
is my current codepage and :set encoding=cp850 solves
my switching problem)
- Notepad-files with danish letters (works out-of-the
box, as console vim7.0/W32 uses this as default, I
guess it is Windows-1252 character set - besides I can
use :set encoding=latin1 or :set encoding=latin9 in
vim, if I need to switch back from some other
encoding)
- Unicoded files, like exports from Registry Editor,
with or without danish letters (works out-of-the box
in vim7.0/W32, informing me that the file has been
converted, when opened, and vim also saves the
modified file in Unicode)
Thanks for your comprehensive reply, I will save it,
in case I run into problems with odd character sets
and file encodings.
Thanks
Peter
In addition to what I szaid in my earlier post, I might add that most
Unicode files produced by Windows are in UTF-16 little-endian with BOM.
These files will be automagically recognised by Vim, and displayed
correctly, if your 'encoding' is set to UTF-8 and your 'fileencodings'
heuristics starts with "ucs-bom" (as in the example code snippet in my
previous post). In that case, ":setlocal fileencoding? bomb?" on such a
file should asnswer " fileencoding=ucs-2le" and " bomb".
I have found it useful to display each file's encoding on its status
line. Here is how I set the 'statusline' option, you may use it as a
source of inspiration if you want (see ":help 'statusline' to decipher
it). If you want to use it, start by a copy-paste into your vimrc and
then edit it to your heart's liking:
if has("statusline")
set statusline=%<%f\ %h%m%r%=%k[%{(&fenc\ ==\
\"\"?&enc:&fenc).(&bomb?\",BOM\":\"\")}]\ %-14.(%l,%c%V%)\ %P
endif
It's one long line, bracketed in an ":if" statement to avoid an error on
Vim versions which cannot set a user-defined status line. If your mailer
or mine "beautifies" the :set line by adding extra line breaks, it will
probably break the line (once or more) at a backslash-escaped space.
Note that WordPad can read UTF-8 files if they have a BOM (if they have
":setlocal bomb") but it will write them as UTF-16le (aka ucs-2le) which
is a different Unicode encoding and, for Latin-alphabet text, usually
uses more disk space. The BOM (acronym of "byte order mark") is the
Unicode codepoint U+FEFF "zero-width no-break space" when at the very
start of a file. That codepoint has a different representation in each
of the basic 5 Unicode encodings, and its value in each of them is
"illegal" in all others (assuming that a little-endian UTF-16le file
won't start with a NULL). It is therefore used to discriminate between
Unicode encodings. See, among others, ":help Unicode" and
http://www.unicode.org/ for more info on Unicode.
Best regards,
Tony.