Re: Unexpected behavior loading cp1252 file as latin1

Vlad Irnov Fri, 28 Jan 2011 19:54:37 -0800


On Jan 20, 11:03 pm, Ben Fritz <[email protected]> wrote:
> I have a file which if read with the Windows-1252 encoding (cp1252 in
> Vim) has an en dash character (encoded as byte 150). When I load this
> file in a Vim with enc=latin1, and leave fenc blank, I would expect to
> see a "no character" block in place of the en dash. However, I see the
> en dash as if I loaded with enc/fenc set to cp1252.
>
> If I set encoding to utf-8, and load the same file with default
> fileencodings, it detects as latin1 and I see the "no character" glyph
> as expected. If I do :e ++enc=cp1252, or if I modify my fileencodings
> option to include cp1252 instead of latin1, I see the en dash, again
> as expected.
>
> Is this behavior intentional? It certainly could be considered
> helpful, but it was very unexpected.


It's not just en-dash. It also happens with adjacent cp1252
characters:
fat middle dot (decimal 149), fancy quotes.

Vim apparently uses cp1252 instead of latin-1 for &enc. My
understanding is
that the only difference between them is that cp1252 has characters
for bytes
128-159 while latin-1 uses them as control characters.

According to http://en.wikipedia.org/wiki/Windows-1252
"It is very common to mislabel Windows-1252 text data with the charset
label ISO-8859-1."

If you need these chars why not use cp1252 or Unicode and forget about
latin1.

-- 
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

Re: Unexpected behavior loading cp1252 file as latin1

Raspunde prin e-mail lui