>
> editing a _corrupted_ CP1252 file
> <http://groups.google.com/group/vim_use/t/d6874651567bc841?utm_source=digest&utm_medium=email>
>
> Erik Christiansen <[email protected]>: Jan 27 08:03PM +1100
>
> On 26.01.16 13:11, Kenneth Reid Beesley wrote:
> > setglobal fileencoding=utf-8
>
> > “ when editing an existing file, try to read it in these encodings, use the
> > first that succeeds
> > set fileencodings=ucs-bom,utf-8,latin1
>
> It doesn't seem all that complicated. A quick test on a "simulated
> corrupted CP1252" file like yours both displayed <81>, and 8g8 worked
> here without any fiddling at all. I have:
>
> set encoding=utf-8 " Is default anyway, IIRC.
>
> fileencodings=ucs-bom,utf-8,default,latin1
>
> which seems to have latched on: fileencoding=latin1
> given the input file.
>
> That seems to confirm what ":h 8g8" says:
>
> This works in two situations:
> 1. when 'encoding' is any 8-bit encoding
> 2. when 'encoding' is "utf-8" and 'fileencoding' is any 8-bit encoding
>
> And I don't even have any "++bad=keep" anywhere.
>
> Erik
>
>
>
Hi Erik,
Thanks again. Here are the key points:
I know (omnisciently) that certain files should be cp1252, but that
some of them are corrupted with undefined-in-cp1252 bytes like \x81.
I want to edit the file as CP1252, i.e., I want the fileencoding to be
cp1252 because that’s what the file is _supposed_ to be.
I want to see/find any bad bytes so that I can manually correct them
(but I’m not perfect—I might miss some bad bytes).
AND CRUCIALLY When I try to write the buffer back to file, I want the
fileencoding to be set to cp1252 so that gvim will refuse to write the file if
I’ve missed any bad bytes like \x81.
gvim -c “e ++enc=cp1252 ++bad=keep” filethatshouldbecp1252.txt
gives me exactly what I want. The fileencoding is forced to be cp1252. I see
bad-for-cp1252 bytes like \x81 kept and displayed in blue as <81>, and I can
find
them with 8g8. AND If I neglected to fix a bad byte like \x81, and I try to
write the buffer back to file, I get an error message saying that conversion
failed.
That effectively forces me to fix the file, make it legal cp1252, before the
buffer is written back to file.
If the fileencoding defaults to latin1, gvim will happily write the
bad-for-cp1252 bytes back to the original file, which remains corrupted.
Best,
Ken
--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php
---
You received this message because you are subscribed to the Google Groups
"vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.