Markus Kuhn wrote:
> I just noticed that when I work in a UTF-8 locale (LC_CTYPE=en_GB.UTF-8), > that vim 6.0 normally opens a UTF-8 file such as Please use Vim 6.1 for this kind of testing. With the released patches if possible (using CVS is easiest). Vim 6.0 is quite old now. > http://www.cl.cam.ac.uk/~mgk25/ucs/examples/lyrics-ipa.txt > > properly in UTF-8 mode, but it deactivates UTF-8 mode when you load > instead a file that contains malformed sequences, such as > > http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt Since this file contains byte sequences that are illegal in UTF-8, it is converted to UTF-8 as if it were a latin1 file. The converted text can be edited normally. When writing the file the conversion is done in reverse, thus a read command followed by a write command produces an identical file. If you want to edit the file as if it were utf-8 you should first filter out the illegal byte sequences. To manually overrule the detection of the encoding use this command: :edit ++enc=utf-8 UTF-8-test.txt This is unsafe though, because you edit the file with the illegal byte sequences. > Even worse, it also deactivates UTF-8 mode when you load a file that > contains new Unicode 3.2 characters, such as > > http://www.cl.cam.ac.uk/~mgk25/UTF-8-demo.txt That should be: http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-demo.txt I can load this file without trouble with Vim 6.1. > I live now on a planet were any other encoding than UTF-8 does not exist > when I am in LC_CTYPE=en_GB.UTF-8. How do I tell vim 6.0 (and also > emacs) to pick the encoding *strictly* based on the locale and look at > absolutely nothing else? Falling back to ISO 8859-1 is not an option, > because ISO 8859-1 is completely unknown on my planet. If you only have UTF-8 files you don't need to do anything. If you communicate with other planets (and this message indicates you do :-) you will have to be able to edit ISO-8859-1 files as well. > Trying to escape the horrors and pain of automatic encoding detection in > a pure UTF-8 environment ... I haven't seen this planet yet. And as soon as I see it, I'll send a Latin1 file to it :-). Conclusion: this UTF-8 only planet does not exist. -- hundred-and-one symptoms of being an internet addict: 244. You use more than 20 passwords. /// Bram Moolenaar -- [EMAIL PROTECTED] -- http://www.moolenaar.net \\\ /// Creator of Vim -- http://vim.sf.net -- ftp://ftp.vim.org/pub/vim \\\ \\\ Project leader for A-A-P -- http://www.a-a-p.org /// \\\ Lord Of The Rings helps Uganda - http://iccf-holland.org/lotr.html /// -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
