Re: get the umlauts right
Tobias Herp wrote: A.J.Mechelynck [EMAIL PROTECTED] wrote: Tobias Herp wrote: I' struggling for quite a while now to get the character encoding right; What does your Vim say on this file in reply to :verbose set enc? fenc? fencs? ? encoding=latin1 fileencoding= fileencodings=ucs-bom -- To set 'fileencoding' to something else than what Vim would normally expect, use the ++enc option to :edit, see :help ++opt. Doing a :e ++enc=utf8 % helped, thanks! When opening the file from the commandline, gvim +set enc=utf8 {filename} works (tested on Windows) -- To force recognition of a file as Unicode (e.g., UTF-8), use :setlocal bomb on it; then check that 'fileencoding' is setlocal'ed to some Unicode encoding (such as utf-8) and save. This didn't work for me. -- To force recognition of a file as not UTF-8 but Latin1 (assuming 'fileencodings' [plural] is set to ucs-bom,utf-8,latin1), put a number of upper-ASCII bytes (bytes 127) near the beginning, maybe in a comment. If the file is a text file, you can also use it as weird underlining (e.g. underline your main title with a row of (pounds sterling) or of Danish (slashed O's); then :setlocal fenc=latin1 and save. The following works well in one of my text files: - # zim: set fenc=latin1 nomod : £µ # zim (not vim) above is intentional - I didn't understand this dirty little trick completely. Is the set fenc=latin1 nomod of any relevance, then, except as a reminder? It's just a reminder: by changing zim to vim the line would be a Vim modeline, but this way Vim doesn't take it as such; what does the trick is the comment (whose bytes, as encoded in Latin1, are illegal in UTF-8 and thus trigger the reject side of Vim's UTF-8 encoding-recognition algorithm). Any string of repeated bytes in the range 128-255 would work just as well IIUC. I wrote a tip at vim-online a few days ago about this trick: http://vim.sourceforge.net/tips/tip.php?tip_id=1288 see :help modeline :help 'fileencodings' :help 'fileencoding' :help 'encoding' :help encoding-table Anyway, I finally inserted a line set fencs=ucs-bom,utf-8,latin1 into my _vimrc file, and everything seams to work fine now. Thanks a lot! My pleasure. Best regards, Tony.
get the umlauts right
Hi, fellow vimmers, I' struggling for quite a while now to get the character encoding right; I'd like vim to guess right, or at least to know which magical comment I could use to force vim to use the correct encoding settings. This is an everyday problem to me, since I work on Windows (different encoding conventions for GUI and shell programs!) as well as several Linux machines which are slightly differently configured. Via our web-based bugtracker, I created an example file (attached) which contains german umlauts and their Javascript and HTML encodings and should look like this: snip ä %E4 auml; (auml) ö %F6 ouml; (ouml) ü %FC uuml; (uuml) Ä %C4 Auml; (Auml) Ö %D6 Ouml; (Ouml) Ü %DC Uuml; (Uuml) ß %DF szlig; (szlig) /snip (to cover the case the webmail interface scrambles the HTML entities I repeated them in the 4th column without the amp; and ;) The umlauts are displayed correctly when I open the file with WinXP's notepad (which in turn doesn't like the *IX line endings), but vim doesn't get them right (Bram's Vim 7.0 on a german WinXP prof, +multi_byte_ime/dyn). Is there something I can do to make vim guess right, at the very least for this document? Thanks a lot in advance! -- Tobias msg2308 Description: Binary data
Re: get the umlauts right
Tobias Herp wrote: Hi, fellow vimmers, I' struggling for quite a while now to get the character encoding right; I'd like vim to guess right, or at least to know which magical comment I could use to force vim to use the correct encoding settings. This is an everyday problem to me, since I work on Windows (different encoding conventions for GUI and shell programs!) as well as several Linux machines which are slightly differently configured. Via our web-based bugtracker, I created an example file (attached) which contains german umlauts and their Javascript and HTML encodings and should look like this: snip ä %E4 auml; (auml) ö %F6 ouml; (ouml) ü %FC uuml; (uuml) Ä %C4 Auml; (Auml) Ö %D6 Ouml; (Ouml) Ü %DC Uuml; (Uuml) ß %DF szlig; (szlig) /snip (to cover the case the webmail interface scrambles the HTML entities I repeated them in the 4th column without the amp; and ;) The umlauts are displayed correctly when I open the file with WinXP's notepad (which in turn doesn't like the *IX line endings), but vim doesn't get them right (Bram's Vim 7.0 on a german WinXP prof, +multi_byte_ime/dyn). Is there something I can do to make vim guess right, at the very least for this document? Thanks a lot in advance! After saving the attachment and loading it in gvim, I see it all right. I am using: VIM - Vi IMproved 7.0 (2006 May 7, compiled Jul 23 2006 22:50:51) Included patches: 1-42 Compiled by [EMAIL PROTECTED] Huge version with GTK2-GNOME GUI. Features included (+) or not (-): [etc.] 'encoding' is set to utf-8 and the file opening heuristic also sets 'fileencoding' to utf-8 without BOM. This is weird since the attachment header says Content-Type: text/plain; charset=iso-8859-1 I wonder if Thunderbird converted it to UTF-8 or what. What does your Vim say on this file in reply to :verbose set enc? fenc? fencs? ? Notes: -- To set 'fileencoding' to something else than what Vim would normally expect, use the ++enc option to :edit, see :help ++opt. -- To force recognition of a file as Unicode (e.g., UTF-8), use :setlocal bomb on it; then check that 'fileencoding' is setlocal'ed to some Unicode encoding (such as utf-8) and save. -- To force recognition of a file as not UTF-8 but Latin1 (assuming 'fileencodings' [plural] is set to ucs-bom,utf-8,latin1), put a number of upper-ASCII bytes (bytes 127) near the beginning, maybe in a comment. If the file is a text file, you can also use it as weird underlining (e.g. underline your main title with a row of (pounds sterling) or of Danish (slashed O's); then :setlocal fenc=latin1 and save. The following works well in one of my text files: - # zim: set fenc=latin1 nomod : £µ # zim (not vim) above is intentional - Best regards, Tony.