Re: [gentoo-user] Re: recompiling vim linked to libncursesw
El Wed, Jul 27, 2005 at 09:04:29AM +0300, Moshe Kaminsky me decía: * Fernando Canizo [EMAIL PROTECTED] [27/07/05 07:15]: Hi all. I'm having trouble with my encoding using mutt + vim + utf-8, basically mi emails are sent with wrong encoding when *replying*. I've tracked the problem, searched, readed FAQs and i found that maybe my problem is this: that while mutt is linked to libncursesw (wide library) vim is to libncurses (normal), this is the output of ldd: I find it hard to believe that this is the problem. You say that you can use utf8 when you are composing (or writing some other stuff), right? What are the values of 'encoding' and 'fileencoding' in vim when replying? Moshe Like i said to Richard, maybe you're right. I mean: i can write an utf-8 file from scratch using vim alone, so why would not when invoking vim from mutt? Maybe is that mutt is telling vim something incorrect when they communicate. Well, i'll give more information, but this gonna grow large ;) Reading the mutt FAQ (http://wiki.mutt.org/index.cgi?MuttFaq/Charset) and checking everything is ok: locale seems to be ok: ~$ locale LANG=es_AR.utf-8 LC_CTYPE=es_AR.utf-8 LC_NUMERIC=es_AR.utf-8 LC_TIME=es_AR.utf-8 LC_COLLATE=es_AR.utf-8 LC_MONETARY=es_AR.utf-8 LC_MESSAGES=es_AR.utf-8 LC_PAPER=es_AR.utf-8 LC_NAME=es_AR.utf-8 LC_ADDRESS=es_AR.utf-8 LC_TELEPHONE=es_AR.utf-8 LC_MEASUREMENT=es_AR.utf-8 LC_IDENTIFICATION=es_AR.utf-8 LC_ALL=es_AR.utf-8 the locales settings are supported: ~$ locale -a C es_AR es_AR.utf8 POSIX checking if the locales work with perl: ~$ perl -e ok doesn't show anything checking if perl is doing the right things by setting an erroneous locale: ~$ env LC_ALL=nocharset perl -e perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LANGUAGE = es_AR.utf-8, LC_ALL = nocharset, LC_CTYPE = es_AR.utf-8, LANG = es_AR.utf-8 are supported and installed on your system. perl: warning: Falling back to the standard locale (C). ok, it cries, so it's working ok my ~/.signature is in utf-8, my ~/.alias is too i got this in my ~/.vimrc: set encoding=utf-8 set fileencoding=utf-8 set termencoding=utf-8 and when mutt invokes vim i re-check that this is ok, and is ok (i mean i check in runtime and it obbeys the configuration) i got this in my ~/.muttrc: set send_charset=us-ascii:utf-8 set charset=utf-8 set locale=es_AR.utf8 from the mutt man i know this settings should not be necessary, since the system is configured ok, but i try with and without and get no difference. Ok, that's all concerning configuration. Now i tell you how the problem works: in mutt, if i compose a mail from scratch, without anything, not even signature, and put a LATIN SMALL LETTER A WITH ACUTE (got that name from unicode chart), and then send it to myself, and to a friend, my friend sees it ok and i too. But if now i reply to this same mail, when vim comes with the quoted text that mutt passes to it y see garbage. So mutt is ok seeing and sending utf-8, vim is ok writing and reading utf-8, but when both cooperate, things get screwed. I investigate what was in the archives, so i saved a copy (using 'C' command from mutt) of the first message (the one i receive from me) and file says: 'UTF-8 Unicode mail text', check what's inside with hexedit and see that LATIN SMALL LETTER A WITH ACUTE is encoded with this hex: C3 A1 (which is not 00 E1 from unicode chart from http://www.unicode.org/charts/) Then i got that same mail and press 'r' from mutt to respond, comes vim with garbled text, and without touching anything i save it under some other name, and then cancel message, file in this saved text gives me: 'UTF-8 Unicode text', but when i see inside with hexedit, i got this hex for the same letter: C3 83 C2 A1, so now i have 4 bytes instead of the too before. So vim-mutt (?) is re-encoding the stuff. Like i said before, checking in vim when called from mutt for enc, and fenc gives utf-8 like it should. I create a file with vim to check this differences with the unicode chart and i got C3 A1 too, so maybe the problem is with vim, it should put 00 E1 for LATIN SMALL LETTER A WITH ACUTE. Well, that's all i remember to have done for this problem. I'm with this for 4 months now, i think, i took the problem, get tired of searching and testing stuff, leave for a month, get to charge again, and so on. But now i really want to solve it. I think i'm going to crosspost this to vim and mutt mailing lists. But if someone here knows how to solve this, i appreciate any help, tip, direction to look or search, or maybe praying would do the job. Linking after build vim to libncursesw the way Richard say a couple of mails before didn't solve it. If there is a way to say to emerge that link vim to this library from the beginning i would like to try it. Besides if anyone knows how to get which programs are using a library (i have done it before, but at this time my
Re: [gentoo-user] Re: recompiling vim linked to libncursesw
On 7/27/05, Moshe Kaminsky [EMAIL PROTECTED] wrote: Hi, * Fernando Canizo [EMAIL PROTECTED] [27/07/05 14:14]: snip I investigate what was in the archives, so i saved a copy (using 'C' command from mutt) of the first message (the one i receive from me) and file says: 'UTF-8 Unicode mail text', check what's inside with hexedit and see that LATIN SMALL LETTER A WITH ACUTE is encoded with this hex: C3 A1 (which is not 00 E1 from unicode chart from http://www.unicode.org/charts/) I think this is just the way these characters are represented in utf-8. Yes, it is. 00E1 hex is '000 1111' in binary. When encoding this as UTF-8 this value is stored in two bytes. The last byte will begin with '10' followed by the last 6 bits of data. '10 11' binary or 'A1' in hex. The first byte will begin with '110' to indicate that it is a two byte character followed by the remaining significant data. '110 00011' binary or 'C3' hex. This is correct. The problem seem to be that mutt(?) takes this UTF-8 encoded data and encodes as UTF-8 again as if the data was two 8 bit characters. 'C3' then becomes 'C3 83' and 'A1' becomes 'C2 A1' /Andreas -- gentoo-user@gentoo.org mailing list
Re: [gentoo-user] Re: recompiling vim linked to libncursesw
El Wed, Jul 27, 2005 at 02:45:27PM +0300, Moshe Kaminsky me dec�a: i got this in my ~/.vimrc: set encoding=utf-8 set fileencoding=utf-8 Please try removing this setting, then check the value after vim reads the file (when you reply). Vim sets this option when editing an existing file according to what it thinks the encoding of the file to be. Also, you might want to try something like A-jh�! I remove those settings from mi ~/.vimrc and when replying to a mail that contains utf-8 (some mail that i sent myself) i get: encoding=utf-8 but: fileencoding=latin1 So, mutt is showing correctly an utf-8 mail, but when it creates the quoted mail for reply in /tmp is giving vim a recoded file. I think... is that? So, what's happening here? :e ++enc=utf-8 file don't know how to use this from the call from mutt This will force vim to read this file as a utf-8 file. Also, what is the value of 'fileencodings'? fileencodings=ucs-bom,utf-8,latin1,default i already have tried with fileencodings=utf-8 only and the problem stays. i got this in my ~/.muttrc: set send_charset=us-ascii:utf-8 Might want to try just utf-8, but I don't think it will matter. Oh, no, i already tried it, i put before: set send_charset=utf-8 set send_charset=us-ascii:iso-8859-1:utf-8 and the problem stays too. Ok, that's all concerning configuration. Now i tell you how the problem works: in mutt, if i compose a mail from scratch, without anything, not even signature, and put a LATIN SMALL LETTER A WITH ACUTE (got that name from unicode chart), and then send it to myself, and to a friend, my friend sees it ok and i too. But if now i reply to this same mail, when vim comes with the quoted text that mutt passes to it y see garbage. Can you include one of this characters in your reply? Yes! In the next line a LATIN SMALL LETTER A WITH ACUTE: � But you should see that ok, now if i reply to that message, when it reaches me through the list, i will send this stuff: á which seems to be (not sure about this) the codification of the two bytes of the LATIN SMALL LETTER A WITH ACUTE but un latin1. Why is that? I don't know. I investigate what was in the archives, so i saved a copy (using 'C' command from mutt) of the first message (the one i receive from me) and file says: 'UTF-8 Unicode mail text', check what's inside with hexedit and see that LATIN SMALL LETTER A WITH ACUTE is encoded with this hex: C3 A1 (which is not 00 E1 from unicode chart from http://www.unicode.org/charts/) I think this is just the way these characters are represented in utf-8. Yes it is. I were looking at the worng charts, the same was said to me in the vim list. -- Fernando Canizo - LUGMen: www.lugmen.org.ar - A8N: a8n.lugmen.org.ar Jacquin's Postulate on Democratic Government: No man's life, liberty, or property are safe while the legislature is in session. -- gentoo-user@gentoo.org mailing list
Re: [gentoo-user] Re: recompiling vim linked to libncursesw
El Wed, Jul 27, 2005 at 02:39:58PM +0200, Andreas Claesson me dec�a: The problem seem to be that mutt(?) takes this UTF-8 encoded data and encodes as UTF-8 again as if the data was two 8 bit characters. 'C3' then becomes 'C3 83' and 'A1' becomes 'C2 A1' Yes!, yes! something like that is happening (maybe is re-encoding in latin1, but you seem to be the master of encodings, so sure i'm wrong with this), but i don't know what to do about. -- Fernando Canizo - LUGMen: www.lugmen.org.ar - A8N: a8n.lugmen.org.ar See you in hell, candy boys!! -- Homer Simpson Homer Badman -- gentoo-user@gentoo.org mailing list
Re: [gentoo-user] Re: recompiling vim linked to libncursesw
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Fernando Canizo wrote: Yes! In the next line a LATIN SMALL LETTER A WITH ACUTE: � This doesn't seem to be UTF-8 (I get a box in UTF-8 mode,) but it looks OK in latin-1. But you should see that ok, now if i reply to that message, when it reaches me through the list, i will send this stuff: á Now this looks like UTF-8. which seems to be (not sure about this) the codification of the two bytes of the LATIN SMALL LETTER A WITH ACUTE but un latin1. - -- [Name ] :: [Matan I. Peled] [Location ] :: [Israel] [Public Key] :: [0xD6F42CA5] [Keyserver ] :: [keyserver.kjsl.com] encrypted/signed plain text preferred -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQFC54MGA7Qvptb0LKURAknKAJ9a97ZLS6r16JmNYUjauD5yoepWWgCePt8g cPcyu0FqLM9xtTO5eUZU8sU= =guht -END PGP SIGNATURE- -- gentoo-user@gentoo.org mailing list
Re: [gentoo-user] Re: recompiling vim linked to libncursesw
El Wed, Jul 27, 2005 at 03:50:14PM +0300, Matan Peled me decía: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Fernando Canizo wrote: Yes! In the next line a LATIN SMALL LETTER A WITH ACUTE: � This doesn't seem to be UTF-8 (I get a box in UTF-8 mode,) but it looks OK in latin-1. But you should see that ok, now if i reply to that message, when it reaches me through the list, i will send this stuff: á Now this looks like UTF-8. which seems to be (not sure about this) the codification of the two bytes of the LATIN SMALL LETTER A WITH ACUTE but un latin1. Yes, i did saw the same. I forgot i was replying and forgot to check fenc and enc in vim. Now i will force the stuff: setting fenc and enc to utf-8 ... Now i write a LATIN SMALL LETTER A WITH ACUTE: á -- Fernando Canizo - LUGMen: www.lugmen.org.ar - A8N: a8n.lugmen.org.ar I had no shoes and I pitied myself. Then I met a man who had no feet, so I took his shoes. -- Dave Barry -- gentoo-user@gentoo.org mailing list