Re: [gentoo-user] Re: recompiling vim linked to libncursesw

2005-07-27 Thread Fernando Canizo
El Wed, Jul 27, 2005 at 09:04:29AM +0300, Moshe Kaminsky me decía:
 * Fernando Canizo [EMAIL PROTECTED] [27/07/05 07:15]:
  
  Hi all.
  
  I'm having trouble with my encoding using mutt + vim + utf-8,
  basically mi emails are sent with wrong encoding when *replying*. I've
  tracked the problem, searched, readed FAQs and i found that maybe my
  problem is this: that while mutt is linked to libncursesw (wide
  library) vim is to libncurses (normal), this is the output of ldd:
 
 I find it hard to believe that this is the problem. You say that you can 
 use utf8 when you are composing (or writing some other stuff), right? 
 What are the values of 'encoding' and 'fileencoding' in vim when 
 replying?
 Moshe

Like i said to Richard, maybe you're right. I mean: i can write an
utf-8 file from scratch using vim alone, so why would not when
invoking vim from mutt? Maybe is that mutt is telling vim something
incorrect when they communicate.

Well, i'll give more information, but this gonna grow large ;)

Reading the mutt FAQ (http://wiki.mutt.org/index.cgi?MuttFaq/Charset)
and checking everything is ok:

locale seems to be ok:
~$ locale
LANG=es_AR.utf-8
LC_CTYPE=es_AR.utf-8
LC_NUMERIC=es_AR.utf-8
LC_TIME=es_AR.utf-8
LC_COLLATE=es_AR.utf-8
LC_MONETARY=es_AR.utf-8
LC_MESSAGES=es_AR.utf-8
LC_PAPER=es_AR.utf-8
LC_NAME=es_AR.utf-8
LC_ADDRESS=es_AR.utf-8
LC_TELEPHONE=es_AR.utf-8
LC_MEASUREMENT=es_AR.utf-8
LC_IDENTIFICATION=es_AR.utf-8
LC_ALL=es_AR.utf-8

the locales settings are supported:
~$ locale -a
C
es_AR
es_AR.utf8
POSIX

checking if the locales work with perl:
~$ perl -e 
ok doesn't show anything

checking if perl is doing the right things by setting an erroneous
locale:
~$ env LC_ALL=nocharset perl -e 
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = es_AR.utf-8,
LC_ALL = nocharset,
LC_CTYPE = es_AR.utf-8,
LANG = es_AR.utf-8
are supported and installed on your system.
perl: warning: Falling back to the standard locale (C).

ok, it cries, so it's working ok

my ~/.signature is in utf-8, my ~/.alias is too

i got this in my ~/.vimrc:
set encoding=utf-8
set fileencoding=utf-8
set termencoding=utf-8

and when mutt invokes vim i re-check that this is ok, and is ok (i
mean i check in runtime and it obbeys the configuration)

i got this in my ~/.muttrc:
set send_charset=us-ascii:utf-8
set charset=utf-8
set locale=es_AR.utf8

from the mutt man i know this settings should not be necessary, since
the system is configured ok, but i try with and without and get no
difference.

Ok, that's all concerning configuration. Now i tell you how the
problem works: in mutt, if i compose a mail from scratch, without
anything, not even signature, and put a LATIN SMALL LETTER A WITH
ACUTE (got that name from unicode chart), and then send it to myself,
and to a friend, my friend sees it ok and i too.

But if now i reply to this same mail, when vim comes with the quoted
text that mutt passes to it y see garbage.

So mutt is ok seeing and sending utf-8, vim is ok writing and reading
utf-8, but when both cooperate, things get screwed.

I investigate what was in the archives, so i saved a copy (using 'C'
command from mutt) of the first message (the one i receive from me)
and file says: 'UTF-8 Unicode mail text', check what's inside with
hexedit and see that LATIN SMALL LETTER A WITH ACUTE is encoded with
this hex: C3 A1 (which is not 00 E1 from unicode chart from
http://www.unicode.org/charts/)

Then i got that same mail and press 'r' from mutt to respond, comes
vim with garbled text, and without touching anything i save it under
some other name, and then cancel message, file in this saved text
gives me: 'UTF-8 Unicode text', but when i see inside with hexedit, i
got this hex for the same letter: C3 83 C2 A1, so now i have 4 bytes
instead of the too before. So vim-mutt (?) is re-encoding the stuff.

Like i said before, checking in vim when called from mutt for enc, and
fenc gives utf-8 like it should.

I create a file with vim to check this differences with the unicode
chart and i got C3 A1 too, so maybe the problem is with vim, it should
put 00 E1 for LATIN SMALL LETTER A WITH ACUTE.

Well, that's all i remember to have done for this problem. I'm with
this for 4 months now, i think, i took the problem, get tired of
searching and testing stuff, leave for a month, get to charge again,
and so on. But now i really want to solve it.

I think i'm going to crosspost this to vim and mutt mailing lists. But
if someone here knows how to solve this, i appreciate any help, tip,
direction to look or search, or maybe praying would do the job. 

Linking after build vim to libncursesw the way Richard say a couple of
mails before didn't solve it. If there is a way to say to emerge that
link vim to this library from the beginning i would like to try it.

Besides if anyone knows how to get which programs are using a library
(i have done it before, but at this time my 

Re: [gentoo-user] Re: recompiling vim linked to libncursesw

2005-07-27 Thread Andreas Claesson
On 7/27/05, Moshe Kaminsky [EMAIL PROTECTED] wrote:
 Hi,
 * Fernando Canizo [EMAIL PROTECTED] [27/07/05 14:14]:
snip
  I investigate what was in the archives, so i saved a copy (using 'C'
  command from mutt) of the first message (the one i receive from me)
  and file says: 'UTF-8 Unicode mail text', check what's inside with
  hexedit and see that LATIN SMALL LETTER A WITH ACUTE is encoded with
  this hex: C3 A1 (which is not 00 E1 from unicode chart from
  http://www.unicode.org/charts/)
 
 I think this is just the way these characters are represented in utf-8.

Yes, it is.

00E1 hex is '000 1111' in binary.

When encoding this as UTF-8 this value is stored in two bytes.

The last byte will begin with '10' followed by the last 6 bits of data. 

'10 11' binary or 'A1' in hex.

The first byte will begin with '110' to indicate that it is a two byte 
character followed by the remaining significant data. 

'110 00011' binary or 'C3' hex.

This is correct.

The problem seem to be that mutt(?) takes this UTF-8 encoded data
and encodes as UTF-8 again as if the data was two 8 bit characters.
 
'C3' then becomes 'C3 83' and 'A1' becomes 'C2 A1' 


/Andreas

-- 
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] Re: recompiling vim linked to libncursesw

2005-07-27 Thread Fernando Canizo
El Wed, Jul 27, 2005 at 02:45:27PM +0300, Moshe Kaminsky me dec�a:
  i got this in my ~/.vimrc:
  set encoding=utf-8
  set fileencoding=utf-8
 
 Please try removing this setting, then check the value after vim reads 
 the file (when you reply). Vim sets this option when editing an existing 
 file according to what it thinks the encoding of the file to be. Also, 
 you might want to try something like

A-jh�! I remove those settings from mi ~/.vimrc and when replying to a
mail that contains utf-8 (some mail that i sent myself) i get:
encoding=utf-8
but:
fileencoding=latin1

So, mutt is showing correctly an utf-8 mail, but when it creates the
quoted mail for reply in /tmp is giving vim a recoded file.

I think... is that? So, what's happening here?

 :e ++enc=utf-8 file

don't know how to use this from the call from mutt

 This will force vim to read this file as a utf-8 file. Also, what is the 
 value of 'fileencodings'?

fileencodings=ucs-bom,utf-8,latin1,default

i already have tried with
fileencodings=utf-8
only and the problem stays.

  i got this in my ~/.muttrc:
  set send_charset=us-ascii:utf-8
 
 Might want to try just utf-8, but I don't think it will matter.

Oh, no, i already tried it, i put before:
set send_charset=utf-8
set send_charset=us-ascii:iso-8859-1:utf-8
and the problem stays too.

  Ok, that's all concerning configuration. Now i tell you how the
  problem works: in mutt, if i compose a mail from scratch, without
  anything, not even signature, and put a LATIN SMALL LETTER A WITH
  ACUTE (got that name from unicode chart), and then send it to myself,
  and to a friend, my friend sees it ok and i too.
  
  But if now i reply to this same mail, when vim comes with the quoted
  text that mutt passes to it y see garbage.
 
 Can you include one of this characters in your reply?

Yes!
In the next line a LATIN SMALL LETTER A WITH ACUTE:
�
But you should see that ok, now if i reply to that message, when it
reaches me through the list, i will send this stuff:
á
which seems to be (not sure about this) the codification of the two
bytes of the LATIN SMALL LETTER A WITH ACUTE but un latin1.

Why is that? I don't know.

  I investigate what was in the archives, so i saved a copy (using 'C'
  command from mutt) of the first message (the one i receive from me)
  and file says: 'UTF-8 Unicode mail text', check what's inside with
  hexedit and see that LATIN SMALL LETTER A WITH ACUTE is encoded with
  this hex: C3 A1 (which is not 00 E1 from unicode chart from
  http://www.unicode.org/charts/)
 
 I think this is just the way these characters are represented in utf-8.

Yes it is. I were looking at the worng charts, the same was said to me in
the vim list.
-- 
Fernando Canizo - LUGMen: www.lugmen.org.ar - A8N: a8n.lugmen.org.ar
Jacquin's Postulate on Democratic Government:
No man's life, liberty, or property are safe while the
legislature is in session.
-- 
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] Re: recompiling vim linked to libncursesw

2005-07-27 Thread Fernando Canizo
El Wed, Jul 27, 2005 at 02:39:58PM +0200, Andreas Claesson me dec�a:
 The problem seem to be that mutt(?) takes this UTF-8 encoded data
 and encodes as UTF-8 again as if the data was two 8 bit characters.
  
 'C3' then becomes 'C3 83' and 'A1' becomes 'C2 A1' 

Yes!, yes! something like that is happening (maybe is re-encoding in
latin1, but you seem to be the master of encodings, so sure i'm wrong
with this), but i don't know what to do about.

-- 
Fernando Canizo - LUGMen: www.lugmen.org.ar - A8N: a8n.lugmen.org.ar
See you in hell, candy boys!!

-- Homer Simpson
   Homer Badman
-- 
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] Re: recompiling vim linked to libncursesw

2005-07-27 Thread Matan Peled
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Fernando Canizo wrote:
 Yes!
 In the next line a LATIN SMALL LETTER A WITH ACUTE:
 �
This doesn't seem to be UTF-8 (I get a box in UTF-8 mode,)
but it looks OK in latin-1.
 But you should see that ok, now if i reply to that message, when it
 reaches me through the list, i will send this stuff:
 á
Now this looks like UTF-8.
 which seems to be (not sure about this) the codification of the two
 bytes of the LATIN SMALL LETTER A WITH ACUTE but un latin1.

- --
[Name  ]   ::  [Matan I. Peled]
[Location  ]   ::  [Israel]
[Public Key]   ::  [0xD6F42CA5]
[Keyserver ]   ::  [keyserver.kjsl.com]
encrypted/signed  plain text  preferred

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFC54MGA7Qvptb0LKURAknKAJ9a97ZLS6r16JmNYUjauD5yoepWWgCePt8g
cPcyu0FqLM9xtTO5eUZU8sU=
=guht
-END PGP SIGNATURE-
-- 
gentoo-user@gentoo.org mailing list



Re: [gentoo-user] Re: recompiling vim linked to libncursesw

2005-07-27 Thread Fernando Canizo
El Wed, Jul 27, 2005 at 03:50:14PM +0300, Matan Peled me decía:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 Fernando Canizo wrote:
  Yes!
  In the next line a LATIN SMALL LETTER A WITH ACUTE:
  �
 This doesn't seem to be UTF-8 (I get a box in UTF-8 mode,)
 but it looks OK in latin-1.
  But you should see that ok, now if i reply to that message, when it
  reaches me through the list, i will send this stuff:
  á
 Now this looks like UTF-8.
  which seems to be (not sure about this) the codification of the two
  bytes of the LATIN SMALL LETTER A WITH ACUTE but un latin1.

Yes, i did saw the same. I forgot i was replying and forgot to check
fenc and enc in vim.

Now i will force the stuff:
setting fenc and enc to utf-8
...
Now i write a LATIN SMALL LETTER A WITH ACUTE:
á

-- 
Fernando Canizo - LUGMen: www.lugmen.org.ar - A8N: a8n.lugmen.org.ar
I had no shoes and I pitied myself.  Then I met a man who had no feet,
so I took his shoes.
-- Dave Barry
-- 
gentoo-user@gentoo.org mailing list