Re: [gentoo-user] Re: charset iso/utf

2007-08-13 Thread Philip Webb
070813 Philip Webb wrote:
 I now have (via a line in  .bashrc ):
 
   LANG=
   LC_CTYPE=en_US.UTF-8  ... snip ...
   LC_ALL=en_US.UTF-8
 
 There's no difference in the headers.
 It occurs to me that I'm running Mutt via 'konsole -e mutt',
 which is restarted automatically by KDE .
 I did restart X  thereby KDE  Konsole+Mutt ,
 but just possibly that won't use  .bashrc : any thoughts ? 

Yes, I know, I should have started there (grimace):
I checked the Gentoo dox re 'locale'
 was told to login again before restarting X ,
which I did with the following result :

  MIME-Version: 1.0
  Content-Type: text/plain; charset=iso-8859-1
  Content-Disposition: inline
  Content-Transfer-Encoding: 8bit

Moreover, the e-acute now appears correctly in Most ,
which I use as pager with Mutt !  So progress, but comments still welcome.

-- 
,,
SUPPORT ___//___,  Philip Webb : [EMAIL PROTECTED]
ELECTRIC   /] [] [] [] [] []|  Centre for Urban  Community Studies
TRANSIT`-O--O---'  University of Toronto
-- 
[EMAIL PROTECTED] mailing list



Re: [gentoo-user] Re: charset iso/utf

2007-08-13 Thread Benno Schulenberg
Philip Webb wrote:
 070813 Philip Webb wrote:
  I now have (via a line in  .bashrc ):
 
LANG=
LC_CTYPE=en_US.UTF-8  ... snip ...
LC_ALL=en_US.UTF-8

Ideally LANG should be set and LC_ALL unset.  The individual LC_* 
variables will take their value from LANG when LC_ALL is unset.  
This has the advantage that you can override the individual 
variables, which is not possible when LC_ALL is set.

In /etc/env.d/02locale I have just this:

LANG=en_GB.utf8
LC_TIME=POSIX
LC_COLLATE=POSIX

and 'locale' produces:

LANG=en_GB.utf8
LC_CTYPE=en_GB.utf8
LC_NUMERIC=en_GB.utf8
LC_TIME=POSIX
LC_COLLATE=POSIX
LC_MONETARY=en_GB.utf8
LC_MESSAGES=en_GB.utf8
LC_PAPER=en_GB.utf8
LC_NAME=en_GB.utf8
LC_ADDRESS=en_GB.utf8
LC_TELEPHONE=en_GB.utf8
LC_MEASUREMENT=en_GB.utf8
LC_IDENTIFICATION=en_GB.utf8
LC_ALL=

Benno
-- 
[EMAIL PROTECTED] mailing list



Re: [gentoo-user] Re: charset iso/utf

2007-08-12 Thread Philip Webb
070811 Benno Schulenberg wrote:
 For the one you are actually using on the console (whether VT or xterm).  
 look at the output of 'locale'.

purslow: system locale
  LANG=
  LC_CTYPE=POSIX
  LC_NUMERIC=POSIX
  LC_TIME=POSIX
  LC_COLLATE=POSIX
  LC_MONETARY=POSIX
  LC_MESSAGES=POSIX
  LC_PAPER=POSIX
  LC_NAME=POSIX
  LC_ADDRESS=POSIX
  LC_TELEPHONE=POSIX
  LC_MEASUREMENT=POSIX
  LC_IDENTIFICATION=POSIX
  LC_ALL=

 Philip Webb also wrote:
 In  .muttrc  I have: 'set charset=iso-8859-1'
 Maybe comment this line out?

I have, with this result in a couple of test e-mails to myself:

  MIME-Version: 1.0
  Content-Type: text/plain; charset=unknown-8bit
  Content-Disposition: inline
  Content-Transfer-Encoding: 8bit
  User-Agent: Mutt/1.5.16 (2007-06-09)

 Why Gvim produces ISO-8859-1 when you run it from the command line, 
 and produces UTF-8 when run from mutt is weird.  Maybe you have utf-8
 as the first entry in 'assumed_charset' in your .muttrc?

There's no entry with 'assumed' in  .muttrc .

-- 
,,
SUPPORT ___//___,  Philip Webb : [EMAIL PROTECTED]
ELECTRIC   /] [] [] [] [] []|  Centre for Urban  Community Studies
TRANSIT`-O--O---'  University of Toronto
-- 
[EMAIL PROTECTED] mailing list



Re: [gentoo-user] Re: charset iso/utf

2007-08-12 Thread Benno Schulenberg
Philip Webb wrote:
 purslow: system locale
   LANG=
   LC_CTYPE=POSIX

You will have to use a UTF-8 locale if you want mutt/gvim to be able 
to handle anything beyond ASCII.  When setting a POSIX locale, I 
also get this:

   Content-Type: text/plain; charset=unknown-8bit

When using a UTF-8 locale, mutt correctly determines whether the 
produced message fits in us-ascii, iso-8859-1, or requires utf-8.

If just setting the better locale doesn't help, then also try with 
an empty .muttrc.

Benno
-- 
[EMAIL PROTECTED] mailing list



Re: [gentoo-user] Re: charset iso/utf

2007-08-12 Thread Philip Webb
070812 Benno Schulenberg wrote:
 You will have to use a UTF-8 locale
 if you want mutt/gvim to be able to handle anything beyond ASCII.
 When using a UTF-8 locale, mutt correctly determines
 whether the produced message fits in us-ascii, iso-8859-1 or needs utf-8.

I now have (via a line in  .bashrc ):

  purslow: ~ locale
LANG=
LC_CTYPE=en_US.UTF-8
LC_NUMERIC=en_US.UTF-8
LC_TIME=en_US.UTF-8
LC_COLLATE=en_US.UTF-8
LC_MONETARY=en_US.UTF-8
LC_MESSAGES=en_US.UTF-8
LC_PAPER=en_US.UTF-8
LC_NAME=en_US.UTF-8
LC_ADDRESS=en_US.UTF-8
LC_TELEPHONE=en_US.UTF-8
LC_MEASUREMENT=en_US.UTF-8
LC_IDENTIFICATION=en_US.UTF-8
LC_ALL=en_US.UTF-8

There's no difference in the headers.
It occurs to me that I'm running Mutt via 'konsole -e mutt',
which is restarted automatically by KDE .
I did restart X  thereby KDE  Konsole+Mutt ,
but just possibly that won't use  .bashrc : any thoughts ? 

 If just setting the better locale doesn't help,
 then also try with an empty .muttrc.

That sounds rather extreme (smile): are there specific lines to comment out ?

-- 
,,
SUPPORT ___//___,  Philip Webb : [EMAIL PROTECTED]
ELECTRIC   /] [] [] [] [] []|  Centre for Urban  Community Studies
TRANSIT`-O--O---'  University of Toronto
-- 
[EMAIL PROTECTED] mailing list



Re: [gentoo-user] Re: charset iso/utf

2007-08-11 Thread Benno Schulenberg
Philip Webb wrote:
 070810 Alexander Skwar wrote:
  You wrote: (I've just been reading LeCarré). Notice the
  letters é. This looks quite a lot like UTF-8 to me.
  In your header, you are saying, that you don't use UTF-8,
  though.

 I write e-mails with Gvim called up by Mutt (as now).
   [...] 
   termencoding -- character encoding used by the terminal
   set tenc=utf-8

This suggests you are using a UTF-8 locale.  In such an environment, 
gvim produces UTF-8 encoded files.  Try with 'gvim text', enter just 
your Ctrl-V 233, save the file, and look at it with 'xxd text'.  If 
it shows c3a9, it's UTF-8.  If gvim should produce ISO-8859-1, then 
make sure to call it with LC_ALL=C.

(But that does not solve the actual bug: mutt should not advertise 
charset=iso-8859-1 when the message contains UTF-8.)

Benno
-- 
[EMAIL PROTECTED] mailing list



Re: [gentoo-user] Re: charset iso/utf

2007-08-11 Thread Philip Webb
070811 Benno Schulenberg wrote:
 Philip Webb wrote:
 I write e-mails with Gvim called up by Mutt (as now).
   [...] 
   termencoding -- character encoding used by the terminal
  set tenc=utf-8
 This suggests you are using a UTF-8 locale.

In  /etc/locale.gen  I have

  en_US ISO-8859-1
  en_US.UTF-8 UTF-8

For everyday purposes, I have no use for anything beyond ASCII,
ie English + French German Spanish accents,
but I do want to be able to handle Esperanto  Ancient Greek with Vim.

 In such an environment, Gvim produces UTF-8 encoded files.
 Try with 'gvim text', enter just your Ctrl-V 233, save the file
 and look at it with 'xxd text'.  If it shows c3a9, it's UTF-8.

  purslow: tmp gvim test.d1
  [perform edit as described]
  purslow: tmp xxd test.d1
000: e90a ..

 If gvim should produce ISO-8859-1, make sure to call it with LC_ALL=C.

Could you clarify in light of my test ?
Eg do you mean I should alias Gvim in  .bashrc ?

 That does not solve the actual bug:
 Mutt should not advertise  charset=iso-8859-1 ,
 when the message contains UTF-8.

If this is a real issue, albeit small, not simply a quibble,
I'm willing to investigate further.  It is a bit irritating
that when I look at my e-mails with Most, e-acute doesn't display correctly.

-- 
,,
SUPPORT ___//___,  Philip Webb : [EMAIL PROTECTED]
ELECTRIC   /] [] [] [] [] []|  Centre for Urban  Community Studies
TRANSIT`-O--O---'  University of Toronto
-- 
[EMAIL PROTECTED] mailing list



Re: [gentoo-user] Re: charset iso/utf : PS

2007-08-11 Thread Philip Webb
070811 Philip Webb wrote:
 070811 Benno Schulenberg wrote:
 That does not solve the actual bug:
 Mutt should not advertise  charset=iso-8859-1 ,
 when the message contains UTF-8.

In  .muttrc  I have:

  set charset=iso-8859-1

Perhaps this sb changed to correspond with what Vim is doing ?

-- 
,,
SUPPORT ___//___,  Philip Webb : [EMAIL PROTECTED]
ELECTRIC   /] [] [] [] [] []|  Centre for Urban  Community Studies
TRANSIT`-O--O---'  University of Toronto
-- 
[EMAIL PROTECTED] mailing list



Re: [gentoo-user] Re: charset iso/utf

2007-08-11 Thread Benno Schulenberg
Philip Webb wrote:
 070811 Benno Schulenberg wrote:
  This suggests you are using a UTF-8 locale.

 In  /etc/locale.gen  I have

   en_US ISO-8859-1
   en_US.UTF-8 UTF-8

Well, that just shows which locales you have available, not which 
one you are actually using on the console (whether VT or xterm).  
For the latter look at the output of 'locale'.

 For everyday purposes, I have no use for anything beyond ASCII,
 ie English + French German Spanish accents,

Strictly speaking, French, German, and Spanish accented characters 
_are beyond ASCII, they are found only in the extended ASCIIs.

Philip Webb also wrote:
 In  .muttrc  I have:

   set charset=iso-8859-1

Maybe comment this line out?  Probably mutt will then determine 
itself which characterset any message you produce uses and 
automatically convert to the lowest one possible.

Why gvim produces ISO-8859-1 when you run it from the command line, 
and produces UTF-8 when run from mutt is... weird.  Maybe you have 
utf-8 as the first entry in 'assumed_charset' in your .muttrc?

  If gvim should produce ISO-8859-1, make sure to call it with
  LC_ALL=C.

 Could you clarify in light of my test ?
 Eg do you mean I should alias Gvim in  .bashrc ?

No, just how mutt calls it, as it apparently works fine when run 
from the console.  Fiddle with the editor setting in .muttrc (if 
still necessary, because removing the 'set charset=iso-8859-1' 
line may be all that is required.

Benno
-- 
[EMAIL PROTECTED] mailing list



Re: [gentoo-user] Re: charset iso/utf

2007-08-10 Thread Philip Webb
070810 Alexander Skwar wrote:
 You wrote: (I've just been reading LeCarré). Notice the letters é.
 This looks quite a lot like UTF-8 to me.
 In your header, you are saying, that you don't use UTF-8, though.

I write e-mails with Gvim called up by Mutt (as now).
My Gvim settings (probably dating from some work with Esperanto) are

  25 multi-byte characters

  encoding -- character encoding used in Vim: latin1, utf-8
euc-jp, big5, etc.
set enc=latin1
  fileencoding -- character encoding for the current file
(local to buffer)
set fenc=latin1
  fileencodings -- automatically detected character encodings
set fencs=ucs-bom,utf-8,default
  termencoding -- character encoding used by the terminal
set tenc=utf-8

I entered the e-acute using 'ctl-v 233'.

It doesn't make much difference to me, but suggestions are always welcome.

-- 
,,
SUPPORT ___//___,  Philip Webb : [EMAIL PROTECTED]
ELECTRIC   /] [] [] [] [] []|  Centre for Urban  Community Studies
TRANSIT`-O--O---'  University of Toronto
-- 
[EMAIL PROTECTED] mailing list