Re: Composing in utf8 from latin1 terminal

2018-10-25 Thread Derek Martin
On Thu, Oct 25, 2018 at 06:14:18PM +0100, Nuno Silva wrote:
> > Also, depending on exactly why you're doing this multi-locale stuff,
> > an even better solution may be to let Mutt's send_charset handle it
> > for you.  
[...]
> But, if I understood the purpose of send_charset correctly, it only
> affects the encoding of the outgoing message. It won't change the
> encoding with which mutt reads the temporary file after I close the text
> editor, will it?

No, but why would you care?  It's just an intermediate data encoding.
What I'm saying is, forget about your latin1 terminal, do EVERYTHING
in Unicode, and let Mutt send the data out as latin1 if appropriate.
This may not be appropriate for your use case--it comes down to your
reason for using a latin1 terminal to run Mutt in.  I can't think of a
good reason to do that any longer... so my assumption is your end goal
is to produce an outgoing message in latin1 so some recipients who
can't yet handle Unicode properly can read it.

If your purpose is something else, that may not work, but like I said
I can't think of a reason why you'd need to do that.  A unicode
terminal should have no trouble letting you enter Portuguese (or
whatever other language) so long as your locale is correct (and
consistent across all programs) and your IME is configured properly.
Then Mutt can convert to latin1 to send.

-- 
Derek D. Martinhttp://www.pizzashack.org/   GPG Key ID: 0xDFBEAD02
-=-=-=-=-
This message is posted from an invalid address.  Replying to it will result in
undeliverable mail due to spam prevention.  Sorry for the inconvenience.



pgpSNPZYrmzTL.pgp
Description: PGP signature


Re: Composing in utf8 from latin1 terminal

2018-10-25 Thread nunojsilva
On 2018-10-25, Derek Martin wrote:

> On Thu, Oct 25, 2018 at 11:11:52AM -0500, Derek Martin wrote:
>> On Thu, Oct 25, 2018 at 12:53:43PM +0100, Nuno Silva wrote:
>> > When I use emacsclient, the interface locale is not broken: the terminal
>> > I/O encoding is correctly set from the locale. The only difference (that
>> > I know of) is that Emacs will use utf8 to read/write files. If this
>> > should match the terminal encoding, then it *is* broken.
>
> Also, depending on exactly why you're doing this multi-locale stuff,
> an even better solution may be to let Mutt's send_charset handle it
> for you.  If set properly, you should be able to compose your messages
> in UTF-8, and as long as you don't use any non-latin1 characters,
> send_charset *should* make sure the message goes out encoded as
> latin1.
>
> The biggest drawback to this approach is you have to be very careful
> to not use any non-latin1 characters, or else the message will be sent
> as unicode.  Otherwise, you'd need to check every message before you
> send it to make sure Mutt will send it as the desired encoding.  I
> have a vague notion that certain other message transforms, like PGP,
> may also interfere with this, but I'm not 100% sure.

But, if I understood the purpose of send_charset correctly, it only
affects the encoding of the outgoing message. It won't change the
encoding with which mutt reads the temporary file after I close the text
editor, will it?

-- 
Nuno Silva



Re: Composing in utf8 from latin1 terminal

2018-10-25 Thread Derek Martin

On Thu, Oct 25, 2018 at 11:11:52AM -0500, Derek Martin wrote:
> On Thu, Oct 25, 2018 at 12:53:43PM +0100, Nuno Silva wrote:
> > When I use emacsclient, the interface locale is not broken: the terminal
> > I/O encoding is correctly set from the locale. The only difference (that
> > I know of) is that Emacs will use utf8 to read/write files. If this
> > should match the terminal encoding, then it *is* broken.

Also, depending on exactly why you're doing this multi-locale stuff,
an even better solution may be to let Mutt's send_charset handle it
for you.  If set properly, you should be able to compose your messages
in UTF-8, and as long as you don't use any non-latin1 characters,
send_charset *should* make sure the message goes out encoded as
latin1.

The biggest drawback to this approach is you have to be very careful
to not use any non-latin1 characters, or else the message will be sent
as unicode.  Otherwise, you'd need to check every message before you
send it to make sure Mutt will send it as the desired encoding.  I
have a vague notion that certain other message transforms, like PGP,
may also interfere with this, but I'm not 100% sure.

-- 
Derek D. Martinhttp://www.pizzashack.org/   GPG Key ID: 0xDFBEAD02
-=-=-=-=-
This message is posted from an invalid address.  Replying to it will result in
undeliverable mail due to spam prevention.  Sorry for the inconvenience.



pgppEdh53sHJZ.pgp
Description: PGP signature


Re: Composing in utf8 from latin1 terminal

2018-10-25 Thread Derek Martin
On Thu, Oct 25, 2018 at 12:53:43PM +0100, Nuno Silva wrote:
> When I use emacsclient, the interface locale is not broken: the terminal
> I/O encoding is correctly set from the locale. The only difference (that
> I know of) is that Emacs will use utf8 to read/write files. If this
> should match the terminal encoding, then it *is* broken.

I suspected as much.  The issue, as far as I can tell based on your
description of things, is that the emacs *server* is running with a
UTF-8 locale, which is why it is editing files in that locale, and
you're connecting to it from a client that's running a different
locale.  That's definitely broken, as would be any such locale
mismatch.  Your hack may well work around it... but you should be
very clear that you're doing something unexpected (and generally
undesirable) which may well break in other ways later.  That's what's
called a gross hack.

A likely better solution would be to run two instances of emacs
server, one in each locale, and make mutt connect to the right one.
But either way it comes down to the choice of running two different
instances of emacs, or using a gross hack to avoid that.

-- 
Derek D. Martinhttp://www.pizzashack.org/   GPG Key ID: 0xDFBEAD02
-=-=-=-=-
This message is posted from an invalid address.  Replying to it will result in
undeliverable mail due to spam prevention.  Sorry for the inconvenience.



pgp_KyQrAdmHJ.pgp
Description: PGP signature


Re: Composing in utf8 from latin1 terminal

2018-10-25 Thread nunojsilva
On 2018-10-24, Derek Martin wrote:

> On Tue, Oct 23, 2018 at 10:31:45PM +0100, Nuno Silva wrote:
>> On 2017-10-12, nunojsi...@ist.utl.pt wrote:
>> 
>> > Recently, I have tried to use mutt on a non-utf8 terminal.  Everything
>> > works as expected in an utf8 environment, but when I compose new e-mails
>> > in a latin1/ISO-8859-1 terminal, mutt will expect the file to be in the
>> > same encoding as the terminal, while my text editor will save the file
>> > in utf8. The result is that non-ASCII characters get misinterpreted,
>> > which can affect the message headers as well (e.g. real names in To: and
>> > Cc:).
>> [...]
>> > Is there some way to configure mutt so that it always uses utf8 to read
>> > the new message after I exit the editor? Or a way to enable some
>> > encoding autodetection that can tell utf8 apart from latin1?
>
> The bottom line is that your environment is misconfigured.  If you
> want this to work, you need to have LANG set properly at every point
> along the execution path.  Your terminal, terminal font, editor, and
> Mutt all need to know that you're using latin1 instead of Unicode, by
> having been started with a latin1 LANG setting.  You may need to
> configure your terminal to use the correct font, although with many
> modern terminals (like gnome-term, kterm, etc.) this should be
> unnecessary.
>
> If you are launching the latin1 terminal from a shell that has its
> LANG set to UTF-8, it could break (an example of this is starting
> hanterm, a terminal program expressly for Korean input with EUC-KR,
> with a UTF-8 locale--won't work).  If the shell running inside the
> terminal has LANG set to UTF-8, both Mutt and your editor could break.
> If you have manual settings on any of these programs to override the
> locale defined by the environment, it could break.  If you don't have
> all of these things set the same way, it could, and almost certainly
> will break.  Sometimes the breakage is subtle, e.g. if you dump the
> right characters to a terminal (say, with the cat command) tht has the
> right font, it will generally display them correctly, even if the
> locale is wrong.  But using them with programs that need to know the
> locale will still break.
>
> If you're using a Mutt setting to connect to an existing emacs
> instance (via emacsclient or similar) that's already running in a
> UTF-8 locale, that's broken.  You need to start a new instance of
> emacs whose locale is latin1.

I haven't noticed this before, but there *is* indeed a difference when
starting a fresh new Emacs instance instead of connecting to an existing
one using emacsclient: the new instance does use latin1 to read/write
files. (That is, the behaviour expected by mutt.)

When I use emacsclient, the interface locale is not broken: the terminal
I/O encoding is correctly set from the locale. The only difference (that
I know of) is that Emacs will use utf8 to read/write files. If this
should match the terminal encoding, then it *is* broken.

I might be happy with the way things are now (as my files are usually in
utf8, and mutt is the only context where I need the file encoding to
match the terminal), but I won't claim it isn't broken if it is.

> Lastly, you may need to adjust send_charset in Mutt.  It can have
> multiple locales, and Mutt will pick the first one that your document
> can be displayed in.  For example, mine is:
>
>   set send_charset="iso-8859-1:utf-8"
>
> If my e-mail contains no characters that need UTF-8, Mutt will choose
> to send the message as iso-8859-1, but otherwise as UTF-8.
>
> If you do those things, it should "just work" and if you don't it
> won't, at least without jumping through pointless hoops to force it,
> which will most likely just break other things.

send_charset appears to be working correctly here, I've checked it a
couple days ago. It isn't even set in any configuration file, so I
suppose it is using the default setting.

For now, I will leave the Emacs hack in place, as I prefer to use the
Emacs "server instance" instead of creating a new one. Everything else
is hopefully correctly set, as this has been the only encoding problem
I've had in the past months. (Now that I've said this, I will probably
discover a new one tomorrow...)

-- 
Nuno Silva



Re: Composing in utf8 from latin1 terminal

2018-10-25 Thread nunojsilva
On 2018-10-24, Ian Zimmerman wrote:

> On 2018-10-23 22:31, Nuno Silva wrote:
>
>> So far I did not find a way to change this on the mutt side, but I made
>> a new major mode for mutt messages in Emacs (the editor I use with
>> mutt), with a hook that changes the file encoding to latin1 if the file
>> was opened in a latin1 terminal and Emacs cannot detect a non-ASCII file
>> encoding.
>> 
>> It appears to work here. I'm sure someone who is more versed in Emacs
>> than me would be able to come up with a more elegant solution, but I'm
>> sharing mine here just in case it is useful to somebody else someday:
>> 
>> (define-derived-mode my-mutt-message-mode message-mode "MuttMSG")
>> (add-hook
>>  'my-mutt-message-mode-hook
>>  (lambda ()
>>(when (equal (terminal-coding-system) 'iso-latin-1-unix)
>>  (let ((encoding (detect-coding-region (point-min) (point-max
>>(when (or
>>   (equal encoding '(undecided-unix))
>>   (equal encoding '(undecided)))
>>  (setq buffer-file-coding-system 'iso-latin-1-unix))
>> (add-to-list 'auto-mode-alist '("/mutt" . my-mutt-message-mode))
>
> You could just hook message-mode-hook with a function that checks
> buffer-file-name,  I think that would be a bit more straightforward than
> adding a new mode.

Yes, it would be. In my case, I am also redefining a keybinding (C-c
C-c, and possibly more in the future), so the new mode is a way to keep
these mutt-related changes together.

> Other possibilities: you could handle this still in Emacs, but after you
> finish writing, at the point you save the temporary file (with one of
> the write hooks).  Or you can write a script that runs Emacs and then
> recodes the file outside of Emacs, using something like iconv(1).

-- 
Nuno Silva



Re: Composing in utf8 from latin1 terminal

2018-10-24 Thread Derek Martin
On Tue, Oct 23, 2018 at 10:31:45PM +0100, Nuno Silva wrote:
> On 2017-10-12, nunojsi...@ist.utl.pt wrote:
> 
> > Recently, I have tried to use mutt on a non-utf8 terminal.  Everything
> > works as expected in an utf8 environment, but when I compose new e-mails
> > in a latin1/ISO-8859-1 terminal, mutt will expect the file to be in the
> > same encoding as the terminal, while my text editor will save the file
> > in utf8. The result is that non-ASCII characters get misinterpreted,
> > which can affect the message headers as well (e.g. real names in To: and
> > Cc:).
> [...]
> > Is there some way to configure mutt so that it always uses utf8 to read
> > the new message after I exit the editor? Or a way to enable some
> > encoding autodetection that can tell utf8 apart from latin1?

The bottom line is that your environment is misconfigured.  If you
want this to work, you need to have LANG set properly at every point
along the execution path.  Your terminal, terminal font, editor, and
Mutt all need to know that you're using latin1 instead of Unicode, by
having been started with a latin1 LANG setting.  You may need to
configure your terminal to use the correct font, although with many
modern terminals (like gnome-term, kterm, etc.) this should be
unnecessary.

If you are launching the latin1 terminal from a shell that has its
LANG set to UTF-8, it could break (an example of this is starting
hanterm, a terminal program expressly for Korean input with EUC-KR,
with a UTF-8 locale--won't work).  If the shell running inside the
terminal has LANG set to UTF-8, both Mutt and your editor could break.
If you have manual settings on any of these programs to override the
locale defined by the environment, it could break.  If you don't have
all of these things set the same way, it could, and almost certainly
will break.  Sometimes the breakage is subtle, e.g. if you dump the
right characters to a terminal (say, with the cat command) tht has the
right font, it will generally display them correctly, even if the
locale is wrong.  But using them with programs that need to know the
locale will still break.

If you're using a Mutt setting to connect to an existing emacs
instance (via emacsclient or similar) that's already running in a
UTF-8 locale, that's broken.  You need to start a new instance of
emacs whose locale is latin1.

Lastly, you may need to adjust send_charset in Mutt.  It can have
multiple locales, and Mutt will pick the first one that your document
can be displayed in.  For example, mine is:

  set send_charset="iso-8859-1:utf-8"

If my e-mail contains no characters that need UTF-8, Mutt will choose
to send the message as iso-8859-1, but otherwise as UTF-8.

If you do those things, it should "just work" and if you don't it
won't, at least without jumping through pointless hoops to force it,
which will most likely just break other things.

-- 
Derek D. Martinhttp://www.pizzashack.org/   GPG Key ID: 0xDFBEAD02
-=-=-=-=-
This message is posted from an invalid address.  Replying to it will result in
undeliverable mail due to spam prevention.  Sorry for the inconvenience.



pgps1A3kowv76.pgp
Description: PGP signature


Re: Composing in utf8 from latin1 terminal

2018-10-24 Thread Ian Zimmerman
On 2018-10-23 22:31, Nuno Silva wrote:

> So far I did not find a way to change this on the mutt side, but I made
> a new major mode for mutt messages in Emacs (the editor I use with
> mutt), with a hook that changes the file encoding to latin1 if the file
> was opened in a latin1 terminal and Emacs cannot detect a non-ASCII file
> encoding.
> 
> It appears to work here. I'm sure someone who is more versed in Emacs
> than me would be able to come up with a more elegant solution, but I'm
> sharing mine here just in case it is useful to somebody else someday:
> 
> (define-derived-mode my-mutt-message-mode message-mode "MuttMSG")
> (add-hook
>  'my-mutt-message-mode-hook
>  (lambda ()
>(when (equal (terminal-coding-system) 'iso-latin-1-unix)
>  (let ((encoding (detect-coding-region (point-min) (point-max
>(when (or
>   (equal encoding '(undecided-unix))
>   (equal encoding '(undecided)))
>  (setq buffer-file-coding-system 'iso-latin-1-unix))
> (add-to-list 'auto-mode-alist '("/mutt" . my-mutt-message-mode))

You could just hook message-mode-hook with a function that checks
buffer-file-name,  I think that would be a bit more straightforward than
adding a new mode.

Other possibilities: you could handle this still in Emacs, but after you
finish writing, at the point you save the temporary file (with one of
the write hooks).  Or you can write a script that runs Emacs and then
recodes the file outside of Emacs, using something like iconv(1).

-- 
Please don't Cc: me privately on mailing lists and Usenet,
if you also post the followup to the list or newsgroup.
To reply privately _only_ on Usenet and on broken lists
which rewrite From, fetch the TXT record for no-use.mooo.com.


Re: Composing in utf8 from latin1 terminal

2018-10-23 Thread nunojsilva
On 2017-10-12, nunojsi...@ist.utl.pt wrote:

> Recently, I have tried to use mutt on a non-utf8 terminal.  Everything
> works as expected in an utf8 environment, but when I compose new e-mails
> in a latin1/ISO-8859-1 terminal, mutt will expect the file to be in the
> same encoding as the terminal, while my text editor will save the file
> in utf8. The result is that non-ASCII characters get misinterpreted,
> which can affect the message headers as well (e.g. real names in To: and
> Cc:).
[...]
> Is there some way to configure mutt so that it always uses utf8 to read
> the new message after I exit the editor? Or a way to enable some
> encoding autodetection that can tell utf8 apart from latin1?

So far I did not find a way to change this on the mutt side, but I made
a new major mode for mutt messages in Emacs (the editor I use with
mutt), with a hook that changes the file encoding to latin1 if the file
was opened in a latin1 terminal and Emacs cannot detect a non-ASCII file
encoding.

It appears to work here. I'm sure someone who is more versed in Emacs
than me would be able to come up with a more elegant solution, but I'm
sharing mine here just in case it is useful to somebody else someday:


(define-derived-mode my-mutt-message-mode message-mode "MuttMSG")
(add-hook
 'my-mutt-message-mode-hook
 (lambda ()
   (when (equal (terminal-coding-system) 'iso-latin-1-unix)
 (let ((encoding (detect-coding-region (point-min) (point-max
   (when (or
  (equal encoding '(undecided-unix))
  (equal encoding '(undecided)))
 (setq buffer-file-coding-system 'iso-latin-1-unix))
(add-to-list 'auto-mode-alist '("/mutt" . my-mutt-message-mode))

-- 
Nuno Silva