Re: [VM] vm-url-decode-buffer and UTF-8

2016-11-15 Thread Uday Reddy
Hi Göran, is this now taken care of by the bug fix you have filed on
Launchpad?

Cheers,
Uday

Göran Uddeborg writes:

> I'm occasionally calling
> 
> emacsclient ... vm-mail-to-mailto-url ...
> 
> from scripts.  It works fine for simple mailto:user@domain URL:s.  But
> if the URL also includes some non-ASCII, e.g. in a subject= argument,
> then this string gets mangled.  More exactly, the UTF-8 encoded
> character in the string gets treated as multiple single characters.
> 
> If I understand things correctly, this boils down to
> vm-url-decode-buffer doing the decoding in that way.  It looks for one
> %XX escape, and does insert-char on the decoded value.  That makes a
> UTF-8 sequence consisting of more than one byte become several
> characters.
> 
> Has anyone else been hit by this or something similar?  And, in
> particular, has anyone any idea how to fix the problem?
> 



Re: [VM] Incorrectly encoded non-ASCII headers

2016-11-15 Thread Uday Reddy
This is an important issue. As per the existing "standards", all message
headers have to be in ASCII (with other character setes duly encoded).

However, there is an RFC 6532, from February 2012, which is widely expected
to become a standard.

  http://tools.ietf.org/html/rfc6532

I think Thunderbird is implementing it already. Since Thunderbird is de
facto standard mail client now, more and more senders will end up using
utf-8.

I wonder if people can look through the RFC and figure out what changes we
need to make. Can we just process the entire incoming mail using utf-8?  I
am thinking that, since utf-8 is compatible with ASCII, any old MIME should
still get correctly handled.

Cheers,
Uday


Yeechang Lee writes:

> (Disclaimer: I am on Emacs 23 and VM 8.1.2.)
> 
> I sometimes receive messages in which headers--usually the subject
> line--uses non-ASCII characters without quoting them as per RFC
> 2047. (Wikipedia's article-of-the-day mailing list is a frequent
> offender.)
> 
> I realize the best solution is to have the sender change its ways and
> emit standards-adhering messages, but in the meanwhile, could VM gain
> the ability to assume that the body's encoding style in a message also
> applies to the headers? An alternative would be to assume that headers
> are 8-bit clean unless RFC 2047-style quoting appears. (Can either be
> done on our own with some elisp in the meanwhile, I wonder? I wouldn't
> want the message itself modified; just the presentation buffer.)
>