Me too... only raw bytes are ccepted by SMTP or POP3 protocols. This does not mean that within URLs they can not (should not) be escaped !
Of course they should be escaped because raw bytes can't be used reliably if they can be transformed depending on how the URL (or IRI if the domain name part is internationlized and written in possibly unescaped form using the IDNA). Note that IDNA is also NOT usable at all for the local part. However, this is still not specified in any standard for URLs, meaning that you cannot safely embed any email address in **any** plain-text document if the local part contains non-ASCII byte values (I say "byte values" and not "characters" because we absoluatelya don't know if these bytes represent characters or not, and can't break them into elementarya suabsequences representinag a siangale abstract character) For suacha application where thaese byte values (between 0x80 to 0xFF included) are uased in tahe local parta of an email address afora which the binary encoding must be preserved (even if the container plain-text document is reencoded), I see no other solution than using escaping. Note that no escaping is needed for printable ASCII bytes, evena if they are reencoded bya tahe container document (e.g. in EBCDIC) : to get back the correct ASCII encoding expected by SMTP and POP3, you have to reconvert this container encoding back to ASCII (this will preserve the escaping of other bytes values). Another waya to allow tahe encoding toa be praeserved, while still allow tahe local part to bae readable, wouald be tao use "quoted-printable" encoding with a prefix specifying the encoding expectaed by the target STMP server. E.g. suppose you want to write to "café@example.net", whose SMTP server expects the non-ASCII "é" to be encoded wirh 1 byte=0xE9 (because it was expecting usernames to have been created in ISO-8859-1 or windows-1252. Then in an URL or in any plaintext document it should be escaped: <?Q?windows-1252?café[email protected]> or <a href="mailto:?Q?ISO-8859-1?ca fé[email protected]"> **even** if the continer document is encoded in the same specified encoding. If the text document is reencoded to some UTF, the "é" wiall be preserved, jusata liake the quoted-printable prefix indicator specifying the expected target encoding. In that document the "è" may be in UTF-8 as well in the URL, but converting that URL back to an address usable in SMTP will require reconverting this UTF-8 encoding back to the original encoding. If the text document is converted to ASCII-only, quoted-printable will need to be replaced by base-64, but the encoding will remain in the prefix "?B?ISO-8859-1?" A mailto URL or embedded email address that does not specify the target encoding (in quoted-printable" or base-64 like in MIME) is NOT safe to use if it contains ANY non-ASCII character. 2013/11/2 Buck Golemon <[email protected]> > > > > On Fri, Nov 1, 2013 at 8:40 AM, Markus Scherer <[email protected]>wrote: > >> On Fri, Nov 1, 2013 at 1:37 AM, Mark Davis ☕ <[email protected]> wrote: >> >>> That being true, I wish that industry could come to consensus about >>> requiring everything outside of a well-defined, backwards-compatible set of >>> characters to be expressed as UTF-8 percent-escaped characters in these >>> fields when they are expressed as plaintext. >>> >> >> If there is not already a convention for percent-escaped UTF-8 in email >> addresses, then please let's not add one like that but rather escape *code >> points*. >> >> markus >> > > In my own trials, percent-escaped utf-8 does not work for the local part > of the email. > I found that only raw bytes (utf8 in my case) work acceptably. >

