I want to digress about one aspect here: SMTP/EAI and unicode normalization.

The general EAI approach to that is to avoid having the problem, ie. to define the SMTP/email extensions such that the problems become other people's problems.

Homoglyphs aren't an SMTP problem. Two codepoints may look the same, but an SMTP server doesn't have to think about which of the two domains is legitimate and which is the impostor. All that is the registry's headache.

De/composition are pushed to the DNS. The SMTP part just says: Convert to a IDNA a-labels in order to do the MX lookup, and otherwise don't mess with the bytes you received. (My patch uses ICU to convert to a-labels.)

That does leave a little trouble, mostly dealing with localparts, but also with local domains. Some of it will be tricky, e.g. doing unicode-based pcre on a system that doesn't use a unicode locale.

Arnt

Reply via email to