I strongly agree with John's comments below. A few additions inline... --On Saturday, May 30, 2026 14:33 -0400 John Levine <[email protected]> wrote:
> It appears that Bob Traverz <[email protected]> said: >> - Normalization Rules: Before banning dot-removal or plus-tag >> stripping, I recommend checking the current practices of Gmail >> and Outlook. This will ensure your specification caters to the >> majority of email IDs. > > Hi, mail person here. Our RFCs are quite clear that the local part > of an address is completely opaque. Any assumptions you make about > normalizing local parts are wrong, both in theory and in practice. > This is a very hard problem, many people have tried to solve it in > the past, and they've all failed. This even applies to things as seemingly obvious as case-insensitivity for all-ASCII addresses. The practice of assuming that [email protected] and [email protected] are the same address is common enough that that standards warn treating them as different but the actual rule is even mapping one of those into the other prior to final delivery may result in mishandled messages. > Gmail ignores dots so [email protected] and [email protected] > are equivalent, but I do not know any other system that does that. > Sometimes [email protected] and [email protected] are the > same, usually they are not. Sometimes a system manually adds > dotted and dotted variants to their alias tables, so it's only as > consistent as the maintainer makes it. > > Some systems allow plus tags, some don't. Some use other characters > than plus. It's not even consistent on a single system -- on mine, > some foo-bar addresses go to the same places as foo, some don't. And some systems that allow plus "tags" think they are noise and remove them, i.e., treating [email protected] as [email protected]. Others are sure they represent subaddresses (or, in their vocabulary, a special form of alias). Still others, for any sort of authentication or comparison purposes, consider only [email protected] with [email protected] and [email protected] equivalent to it and each other. And, as John suggests, some believe that "-", and maybe "%" or "/", the latter of which used to be meaningful for mail routing and may still be in some places, are equivalent to "+". > Since we added EAI addresses which allow UTF-8 characters in > addresses, you can't even do case folding reliably since it's > language specific. As a famous example lower case i and upper case > I are equivalent in most European languages, but not in Turkish > where dotted and undotted i are different characters with upper and > lower case versions of both. There are lots of other variations > like ö which in German is considered a short version of oe but in > Scandinavian languages is not. See above about case folding even with all-ASCII addresses. And, since John picked on German, consider "ß" (U+00DF), which is equivalent to "ss" except where it isn't. > People have tried to come up with rule sets about mail systems' > normalizations. It doesn't work for two reasons. One is that in most > cases you don't know what a system's rules really are (what does > Gmail do with two dots in a row?), the other is that there are more > mail systems than you know, and you'll never have a complete set. Exactly. > The usual approach is to pretend none of this matters and to invent > a scheme that will fail randomly on addresses that don't make the > same assumptions you do. In this case you can do better, since you > can pass the unmodified local part to the keyserver and it can do > whatever the local policy is to the address and return the actual > key. As long as the keyserver actually knows what the conventions of the destination system are. But please remember that the price of inventing a system like this and getting it wrong is essentially to tell users or organizations using other systems that their systems (some of which may have been widely deployed and used for decades) are illegitimate and do not deserve to send or refuse mail. To suggest that would be a bad idea, and a worse one for the IETF to consider saying it, would be an understatement. john _______________________________________________ DNSOP mailing list -- [email protected] To unsubscribe send an email to [email protected]
