Arnt Gulbrandsen: > On Wednesday, June 4, 2014 12:55:18 PM CEST, Wietse Venema wrote: > > I have looked at parts of the patch in my copious time. > > I hoped someone else would ;) I do feel a little guilty about imposing on > you alone. > > > First, Postfix behavior must not change unless mail is flagged as > > EAI, regardless of whether it contains 8-bit headers or envelopes. > > Are you sure?
Yes. We must maintain compatibility with existing practice. Postfix has always passed 8-bit headers and envelopes (localparts) for the past 15 years. It would be an unaceptable compatibility break if, for example, a corporate perimeter MTA were to start bouncing inbound mail just because 1) some up-stream client is changed to flag that email as SMTPUTF8, but 2) some down-stream internal server doesn't announce SMTPUTF8. > > Thus, the SMTP client, cleanup daemon, and other daemon programs > > MUST NOT engage into any EAI-related stuff unless a message is > > flagged as EAI-enabled. I will add a guard around that code. > > The smtputf8 flag in the queue file acts as such a guard. No it doesn't. Example: ORCPT handling in the cleanup Milter client and in the SMTP client is unconditional on the smtputf8 flag. However, given that UTF8 addresses use a special encoding, I suspect that it is better to decode them properly (the alternative would be to not decode them at all and just pass them on, but that requires some extra code to handle existing queue files that contain decoded attributes). [configurable EAI detection in the Postfix sendmail command] > +#if !defined(NO_EAI) ... > +#endif That is not what I call configurable. That is what I call compiled-in hard-coded behavior. > > Have you given any thought of what happens when a company installs > > Postfix-EAI on the perimeter, and WANTS TO FORWARD THE MAIL TO THEIR > > INTERNAL SYSTEMS that may or may not have EAI support? > > Yes. ... > Outgoing mail from that company to unicode addresses may begin to work, > depending on whether the internal origin server supports EAI. Incorrect. This does not require any EAI support in the SMTP client. The SMTP client simply hands the mail to the gateway without any transformation of the recipient domain. > Incoming mail to that company from unicode addresses still doesn't work. This has worked for 15 years, at least with UTF8 localparts. We must maintain compatibility with existing practice. It would be an unacceptable compatibility break if Postfix were to suddenly start rejecting such mail. > > I haven't looked yet at the interface with database systems. > > At this interface we can expect characterset issues. > > I changed printable(), so e.g. any log systems that only accepted ASCII may > get problems. Things like recipient table lookups always had to accept > 8-bit localparts, now they have to accept 8-bit domain side too. Is there a possibity that the same domain name may exist as an UTF8 string in some contexts and as xn-mumble elsewhere? If this is a problem then it will affect many database lookups. How do UTF8 domain names interact with DNS RHSBL lists? Do they expect the UTF8 form or the xn--mumble form? How do UTF8 domain names interact with reject_unknown_sender_domain, reject_unknown_recipient_domain, etc.? It looks like you are passing the UTF8 domain name in DNS queries. > Other issues should be unlikely, since the eightbittery is passed along > without any actual changes. No upcasing, downcasing, charset conversions or > other complications. The only conversion the code does is just before it > does an MX lookup. First, all Postfix table lookups are case-insensitive by default. You may have missed that. Second, not all lookup tables may support UTF8. What does the POSIX standard have to say about this for regular expressions? This affects the regexp: table. Third, in database queries, strings that contain UTF8 may require special treatment when the default locale is not unicode-based. We must maintain compatibility with existing practice: Postfix currently passes 8bit strings as if they are in the default locale. It would be an unacceptable compatibility break if Postfix suddenly starts to fail those queries just because they aren't well-formed UTF8. So it looks like there all the work on the database interface still needs to be done. Finally, you appear to have broken the valid_hostname(3) abstraction. This module enforces RFC rules for hostnames (and domain names) in calls of infrastructure functions such as getaddrinfo(), getnameinfo() and functions at lower levels in the stack. Unless the EAI RFCs say otherwise, the hostname in HELO commands cannot be an UTF8 string, therefore it cannot be treated as if it is a recipient domain. Recipient domains require a validator that is specific for recipient domains, and that validator does not belong in the valid_hostname(3) module. I think this also requires a different version of the host_port() function that is specific for recipient addresses and that has flag whether or not UTF8 functionality is enabled. More later, after I have reviewed the rest of the code, and after I have checked it against the RFCs for compliance and completeness. Wietse