On Wednesday, June 11, 2014 2:59:12 PM CEST, Wietse Venema wrote:
In that case, don't bother writing code. Instead, share/discuss
your design decisions/recommendations. Does it make sense to fold
case with table lookups?

Of course. Email addresses are case insensitive in dozens of languages. It was slack of me not to catch that mistake.

How do we deal with table lokups when the
same domain can show up in different forms at different stages of
email handling (client or server name, mail from/rcpt to domain,
mail headers).

Autodetection is needed. Happily it is also possible.

My preferred approach:

Store UTF8 in the tables and use UTF8 in table lookups. I say this because making pgsql_table work well with utf8 on the localparts and xn--mumble on the domains is bothersome. It seems to me that the reasons for the bother are general, not specific to Postgres. Unicode is very widely used.

Add two new files/functions in util, one to convert from utf8 to xn--mumble and one to convert the other way. Refactor the code in smtp/smtp*.c to call that (that refactoring is the main reason why I want to wait.)

Next, make many locations call the toutf8 function, so that postqueue, ETRN etc. accept both formats on input. This autodetection only breaks if someone has used "xn--" for some other purpose in an internal subdomain, but that's a risk I am prepared to accept. The web browsers also autodetect, and AFAICT it hasn't caused any problems.

Finally, make somewhat fewer locations call call one of the conversion functions to generate the appropriate format for e.g. postqueue output and the EHLO argument. (I still think using a unicode myhostname is a trouble magnet. IIRC my patch disallows it, and I would at least warn against it on startup.)

Once the right callers are there, the table lookups should just work, at any stage.

How would a russian system admin effectively configure
manage tables/logging/queue management with domains in chinese or
hebrew script?

This is several questions. Two or three, I think.

One answer is that if an ISP wants to sell service to Chinese, the staff who talk to the Chinese realistically have to know Chinese. Having a Russian monoglot answer support requests from Chinese will not be effective. EAI just adds one more communication problem.

The fashion these days is to add self service. An ISP may employ a Russian postmaster, but also Chinese sales staff and have web forms written in polite Chinese. In that case, the core of the problem is to make the forms, database and batch processes UTF8-clean.

The postmaster may end up with a support request written in a language he does not understand. EAI means that the support request may include a domain name the postmaster cannot understand too, which IMO is not a significant extra problem.

The other part is: What about queues/mail/tables involving the domains of strangers on the net. If e.g. you have to add a separate queue to a particular domain because mail to it disturbs others.

Since autodetecting on input is possible, I think that's how it has to be. That will cater to postmaster preferences, to some degree.

I have no very strong opinion on the default format used for output. I know my preference as user (look at LANG and use UTF8 if the locale uses it), but I also know that as maintainer, I'd go with whatever causes less support mail. Since you plan to use -DNO_EAI by default for one release cycle you'll have enough time to decide.

I will be working bits and pieces of the patch into Postfix over
the remainder of 2014. This is invasive stuff and it needs to be
done right.

Yes. In a sense that's why I wanted to defer the autoconfiguration/format choice; that might well involve merge conflicts and that would be even worse in this change than usually.

Arnt

Reply via email to