On Wednesday, June 11, 2014 2:59:12 PM CEST, Wietse Venema wrote:
In that case, don't bother writing code. Instead, share/discuss
your design decisions/recommendations. Does it make sense to fold
case with table lookups?
Of course. Email addresses are case insensitive in dozens of languages. It
was slack of me not to catch that mistake.
How do we deal with table lokups when the
same domain can show up in different forms at different stages of
email handling (client or server name, mail from/rcpt to domain,
mail headers).
Autodetection is needed. Happily it is also possible.
My preferred approach:
Store UTF8 in the tables and use UTF8 in table lookups. I say this because
making pgsql_table work well with utf8 on the localparts and xn--mumble on
the domains is bothersome. It seems to me that the reasons for the bother
are general, not specific to Postgres. Unicode is very widely used.
Add two new files/functions in util, one to convert from utf8 to xn--mumble
and one to convert the other way. Refactor the code in smtp/smtp*.c to call
that (that refactoring is the main reason why I want to wait.)
Next, make many locations call the toutf8 function, so that postqueue, ETRN
etc. accept both formats on input. This autodetection only breaks if
someone has used "xn--" for some other purpose in an internal subdomain,
but that's a risk I am prepared to accept. The web browsers also
autodetect, and AFAICT it hasn't caused any problems.
Finally, make somewhat fewer locations call call one of the conversion
functions to generate the appropriate format for e.g. postqueue output and
the EHLO argument. (I still think using a unicode myhostname is a trouble
magnet. IIRC my patch disallows it, and I would at least warn against it on
startup.)
Once the right callers are there, the table lookups should just work, at
any stage.
How would a russian system admin effectively configure
manage tables/logging/queue management with domains in chinese or
hebrew script?
This is several questions. Two or three, I think.
One answer is that if an ISP wants to sell service to Chinese, the staff
who talk to the Chinese realistically have to know Chinese. Having a
Russian monoglot answer support requests from Chinese will not be
effective. EAI just adds one more communication problem.
The fashion these days is to add self service. An ISP may employ a Russian
postmaster, but also Chinese sales staff and have web forms written in
polite Chinese. In that case, the core of the problem is to make the forms,
database and batch processes UTF8-clean.
The postmaster may end up with a support request written in a language he
does not understand. EAI means that the support request may include a
domain name the postmaster cannot understand too, which IMO is not a
significant extra problem.
The other part is: What about queues/mail/tables involving the domains of
strangers on the net. If e.g. you have to add a separate queue to a
particular domain because mail to it disturbs others.
Since autodetecting on input is possible, I think that's how it has to be.
That will cater to postmaster preferences, to some degree.
I have no very strong opinion on the default format used for output. I know
my preference as user (look at LANG and use UTF8 if the locale uses it),
but I also know that as maintainer, I'd go with whatever causes less
support mail. Since you plan to use -DNO_EAI by default for one release
cycle you'll have enough time to decide.
I will be working bits and pieces of the patch into Postfix over
the remainder of 2014. This is invasive stuff and it needs to be
done right.
Yes. In a sense that's why I wanted to defer the autoconfiguration/format
choice; that might well involve merge conflicts and that would be even
worse in this change than usually.
Arnt