On Wednesday, June 4, 2014 11:16:51 PM CEST, Wietse Venema wrote:
* Postfix table queries are case-insensitive. I don't see any attempt
to implement that for UTF8 addresses. This leaves an ambiguity.
I looked at this now.
As I read the code, tables mostly map to lower case and then do a binary
comparison. The mysql and pgsql tables may additionally use the database
server's ilike operation. Finally, lowercase() maps U to u, but leaves 0xC0
as 0xC0, even if the Postfix server runs in a locale where the lowercase
form of that is 0xE0.
Is that correct?
I can provide a supplementary patch that provides case insensitivity for
unicode. It's easy, but there are several ways to do it, and I don't know
which you prefer.
1. Toupper/tolower in Postfix, with the usual table. This adds the bulk of
a table and is language-independent but imperfect. The well-known problem
is i/ı. (The lowercase("I") equvalent is "ı" in Turkish and a handful of
other locales.)
2a. Toupper/tolower that call out to ICU if EAI is enabled and there's any
non-ASCII is in the argument. This slows down toupper()/tolower() but
Postfix escapes having the table and ICU devotes considerable effort to
correctness. It's easy to compose the string, too (composition means to use
å instead of "a"+"ring above").
2b. Ditto, but calling a language-sensitive function in ICU, so that i is
equal to İ if the Postfix server runs in one of those locales. I'm unhappy
about this alternative — a Swiss service provider may well service both
Kazakh and Korean users and how should the service providers's Postfix be
configured?
3. Switching to titlecase. A bigger change. Titlecase is a form in which in
which case differences are erased and in principle it's neither equal to
uppercase nor to lowercase. It's only usable for implementing
case-insensitive comparison/lookup using fast binary comparison.
In my opinion the change to titlecase isn't worth it. There aren't enough
problems with lowercase() to justify such a sweeping change. Also keeping
lower case allows compiled tables to survive upgrades/downgrades.
I'm neutral regarding 1 and 2a. If you'll tell me what you prefer I'll
write a patch and test that it matches another implementation.
Arnt