Re: [HACKERS] Locale-dependent case conversion in {identifier}

Nicolai Tufar Fri, 29 Nov 2002 23:57:46 -0800

By no means I would try to convince that your reading of
the SQL standards is wrong. What I am trying to tell is
that Turkish alphabet is broken beyond repair. And since
there is absolutely no way to change our alphabet, we
may can code a workaround in the code.


So i do not claim that your code is wrong. It is
behaviang according to specification. But unfortunately
folks at SQL99 probably were not aware of the woes
of Turkish "I".

The very special case of letter "I" in Turkish is not
only PostgreSQL's problem. Many java programs have
failed miserably trying to open files with "I"s in
pathnames.

So basically, there are two letters "I" in Trukish.
The wone is with dot on top and another is without.
The with dot on top walways has the dot and the one
without never has it. Simple. The problem is
with the standard Latin "I". So why small "i" does
have a dot and capital "I" does not?

Standard conversion is
Lower: "I" -> "y'" and "Y'" -> "i".
Upper: "y'"  -> "I" and "i" -> "Y'".
(font may not be displayed correctly in your mail reader)

Historically programs that operate in Turkish locale have
chosen to hardcode the capitalisation of "i" in system
messages and identifier names like this:

Lower: "I" -> "i" and "Y'" -> "i".
Upper: "y'"  -> "I" and "i" -> "I".

With this, no matter what kind of "I" you used in names,
it is always going to end up a valid ASCII character.

Would it be acceptable if I submit a path that applies this
special logic in src/backend/parser/scan.l if the locale is "tr_TR"?

Because for many folks setting locale to Turkish would
render their database unusable. For, god forbid, if your
sql has a column name written in capitlas including "I".
It is not working. So I deeply believe that PostgreSQL community
have to provide a workaround for this problem.

So what should I do?

Best regards,
Nick




Tom Lane wrote:

"Nicolai Tufar" <[EMAIL PROTECTED]> writes:

So I have changed lower-case conversion code in scan.l to make it purely
ASCII-based.
as in keywords.c. Mini-patch is given below.


Rather than offering a patch, you need to convince us why our reading of
the SQL standard is wrong.  ("Oracle does it that way" is not an
argument that will carry a lot of weight.)

SQL99 states that identifier case conversions are done on the basis of
the Unicode upper/lower case equivalences, so it seems clear that they
intend more than ASCII-only conversion for identifiers.  Locale-based
conversion might not be an exact implementation of the spec, but it's
surely closer than ASCII-only.

			regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly




---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Locale-dependent case conversion in {identifier}

Reply via email to