Re: [HACKERS] Locale-dependent case conversion in {identifier}

2002-11-30 Thread Hannu Krosing
On Sat, 2002-11-30 at 01:40, Nicolai Tufar wrote:
 And I happen to have bad luck to use PostgreSQL with Turkish locale. And, as
 you
 may know our I is not your I:
 
 pgsql=# create table a(x char(1));
 CREATE TABLE
 pgsql=# grant SELECT ON a to PUBLIC;
 ERROR:  user public does not exist
 pgsql=#
 
 Oracle, the second best database I have does seem to convert relation names
 in
 locale-dependent fassion:
 
SQL alter session set NLS_LANGUAGE='TURKISH';
Session altered.
SQL create table a(x char(1));
Table created.
SQL grant select on a to PUBLIC;
Grant succeeded.

could it just be that we store identifiers in lower case, whereas most others
(including SQL spec IIRC)have them in upper case ?

Could you try the grant in both databases also in lower case ?

i.e.:

grant select on a to public;

--
Hannu



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Locale-dependent case conversion in {identifier}

2002-11-30 Thread Hannu Krosing
On Sat, 2002-11-30 at 07:57, Nicolai Tufar wrote:
 With this, no matter what kind of I you used in names,
 it is always going to end up a valid ASCII character.
 
 Would it be acceptable if I submit a path that applies this
 special logic in src/backend/parser/scan.l if the locale is tr_TR?
 
 Because for many folks setting locale to Turkish would
 render their database unusable. For, god forbid, if your
 sql has a column name written in capitlas including I.
 It is not working. So I deeply believe that PostgreSQL community
 have to provide a workaround for this problem.
 
 So what should I do?

In SQL in general and in postgreSQL in particular, you can always use
quoted names and thus escape the stupidities of case conversion:

grant SELECT ON a to public;

should work everywhere (except Oracle and other DB's where it should be
 grant SELECT ON A to PUBLIC;
)

I can't help you on Win32/VMS filenames ;)

---
Hannu


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Locale-dependent case conversion in {identifier}

2002-11-30 Thread Tom Lane
Nicolai Tufar [EMAIL PROTECTED] writes:
 Historically programs that operate in Turkish locale have
 chosen to hardcode the capitalisation of i in system
 messages and identifier names like this:

 Lower: I - i and Y' - i.
 Upper: y'  - I and i - I.

If that's the behavior you want, why don't you set up a variant locale
definition that does it that way?  That would fix *all* your locale-
dependent programs, not just Postgres ...

 Would it be acceptable if I submit a path that applies this
 special logic in src/backend/parser/scan.l if the locale is tr_TR?

It really seems like an inappropriate wart to me :-(

 Because for many folks setting locale to Turkish would
 render their database unusable. For, god forbid, if your
 sql has a column name written in capitlas including I.
 It is not working.

I am not seeing why this is any worse than the universal problems of
using upper-case letters without double-quoting 'em.  If you
consistently spell the name the same way, you will not have a problem;
if you don't, you might have a problem, but why is it worse than
anyone else's?

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Locale-dependent case conversion in {identifier}

2002-11-30 Thread Nicolai Tufar
- Original Message -
From: Hannu Krosing [EMAIL PROTECTED]
To: Nicolai Tufar [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Saturday, November 30, 2002 5:41 PM
Subject: Re: [HACKERS] Locale-dependent case conversion in {identifier}


 [ ... ]

 could it just be that we store identifiers in lower case, whereas most
others
 (including SQL spec IIRC)have them in upper case ?

That seem to be the case. All the databases I used, automaticaly convert
identifiers to upper case.
And they all do it using ASCII-only conversion.


 Could you try the grant in both databases also in lower case ?

 i.e.:

 grant select on a to public;

The statement works in both databases. But the problem is that it was
pg_dumpall who created SQL statements with PUBLIC. Why
pg_dumpall does not enclose identifiers in quotes, like:

REVOKE ALL ON TABLE tamara2 FROM public;
  insted of
REVOKE ALL ON TABLE tamara2 FROM PUBLIC;
as it does now.

I will make an attempt to modify pg_dump accordingly, and will send a
patch to the list.


Regards,
Nic


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] Locale-dependent case conversion in {identifier}

2002-11-29 Thread Tom Lane
Nicolai Tufar [EMAIL PROTECTED] writes:
 So I have changed lower-case conversion code in scan.l to make it purely
 ASCII-based.
 as in keywords.c. Mini-patch is given below.

Rather than offering a patch, you need to convince us why our reading of
the SQL standard is wrong.  (Oracle does it that way is not an
argument that will carry a lot of weight.)

SQL99 states that identifier case conversions are done on the basis of
the Unicode upper/lower case equivalences, so it seems clear that they
intend more than ASCII-only conversion for identifiers.  Locale-based
conversion might not be an exact implementation of the spec, but it's
surely closer than ASCII-only.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Locale-dependent case conversion in {identifier}

2002-11-29 Thread Nicolai Tufar
By no means I would try to convince that your reading of
the SQL standards is wrong. What I am trying to tell is
that Turkish alphabet is broken beyond repair. And since
there is absolutely no way to change our alphabet, we
may can code a workaround in the code.

So i do not claim that your code is wrong. It is
behaviang according to specification. But unfortunately
folks at SQL99 probably were not aware of the woes
of Turkish I.

The very special case of letter I in Turkish is not
only PostgreSQL's problem. Many java programs have
failed miserably trying to open files with Is in
pathnames.

So basically, there are two letters I in Trukish.
The wone is with dot on top and another is without.
The with dot on top walways has the dot and the one
without never has it. Simple. The problem is
with the standard Latin I. So why small i does
have a dot and capital I does not?

Standard conversion is
Lower: I - y' and Y' - i.
Upper: y'  - I and i - Y'.
(font may not be displayed correctly in your mail reader)

Historically programs that operate in Turkish locale have
chosen to hardcode the capitalisation of i in system
messages and identifier names like this:

Lower: I - i and Y' - i.
Upper: y'  - I and i - I.

With this, no matter what kind of I you used in names,
it is always going to end up a valid ASCII character.

Would it be acceptable if I submit a path that applies this
special logic in src/backend/parser/scan.l if the locale is tr_TR?

Because for many folks setting locale to Turkish would
render their database unusable. For, god forbid, if your
sql has a column name written in capitlas including I.
It is not working. So I deeply believe that PostgreSQL community
have to provide a workaround for this problem.

So what should I do?

Best regards,
Nick




Tom Lane wrote:

Nicolai Tufar [EMAIL PROTECTED] writes:


So I have changed lower-case conversion code in scan.l to make it purely
ASCII-based.
as in keywords.c. Mini-patch is given below.



Rather than offering a patch, you need to convince us why our reading of
the SQL standard is wrong.  (Oracle does it that way is not an
argument that will carry a lot of weight.)

SQL99 states that identifier case conversions are done on the basis of
the Unicode upper/lower case equivalences, so it seems clear that they
intend more than ASCII-only conversion for identifiers.  Locale-based
conversion might not be an exact implementation of the spec, but it's
surely closer than ASCII-only.

			regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly





---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]