subject:"Re\: \[HACKERS\] \[SQL\] Case Preservation disregarding case"

Re: [HACKERS] [SQL] Case Preservation disregarding case

2006-11-15 Thread Andrew Dunstan


Jim Nasby wrote:

On Nov 14, 2006, at 2:42 PM, Simon Riggs wrote:

On Thu, 2006-11-02 at 10:51 -0500, Tom Lane wrote:

"Simon Riggs" <[EMAIL PROTECTED]> writes:
We have namespaces to differentiate between two sources of object 
names,
so anybody who creates a schema where MyColumn is not the same 
thing as

myColumn is not following sensible rules for conceptual distance.


I'd agree that that is not a good design practice, but the fact remains
that they *are* different per spec.


Would be better to make this behaviour a userset
switchable between the exactly compliant and the more intuitive.


That's certainly not happening --- if you make any changes in the
semantics of equality of type name, it would have to be frozen no
later than initdb time, for exactly the same reasons we freeze
locale then (hint: index ordering).


[Re-read all of this after Bruce's post got me thinking.]

My summary of the thread, with TODO items noted:

1. PostgreSQL doesn't follow the spec, but almost does, with regard to
comparison of unquoted and quoted identifiers. DB2 does this per spec.

2. TODO: We could follow the spec, but it would need an initdb option;
some non-SQL:2003 standard PostgreSQL programs would not work as they do
now. This is considered a minor, low priority item, though.

3. TODO: We could set column headers better if we wanted to (rather
than ?column? we could use e.g. Sum_ColumnName etc)


Did the idea of preserving the original case and using that for output 
column names, /d, etc. get shot down? I thought it would be a useful 
addition...



I think there is broad agreement that we need to provide optional 
minimally spec compliant behaviour (fold unquoted to upper case, 
otherwise as now). I am not sure how invasive either the "case preserved 
+ case insensitive comparison" or the "case preserved + case sensitive 
comparison" option would be. I don't know that anything has been ruled 
out. Until someone produces a patch or a definite design and analysis we 
are a bit in the dark. Personally, I would like to see all of these as 
options ;-)


I think Simon's item 3 above is a separate issue.

cheers

andrew

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [HACKERS] [SQL] Case Preservation disregarding case

2006-11-15 Thread Jim Nasby


On Nov 14, 2006, at 2:42 PM, Simon Riggs wrote:

On Thu, 2006-11-02 at 10:51 -0500, Tom Lane wrote:

"Simon Riggs" <[EMAIL PROTECTED]> writes:
We have namespaces to differentiate between two sources of object  
names,
so anybody who creates a schema where MyColumn is not the same  
thing as

myColumn is not following sensible rules for conceptual distance.


I'd agree that that is not a good design practice, but the fact  
remains

that they *are* different per spec.


Would be better to make this behaviour a userset
switchable between the exactly compliant and the more intuitive.


That's certainly not happening --- if you make any changes in the
semantics of equality of type name, it would have to be frozen no
later than initdb time, for exactly the same reasons we freeze
locale then (hint: index ordering).


[Re-read all of this after Bruce's post got me thinking.]

My summary of the thread, with TODO items noted:

1. PostgreSQL doesn't follow the spec, but almost does, with regard to
comparison of unquoted and quoted identifiers. DB2 does this per spec.

2. TODO: We could follow the spec, but it would need an initdb option;
some non-SQL:2003 standard PostgreSQL programs would not work as  
they do

now. This is considered a minor, low priority item, though.

3. TODO: We could set column headers better if we wanted to (rather
than ?column? we could use e.g. Sum_ColumnName etc)


Did the idea of preserving the original case and using that for  
output column names, /d, etc. get shot down? I thought it would be a  
useful addition...

--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)





---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] [SQL] Case Preservation disregarding case

2006-11-14 Thread Simon Riggs

On Thu, 2006-11-02 at 10:51 -0500, Tom Lane wrote:
> "Simon Riggs" <[EMAIL PROTECTED]> writes:
> > We have namespaces to differentiate between two sources of object names,
> > so anybody who creates a schema where MyColumn is not the same thing as
> > myColumn is not following sensible rules for conceptual distance.
> 
> I'd agree that that is not a good design practice, but the fact remains
> that they *are* different per spec.
> 
> > Would be better to make this behaviour a userset
> > switchable between the exactly compliant and the more intuitive.
> 
> That's certainly not happening --- if you make any changes in the
> semantics of equality of type name, it would have to be frozen no
> later than initdb time, for exactly the same reasons we freeze
> locale then (hint: index ordering).

[Re-read all of this after Bruce's post got me thinking.]

My summary of the thread, with TODO items noted:

1. PostgreSQL doesn't follow the spec, but almost does, with regard to
comparison of unquoted and quoted identifiers. DB2 does this per spec.

2. TODO: We could follow the spec, but it would need an initdb option;
some non-SQL:2003 standard PostgreSQL programs would not work as they do
now. This is considered a minor, low priority item, though.

3. TODO: We could set column headers better if we wanted to (rather
than ?column? we could use e.g. Sum_ColumnName etc)

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] [SQL] Case Preservation disregarding case sensitivity?

2006-11-14 Thread Bruce Momjian

beau hargis wrote:
> Having installed DB2 Enterprise today and taking it for a spin, it does 
> indeed 
> behave in a similar manner. However, after reading through both 
> specifications, it seems that DB2 follows more of the spec than PostgreSQL. 
> The specifications state that for purpose of comparing identifiers, both 
> shall be converted to upper-case. DB2 displays all identifiers in upper-case 
> whereas PostgreSQL displays all identifiers in lower-case. This alone would 
> be a deviation from the specification. 

True.  We lowercase because historically we have, and because
all-upper-case is hard to read.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] [SQL] Case Preservation disregarding case

2006-11-02 Thread Tom Lane

"Simon Riggs" <[EMAIL PROTECTED]> writes:
> We have namespaces to differentiate between two sources of object names,
> so anybody who creates a schema where MyColumn is not the same thing as
> myColumn is not following sensible rules for conceptual distance.

I'd agree that that is not a good design practice, but the fact remains
that they *are* different per spec.

> Would be better to make this behaviour a userset
> switchable between the exactly compliant and the more intuitive.

That's certainly not happening --- if you make any changes in the
semantics of equality of type name, it would have to be frozen no
later than initdb time, for exactly the same reasons we freeze
locale then (hint: index ordering).

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] [SQL] Case Preservation disregarding case

2006-11-02 Thread Simon Riggs

On Wed, 2006-11-01 at 11:31 -0500, Chuck McDevitt wrote:

> But, stepping back from all that, what is it the users want?
> 
> 1)  When re-creating a CREATE TABLE statement from whatever catalog
> info, they'd like the names to come back exactly as then entered them.
>   If I do:
>  CREATE TABLE BobsTable (WeeklySales numeric(10,2),
> "SomeStrangeName" int);
> 
>   They'd like to see exactly that when the CREATE TABLE gets
> re-created, not what we do now:
> 
>  CREATE TABLE bobstable (weeklysales numeric(10,2),
> "SomeStrangeName" int);

This would be very good indEEd.

It can be very annoying trying to locate a table when the user swears
they called it one thing and actually the case or quotation is
different. Current behaviour isn't useful, even if it is "onspec" (or is
that OnSpec?). Would be better to make this behaviour a userset
switchable between the exactly compliant and the more intuitive.

We have namespaces to differentiate between two sources of object names,
so anybody who creates a schema where MyColumn is not the same thing as
myColumn is not following sensible rules for conceptual distance. It's
certainly an error of best practice, even if its not actually a bug.

> 2)  When doing reports, they'd like the name as entered to be the title
> of the column:
>   Select * from bobstable;  
> 
>   Would be nice if they saw this:
>   WeeklySalesSomeStrangeName
>   ------
...

> Producing "?column?" or somesuch to use in the report, it could return a
> title like "sum(WeeklySales)"

That would be just great. I'm not sure the spec says what the titles
should be, does it?

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] [SQL] Case Preservation disregarding case

2006-11-01 Thread Chuck McDevitt

Sorry, my last mail wasn't well thought out.  Yes, the
information_schema needs the case-folded name (although it might be ok
to add additional columns to the information_schema for extra
information).

But, stepping back from all that, what is it the users want?

1)  When re-creating a CREATE TABLE statement from whatever catalog
info, they'd like the names to come back exactly as then entered them.
If I do:
 CREATE TABLE BobsTable (WeeklySales numeric(10,2),
"SomeStrangeName" int);

  They'd like to see exactly that when the CREATE TABLE gets
re-created, not what we do now:

   CREATE TABLE bobstable (weeklysales numeric(10,2),
"SomeStrangeName" int);

2)  When doing reports, they'd like the name as entered to be the title
of the column:
Select * from bobstable;  

  Would be nice if they saw this:
  WeeklySalesSomeStrangeName
  ------

For compatibility with existing apps and spec compliance, you'd still
want PQfname() to return the case-folded name.
But there isn't any reason you couldn't also return a "suggested title"
field (PQftitle?) which preserves the user's case.

You could also extend the concept of a PQftitle to make nicer titles for
expressions.  Instead of 
SELECT sum(WeeklySales) from BobsTable;

Producing "?column?" or somesuch to use in the report, it could return a
title like "sum(WeeklySales)"

-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, October 31, 2006 10:38 PM
To: Chuck McDevitt
Cc: Stephan Szabo; beau hargis; pgsql-sql@postgresql.org;
pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] [SQL] Case Preservation disregarding case

"Chuck McDevitt" <[EMAIL PROTECTED]> writes:
> Equivalent, yes.  But I can interpret that clause it mean I can show
> either the case folded or non-case-folded value in the information
> schema, as they are equivalent.

Well, that's an interesting bit of specs-lawyering, but I don't see
how you can defend it against these rules in SQL99 5.2:

21) For every  IB there is exactly one
corresponding case-normal form CNF. CNF is an 
derived from IB as follows.

Let n be the number of characters in IB. For i ranging from
1
(one) to n, the i-th character M(i) of IB is translated into
the
corresponding character or characters of CNF as follows.

Case:

a) If M(i) is a lower case character or a title case
character
  for which an equivalent upper case sequence U is defined
by
  Unicode, then let j be the number of characters in U; the
  next j characters of CNF are U.

b) Otherwise, the next character of CNF is M(i).

22) The case-normal form of the  of a  is used for purposes such as and including
determination of identifier equivalence, representation in
the Definition and Information Schemas, and representation
in
diagnostics areas.

NOTE 44 - Any lower-case letters for which there are no
upper-
case equivalents are left in their lower-case form.

Again, obviously we are not compliant because we fold to lower rather
than upper case, but I do not see how you can read (22) as not requiring
the information schema to show the upper-cased form.  The output of
functions such as PQfname() might be considered closer to diagnostics
info than information schema, but that's covered too.

But the really serious problem with what you propose is that it would
allow two table columns with names that the system considers distinct
to show as the same string in the information schema and diagnostic
outputs.  That can't be acceptable --- it's going to break any
application that does any nontrivial analysis of what it sees there,
not to mention that it violates various primary key constraints in
the information schema specification.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] [SQL] Case Preservation disregarding case

2006-10-31 Thread Tom Lane

"Chuck McDevitt" <[EMAIL PROTECTED]> writes:
> Equivalent, yes.  But I can interpret that clause it mean I can show
> either the case folded or non-case-folded value in the information
> schema, as they are equivalent.

Well, that's an interesting bit of specs-lawyering, but I don't see
how you can defend it against these rules in SQL99 5.2:

21) For every  IB there is exactly one
corresponding case-normal form CNF. CNF is an 
derived from IB as follows.

Let n be the number of characters in IB. For i ranging from 1
(one) to n, the i-th character M(i) of IB is translated into the
corresponding character or characters of CNF as follows.

Case:

a) If M(i) is a lower case character or a title case character
  for which an equivalent upper case sequence U is defined by
  Unicode, then let j be the number of characters in U; the
  next j characters of CNF are U.

b) Otherwise, the next character of CNF is M(i).

22) The case-normal form of the  of a  is used for purposes such as and including
determination of identifier equivalence, representation in
the Definition and Information Schemas, and representation in
diagnostics areas.

NOTE 44 - Any lower-case letters for which there are no upper-
case equivalents are left in their lower-case form.

Again, obviously we are not compliant because we fold to lower rather
than upper case, but I do not see how you can read (22) as not requiring
the information schema to show the upper-cased form.  The output of
functions such as PQfname() might be considered closer to diagnostics
info than information schema, but that's covered too.

But the really serious problem with what you propose is that it would
allow two table columns with names that the system considers distinct
to show as the same string in the information schema and diagnostic
outputs.  That can't be acceptable --- it's going to break any
application that does any nontrivial analysis of what it sees there,
not to mention that it violates various primary key constraints in
the information schema specification.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] [SQL] Case Preservation disregarding case

2006-10-31 Thread Chuck McDevitt

-Original Message-
From: Stephan Szabo [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, October 31, 2006 10:23 AM
To: Chuck McDevitt
Cc: Tom Lane; beau hargis; pgsql-sql@postgresql.org;
pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] [SQL] Case Preservation disregarding case

On Tue, 31 Oct 2006, Chuck McDevitt wrote:

> We treated quoted identifiers as case-specific, as the spec requires.
>
> In the catalog, we stored TWO columns... The column name with case
> converted as appropriate (as PostgreSQL already does), used for
looking
> up the attribute,
> And a second column, which was the column name with the case exactly
as
> entered by the user.

Wouldn't using that second column's value tend to often violate 5.2SR10
(at least that's the reference item in SQL92)? AFAICT, that rule
basically
says that the regular identifier is equivalent to the case-folded one
for
purposes of information and definition schema and similar purposes which
seems like it would be intended to include things like column labeling
for
output. There's a little bit of flexibility there on both similar
purposes
and equivalence, though.

Equivalent, yes.  But I can interpret that clause it mean I can show
either the case folded or non-case-folded value in the information
schema, as they are equivalent.

Anyway, we have many things that are "enhancements" beyond the spec, and
this could also be considered an enhancement.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] [SQL] Case Preservation disregarding case

2006-10-31 Thread Stephan Szabo

On Tue, 31 Oct 2006, Chuck McDevitt wrote:

> We treated quoted identifiers as case-specific, as the spec requires.
>
> In the catalog, we stored TWO columns... The column name with case
> converted as appropriate (as PostgreSQL already does), used for looking
> up the attribute,
> And a second column, which was the column name with the case exactly as
> entered by the user.

Wouldn't using that second column's value tend to often violate 5.2SR10
(at least that's the reference item in SQL92)? AFAICT, that rule basically
says that the regular identifier is equivalent to the case-folded one for
purposes of information and definition schema and similar purposes which
seems like it would be intended to include things like column labeling for
output. There's a little bit of flexibility there on both similar purposes
and equivalence, though.

10) The  of a  is equivalent
to an  in which every letter that is a lower-
case letter is replaced by the equivalent upper-case letter
or letters. This treatment includes determination of equiva-
lence, representation in the Information and Definition Schemas,
representation in the diagnostics area, and similar uses.

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] [SQL] Case Preservation disregarding case

2006-10-31 Thread Andrew Dunstan



There was some discussion a couple of years ago on the -hackers list 
about it, so you might like to review the archives. The consensus seemed 
to be that behaviour would need to be set no later than createdb time.  
The options I thought of were:


. current postgres behaviour (we need to do this for legacy reasons, of 
course, as well as to keep happy the legions who hate using upper case 
for anything)
. strictly spec compliant (same as current behaviour, but folding to 
upper case for unquoted identifiers rather than lower)
. fully case sensitive even for unquoted identifiers (not spec compliant 
at all, but nevertheless possibly attractive especially for people 
migrating from MS SQLServer, where it is an option, IIRC).


To this you propose, as I understand it, to have a fourth possibility 
which would be spec compliant for comparison purposes but would label 
result set columns with the case preserved name originally used (or 
would you use the casing used in the query?).


These could be accomplished I think with a second catalog column like 
you suggest, in a number of places, but making sure all the code paths 
were covered might be somewhat laborious. We could probably add the 
second option without being nearly so invasive, though, and some people 
might feel that that would be sufficient.


cheers

andrew

Chuck McDevitt wrote:

We treated quoted identifiers as case-specific, as the spec requires.

In the catalog, we stored TWO columns... The column name with case
converted as appropriate (as PostgreSQL already does), used for looking
up the attribute,
And a second column, which was the column name with the case exactly as
entered by the user.

So, your example would work just fine.


-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED] 
Sent: Monday, October 30, 2006 10:35 PM

To: Chuck McDevitt
Cc: beau hargis; pgsql-sql@postgresql.org; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] [SQL] Case Preservation disregarding case
sensitivity?

"Chuck McDevitt" <[EMAIL PROTECTED]> writes:
  

At Teradata, we certainly interpreted the spec to allow


case-preserving,
  

but case-insensitive, identifiers.



Really?

As I see it, the controlling parts of the SQL spec are (SQL99 sec 5.2)

26) A  and a  are
equivalent if the  of the 
(with every letter that is a lower-case letter replaced by
the
corresponding upper-case letter or letters) and the
 of the  (with all
occurrences of  replaced by  and all
occurrences of  replaced by ),
considered as the repetition of a 
that specifies a  of
SQL_IDENTIFIER
and an implementation-defined collation that is sensitive to
case, compare equally according to the comparison rules in
Subclause 8.2, "".

27) Two s are equivalent if their
s, considered as the repetition of a
 that specifies a 
of SQL_IDENTIFIER and an implementation-defined collation
that is sensitive to case, compare equally according to the
comparison rules in Subclause 8.2, "".

Note well the "sensitive to case" bits there.  Now consider

CREATE TABLE tab (
"foobar" int,
"FooBar" timestamp,
"FOOBAR" varchar(3)
);

We can *not* reject this as containing duplicate column names, else we
have certainly violated rule 27.  Now what will you do with

SELECT fooBar FROM tab;

?  The spec is unquestionably on the side of "you selected the varchar
column"; historical Postgres practice is on the side of "you selected
the int column".  AFAICS a case-insensitive approach would have to
fail with some "I can't identify which column you mean" error.  I am
interested to see where you find support for that in the spec...





---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [HACKERS] [SQL] Case Preservation disregarding case

2006-10-31 Thread Chuck McDevitt

Oh... And Microsoft SQLServer does something similar.

At Greenplum, we've already gotten complaints from customers about this
when they were switching from MSSQL to GP's PostgreSQL-based database.

-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED] 
Sent: Monday, October 30, 2006 10:35 PM
To: Chuck McDevitt
Cc: beau hargis; pgsql-sql@postgresql.org; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] [SQL] Case Preservation disregarding case
sensitivity?

"Chuck McDevitt" <[EMAIL PROTECTED]> writes:
> At Teradata, we certainly interpreted the spec to allow
case-preserving,
> but case-insensitive, identifiers.



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] [SQL] Case Preservation disregarding case

2006-10-31 Thread Chuck McDevitt

We treated quoted identifiers as case-specific, as the spec requires.

In the catalog, we stored TWO columns... The column name with case
converted as appropriate (as PostgreSQL already does), used for looking
up the attribute,
And a second column, which was the column name with the case exactly as
entered by the user.

So, your example would work just fine.

-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED] 
Sent: Monday, October 30, 2006 10:35 PM
To: Chuck McDevitt
Cc: beau hargis; pgsql-sql@postgresql.org; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] [SQL] Case Preservation disregarding case
sensitivity?

"Chuck McDevitt" <[EMAIL PROTECTED]> writes:
> At Teradata, we certainly interpreted the spec to allow
case-preserving,
> but case-insensitive, identifiers.

Really?

As I see it, the controlling parts of the SQL spec are (SQL99 sec 5.2)

26) A  and a  are
equivalent if the  of the 
(with every letter that is a lower-case letter replaced by
the
corresponding upper-case letter or letters) and the
 of the  (with all
occurrences of  replaced by  and all
occurrences of  replaced by ),
considered as the repetition of a 
that specifies a  of
SQL_IDENTIFIER
and an implementation-defined collation that is sensitive to
case, compare equally according to the comparison rules in
Subclause 8.2, "".

27) Two s are equivalent if their
s, considered as the repetition of a
 that specifies a 
of SQL_IDENTIFIER and an implementation-defined collation
that is sensitive to case, compare equally according to the
comparison rules in Subclause 8.2, "".

Note well the "sensitive to case" bits there.  Now consider

CREATE TABLE tab (
"foobar" int,
"FooBar" timestamp,
"FOOBAR" varchar(3)
);

We can *not* reject this as containing duplicate column names, else we
have certainly violated rule 27.  Now what will you do with

SELECT fooBar FROM tab;

?  The spec is unquestionably on the side of "you selected the varchar
column"; historical Postgres practice is on the side of "you selected
the int column".  AFAICS a case-insensitive approach would have to
fail with some "I can't identify which column you mean" error.  I am
interested to see where you find support for that in the spec...

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] [SQL] Case Preservation disregarding case

2006-10-31 Thread Martijn van Oosterhout

On Tue, Oct 31, 2006 at 12:55:46PM -0500, Andrew Dunstan wrote:
> To this you propose, as I understand it, to have a fourth possibility 
> which would be spec compliant for comparison purposes but would label 
> result set columns with the case preserved name originally used (or 
> would you use the casing used in the query?).

The big issue I see with this is that it would break PQfname on the
client end, since that's case sensetive too. Most client languages are,
so you really are between a rock and a hard place.

Making PQfname case-insensetive also screws up in Tom's example.

One way to appraoch this is to consider this a setting of the collation
of the name datatype. If a case-insensetive collation is selected at
initdb time, then Tom's example would indeed fail, but that's a choice
someone made. Problem being, you'd have to export that choice to
clients to make PQfname work, and that's going to messy.

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to 
> litigate.

signature.asc
Description: Digital signature

Re: [HACKERS] [SQL] Case Preservation disregarding case sensitivity?

2006-10-30 Thread Tom Lane

"Chuck McDevitt" <[EMAIL PROTECTED]> writes:
> At Teradata, we certainly interpreted the spec to allow case-preserving,
> but case-insensitive, identifiers.

Really?

As I see it, the controlling parts of the SQL spec are (SQL99 sec 5.2)

26) A  and a  are
equivalent if the  of the 
(with every letter that is a lower-case letter replaced by the
corresponding upper-case letter or letters) and the  of the  (with all
occurrences of  replaced by  and all
occurrences of  replaced by ),
considered as the repetition of a 
that specifies a  of SQL_IDENTIFIER
and an implementation-defined collation that is sensitive to
case, compare equally according to the comparison rules in
Subclause 8.2, "".

27) Two s are equivalent if their s, considered as the repetition of a  that specifies a 
of SQL_IDENTIFIER and an implementation-defined collation
that is sensitive to case, compare equally according to the
comparison rules in Subclause 8.2, "".

Note well the "sensitive to case" bits there.  Now consider

CREATE TABLE tab (
"foobar" int,
"FooBar" timestamp,
"FOOBAR" varchar(3)
);

We can *not* reject this as containing duplicate column names, else we
have certainly violated rule 27.  Now what will you do with

SELECT fooBar FROM tab;

?  The spec is unquestionably on the side of "you selected the varchar
column"; historical Postgres practice is on the side of "you selected
the int column".  AFAICS a case-insensitive approach would have to
fail with some "I can't identify which column you mean" error.  I am
interested to see where you find support for that in the spec...

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] [SQL] Case Preservation disregarding case

2006-10-30 Thread Chuck McDevitt

At Teradata, we certainly interpreted the spec to allow case-preserving,
but case-insensitive, identifiers.
Users really liked it that way:  If you re-created a CREATE TABLE
statement from the catalog, you could get back exactly the case the user
had entered, but people using the table didn't need to worry about case.
And column titles in reports would have the nice case preserving
information.
Sort of like how Windows systems treat file names... The case is
preserved, but you don't need to know it to access the file.

I know UNIX users usually think "case-preserving with case-insensitive"
a foreign concept, but that doesn't mean the average user feels the
same.

If I want my column named "WeeklyTotalSales", I really don't want to
have to always quote it and type in the exact case.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Tom Lane
Sent: Monday, October 30, 2006 7:24 PM
To: beau hargis
Cc: pgsql-sql@postgresql.org; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] [SQL] Case Preservation disregarding case
sensitivity?

beau hargis <[EMAIL PROTECTED]> writes:
> Considering the differences that already exist between database
systems and 
> their varying compliance with SQL and the various extensions that have
been 
> created, I do not consider that the preservation of case for
identifiers 
> would violate any SQL standard.

That's not how I read the spec.  It is true that we are not 100% spec
compliant, but that isn't a good argument for moving further away from
spec.  Not to mention breaking backwards compatibility with our
historical behavior.  The change you propose would fix your application
at the cost of breaking other people's applications.   Perhaps you
should consider fixing your app instead.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] [SQL] Case Preservation disregarding case sensitivity?

2006-10-30 Thread Tom Lane

beau hargis <[EMAIL PROTECTED]> writes:
> Considering the differences that already exist between database systems and 
> their varying compliance with SQL and the various extensions that have been 
> created, I do not consider that the preservation of case for identifiers 
> would violate any SQL standard.

That's not how I read the spec.  It is true that we are not 100% spec
compliant, but that isn't a good argument for moving further away from
spec.  Not to mention breaking backwards compatibility with our
historical behavior.  The change you propose would fix your application
at the cost of breaking other people's applications.   Perhaps you
should consider fixing your app instead.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] [SQL] Case Preservation disregarding case sensitivity?

2006-10-30 Thread beau hargis

On Friday 27 October 2006 19:38, Joe wrote:
> Hi Beau,
>
> On Fri, 2006-10-27 at 16:23 -0700, beau hargis wrote:
> > I am hoping that there is an easy way to obtain case-preservation with
> > case-insensitivity, or at the very least, case-preservation and complete
> > case-sensitivity, or case-preservation and a consistant case-conversion
> > strategy.
> >
> > The case of the column names need to be preserved because that is the way
> > the schema is designed and most importantly (VERY, VERY IMPORTANT),
> > column names are used in apps as hash values, or as named references
> > which are case sensitive and as such need to be delivered to the client
> > in exactly in the manner specified at the time of table creation.
> >
>
> I went through the same issue in my conversion from MySQL to Postgres
> and (since I had a small application) I ended up changing up all my
> tables and columns "UserProfile" to user_profile.
>
> I'm afraid however, that it's MySQL that is the odd man out.  I haven't
> researched this completely but I believe PG follows either the FIPS-127
> or SQL-92 standard with respect to what are called "delimited
> identifiers".  Basically, this says if you want case sensitivity in
> identifier names, you have to use double quotes wherever you refer to
> the identifier.  Without the double quotes, the SQL implementor can
> either use UPPERCASE (as I believe Oracle and DB2 do) or lowercase (as
> PG does) when it displays those identifiers.
>
> Joe

Again, I am at the same point I was at when searching and searching for 
information on the problem, and I am not sure what the SQL standard has to 
say about it: I do not, and I dont think that anyone else who is struggling 
to deal with the problem does, care about case-sensitivity. I am interested 
in case preservation of column names. I do, indeed, want identifiers treated 
in a case insensitive way, but I want the case PRESERVED in the table 
definitions and I want that case, as preserved, to be reflected in the field 
names as returned by the server to any client library that connects and 
initiates a query. 

Case-preservation is not the same as case-sensitivity; nor is 
case-normalization the same as case-insensitivity. What PostgreSQL is doing 
is converting any, and all, identifiers to a lower case and then matching 
those against the identifiers (as stored in the table definition) in a 
case-sensitive manner. It 'normalizes' the case of the identifiers so that it 
has a common internal representation; the desires of the programmer and 
database architect be damned. 

Referenced specification details:
From FIPS-127:
===
3. Delimited identifiers. In the previous ANSI SQL specification, it was not 
possible for an application to specify identifiers with spaces or other 
special symbols. Also, it was not possible to protect against future assaults 
on the name space for (identifier) by additions to the (reserved word) list. 
The new facility for (delimited identifier) allows a user to enclose all 
identifiers in double-quotation marks, thereby ensuring that the name defined 
or referenced may contain spaces or other special symbols and will not be 
impacted by future additions to the (reserved word) list.
===

From SQL-92/Sec. 5.2:
===
10)The  of a  is equivalent
to an  in which every letter that is a lower-
case letter is replaced by the equivalent upper-case letter
or letters. This treatment includes determination of equiva-
lence, representation in the Information and Definition Schemas,
representation in the diagnostics area, and similar uses.

11)The  of a  (with every
letter that is a lower-case letter replaced by the equivalent
upper-case letter or letters), treated as the repetition of
a  that specifies a  of SQL_TEXT, shall not be equal, according to
the comparison rules in Subclause 8.2, "",
to any  (with every letter that is a lower-case
letter replaced by the equivalent upper-case letter or letters),
treated as the repetition of a  that
specifies a  of SQL_TEXT.

12)Two s are equivalent if their s, considered as the repetition of a  that specifies a  of
SQL_TEXT, compare equally according to the comparison rules
in Subclause 8.2, "".

13)A  and a  are equiva-
lent if the  of the  (with
every letter that is a lower-case letter replaced by the equiva-
lent upper-case letter or letters) and the  of the  (with all occurrences of
 replaced by  and all occurrences of  replaced by ), considered as
the repetition of a  that specifies a
 of SQL_TEXT and an implementation-
defined collation that is sensitive to case, compare equally
according to the comparison rules in Subclause 8.2, "".

14)Two s are equivalent if their s (with all occurrences of  replaced
by  and all occurrences of 
replaced by ), considered as the repetition of a
 that specifies a  of SQL_TEXT and an implementation-defined collation
that is sensitive to case, compare

Re: [HACKERS] [SQL] Case Preservation disregarding case

Re: [HACKERS] [SQL] Case Preservation disregarding case

Re: [HACKERS] [SQL] Case Preservation disregarding case

Re: [HACKERS] [SQL] Case Preservation disregarding case sensitivity?

Re: [HACKERS] [SQL] Case Preservation disregarding case

Re: [HACKERS] [SQL] Case Preservation disregarding case

Re: [HACKERS] [SQL] Case Preservation disregarding case

Re: [HACKERS] [SQL] Case Preservation disregarding case

Re: [HACKERS] [SQL] Case Preservation disregarding case

Re: [HACKERS] [SQL] Case Preservation disregarding case

Re: [HACKERS] [SQL] Case Preservation disregarding case

Re: [HACKERS] [SQL] Case Preservation disregarding case

Re: [HACKERS] [SQL] Case Preservation disregarding case

Re: [HACKERS] [SQL] Case Preservation disregarding case

Re: [HACKERS] [SQL] Case Preservation disregarding case sensitivity?

Re: [HACKERS] [SQL] Case Preservation disregarding case

Re: [HACKERS] [SQL] Case Preservation disregarding case sensitivity?

Re: [HACKERS] [SQL] Case Preservation disregarding case sensitivity?

18 matches

Site Navigation

Mail list logo

Footer information