Re: [HACKERS] [SQL] Case Preservation disregarding case sensitivity?

2006-11-14 Thread Bruce Momjian
beau hargis wrote:
> Having installed DB2 Enterprise today and taking it for a spin, it does 
> indeed 
> behave in a similar manner. However, after reading through both 
> specifications, it seems that DB2 follows more of the spec than PostgreSQL. 
> The specifications state that for purpose of comparing identifiers, both 
> shall be converted to upper-case. DB2 displays all identifiers in upper-case 
> whereas PostgreSQL displays all identifiers in lower-case. This alone would 
> be a deviation from the specification. 

True.  We lowercase because historically we have, and because
all-upper-case is hard to read.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [SQL] Case Preservation disregarding case sensitivity?

2006-10-30 Thread Tom Lane
"Chuck McDevitt" <[EMAIL PROTECTED]> writes:
> At Teradata, we certainly interpreted the spec to allow case-preserving,
> but case-insensitive, identifiers.

Really?

As I see it, the controlling parts of the SQL spec are (SQL99 sec 5.2)

26) A  and a  are
equivalent if the  of the 
(with every letter that is a lower-case letter replaced by the
corresponding upper-case letter or letters) and the  of the  (with all
occurrences of  replaced by  and all
occurrences of  replaced by ),
considered as the repetition of a 
that specifies a  of SQL_IDENTIFIER
and an implementation-defined collation that is sensitive to
case, compare equally according to the comparison rules in
Subclause 8.2, "".

27) Two s are equivalent if their s, considered as the repetition of a  that specifies a 
of SQL_IDENTIFIER and an implementation-defined collation
that is sensitive to case, compare equally according to the
comparison rules in Subclause 8.2, "".

Note well the "sensitive to case" bits there.  Now consider

CREATE TABLE tab (
"foobar" int,
"FooBar" timestamp,
"FOOBAR" varchar(3)
);

We can *not* reject this as containing duplicate column names, else we
have certainly violated rule 27.  Now what will you do with

SELECT fooBar FROM tab;

?  The spec is unquestionably on the side of "you selected the varchar
column"; historical Postgres practice is on the side of "you selected
the int column".  AFAICS a case-insensitive approach would have to
fail with some "I can't identify which column you mean" error.  I am
interested to see where you find support for that in the spec...

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [SQL] Case Preservation disregarding case sensitivity?

2006-10-30 Thread Tom Lane
beau hargis <[EMAIL PROTECTED]> writes:
> Considering the differences that already exist between database systems and 
> their varying compliance with SQL and the various extensions that have been 
> created, I do not consider that the preservation of case for identifiers 
> would violate any SQL standard.

That's not how I read the spec.  It is true that we are not 100% spec
compliant, but that isn't a good argument for moving further away from
spec.  Not to mention breaking backwards compatibility with our
historical behavior.  The change you propose would fix your application
at the cost of breaking other people's applications.   Perhaps you
should consider fixing your app instead.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [SQL] Case Preservation disregarding case sensitivity?

2006-10-30 Thread beau hargis
On Friday 27 October 2006 19:38, Joe wrote:
> Hi Beau,
>
> On Fri, 2006-10-27 at 16:23 -0700, beau hargis wrote:
> > I am hoping that there is an easy way to obtain case-preservation with
> > case-insensitivity, or at the very least, case-preservation and complete
> > case-sensitivity, or case-preservation and a consistant case-conversion
> > strategy.
> >
> > The case of the column names need to be preserved because that is the way
> > the schema is designed and most importantly (VERY, VERY IMPORTANT),
> > column names are used in apps as hash values, or as named references
> > which are case sensitive and as such need to be delivered to the client
> > in exactly in the manner specified at the time of table creation.
> >
>
> I went through the same issue in my conversion from MySQL to Postgres
> and (since I had a small application) I ended up changing up all my
> tables and columns "UserProfile" to user_profile.
>
> I'm afraid however, that it's MySQL that is the odd man out.  I haven't
> researched this completely but I believe PG follows either the FIPS-127
> or SQL-92 standard with respect to what are called "delimited
> identifiers".  Basically, this says if you want case sensitivity in
> identifier names, you have to use double quotes wherever you refer to
> the identifier.  Without the double quotes, the SQL implementor can
> either use UPPERCASE (as I believe Oracle and DB2 do) or lowercase (as
> PG does) when it displays those identifiers.
>
> Joe

Again, I am at the same point I was at when searching and searching for 
information on the problem, and I am not sure what the SQL standard has to 
say about it: I do not, and I dont think that anyone else who is struggling 
to deal with the problem does, care about case-sensitivity. I am interested 
in case preservation of column names. I do, indeed, want identifiers treated 
in a case insensitive way, but I want the case PRESERVED in the table 
definitions and I want that case, as preserved, to be reflected in the field 
names as returned by the server to any client library that connects and 
initiates a query. 

Case-preservation is not the same as case-sensitivity; nor is 
case-normalization the same as case-insensitivity. What PostgreSQL is doing 
is converting any, and all, identifiers to a lower case and then matching 
those against the identifiers (as stored in the table definition) in a 
case-sensitive manner. It 'normalizes' the case of the identifiers so that it 
has a common internal representation; the desires of the programmer and 
database architect be damned. 

Referenced specification details:
From FIPS-127:
===
3. Delimited identifiers. In the previous ANSI SQL specification, it was not 
possible for an application to specify identifiers with spaces or other 
special symbols. Also, it was not possible to protect against future assaults 
on the name space for (identifier) by additions to the (reserved word) list. 
The new facility for (delimited identifier) allows a user to enclose all 
identifiers in double-quotation marks, thereby ensuring that the name defined 
or referenced may contain spaces or other special symbols and will not be 
impacted by future additions to the (reserved word) list.
===

From SQL-92/Sec. 5.2:
===
10)The  of a  is equivalent
to an  in which every letter that is a lower-
case letter is replaced by the equivalent upper-case letter
or letters. This treatment includes determination of equiva-
lence, representation in the Information and Definition Schemas,
representation in the diagnostics area, and similar uses.

11)The  of a  (with every
letter that is a lower-case letter replaced by the equivalent
upper-case letter or letters), treated as the repetition of
a  that specifies a  of SQL_TEXT, shall not be equal, according to
the comparison rules in Subclause 8.2, "",
to any  (with every letter that is a lower-case
letter replaced by the equivalent upper-case letter or letters),
treated as the repetition of a  that
specifies a  of SQL_TEXT.

12)Two s are equivalent if their s, considered as the repetition of a  that specifies a  of
SQL_TEXT, compare equally according to the comparison rules
in Subclause 8.2, "".

13)A  and a  are equiva-
lent if the  of the  (with
every letter that is a lower-case letter replaced by the equiva-
lent upper-case letter or letters) and the  of the  (with all occurrences of
 replaced by  and all occurrences of  replaced by ), considered as
the repetition of a  that specifies a
 of SQL_TEXT and an implementation-
defined collation that is sensitive to case, compare equally
according to the comparison rules in Subclause 8.2, "".

14)Two s are equivalent if their s (with all occurrences of  replaced
by  and all occurrences of 
replaced by ), considered as the repetition of a
 that specifies a  of SQL_TEXT and an implementation-defined collation
that is sensitive to case, compare