Re: [HACKERS] SQL standard changed behavior of ON UPDATE SET NULL/SET DEFAULT?

2012-06-17 Thread Tom Lane
I wrote:
 Anybody have DB2, or something else that might be thought to be pretty
 close to spec-compliant?

Remarkably enough, the DB2 10.1 manuals at www.ibm.com say that it
doesn't support ON UPDATE SET NULL or ON UPDATE SET DEFAULT.  I'm
disappointed in them :-(.  But anyway it seems that we'll not get
that much guidance from looking at other SQL implementations, and
what precedents there are suggest that people are using the set-all-
the-columns interpretation.

After reflection it seems clear to me that set-all-the-columns is
in fact an improvement for the SET DEFAULT case, regardless of match
style.  If we set only some of them, you get a mishmash of old and
new column values which is rather unlikely to match any row of the
referenced table.  If we always set all of them, then (at least for
constant default values) only one fallback entry is required in
the referenced table.  This can be seen in my example script upthread,
where I had to make a bogus referenceable entry 11, 0 to prevent
an RI failure on the MATCH SIMPLE update.  Having just the one fallback
entry 0, 0 definitely seems saner from an application standpoint.

I'm less sold on set-all-the-columns for the MATCH SIMPLE SET NULL
case.  In this match style, setting any referencing column to null
is sufficient to prevent an RI failure, and it could be argued that
zapping all of them discards data that might be useful.  But it does
have the advantage of predictability.

From an implementation standpoint, set-all-the-columns is definitely
easier to deal with: we won't need ri_OneKeyEqual at all any more,
and RI_FKey_setnull_upd no longer has the problem of having to deal
with variant plans depending on which columns it needs to zap.
So I'm attracted to it on that basis, but I don't want to let
implementation concerns drive the decision.

On balance I think we ought to switch to set-all-the-columns, though
only in 9.3+ --- a back-patched behavioral change doesn't seem like a
good idea.

Any objections, or anyone want to do more research before we decide?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL standard changed behavior of ON UPDATE SET NULL/SET DEFAULT?

2012-06-17 Thread Tom Lane
I wrote:
 On balance I think we ought to switch to set-all-the-columns, though
 only in 9.3+ --- a back-patched behavioral change doesn't seem like a
 good idea.

And here is a draft patch for that.  I was interested to find that the
documentation already claims that all columns are set in the relevant
cases (so the docs changes here are just wordsmithing and
clarification).

regards, tom lane

diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index ea840fb8468f7d1d65b572d9880d74dd0178e143..013dc7c4dacdc56bbdfd001f6a0b615ecbead84b 100644
*** a/doc/src/sgml/ddl.sgml
--- b/doc/src/sgml/ddl.sgml
*** CREATE TABLE t1 (
*** 735,741 
 /para
  
 para
! A table can contain more than one foreign key constraint.  This is
  used to implement many-to-many relationships between tables.  Say
  you have tables about products and orders, but now you want to
  allow one order to contain possibly many products (which the
--- 735,741 
 /para
  
 para
! A table can have more than one foreign key constraint.  This is
  used to implement many-to-many relationships between tables.  Say
  you have tables about products and orders, but now you want to
  allow one order to contain possibly many products (which the
*** CREATE TABLE order_items (
*** 827,837 
  row(s) referencing it should be automatically deleted as well.
  There are two other options:
  literalSET NULL/literal and literalSET DEFAULT/literal.
! These cause the referencing columns to be set to nulls or default
  values, respectively, when the referenced row is deleted.
  Note that these do not excuse you from observing any constraints.
  For example, if an action specifies literalSET DEFAULT/literal
! but the default value would not satisfy the foreign key, the
  operation will fail.
 /para
  
--- 827,838 
  row(s) referencing it should be automatically deleted as well.
  There are two other options:
  literalSET NULL/literal and literalSET DEFAULT/literal.
! These cause the referencing column(s) in the referencing row(s)
! to be set to nulls or their default
  values, respectively, when the referenced row is deleted.
  Note that these do not excuse you from observing any constraints.
  For example, if an action specifies literalSET DEFAULT/literal
! but the default value would not satisfy the foreign key constraint, the
  operation will fail.
 /para
  
*** CREATE TABLE order_items (
*** 839,851 
  Analogous to literalON DELETE/literal there is also
  literalON UPDATE/literal which is invoked when a referenced
  column is changed (updated).  The possible actions are the same.
 /para
  
 para
  Since a commandDELETE/command of a row from the referenced table
  or an commandUPDATE/command of a referenced column will require
  a scan of the referencing table for rows matching the old value, it
! is often a good idea to index the referencing columns.  Because this
  is not always needed, and there are many choices available on how
  to index, declaration of a foreign key constraint does not
  automatically create an index on the referencing columns.
--- 840,869 
  Analogous to literalON DELETE/literal there is also
  literalON UPDATE/literal which is invoked when a referenced
  column is changed (updated).  The possible actions are the same.
+ In this case, literalCASCADE/ means that the updated values of the
+ referenced column(s) should be copied into the referencing row(s).
 /para
  
 para
+ Normally, a referencing row need not satisfy the foreign key constraint
+ if any of its referencing columns are null.  If literalMATCH FULL/
+ is added to the foreign key declaration, a referencing row escapes
+ satisfying the constraint only if all its referencing columns are null
+ (so a mix of null and non-null values is guaranteed to fail a
+ literalMATCH FULL/ constraint).  If you don't want referencing rows
+ to be able to avoid satisfying the foreign key constraint, declare the
+ referencing column(s) as literalNOT NULL/.
+/para
+ 
+para
+ A foreign key must reference columns that either are a primary key or
+ form a unique constraint.  This means that the referenced columns always
+ have an index (the one underlying the primary key or unique constraint);
+ so checks on whether a referencing row has a match will be efficient.
  Since a commandDELETE/command of a row from the referenced table
  or an commandUPDATE/command of a referenced column will require
  a scan of the referencing table for rows matching the old value, it
! is often a good idea to index the referencing columns too.  Because this
  is not always needed, and there are many choices available on how
  to index, declaration of a 

[HACKERS] SQL standard changed behavior of ON UPDATE SET NULL/SET DEFAULT?

2012-06-16 Thread Tom Lane
While looking at Vik Reykja's pending patch to improve the FK triggers
by skipping processing when a NULL column didn't change, I started to
wonder whether that really had no user-visible semantic effect.
In particular, in ON UPDATE SET NULL/SET DEFAULT cases, it seemed like
this could change the set of referencing columns that get set to NULL
or to their defaults.  So the next question was which behavior actually
conforms to the SQL standard, and the answer to that is ... disturbing.

The code in ri_triggers.c was written against SQL92's definition of ON
UPDATE SET NULL/SET DEFAULT, which is (ignoring the MATCH PARTIAL case,
which we don't implement):

 6) If an update rule is specified and a non-null value of a ref-
erenced column in the referenced table is updated to a value
that is distinct from the current value of that column, then

a) If match type is not specified or if FULL is specified,
  then

 ii) If the update rule specifies SET NULL, then

 Case:

 1) If match type is not specified, then in all matching
   rows the referencing column that corresponds with the
   referenced column is set to the null value.

 2) If match type specifies FULL, then in all matching
   rows each referencing column is set to the null value.

iii) If the update rule specifies SET DEFAULT, then in all
 matching rows the referencing column that corresponds with
 the referenced column is set to the default value specified
 in the General Rules of Subclause 11.5, default clause.

Note that only in the MATCH FULL + SET NULL case does it say to set
*all* the referencing columns in each matching row.  Otherwise, you are
only supposed to change columns that correspond to referenced columns
that were changed.  It's notable that SET NULL and SET DEFAULT have
different behaviors here.

On the other hand, in SQL:2008 I find (some boilerplate text omitted):

10) If a non-null value of a referenced column RC in the
referenced table is updated to a value that is distinct from the
current value of RC, then, for every member F of the subtable
family of the referencing table:

Case:

  a) If M specifies SIMPLE or FULL, then

Case:

ii) If UR specifies SET NULL, then

  Case:

1) If M specifies SIMPLE, then each matching row
MR in F is paired with the candidate replacement
row NMR, formed by copying MR and setting each
referencing column in the copy to the null
value. MR is identified for replacement by NMR
in F.

2) If M specifies FULL, then each matching row
MR in F is paired with the candidate replacement
row NMR, formed by copying MR and setting each
referencing column in the copy to the null
value. MR is identified for replacement by NMR
in F.

iii) If UR specifies SET DEFAULT, then each matching row
MR in F is paired with the candidate replacement row
NMR, formed by copying MR and setting each referencing
column in the copy to the default value specified in the
General Rules of Subclause 11.5, default clause. MR
is identified for replacement by NMR in F.

So far as I can see, this says to set *all* referencing columns to
nulls or their defaults, in all four cases, whether the corresponding
referenced column was one that changed or not.  This is very clearly
different from what SQL92 says.  It's also rather curious that they
distinguish two cases for SET NULL when the texts are exactly alike.

It looks to me like this change occurred in SQL:2003, although
SQL:1999's version of the text is such badly written pseudo-mathematical
gobbledygook that it's a bit hard to tell which behavior they meant.
However, neither of those specs list any change in referential
constraint behavior as being an acknowledged incompatibility with the
prior standard.  Have the SQL committee simply failed to notice that in
whacking this text around they changed the meaning?  Which behavior is
actually implemented by other RDBMSes?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL standard changed behavior of ON UPDATE SET NULL/SET DEFAULT?

2012-06-16 Thread Tom Lane
I wrote:
 Have the SQL committee simply failed to notice that in
 whacking this text around they changed the meaning?  Which behavior is
 actually implemented by other RDBMSes?

If anyone is up for actually trying this, here is a script to test the
behavior in question:

create table pp (f1 int, f2 int, primary key (f1,f2));
create table cmssn (f1 int, f2 int,
foreign key (f1,f2) references pp(f1,f2) on update set null);
create table cmfsn (f1 int, f2 int,
foreign key (f1,f2) references pp(f1,f2) match full on update set null);
create table cmssd (f1 int default 0, f2 int default 0,
foreign key (f1,f2) references pp(f1,f2) on update set default);
create table cmfsd (f1 int default 0, f2 int default 0,
foreign key (f1,f2) references pp(f1,f2) match full on update set default);

insert into pp values (11, 22);
insert into pp values (11, 0);
insert into pp values (0, 0);

insert into cmssn values (11, 22);
insert into cmfsn values (11, 22);
insert into cmssd values (11, 22);
insert into cmfsd values (11, 22);

update pp set f2 = f2 + 1 where f2  0;

select * from cmssn;
select * from cmfsn;
select * from cmssd;
select * from cmfsd;

In Postgres this produces

 f1 | f2 
+
 11 |   
(1 row)

 f1 | f2 
+
|   
(1 row)

 f1 | f2 
+
 11 |  0
(1 row)

 f1 | f2 
+
  0 |  0
(1 row)

which shows that we are self-consistent but not actually compliant with
either old or new wordings of the spec :-(

The only other SQL DB I have handy is mysql 5.5.24, which shows up
pretty unimpressively: it gives a syntax error on the cmssd definition,
which would be all right because the manual says the innodb storage
engine doesn't support SET DEFAULT, except it *doesn't* give a syntax
error for creating cmfsd.  Then, the update fails claiming that cmfsn's
FK constraint is violated, so they evidently don't implement that case
correctly.  After removing cmfsn, the update fails again claiming that
cmfsd's FK constraint is violated, so yeah they are telling the truth
when they say SET DEFAULT doesn't work.  The upshot is that only the
MATCH SIMPLE SET NULL case works at all in current mysql, and that
produces the result

mysql select * from cmssn;
+--+--+
| f1   | f2   |
+--+--+
| NULL | NULL |
+--+--+
1 row in set (0.00 sec)

so they are nulling all the referencing columns in this case, which
matches the more recent specs but is clearly contrary to SQL92.

Anybody have DB2, or something else that might be thought to be pretty
close to spec-compliant?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL standard changed behavior of ON UPDATE SET NULL/SET DEFAULT?

2012-06-16 Thread Dean Rasheed
On 16 June 2012 21:18, Tom Lane t...@sss.pgh.pa.us wrote:
 Anybody have DB2, or something else that might be thought to be pretty
 close to spec-compliant?


I have an Oracle DB, but they're not exactly known for spec
compliance. In fact they dodge this entire issue by not supporting ON
UPDATE actions at all :-)

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL standard changed behavior of ON UPDATE SET NULL/SET DEFAULT?

2012-06-16 Thread Dickson S. Guedes
2012/6/16 Tom Lane t...@sss.pgh.pa.us:
 I wrote:
 Have the SQL committee simply failed to notice that in
 whacking this text around they changed the meaning?  Which behavior is
 actually implemented by other RDBMSes?

 If anyone is up for actually trying this, here is a script to test the
 behavior in question:

 create table pp (f1 int, f2 int, primary key (f1,f2));
 create table cmssn (f1 int, f2 int,
    foreign key (f1,f2) references pp(f1,f2) on update set null);
 create table cmfsn (f1 int, f2 int,
    foreign key (f1,f2) references pp(f1,f2) match full on update set null);
 create table cmssd (f1 int default 0, f2 int default 0,
    foreign key (f1,f2) references pp(f1,f2) on update set default);
 create table cmfsd (f1 int default 0, f2 int default 0,
    foreign key (f1,f2) references pp(f1,f2) match full on update set default);

 insert into pp values (11, 22);
 insert into pp values (11, 0);
 insert into pp values (0, 0);

 insert into cmssn values (11, 22);
 insert into cmfsn values (11, 22);
 insert into cmssd values (11, 22);
 insert into cmfsd values (11, 22);

 update pp set f2 = f2 + 1 where f2  0;

 select * from cmssn;
 select * from cmfsn;
 select * from cmssd;
 select * from cmfsd;

 In Postgres this produces

  f1 | f2
 +
  11 |
 (1 row)

  f1 | f2
 +
    |
 (1 row)

  f1 | f2
 +
  11 |  0
 (1 row)

  f1 | f2
 +
  0 |  0
 (1 row)

 which shows that we are self-consistent but not actually compliant with
 either old or new wordings of the spec :-(

 The only other SQL DB I have handy is mysql 5.5.24, which shows up
 pretty unimpressively: it gives a syntax error on the cmssd definition,
 which would be all right because the manual says the innodb storage
 engine doesn't support SET DEFAULT, except it *doesn't* give a syntax
 error for creating cmfsd.  Then, the update fails claiming that cmfsn's
 FK constraint is violated, so they evidently don't implement that case
 correctly.  After removing cmfsn, the update fails again claiming that
 cmfsd's FK constraint is violated, so yeah they are telling the truth
 when they say SET DEFAULT doesn't work.  The upshot is that only the
 MATCH SIMPLE SET NULL case works at all in current mysql, and that
 produces the result

 mysql select * from cmssn;
 +--+--+
 | f1   | f2   |
 +--+--+
 | NULL | NULL |
 +--+--+
 1 row in set (0.00 sec)

 so they are nulling all the referencing columns in this case, which
 matches the more recent specs but is clearly contrary to SQL92.

 Anybody have DB2, or something else that might be thought to be pretty
 close to spec-compliant?


I tryed in a MS SQL Server 2012 via SQLFiddle [1]. I could only create
'cmssn' and 'cmssd' tables because as I can see in [2] MS SQL Server
2012 doesn't supports MATCH syntax.

The result was:

select * from cmssn;
F1 |  F2
(null)   |  (null)

select * from cmssd;
F1 |F2
0   |   0

The test is in [3], and there you can try other RDBMS, just create the
schema on the left panel and testing selects on the right.

[1] http://sqlfiddle.com
[2] http://msdn.microsoft.com/en-us/library/ms174979.aspx
[3] http://sqlfiddle.com/#!6/ac7db/1

Regards.
-- 
Dickson S. Guedes
mail/xmpp: gue...@guedesoft.net - skype: guediz
http://guedesoft.net - http://www.postgresql.org.br

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers