Forwarded to -patches because of the attachment.

-------- Eredeti üzenet --------
Tárgy:  Re: [HACKERS] Behavior of GENERATED columns per SQL2003
Dátum:  Tue, 08 May 2007 12:38:32 +0200
Feladó:         Zoltan Boszormenyi <[EMAIL PROTECTED]>
Címzett:        Tom Lane <[EMAIL PROTECTED]>
Hivatkozások:   <[EMAIL PROTECTED]>

Tom Lane írta:
I've been studying the SQL spec in a bit more detail and I'm suddenly
thinking that we've got the behavior all wrong in the current
GENERATED/IDENTITY patch.  In particular, it looks to me like we've
been implementing GENERATED ALWAYS AS (expr) according to the rules
that the spec in fact lays down only for GENERATED ALWAYS AS IDENTITY.
You'd think the two constructs would be pretty closely related but
the behaviors specified by the spec are light-years apart.  If you
look closely, a "generated column" in the meaning of section 4.14.8
is one that has GENERATED ALWAYS AS (expr), and identity columns are
*not* in this class.


It looks to me like the behavior the spec intends for a generated column
is actually that it can be implemented as a "virtual column" occupying
no space on disk and instead computed on-the-fly when retrieved.

I think the keyword here is 'can be'. But having it stored gives a nice
speedup in SELECTs vs. no speedup if you generate it on the fly.

Identity columns can have their values overridden by the
user (it's a little harder if GENERATED ALWAYS, but still possible),
and they don't change during an UPDATE unless specifically forced to.

Yes, I implemented it this way.

In contrast, generated columns cannot be overridden by
assignment, and are recomputed from their base columns during updates.

I see, I incorrectly made OVERRIDING SYSTEM VALUE
to have an effect on generated columns. Now I don't need
to pass around the list of the modified fields from
rewriteTargetList(), I simply have to blindly update all of them
both in INSERT and UPDATE. Fixed. And now I don't have to
invent something to discover what fields were modified by
BEFORE triggers.

This realization also explains the following, otherwise rather strange,

* There is no GENERATED BY DEFAULT AS (expr) in the spec.

Yes, and because it already exists and called DEFAULT.

* GENERATED expressions are specifically disallowed from containing
  subselects, calling functions that access any SQL-data, or being
  nondeterministic; hence their values depend solely on the regular
  columns in the same row.

The sanity checks for the DEFAULT expression already
handle subselect in PostgreSQL:

db=# create table t1 (id float generated always as identity, t text,
g text generated always as (case when t is null then '' else t end || (select max(id) from t1)); NOTICE: CREATE TABLE will create implicit sequence "t1_id_seq" for serial column ""
ERROR:  cannot use subquery in default expression

But how do you check a function in general?
Especially when it's not written in plpgsql?
E.g. a C function can use SPI and SELECTs.

Also, SQL:2003 doesn't allow any functions for the DEFAULT clause
besides timestamp functions, e.g. NOW().
But PostgreSQL already allows non-IMMUTABLE functions used in
DEFAULT clauses. Would you want to restrict it?

* While identity columns are updated (if needed) before execution of
  BEFORE triggers, generated columns are updated after BEFORE triggers;
  hence a BEFORE trigger can override the value in one case and not the
  other.  (The current patch gets this wrong, btw.)

Where do you see that? Which version were you looking at?
Identity columns are generated in rewriteTargetList(), way before any triggers.
Generated column are computed in ExecInsert() and ExecUpdate(),
certainly after applying BEFORE triggers in both cases and
before CheckConstraint(). There was one bug in the UPDATE case, though,
as UPDATE loops if it couldn't do its job in one go because of serialization.
I fixed this.

* Generated columns are forcibly updated when their base columns change
  as a result of FK constraints (such as ON UPDATE CASCADE).

Isn't ExecUpdate() called on the referring table's row during such FK event?
It seems yes to me:

db=# create table t1 (id serial generated always as identity primary key,
t text generated always as (id || '_1'));
NOTICE: CREATE TABLE will create implicit sequence "t1_id_seq" for serial column "" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "t1_pkey" for table "t1"
db=# create table t2 (id serial generated always as identity primary key,
id_t1 integer not null references t1(id) on update cascade,
g text generated always as (id_t1 || '_1'));
NOTICE: CREATE TABLE will create implicit sequence "t2_id_seq" for serial column "" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "t2_pkey" for table "t2"
db=# insert into t1 values (default, 'a1');
db=# insert into t1 values (default, 'b2');
db=# insert into t1 values (default, 'c3');
db=# insert into t1 values (default, 'd4');
db=# select * from t1;
id | t ----+-----
 1 | 1_1
 2 | 2_1
 3 | 3_1
 4 | 4_1
(4 rows)

db=# insert into t2 values (default, 2);
db=# select * from t2;
id | id_t1 | g ----+-------+-----
 1 |     2 | 2_1
(1 row)

db=# update t1 set id = 77 where id = 2;
db=# select * from t1;
id | t ----+------
 1 | 1_1
 3 | 3_1
 4 | 4_1
77 | 77_1    (<--- This was buggy, the generated value didn't change here,
as an oversight during the rewrite to make the generation behaviour a property of the DEFAULT clause. Now fixed.)
(4 rows)

zozo=# select * from t2;
id | id_t1 | g ----+-------+------
 1 |    77 | 77_1 (<---- It worked this way before, too.)
(1 row)

Anyway, a new patch is necessary it seems,
so it should apply cleanly to new CVS and have the
above bugs fixed. Attached.

It looks to me like a BEFORE trigger is actually the only place that can
(transiently) see values of a generated column that are different from
the result of applying the generation expression on the rest of the row.
It's unclear whether that's intentional or an oversight.

I thought it was intentional. BEFORE triggers can change the base columns
before storing them, but the generated columns should be consistent
with their generation expressions and their base columns after executing
the BEFORE triggers. If the triggers should see generated values
then another recomputation is necessary to refresh the generated
columns AND to make them unchangable by the triggers.
It seems to be a waste of cycles to me. It should be documented, though.

Is anyone familiar with a database that implements SQL-spec generated
columns?  Do they actually store the columns?

David Fuhry answered this question with the advertised intention.

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?


Best regards,
Zoltán Böszörményi

Zoltán Böszörményi
Cybertec Geschwinde & Schönig GmbH

Zoltán Böszörményi
Cybertec Geschwinde & Schönig GmbH

Attachment: psql-serial-45.diff.gz
Description: Unix tar archive

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not

Reply via email to