Forwarded to -patches because of the attachment. -------- Eredeti üzenet -------- Tárgy: Re: [HACKERS] Behavior of GENERATED columns per SQL2003 Dátum: Tue, 08 May 2007 12:38:32 +0200 Feladó: Zoltan Boszormenyi <[EMAIL PROTECTED]> Címzett: Tom Lane <[EMAIL PROTECTED]> CC: [EMAIL PROTECTED] Hivatkozások: <[EMAIL PROTECTED]>
Tom Lane írta:
I've been studying the SQL spec in a bit more detail and I'm suddenly thinking that we've got the behavior all wrong in the current GENERATED/IDENTITY patch. In particular, it looks to me like we've been implementing GENERATED ALWAYS AS (expr) according to the rules that the spec in fact lays down only for GENERATED ALWAYS AS IDENTITY. You'd think the two constructs would be pretty closely related but the behaviors specified by the spec are light-years apart. If you look closely, a "generated column" in the meaning of section 4.14.8 is one that has GENERATED ALWAYS AS (expr), and identity columns are *not* in this class.
It looks to me like the behavior the spec intends for a generated column is actually that it can be implemented as a "virtual column" occupying no space on disk and instead computed on-the-fly when retrieved.
I think the keyword here is 'can be'. But having it stored gives a nice speedup in SELECTs vs. no speedup if you generate it on the fly.
Identity columns can have their values overridden by the user (it's a little harder if GENERATED ALWAYS, but still possible), and they don't change during an UPDATE unless specifically forced to.
Yes, I implemented it this way.
In contrast, generated columns cannot be overridden by assignment, and are recomputed from their base columns during updates.
I see, I incorrectly made OVERRIDING SYSTEM VALUE to have an effect on generated columns. Now I don't need to pass around the list of the modified fields from rewriteTargetList(), I simply have to blindly update all of them both in INSERT and UPDATE. Fixed. And now I don't have to invent something to discover what fields were modified by BEFORE triggers.
This realization also explains the following, otherwise rather strange, facts: * There is no GENERATED BY DEFAULT AS (expr) in the spec.
Yes, and because it already exists and called DEFAULT.
* GENERATED expressions are specifically disallowed from containing subselects, calling functions that access any SQL-data, or being nondeterministic; hence their values depend solely on the regular columns in the same row.
The sanity checks for the DEFAULT expression already handle subselect in PostgreSQL: db=# create table t1 (id float generated always as identity, t text,g text generated always as (case when t is null then '' else t end || (select max(id) from t1)); NOTICE: CREATE TABLE will create implicit sequence "t1_id_seq" for serial column "t1.id"
ERROR: cannot use subquery in default expression But how do you check a function in general? Especially when it's not written in plpgsql? E.g. a C function can use SPI and SELECTs. Also, SQL:2003 doesn't allow any functions for the DEFAULT clause besides timestamp functions, e.g. NOW(). But PostgreSQL already allows non-IMMUTABLE functions used in DEFAULT clauses. Would you want to restrict it?
* While identity columns are updated (if needed) before execution of BEFORE triggers, generated columns are updated after BEFORE triggers; hence a BEFORE trigger can override the value in one case and not the other. (The current patch gets this wrong, btw.)
Where do you see that? Which version were you looking at?Identity columns are generated in rewriteTargetList(), way before any triggers.
Generated column are computed in ExecInsert() and ExecUpdate(), certainly after applying BEFORE triggers in both cases and before CheckConstraint(). There was one bug in the UPDATE case, though,as UPDATE loops if it couldn't do its job in one go because of serialization.
I fixed this.
* Generated columns are forcibly updated when their base columns change as a result of FK constraints (such as ON UPDATE CASCADE).
Isn't ExecUpdate() called on the referring table's row during such FK event? It seems yes to me: db=# create table t1 (id serial generated always as identity primary key, t text generated always as (id || '_1'));NOTICE: CREATE TABLE will create implicit sequence "t1_id_seq" for serial column "t1.id" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "t1_pkey" for table "t1"
CREATE TABLE db=# create table t2 (id serial generated always as identity primary key, id_t1 integer not null references t1(id) on update cascade, g text generated always as (id_t1 || '_1'));NOTICE: CREATE TABLE will create implicit sequence "t2_id_seq" for serial column "t2.id" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "t2_pkey" for table "t2"
CREATE TABLE db=# insert into t1 values (default, 'a1'); INSERT 0 1 db=# insert into t1 values (default, 'b2'); INSERT 0 1 db=# insert into t1 values (default, 'c3'); INSERT 0 1 db=# insert into t1 values (default, 'd4'); INSERT 0 1 db=# select * from t1;id | t ----+-----
1 | 1_1 2 | 2_1 3 | 3_1 4 | 4_1 (4 rows) db=# insert into t2 values (default, 2); INSERT 0 1 db=# select * from t2;id | id_t1 | g ----+-------+-----
1 | 2 | 2_1 (1 row) db=# update t1 set id = 77 where id = 2; UPDATE 1 db=# select * from t1;id | t ----+------
1 | 1_1 3 | 3_1 4 | 4_1 77 | 77_1 (<--- This was buggy, the generated value didn't change here,as an oversight during the rewrite to make the generation behaviour a property of the DEFAULT clause. Now fixed.)
(4 rows) zozo=# select * from t2;id | id_t1 | g ----+-------+------
1 | 77 | 77_1 (<---- It worked this way before, too.) (1 row) Anyway, a new patch is necessary it seems, so it should apply cleanly to new CVS and have the above bugs fixed. Attached.
It looks to me like a BEFORE trigger is actually the only place that can (transiently) see values of a generated column that are different from the result of applying the generation expression on the rest of the row. It's unclear whether that's intentional or an oversight.
I thought it was intentional. BEFORE triggers can change the base columns before storing them, but the generated columns should be consistent with their generation expressions and their base columns after executing the BEFORE triggers. If the triggers should see generated values then another recomputation is necessary to refresh the generated columns AND to make them unchangable by the triggers. It seems to be a waste of cycles to me. It should be documented, though.
Is anyone familiar with a database that implements SQL-spec generated columns? Do they actually store the columns?
David Fuhry answered this question with the advertised intention.
regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Geschwinde & Schönig GmbH http://www.postgresql.at/ -- ---------------------------------- Zoltán Böszörményi Cybertec Geschwinde & Schönig GmbH http://www.postgresql.at/
Description: Unix tar archive
---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match