[SQL] Inheritance, plpgsql inserting, and revisions

Phil Endecott Wed, 16 Jun 2004 08:21:58 -0700

Dear Postgresql Experts,

A few days ago I posted a message asking about INSERTing a row based on a RECORD 
variable in a plpgsql function.  I haven't had any replies - maybe it got lost in the 
weekend's spam - so I thought I'd post again giving some more background to what I'm 
trying to do.


1. Keeping old revisions of data.

Say I have a basic table:

create table T (
  id    integer  primary key,
  x     text,
  ...
);

I can, of course, insert new rows, update existing rows, and delete rows.  But once 
I've made a change in one of those ways I have lost the previous information.  There 
is no way to "undo" or "revert" those changes.  (I'm not talking about transactions 
here.  Think longer-term data retention, more like UNDO in a wordprocessor or CVS 
history.)  In my application I want to be able to undo changes and/or to see the state 
of the data at some point in the past.

To achieve this, I never update or delete the table.  I always insert new rows.   Each 
row has an "edit date" field.  The current state of the data is found by selecting 
rows with the most recent edit date for each id.  Old rows can periodically be purged 
to keep the size of the database manageable, if necessary.  So I have this table:

create table T_d (
  id        integer,
  editdate  timestamp  default current_timestamp,
  x         text,
  ...
  primary key (id, editdate)
);

Then I have a view that gets just the current state:

create view T as
    select distinct on (id) id, x from T_d order by id, editdate desc;

When I select from T, I don't need to worry about the existence of this extra stuff, I 
can treat it like an "ordinary" table.  Using rules I can convert changes to T into 
inserts into T_d, again transparently:

create rule ins_T as on insert to T do instead
   insert into T_d (id, x) values (new.id, new.x);

create rule upd_T as on update to T do instead
   insert into T_d (id, x) values (new.id, new.x);

To support deletes I add an extra boolean field to T_d called deleted.  It defaults to 
false.  If the most recent revision of a row has deleted true, that row doesn't exist:

create table T_d (
  id        integer,
  editdate  timestamp default current_timestamp,
  deleted   boolean   default false,
  x         text,
  ...
  primary key (id, editdate)
);

The view that deals with this is:

create view T as
  select id, x from (
    select distinct on (id) * from T_d order by id, editdate desc
  ) as q where not deleted;

I can then create a rule for delete that inserts a row with deleted true:

create rule del_T as on delete to T do instead
   insert into T_d (id, deleted, x) values (old.id, true, old.x);

This all works well.  It should be applicable to any simple database design, and I'd 
recommend the approach to anyone faced with similar requirements.


2. Inheritance

In my application I'm also using inheritance.  So as well as the "base" table T that 
I've described above, I have "derived" tables M and N:

create table M_d (
  mm  text
) inherits (T_d);

create table N_d (
  nn integer
) inherits (T_d);

If I define similar views and rules, I can get the same "history recording" for these 
derived tables as well... almost.

It's fine when I access M and N themselves.  The problem is if I try to access data in 
M or N via the base table T.  Having inserted a row into M, I should be able to change 
the shared field x in that row by doing an update on the base table T.  With "real" 
tables this works fine.  With my rules I "do instead insert" a new row into T.  But 
this row is in T, not in the derived table M.  As far as M is concerned, this new row 
is invisible (and if it were visible, it would not have any values for M's local field 
mm).  A similar problem exists for deletes.

This is the problem that I have been trying to solve over the last few days.  So far, 
I have got this far:

- When I want to do an update or delete on the base table, I use this to find out what 
derived table the affected row actually belongs to:

select pc.relname from T_d td join pg_class pc on (td.tableoid=pc.oid)

- I then need to insert a new row into this table, based on the values from the most 
recent row with the selected id and any changes for an update, or with deleted set for 
a delete.

I don't think this can be done directly within a CREATE RULE statement's commands, so 
I've been trying to write a plpgsql function to do it.  I presume that I can then call 
the function from the rule, though I have yet to try this.

Considering just the delete for the now, this is what I have managed so far.  This 
gets called with the id of the row to be deleted:

create function del ( integer ) returns void as '
  declare
    del_id alias for $1;
    table text;
    r record;
  begin
    select pc.relname into table from T_d td join pg_class pc
      on (td.tableoid=pc.oid) where td.id=del_id;
    if not found then
      raise exception ''object % not found'', del_id;
    end if;
    -- following "loop" executes once.
    for r in execute ''select * from '' || table {without the _d}
                  || '' where id='' || del_id
        loop
      r.deleted := t;
      r.editdate := current_timestamp;
      insert into table r;  !!! Nope !!!
      exit;
    end loop;
  end;
' language plpgsql;


As you can see, I have got as far as reading the row from the derived table (M or N) 
into a record variable r, and have modified it.  Now I want to insert this value back 
into the table.  The syntax I was hoping to find is something like INSERT r INTO 
table, but it doesn't seem to exist.  So maybe I have to construct an explicit INSERT 
(...) VALUES (...) statement as a string.  That's OK, but is there an "introspection 
mechanism" that lets me iterate over the fields of a record, getting their names?

I feel I'm pretty close to having a neat solution to an interesting problem, but am 
stuck with this bit of plpgsql syntax.  Can anyone offer any suggestions?

Thank you for reading this far!

Regards,

--Phil Endecott.

p.s. My spellcheker wants to turn plpgsql into "popsicle"!  What a great idea on a 
sunny afternoon like today...

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
      joining column's datatypes do not match

[SQL] Inheritance, plpgsql inserting, and revisions

Reply via email to