[SQL] Versionning (was: Whole-row comparison)

2007-06-01 Thread christian.roche.ext

Hi all,

first let me tell you that this nice "whole-row comparison" feature of
postgres 8.2 allowed me to create a versionned database model very
neatly.  The SQL statement that inserts in the destination table only
the one rows that are new or that have changed since last time is very
simply written:

INSERT INTO bsc_table
SELECT nextval('version_seq'), 
FROM load.bsc_table AS ld LEFT JOIN bsc_view AS nt USING (obj_id)
WHERE nt.obj_id IS NULL OR row(nt.*) <> row(ld.*);

bsc_view is a view that returns the latest version of each object in the
bsc table:

CREATE VIEW bsc_view AS
SELECT  
FROM bsc_table
WHERE (obj_id, ver_id) IN (SELECT obj_id, max(ver_id) FROM bsc_table
GROUP BY obj_id);

This is all nice as long as I only want to access the very last version
of the table.  However what should be done if I now need to access an
earlier version ?  The most elegant way would be to pass a ver_id
parameter to bsc_view, something like :

CREATE VIEW bsc_view(int) AS
SELECT  
FROM bsc_table
WHERE (obj_id, ver_id) IN
  (SELECT obj_id, max(ver_id)
 FROM bsc_table
   WHERE ver_id <= $1
 GROUP BY obj_id));

However postgres doesn't allow parameters in views as far as I know.  I
guess I could create a function returning a set of rows, but then I
would lose most advantages of rewritten views, especially optimization,
right ?

I've contemplated reusing an awful hack from my Access era, namely using
a single-rowed table to store the parameter and joining the view on it.
The parameter would be updated before the view is called; this would
work but would definitely be ugly.  Can someone think of a better way to
do that ?

Thanks a lot,
Christian

-Original Message-
 
I'm trying to implement a "versionned" storage for a datawarehouse
system, meaning I have tables which store daily information about
objects and I would like to have a row inserted (i.e. a new version)
only if it differs from the most recent existing row.  For instance
instead of storing
 
versionattribute1attribute2
1xy
2xy
3xy
4zy
5zy
6zt
 
we would only keep the diffs :
 
versionattribute1attribute2
1xy
4zy
6zt

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [SQL] Versionning (was: Whole-row comparison)

2007-06-01 Thread Andrew Sullivan
On Fri, Jun 01, 2007 at 08:07:46PM +0300, [EMAIL PROTECTED] wrote:
> I've contemplated reusing an awful hack from my Access era, namely using
> a single-rowed table to store the parameter and joining the view on it.
> The parameter would be updated before the view is called; this would
> work but would definitely be ugly.  Can someone think of a better way to
> do that ?

I sort of don't see how that hack would be any different from a SRF. 
You'd lose the planner benefits anyway, I think, because you'd have
to plan for the generic case where the data could be anything, no?

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
Users never remark, "Wow, this software may be buggy and hard 
to use, but at least there is a lot of code underneath."
--Damien Katz

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [SQL] Versionning (was: Whole-row comparison)

2007-06-01 Thread christian.roche.ext
 
Hi Andrew,

I must admit I don't really understand what you say.  I don't know what
SRF stand for, and what you say about generic case is not clear to me,
sorry.

My idea is that using a parameter table allows me to keep using a view,
which is optimized for instance when used against a WHERE condition.

For example, I could write :

CREATE VIEW bsc_view AS
SELECT  
FROM bsc_table
WHERE (obj_id, ver_id) IN
  (SELECT obj_id, max(ver_id) FROM bsc_table, param_table
 WHERE ver_id <= param_table.ver_id
   GROUP BY obj_id));

and the following statement would be optimized:

UPDATE param_table SET ver_id = xxx;
SELECT * FROM bsc_view WHERE obj_id = yyy; 

which would not be the case would I have used a multi-row function.

Does this make sense ?

Thanks a lot,
Christian

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of ext Andrew Sullivan
Sent: Friday, June 01, 2007 17:47
To: pgsql-sql@postgresql.org
Subject: Re: [SQL] Versionning (was: Whole-row comparison)

On Fri, Jun 01, 2007 at 08:07:46PM +0300, [EMAIL PROTECTED]
wrote:
> I've contemplated reusing an awful hack from my Access era, namely 
> using a single-rowed table to store the parameter and joining the view
on it.
> The parameter would be updated before the view is called; this would 
> work but would definitely be ugly.  Can someone think of a better way 
> to do that ?

I sort of don't see how that hack would be any different from a SRF. 
You'd lose the planner benefits anyway, I think, because you'd have to
plan for the generic case where the data could be anything, no?

A


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[SQL] schema propagation

2007-06-01 Thread chester c young
Hi

has anyone done any work on comparing schemas?

I'm trying to automatically propagate changes in a master schema to
child schemas.  All schema changes will originate at master.  Currently
child schemas are in the same database, but in the future might be in
different databases (clusters?).

thanks!


  

Fussy? Opinionated? Impossible to please? Perfect.  Join Yahoo!'s user panel 
and lay it on us. http://surveylink.yahoo.com/gmrs/yahoo_panel_invite.asp?a=7 


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [SQL] Versionning (was: Whole-row comparison)

2007-06-01 Thread Andrew Sullivan
On Fri, Jun 01, 2007 at 09:07:10PM +0300, [EMAIL PROTECTED] wrote:
>  
> Hi Andrew,
> 
> I must admit I don't really understand what you say.  I don't know what
> SRF stand for, and what you say about generic case is not clear to me,
> sorry.

Sorry, it stands for set returning function.  I thought someone
upthread suggested that instead of a view.

> and the following statement would be optimized:
> 
> UPDATE param_table SET ver_id = xxx;
> SELECT * FROM bsc_view WHERE obj_id = yyy; 
> 
> which would not be the case would I have used a multi-row function.
> 
> Does this make sense ?

Yes, but I don't think it's true.  Because you change the value of
ver_id all the time, the actual result can't be collapsed to a
constant, so you end up having to execute the query with the
additional value, and you still have to plan that.  The same thing is
true of a function, which will have its plan prepared the first time
you execute it.  (I could be wrong about this; I suppose the only way
would be to try it.)

A
-- 
Andrew Sullivan  | [EMAIL PROTECTED]
However important originality may be in some fields, restraint and 
adherence to procedure emerge as the more significant virtues in a 
great many others.   --Alain de Botton

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings