Re: [SQL] Still struggling with history tables
O Ken Winter έγραψε στις Jan 17, 2006 :
> Friends ~
>
> I'm still trying to implement a solution to the requirement to keep a
> complete history of data changes to a "person" table. (See earlier
> correspondence below.) I'm trying for a variant of the architecture
> suggested by Richard Huxton (also below). In my variant, I have this
> "identifier" table, carrying the id and invariant info about each person:
>
> /*==*/
> /* Table: person_i */
> /*==*/
> create table person_i (
> idi BIGSERIALnot null,
> date_of_birthDATE null,
> constraint PK_person_i_key_1 primary key (idi)
> )
> ;
>
> And then I have this "history" table, carrying the time-varying info on
> which I want to keep a complete history:
>
> /*==*/
> /* Table: person_h */
> /*==*/
> create table person_h (
> idh INT8 not null,
> start_date DATE not null default 'now()',
> end_date DATE null,
> name VARCHAR(255) null,
> constraint PK_person_h_key_1 primary key (idh, start_date),
> constraint fk_reference_6 foreign key (idh)
>references person_i (idi)
> on delete restrict on update restrict
> )
> ;
>
> Triggers are in place on the "person_h" table so that when an app does an
> update, the current h record is expired (with its old data) and a new record
> (wuth the updated data)is inserted and made effective "now". What I'm now
> trying to build is this view:
>
> /*==*/
> /* View: person */
> /*==*/
> create view person as
> select idi, date_of_birth, start_date, end_date, name
> from person_i i, person_h h
> where i.idi = h.idh;
>
> I want to enable users (and apps) who need to deal only with current data to
> be able to treat "person" as a real table (i.e. to write to it as well as
> read from it). Specifically, the requirements are that when a user does:
>
> . Insert - The system inserts a record into the i table and the first
> record in the h table.
> . Select - The system returns attributes of i and h tables (not
> duplicating the identifier columns).
> . Update - The system allows updating of i attributes
> (update-in-place, not creating a new history record) and h attributes
> (creating a new history record).
> . Delete - The system deletes the i record and all of its h records.
>
> I'm stuck on how to implement the "insert" action, which I thought would be
> simple. The problem is this: The i table id is of type BIGSERIAL, i.e.
> sequence-assigned. I've tried writing the following rule to get both the i
> record and the first h record inserted:
>
> CREATE RULE ru AS
> ON INSERT TO person
> DO INSTEAD (
> INSERT INTO person_i DEFAULT VALUES;
> INSERT INTO person_h (idh) VALUES (NEW.idi)
> );
How about
INSERT INTO person_i (idi,dateofbirth)
VALUES(nextval('public.person_i_idi_seq'::text),new.dateofbirth);
INSERT INTO person_h(idh)
VALUES(currval('public.person_i_idi_seq'::text));
However, i have treated similar problems with two different approaches:
Generally it is very hard to distinguish between two kind of UPDATES:
a) UPDATEs that mean real data updates and they should be recorded
to the history system.
b) UPDATEs that are just false data entry, and they should mean
just plain correction UPDATES, with no recording.
This distinguishability is very important, otherwise someone would
define a way to store historic data of changes to the historic data
themselves, and so on.
1) Is the history data frequently needed? Are there a lot of apps
hitting these historic data? Do we need the freedom to query current live
data as well as past data in a uniform manner?
Then i just use one table, with endtimestamp is null meaning
this is a current (alive) record.
Then i write triggers to enforce interval wise integrity to the table,
(e.g. No records A,B exist with A<>B,A,B for the same person, so that
(A.starttimestamp,coalesce(A.endtimestamp,now())) overlaps with
(B.starttimestamp,coalesce(B.endtimestamp,now()))
This way i give people the ability to do what they want with the table.
The triggers do the job of enforcing integrity.
In this case what we mean as historic is
"what users define and input as historic".
Users are in charge here, not the DB.
2) If on the other hand, historic data are just a convinient way
of accessing history data, instead of going to find the backup of this
past day
in the computer room,
then I keep one and
[SQL] bug with if ... then ... clause in views
While trying to create some views I stumbled on some problem with using the if-then clause. Here is a simple example : CREATE OR REPLACE VIEW public.SomeView as select d.id, if (true) then d.DocNumber endif from public.Z_Documents as d; I get the following error : syntax error at or near "then" at character 72 I don't have a clue what is going on here. Any suggestions? Thanks in advance, Emil __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] bug with if ... then ... clause in views
On Jan 18, 2006, at 18:18 , Emil Rachovsky wrote: CREATE OR REPLACE VIEW public.SomeView as select d.id, if (true) then d.DocNumber endif from public.Z_Documents as d; I get the following error : syntax error at or near "then" at character 72 Well, one problem is that IF ... ENDIF is the correct syntax. You can use it in PL/pgSQL, but that's a procedural language, and not what you're doing here. You're probably looking for CASE, e.g., CREATE OR REPLACE VIEW public."SomeView" AS SELECT d.id , CASE WHEN TRUE THEN d."DocNumber" ELSE something_else END as "DocNumber" FROM public."Z_Documents" as d; Note you need an ELSE clause: you can't have a variable number of columns in the view (just like you can't have a variable number of columns in a table). As an aside, you need to double-quote identifiers if you want them to be case-sensitive: otherwise they'll be down-cased. Hope this helps. Michael Glaesemann grzm myrealbox com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [SQL] Still struggling with history tables
On Jan 18, 2006, at 19:23 , Achilleus Mantzios wrote: Generally it is very hard to distinguish between two kind of UPDATES: a) UPDATEs that mean real data updates and they should be recorded to the history system. b) UPDATEs that are just false data entry, and they should mean just plain correction UPDATES, with no recording. This distinguishability is very important, otherwise someone would define a way to store historic data of changes to the historic data themselves, and so on. Just a quick note: Sometimes the term "valid-time" is used to talk about the interval of data validity, and tables that include valid-time intervals are sometimes referred to as "state tables". If you're interested in tracking when corrections are made, this is referred to as "transaction-time". Correcting the is sometimes called a "nonsequenced" update, because it's not correcting the sequence of validity: it's just a correction. For more information, you can check out "Developing Time-Oriented Database Applications in SQL" by Richard Snodgrass (available as a free PDF download from his website[1]), or, for more theoretical information, "Temporal Data and the Relational Model" by CJ Date, Hugh Darwen, and Nikos Lorentzos.[2] Michael Glaesemann grzm myrealbox com [1](http://www.cs.arizona.edu/people/rts/tdbbook.pdf) [2](http://www.amazon.com/gp/product/1558608559/) ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] Matching several rows
Hi, On Jan 18 05:43, Ivan Steganov wrote: > IDRIGHT > - > 201 > 202 > 205 > 2010 > 302 > 3010 > > Now I need to find out which IDs have, say rights 2 AND 5 AND 10. SELECT t.id FROM (SELECT id, sum(1) AS s FROM id_n_rights WHERE rights = ANY(ARRAY[2,5,10]) GROUP BY id) AS t WHERE t.s = 3; -- 3 = length(ARRAY[2,5,10]) HTH. Regards. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [SQL] Matching several rows
On Jan 18, 2006, at 20:55 , Volkan YAZICI wrote:
On Jan 18 05:43, Ivan Steganov wrote:
IDRIGHT
-
201
202
205
2010
302
3010
Now I need to find out which IDs have, say rights 2 AND 5 AND 10.
SELECT t.id
FROM (SELECT id, sum(1) AS s
FROM id_n_rights
WHERE rights = ANY(ARRAY[2,5,10])
GROUP BY id) AS t
WHERE t.s = 3; -- 3 = length(ARRAY[2,5,10])
Or how about just
select id
from urights
where uright in (2,5,10)
group by id
having count(id) = 3; -- number of urights in IN clause
Find below the query plans for the variations we've seen (as well as
DDL statements):
Michael Glaesemann
grzm myrealbox com
create table urights
(
id integer not null
, uright integer not null
);
copy urights (id, uright) from stdin;
20 1
20 2
20 5
20 10
30 2
30 10
\.
select id
from urights as right_2
join urights as right_5 using (id)
join urights as right_10 using (id)
where right_2.uright = 2
and right_5.uright = 5
and right_10.uright = 10;
QUERY PLAN
-
Nested Loop (cost=0.00..3.27 rows=1 width=4) (actual
time=0.053..0.069 rows=1 loops=1)
Join Filter: ("outer".id = "inner".id)
-> Nested Loop (cost=0.00..2.17 rows=1 width=8) (actual
time=0.041..0.052 rows=1 loops=1)
Join Filter: ("inner".id = "outer".id)
-> Seq Scan on urights right_5 (cost=0.00..1.07 rows=1
width=4) (actual time=0.023..0.027 rows=1 loops=1)
Filter: (uright = 5)
-> Seq Scan on urights right_2 (cost=0.00..1.07 rows=2
width=4) (actual time=0.005..0.011 rows=2 loops=1)
Filter: (uright = 2)
-> Seq Scan on urights right_10 (cost=0.00..1.07 rows=2
width=4) (actual time=0.007..0.011 rows=2 loops=1)
Filter: (uright = 10)
Total runtime: 0.337 ms
(11 rows)
select id
from (
select id
from urights
where uright = 2
) as right_2
join (
select id
from urights
where uright = 5
) as right_5 using (id)
join (
select id
from urights
where uright = 10
) as right_10 using (id);
QUERY PLAN
-
Nested Loop (cost=0.00..3.27 rows=1 width=4) (actual
time=0.062..0.079 rows=1 loops=1)
Join Filter: ("outer".id = "inner".id)
-> Nested Loop (cost=0.00..2.17 rows=1 width=8) (actual
time=0.048..0.060 rows=1 loops=1)
Join Filter: ("inner".id = "outer".id)
-> Seq Scan on urights (cost=0.00..1.07 rows=1 width=4)
(actual time=0.031..0.036 rows=1 loops=1)
Filter: (uright = 5)
-> Seq Scan on urights (cost=0.00..1.07 rows=2 width=4)
(actual time=0.005..0.011 rows=2 loops=1)
Filter: (uright = 2)
-> Seq Scan on urights (cost=0.00..1.07 rows=2 width=4) (actual
time=0.006..0.011 rows=2 loops=1)
Filter: (uright = 10)
Total runtime: 0.299 ms
(11 rows)
SELECT t.id
FROM (SELECT id, sum(1) AS s
FROM urights
WHERE uright = ANY(ARRAY[2,5,10])
GROUP BY id) AS t
WHERE t.s = 3; -- 3 = length(ARRAY[2,5,10])
QUERY PLAN
-
Subquery Scan t (cost=1.23..1.26 rows=1 width=4) (actual
time=0.102..0.106 rows=1 loops=1)
-> HashAggregate (cost=1.23..1.25 rows=1 width=4) (actual
time=0.095..0.098 rows=1 loops=1)
Filter: (sum(1) = 3)
-> Seq Scan on urights (cost=0.00..1.21 rows=3 width=4)
(actual time=0.038..0.052 rows=5 loops=1)
Filter: (uright = ANY ('{2,5,10}'::integer[]))
Total runtime: 2.521 ms
(6 rows)
select id
from urights
where uright in (2,5,10)
group by id
having count(id) = 3;
QUERY PLAN
---
HashAggregate (cost=1.12..1.15 rows=2 width=4) (actual
time=2.582..2.584 rows=1 loops=1)
Filter: (count(id) = 3)
-> Seq Scan on urights (cost=0.00..1.10 rows=4 width=4) (actual
time=0.042..2.502 rows=5 loops=1)
Filter: ((uright = 2) OR (uright = 5) OR (uright = 10))
Total runtime: 2.944 ms
(5 rows)
---(end of broadcast)---
TIP 6: explain analyze is your friend
Re: [SQL] bug with if ... then ... clause in views
On Jan 18, 2006, at 4:18 AM, Emil Rachovsky wrote: While trying to create some views I stumbled on some problem with using the if-then clause. Here is a simple example : CREATE OR REPLACE VIEW public.SomeView as select d.id, if (true) then d.DocNumber endif from public.Z_Documents as d; I get the following error : syntax error at or near "then" at character 72 I don't have a clue what is going on here. Any suggestions? I see 2 issues. 1) if *what* is true? 2) AFAIK, there is no IF conditional in SQL. Perhaps you want CASE? e.g. CREATE OR REPLACE VIEW SomeView as select d.id, case when condition = true then d.doc_number else 'Bad Doc' end from documents as d; go Thanks in advance, Emil -- Daryl "Stress rarely has a positive impact on our ability to think. Never, I'd guess." -- Ron Jeffries, 2005 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] Matching several rows
On Jan 18 09:33, Michael Glaesemann wrote: > On Jan 18, 2006, at 20:55 , Volkan YAZICI wrote: > >SELECT t.id > >FROM (SELECT id, sum(1) AS s > > FROM id_n_rights > > WHERE rights = ANY(ARRAY[2,5,10]) > > GROUP BY id) AS t > >WHERE t.s = 3; AFAICS, the bottleneck in above query is ANY(ARRAY[]) clause usage. Instead of that, if you replace «rights = ANY(ARRAY[2,5,10])» with «rights IN (2,5,10)» it's overhead decreases to 0.200-0.300ms domain. Regards. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [SQL] Matching several rows
On Jan 18, 2006, at 21:48 , Volkan YAZICI wrote: AFAICS, the bottleneck in above query is ANY(ARRAY[]) clause usage. Instead of that, if you replace «rights = ANY(ARRAY[2,5,10])» with «rights IN (2,5,10)» it's overhead decreases to 0.200-0.300ms domain. explain analyze SELECT id FROM ( SELECT id, sum(1) AS s FROM urights WHERE uright in (2,5,10) GROUP BY id) AS t WHERE s = 3; QUERY PLAN - Subquery Scan t (cost=1.14..1.19 rows=2 width=4) (actual time=0.106..0.108 rows=1 loops=1) -> HashAggregate (cost=1.14..1.17 rows=2 width=4) (actual time=0.103..0.105 rows=1 loops=1) Filter: (sum(1) = 3) -> Seq Scan on urights (cost=0.00..1.10 rows=4 width=4) (actual time=0.029..0.038 rows=5 loops=1) Filter: ((uright = 2) OR (uright = 5) OR (uright = 10)) Total runtime: 0.386 ms (6 rows) Michael Glaesemann grzm myrealbox com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] Matching several rows
Thank you to everyone for the great help!I will evaluate all methods in our query (It is actually well complexer then this sample) and choose the best one.Is there any "scientific" name to this kind of "several rows match for one result" data selection? Ivan
Re: [SQL] Matching several rows
Michael Glaesemann wrote: On Jan 18, 2006, at 21:48 , Volkan YAZICI wrote: AFAICS, the bottleneck in above query is ANY(ARRAY[]) clause usage. Instead of that, if you replace «rights = ANY(ARRAY[2,5,10])» with «rights IN (2,5,10)» it's overhead decreases to 0.200-0.300ms domain. explain analyze SELECT id FROM ( SELECT id, sum(1) AS s FROM urights WHERE uright in (2,5,10) GROUP BY id) AS t WHERE s = 3; QUERY PLAN - Subquery Scan t (cost=1.14..1.19 rows=2 width=4) (actual time=0.106..0.108 rows=1 loops=1) -> HashAggregate (cost=1.14..1.17 rows=2 width=4) (actual time=0.103..0.105 rows=1 loops=1) Filter: (sum(1) = 3) -> Seq Scan on urights (cost=0.00..1.10 rows=4 width=4) (actual time=0.029..0.038 rows=5 loops=1) Filter: ((uright = 2) OR (uright = 5) OR (uright = 10)) Total runtime: 0.386 ms (6 rows) Michael Glaesemann grzm myrealbox com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly why not use an having clause in the GROUP BY? HTH ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [SQL] [GENERAL] bug with if ... then ... clause in views
Emil Rachovsky wrote: While trying to create some views I stumbled on some problem with using the if-then clause. Here is a simple example : CREATE OR REPLACE VIEW public.SomeView as select d.id, if (true) then d.DocNumber endif from public.Z_Documents as d; I get the following error : syntax error at or near "then" at character 72 I don't have a clue what is going on here. Any suggestions? IF / THEN is not part of SQL. it is part of plpgsql. However you'll find that CASE can do the same thing. select d.id, case when true then d.docNumber else 'something else' end as blah, public.Z_documents as d ... -- Jeff Trout <[EMAIL PROTECTED]> http://www.jefftrout.com/ http://www.stuarthamm.net/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [SQL] Matching several rows
It is easier to think of this as SET INTERSECTION which leads to: SELECT id FROM urights WHERE right = 2 INTERSECT SELECT id FROM urights WHERE right = 5 INTERSECT SELECT id FROM urights WHERE right = 10 Ivan Steganov wrote: Thank you to everyone for the great help! I will evaluate all methods in our query (It is actually well complexer then this sample) and choose the best one. Is there any "scientific" name to this kind of "several rows match for one result" data selection? Ivan ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[SQL] stored procedures for complex SELECTs
Are there performance advantages that can be achieved by wrapping a complex SELECT into a stored procedure? Alex ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [SQL] stored procedures for complex SELECTs
On Jan 19, 2006, at 8:17 , [EMAIL PROTECTED] wrote: Are there performance advantages that can be achieved by wrapping a complex SELECT into a stored procedure? I believe it depends on the procedural language. If it's SQL, I think it may be inlined, so you'd have overhead due to the stored procedure rather than the select itself would probably be minimal. In any other language, there would be additional parsing overhead, I believe, so I don't think it would be faster than the select itself. However, why don't you test it? Test each case in an EXPLAIN ANALYZE or run some other benchmark to see if there's a performance difference. Then you'll know for sure—and have numbers to back it up. Michael Glaesemann grzm myrealbox com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
