Re: [SQL] Still struggling with history tables

2006-01-18 Thread Achilleus Mantzios
O Ken Winter έγραψε στις Jan 17, 2006 :

> Friends ~
> 
> I'm still trying to implement a solution to the requirement to keep a
> complete history of data changes to a "person" table.  (See earlier
> correspondence below.)  I'm trying for a variant of the architecture
> suggested by Richard Huxton (also below).  In my variant, I have this
> "identifier" table, carrying the id and invariant info about each person:
> 
> /*==*/
> /* Table: person_i  */
> /*==*/
> create table person_i (
> idi  BIGSERIALnot null,
> date_of_birthDATE null,
> constraint PK_person_i_key_1 primary key (idi)
> )
> ;
> 
> And then I have this "history" table, carrying the time-varying info on
> which I want to keep a complete history:
> 
> /*==*/
> /* Table: person_h  */
> /*==*/
> create table person_h (
> idh  INT8 not null,
> start_date   DATE not null default 'now()',
> end_date DATE null,
> name VARCHAR(255) null,
> constraint PK_person_h_key_1 primary key (idh, start_date),
> constraint fk_reference_6 foreign key (idh)
>references person_i (idi)
>  on delete restrict on update restrict
> )
> ;
> 
> Triggers are in place on the "person_h" table so that when an app does an
> update, the current h record is expired (with its old data) and a new record
> (wuth the updated data)is inserted and made effective "now".  What I'm now
> trying to build is this view:
> 
> /*==*/
> /* View: person */
> /*==*/
> create view person as
> select idi, date_of_birth, start_date, end_date, name
> from person_i i, person_h h
> where i.idi = h.idh;
> 
> I want to enable users (and apps) who need to deal only with current data to
> be able to treat "person" as a real table (i.e. to write to it as well as
> read from it).  Specifically, the requirements are that when a user does:
> 
> . Insert - The system inserts a record into the i table and the first
> record in the h table.
> . Select - The system returns attributes of i and h tables (not
> duplicating the identifier columns).
> . Update - The system allows updating of i attributes
> (update-in-place, not creating a new history record) and h attributes
> (creating a new history record).
> . Delete - The system deletes the i record and all of its h records.
> 
> I'm stuck on how to implement the "insert" action, which I thought would be
> simple.  The problem is this:  The i table id is of type BIGSERIAL, i.e.
> sequence-assigned.  I've tried writing the following rule to get both the i
> record and the first h record inserted:
> 
> CREATE RULE ru AS 
> ON INSERT TO person 
> DO INSTEAD (
>   INSERT INTO person_i DEFAULT VALUES; 
>   INSERT INTO person_h (idh) VALUES (NEW.idi)
> );

How about

INSERT INTO person_i (idi,dateofbirth) 
VALUES(nextval('public.person_i_idi_seq'::text),new.dateofbirth);
INSERT INTO person_h(idh) 
VALUES(currval('public.person_i_idi_seq'::text));

However, i have treated similar problems with two different approaches:

Generally it is very hard to distinguish between two kind of UPDATES:

a) UPDATEs that mean real data updates and they should be recorded
to the history system.
b) UPDATEs that are just false data entry, and they should mean
just plain correction UPDATES, with no recording.

This distinguishability is very important, otherwise someone would
define a way to store historic data of changes to the historic data
themselves, and so on.

1) Is the history data frequently needed? Are there a lot of apps
hitting these historic data? Do we need the freedom to query current live
data as well as past data in a uniform manner?

Then i just use one table, with endtimestamp is null meaning
this is a current (alive) record.
Then i write triggers to enforce interval wise integrity to the table,
(e.g. No records A,B exist with A<>B,A,B for the same person, so that 
(A.starttimestamp,coalesce(A.endtimestamp,now())) overlaps with 
(B.starttimestamp,coalesce(B.endtimestamp,now()))

This way i give people the ability to do what they want with the table.
The triggers do the job of enforcing integrity.

In this case what we mean as historic is
"what users define and input as historic".
Users are in charge here, not the DB.

2) If on the other hand, historic data are just a convinient way
of accessing history data, instead of going to find the backup of this 
past day
in the computer room,
then I keep one and 

[SQL] bug with if ... then ... clause in views

2006-01-18 Thread Emil Rachovsky
While trying to create some views I stumbled on some
problem with using the if-then clause. Here is a
simple example : 

CREATE OR REPLACE VIEW public.SomeView
  as select d.id,
  if (true) then d.DocNumber endif from
public.Z_Documents as d;

I get the following error :
 syntax error at or near "then" at character 72

I don't have a clue what is going on here. Any
suggestions?

Thanks in advance,
Emil

__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [SQL] bug with if ... then ... clause in views

2006-01-18 Thread Michael Glaesemann

On Jan 18, 2006, at 18:18 , Emil Rachovsky wrote:


CREATE OR REPLACE VIEW public.SomeView
  as select d.id,
  if (true) then d.DocNumber endif from
public.Z_Documents as d;

I get the following error :
 syntax error at or near "then" at character 72


Well, one problem is that IF ... ENDIF is the correct syntax. You can  
use it in PL/pgSQL, but that's a procedural language, and not what  
you're doing here. You're probably looking for CASE, e.g.,


CREATE OR REPLACE VIEW public."SomeView" AS
SELECT d.id
, CASE WHEN TRUE
THEN d."DocNumber"
ELSE something_else
END as "DocNumber"
FROM public."Z_Documents" as d;

Note you need an ELSE clause: you can't have a variable number of  
columns in the view (just like you can't have a variable number of  
columns in a table). As an aside, you need to double-quote  
identifiers if you want them to be case-sensitive: otherwise they'll  
be down-cased.


Hope this helps.

Michael Glaesemann
grzm myrealbox com




---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [SQL] Still struggling with history tables

2006-01-18 Thread Michael Glaesemann


On Jan 18, 2006, at 19:23 , Achilleus Mantzios wrote:


Generally it is very hard to distinguish between two kind of UPDATES:

a) UPDATEs that mean real data updates and they should be recorded
to the history system.
b) UPDATEs that are just false data entry, and they should mean
just plain correction UPDATES, with no recording.

This distinguishability is very important, otherwise someone would
define a way to store historic data of changes to the historic data
themselves, and so on.


Just a quick note:

Sometimes the term "valid-time" is used to talk about the interval of  
data validity, and tables that include valid-time intervals are  
sometimes referred to as "state tables". If you're interested in  
tracking when corrections are made, this is referred to as  
"transaction-time". Correcting the is sometimes called a  
"nonsequenced" update, because it's not correcting the sequence of  
validity: it's just a correction.


For more information, you can check out "Developing Time-Oriented  
Database Applications in SQL" by Richard Snodgrass (available as a  
free PDF download from his website[1]), or, for more theoretical  
information, "Temporal Data and the Relational Model" by CJ Date,  
Hugh Darwen, and Nikos Lorentzos.[2]


Michael Glaesemann
grzm myrealbox com

[1](http://www.cs.arizona.edu/people/rts/tdbbook.pdf)
[2](http://www.amazon.com/gp/product/1558608559/)



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [SQL] Matching several rows

2006-01-18 Thread Volkan YAZICI
Hi,

On Jan 18 05:43, Ivan Steganov wrote:
> IDRIGHT
> -
> 201
> 202
> 205
> 2010
> 302
> 3010
> 
> Now I need to find out which IDs have, say rights 2 AND 5 AND 10.

SELECT t.id
FROM (SELECT id, sum(1) AS s
  FROM id_n_rights
  WHERE rights = ANY(ARRAY[2,5,10])
  GROUP BY id) AS t
WHERE t.s = 3;  -- 3 = length(ARRAY[2,5,10])


HTH.
Regards.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [SQL] Matching several rows

2006-01-18 Thread Michael Glaesemann


On Jan 18, 2006, at 20:55 , Volkan YAZICI wrote:


On Jan 18 05:43, Ivan Steganov wrote:

IDRIGHT
-
201
202
205
2010
302
3010

Now I need to find out which IDs have, say rights 2 AND 5 AND 10.


SELECT t.id
FROM (SELECT id, sum(1) AS s
  FROM id_n_rights
  WHERE rights = ANY(ARRAY[2,5,10])
  GROUP BY id) AS t
WHERE t.s = 3;  -- 3 = length(ARRAY[2,5,10])



Or how about just

select id
from urights
where uright in (2,5,10)
group by id
having count(id) = 3; -- number of urights in IN clause

Find below the query plans for the variations we've seen (as well as  
DDL statements):


Michael Glaesemann
grzm myrealbox com

create table urights
(
id integer not null
, uright integer not null
);

copy urights (id, uright) from stdin;
20  1
20  2
20  5
20  10
30  2
30  10
\.


select id
from urights as right_2
join urights as right_5 using (id)
join urights as right_10 using (id)
where right_2.uright = 2
and right_5.uright = 5
and right_10.uright = 10;
 QUERY PLAN
 
-
Nested Loop  (cost=0.00..3.27 rows=1 width=4) (actual  
time=0.053..0.069 rows=1 loops=1)

   Join Filter: ("outer".id = "inner".id)
   ->  Nested Loop  (cost=0.00..2.17 rows=1 width=8) (actual  
time=0.041..0.052 rows=1 loops=1)

 Join Filter: ("inner".id = "outer".id)
 ->  Seq Scan on urights right_5  (cost=0.00..1.07 rows=1  
width=4) (actual time=0.023..0.027 rows=1 loops=1)

   Filter: (uright = 5)
 ->  Seq Scan on urights right_2  (cost=0.00..1.07 rows=2  
width=4) (actual time=0.005..0.011 rows=2 loops=1)

   Filter: (uright = 2)
   ->  Seq Scan on urights right_10  (cost=0.00..1.07 rows=2  
width=4) (actual time=0.007..0.011 rows=2 loops=1)

 Filter: (uright = 10)
Total runtime: 0.337 ms
(11 rows)


select id
from (
select id
from urights
where uright = 2
) as right_2
join (
select id
from urights
where uright = 5
) as right_5 using (id)
join (
select id
from urights
where uright = 10
) as right_10 using (id);
 QUERY PLAN
 
-
Nested Loop  (cost=0.00..3.27 rows=1 width=4) (actual  
time=0.062..0.079 rows=1 loops=1)

   Join Filter: ("outer".id = "inner".id)
   ->  Nested Loop  (cost=0.00..2.17 rows=1 width=8) (actual  
time=0.048..0.060 rows=1 loops=1)

 Join Filter: ("inner".id = "outer".id)
 ->  Seq Scan on urights  (cost=0.00..1.07 rows=1 width=4)  
(actual time=0.031..0.036 rows=1 loops=1)

   Filter: (uright = 5)
 ->  Seq Scan on urights  (cost=0.00..1.07 rows=2 width=4)  
(actual time=0.005..0.011 rows=2 loops=1)

   Filter: (uright = 2)
   ->  Seq Scan on urights  (cost=0.00..1.07 rows=2 width=4) (actual  
time=0.006..0.011 rows=2 loops=1)

 Filter: (uright = 10)
Total runtime: 0.299 ms
(11 rows)

SELECT t.id
FROM (SELECT id, sum(1) AS s
FROM urights
WHERE uright = ANY(ARRAY[2,5,10])
GROUP BY id) AS t
WHERE t.s = 3;  -- 3 = length(ARRAY[2,5,10])
 QUERY PLAN
 
-
Subquery Scan t  (cost=1.23..1.26 rows=1 width=4) (actual  
time=0.102..0.106 rows=1 loops=1)
   ->  HashAggregate  (cost=1.23..1.25 rows=1 width=4) (actual  
time=0.095..0.098 rows=1 loops=1)

 Filter: (sum(1) = 3)
 ->  Seq Scan on urights  (cost=0.00..1.21 rows=3 width=4)  
(actual time=0.038..0.052 rows=5 loops=1)

   Filter: (uright = ANY ('{2,5,10}'::integer[]))
Total runtime: 2.521 ms
(6 rows)


select id
from urights
where uright in (2,5,10)
group by id
having count(id) = 3;
  QUERY PLAN
 
---
HashAggregate  (cost=1.12..1.15 rows=2 width=4) (actual  
time=2.582..2.584 rows=1 loops=1)

   Filter: (count(id) = 3)
   ->  Seq Scan on urights  (cost=0.00..1.10 rows=4 width=4) (actual  
time=0.042..2.502 rows=5 loops=1)

 Filter: ((uright = 2) OR (uright = 5) OR (uright = 10))
Total runtime: 2.944 ms
(5 rows)





---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [SQL] bug with if ... then ... clause in views

2006-01-18 Thread Daryl Richter


On Jan 18, 2006, at 4:18 AM, Emil Rachovsky wrote:


While trying to create some views I stumbled on some
problem with using the if-then clause. Here is a
simple example :

CREATE OR REPLACE VIEW public.SomeView
  as select d.id,
  if (true) then d.DocNumber endif from
public.Z_Documents as d;

I get the following error :
 syntax error at or near "then" at character 72

I don't have a clue what is going on here. Any
suggestions?


I see 2 issues.

1) if *what* is true?
2) AFAIK, there is no IF conditional in SQL.  Perhaps you want CASE?   
e.g.


CREATE OR REPLACE VIEW SomeView
as
  select
d.id,
case
  when condition = true then d.doc_number
  else 'Bad Doc'
end
  from
  documents as d;
go



Thanks in advance,
Emil



--
Daryl

"Stress rarely has a positive impact on our ability to think.
Never, I'd guess."

-- Ron Jeffries, 2005



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [SQL] Matching several rows

2006-01-18 Thread Volkan YAZICI
On Jan 18 09:33, Michael Glaesemann wrote:
> On Jan 18, 2006, at 20:55 , Volkan YAZICI wrote:
> >SELECT t.id
> >FROM (SELECT id, sum(1) AS s
> >   FROM id_n_rights
> >   WHERE rights = ANY(ARRAY[2,5,10])
> >   GROUP BY id) AS t
> >WHERE t.s = 3;

AFAICS, the bottleneck in above query is ANY(ARRAY[]) clause usage.
Instead of that, if you replace «rights = ANY(ARRAY[2,5,10])» with
«rights IN (2,5,10)» it's overhead decreases to 0.200-0.300ms domain.


Regards.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [SQL] Matching several rows

2006-01-18 Thread Michael Glaesemann


On Jan 18, 2006, at 21:48 , Volkan YAZICI wrote:


AFAICS, the bottleneck in above query is ANY(ARRAY[]) clause usage.
Instead of that, if you replace «rights = ANY(ARRAY[2,5,10])» with
«rights IN (2,5,10)» it's overhead decreases to 0.200-0.300ms domain.


explain analyze
SELECT id
FROM (
SELECT id, sum(1) AS s
FROM urights
WHERE uright in (2,5,10)
GROUP BY id) AS t
WHERE s = 3;
 QUERY PLAN
 
-
Subquery Scan t  (cost=1.14..1.19 rows=2 width=4) (actual  
time=0.106..0.108 rows=1 loops=1)
   ->  HashAggregate  (cost=1.14..1.17 rows=2 width=4) (actual  
time=0.103..0.105 rows=1 loops=1)

 Filter: (sum(1) = 3)
 ->  Seq Scan on urights  (cost=0.00..1.10 rows=4 width=4)  
(actual time=0.029..0.038 rows=5 loops=1)

   Filter: ((uright = 2) OR (uright = 5) OR (uright = 10))
Total runtime: 0.386 ms
(6 rows)


Michael Glaesemann
grzm myrealbox com




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [SQL] Matching several rows

2006-01-18 Thread Ivan Steganov
Thank you to everyone for the great help!I will evaluate all methods in our query (It is actually well complexer then this sample) and choose the best one.Is there any "scientific" name to this kind of "several rows match for one result" data selection?
Ivan


Re: [SQL] Matching several rows

2006-01-18 Thread Patrick JACQUOT

Michael Glaesemann wrote:



On Jan 18, 2006, at 21:48 , Volkan YAZICI wrote:


AFAICS, the bottleneck in above query is ANY(ARRAY[]) clause usage.
Instead of that, if you replace «rights = ANY(ARRAY[2,5,10])» with
«rights IN (2,5,10)» it's overhead decreases to 0.200-0.300ms domain.



explain analyze
SELECT id
FROM (
SELECT id, sum(1) AS s
FROM urights
WHERE uright in (2,5,10)
GROUP BY id) AS t
WHERE s = 3;
 QUERY PLAN
 
-
Subquery Scan t  (cost=1.14..1.19 rows=2 width=4) (actual  
time=0.106..0.108 rows=1 loops=1)
   ->  HashAggregate  (cost=1.14..1.17 rows=2 width=4) (actual  
time=0.103..0.105 rows=1 loops=1)

 Filter: (sum(1) = 3)
 ->  Seq Scan on urights  (cost=0.00..1.10 rows=4 width=4)  
(actual time=0.029..0.038 rows=5 loops=1)

   Filter: ((uright = 2) OR (uright = 5) OR (uright = 10))
Total runtime: 0.386 ms
(6 rows)


Michael Glaesemann
grzm myrealbox com




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


why not use an having clause in the GROUP BY?
HTH

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [SQL] [GENERAL] bug with if ... then ... clause in views

2006-01-18 Thread Jeff

Emil Rachovsky wrote:


While trying to create some views I stumbled on some
problem with using the if-then clause. Here is a
simple example : 


CREATE OR REPLACE VIEW public.SomeView
 as select d.id,
 if (true) then d.DocNumber endif from
public.Z_Documents as d;

I get the following error :
syntax error at or near "then" at character 72

I don't have a clue what is going on here. Any
suggestions?

 


IF / THEN is not part of SQL. it is part of plpgsql.
However you'll find that CASE can do the same thing.
select d.id, case when true then d.docNumber else 'something else' end 
as blah, public.Z_documents as d ...


--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [SQL] Matching several rows

2006-01-18 Thread Vraj Mohan

It is easier to think of this as SET INTERSECTION which leads to:

SELECT id FROM urights WHERE right = 2
INTERSECT
SELECT id FROM urights WHERE right = 5
INTERSECT
SELECT id FROM urights WHERE right = 10

Ivan Steganov wrote:

Thank you to everyone for the great help!

I will evaluate all methods in our query (It is actually well complexer then
this sample) and choose the best one.

Is there any "scientific" name to this kind of "several rows match for one
result" data selection?

Ivan




---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[SQL] stored procedures for complex SELECTs

2006-01-18 Thread alex-lists-pgsql

Are there performance advantages that can be achieved by wrapping a 
complex SELECT into a stored procedure? 

Alex

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [SQL] stored procedures for complex SELECTs

2006-01-18 Thread Michael Glaesemann


On Jan 19, 2006, at 8:17 , [EMAIL PROTECTED] wrote:


Are there performance advantages that can be achieved by wrapping a
complex SELECT into a stored procedure?


I believe it depends on the procedural language. If it's SQL, I think  
it may be inlined, so you'd have overhead due to the stored procedure  
rather than the select itself would probably be minimal. In any other  
language, there would be additional parsing overhead, I believe, so I  
don't think it would be faster than the select itself.


However, why don't you test it? Test each case in an EXPLAIN ANALYZE  
or run some other benchmark to see if there's a performance  
difference. Then you'll know for sure—and have numbers to back it up.


Michael Glaesemann
grzm myrealbox com




---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster