[SQL] determining how many products are in how many categories

2004-02-08 Thread David Garamond
# product table (simplified):
create table p (
  id char(22) not null primary key,
  name text,
  desc text
);
# product category table (simpl.):
create table pc (
  id char(22) not null primary key,
  name text,
  desc text
);
# table that maps products into categories:
create table p_pc (
  id char(22) not null primary key,
  pid char(22) not null references p(id),
  pcid char(22) not null references pc(id)
);
create index i_ppc_pid on p_pc(pid);
create index i_ppc_pcid on p_pc(pcid);
create unique index i_ppc_pid_pcid on p_pc(pid, pcid);
There are about 50k rows in p, 40k in pc, and 140k in p_pc. Most 
products are only placed in 1 category, some in 2, fewer in 3, and fewer 
stills in 4, 5, 6 categories.

I want to know how many products are placed in 1 category, how many in 
2, and so on. Here's my query:

select count(pid),num_cat from (
  select pid,count(cid) as num_cat
  from ppc group by pid) as f
group by num_cat;
A sample output (4000 products are categorized in 5 different places, 
4998 in 4, and so on):

 count | num_cat
---+-
  4000 |   5
  4998 |   4
  7502 |   3
 10001 |   2
 17499 |   1
(5 rows)
However, this query is taking about 2.5 minutes. Any idea on how to make 
it faster?

--
dave
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html



Re: [SQL] surrogate key or not?

2004-07-24 Thread David Garamond
Josh Berkus wrote:
> Given:  Surrogate keys, by definition, represent no real data;
> Given:  Only items which represent real data have any place in
>   a data model
> Conclusion:  Surrogate keys have no place in the data model

But, once a surrogate key is assigned to a row, doesn't it become a
"real" data? For example, I have a bunch of invoices/receipts and I
write down a unique number on each of them. Doesn't the unique number
become part of the information contained by the invoice/receipt itself
(at least as long as I'm concerned)? Another example is when Adam
(+Eve?) named each animal in the Genesis. At that time the name he gave
for each animal was arbitrary ("surrogate"), but once assigned to each
species, it becomes part of the real data.

> 3) Mutability:  Most RDBMSs are very inefficient about CASCADE deletes and 

Change management IMO is perhaps the main reason of surrogate/artificial
key. We often need a PK that _never_ needs to change (because it can be
a royal PITA or downright impossibility to make this change; the PK
might already be printed on a form/card/document, recorded on some
permanent database, tattoed/embedded in someone's forehead, etc).
Meanwhile, every other aspect of the data can change (e.g. a person can
change his name, sex, age, email, address, even date & place of birth).
Not to mention data entry mistakes. So it's impossible to use any
"real"/natural key in this case.

> Now, you're probably wondering "why does this guy regard surrogate keys as a 
> problem?"   I'll tell you:  I absolutely cannot count the number of "bad 
> databases" I've encountered which contained tables with a surrogate key, and 
> NO REAL KEY of any kind.   This makes data normalization impossible, and 
> cleanup of the database becomes a labor-intensive process requiring 
> hand-examination of each row.

Okay, so surrogate key makes it easy for stupid people to design a
database that is prone to data duplication (because he doesn't install
enough unique constraints to prevent this). But I don't see how a
relation with a surrogate key is harder to "normalize" (that is, for the
duplicates to be removed) than a relation with no key at all. Compare:

 street
 --
 Green Street
 Green Street
 Green Street

versus:

 idstreet
 ----
 2934  Green Street
 4555  Green Street
 5708  Green Street

They both contain duplicates and/or ambiguous data. They're both equally
hard to normalize/clean.

-- 
dave

---(end of broadcast)---
TIP 8: explain analyze is your friend


[SQL] Displaying two tables side by side

2004-08-11 Thread David Garamond
How can you display two tables side by side? Example:
> select * from t1;
 a | b
---+---
 2 | 2
 3 | 5
 4 | 7
 9 | 0
> select * from t2;
 c | d
---+---
 4 | 5
 7 | 3
 3 | 2
 1 | 1
 2 | 0
Intended output:
 a | b | c | d
---+---+---+---
 2 | 2 | 4 | 5
 3 | 5 | 7 | 3
 4 | 7 | 3 | 2
 9 | 0 | 1 | 1
   |   | 2 | 0
Each table has no keys (and no OIDs). Order is not important, but each 
row from each table needs to be displayed exactly once.

--
dave
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [SQL] Displaying two tables side by side

2004-08-11 Thread David Garamond
Andreas Haumer wrote:
You could try to use PosgreSQL's ctid system column to join on like this:
test=# select *,ctid from t1;
 a | b | ctid
- ---+---+---
 2 | 2 | (0,1)
 3 | 5 | (0,2)
 4 | 7 | (0,3)
 9 | 0 | (0,4)
test=# select *,ctid from t2;
 c | d | ctid
- ---+---+---
 4 | 5 | (0,1)
 7 | 3 | (0,2)
 3 | 2 | (0,3)
 1 | 1 | (0,4)
 2 | 0 | (0,5)
test=# select * from t1 right outer join t2 on (t1.ctid=t2.ctid);
 a | b | c | d
- ---+---+---+---
 2 | 2 | 4 | 5
 3 | 5 | 7 | 3
 4 | 7 | 3 | 2
 9 | 0 | 1 | 1
   |   | 2 | 0
Note that this is of course very platform specific. On Oracle
you could use rownum, for example.
I don't have a more portable solution on hand right now.
HTH
Thanks, I just found out about ctid. I was thinking on a rownum 
equivalent too, actually.

I guess a more portable solution would be creating a temporary table for 
each table to add the ctid/"row counter" equivalent, and then join on that.

--
dave
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[SQL] olympics ranking query

2004-08-20 Thread David Garamond
See http://www.athens2004.com/en/OlympicMedals/medals?noc=MGL .
create table countrymedal (
  countryid CHAR(3) PRIMARY KEY,
  gold INT NOT NULL,
  silver INT NOT NULL,
  bronze INT NOT NULL);
COPY countrymedal (countryid, gold, silver, bronze) FROM stdin;
ITA 5   6   3
FRA 5   3   5
UKR 5   1   1
RUS 4   8   10
GER 4   4   7
TUR 3   0   1
KOR 2   7   3
NED 2   5   4
HUN 2   3   1
SVK 2   2   1
ROM 2   0   2
GRE 2   0   1
POL 1   2   1
BLR 1   1   2
SUI 1   0   1
UAE 1   0   0
GBR 0   4   4
AUT 0   3   0
PRK 0   2   1
ESP 0   2   0
CUB 0   1   5
CZE 0   1   2
ZIM 0   1   1
USA 15  11  10
CHN 15  9   8
JPN 9   4   2
AUS 7   5   8
GEO 1   1   0
RSA 1   1   0
BUL 1   0   2
THA 1   0   2
IND 0   1   0
INA 0   1   0
KAZ 0   1   0
POR 0   1   0
SCG 0   1   0
AZE 0   0   2
BEL 0   0   2
BRA 0   0   2
DEN 0   0   2
ARG 0   0   1
CAN 0   0   1
COL 0   0   1
CRO 0   0   1
ISR 0   0   1
MGL 0   0   1
SLO 0   0   1
TRI 0   0   1
\.
create sequence seq1;
create sequence seq2;
-- query #1: list of ranks by gold
select setval('seq1', 1);
select setval('seq2', 1);
select
  setval('seq1', currval('seq1')+setval('seq2',count(*)))-count(*) as rank,
  count(*) as numranker,
  gold, silver, bronze
from countrymedal
group by gold, silver, bronze
order by gold desc, silver desc, bronze desc;
-- result of query #1
 rank | numranker | gold | silver | bronze
--+---+--++
1 | 1 |   15 | 11 | 10
2 | 1 |   15 |  9 |  8
3 | 1 |9 |  4 |  2
4 | 1 |7 |  5 |  8
5 | 1 |5 |  6 |  3
6 | 1 |5 |  3 |  5
7 | 1 |5 |  1 |  1
8 | 1 |4 |  8 | 10
9 | 1 |4 |  4 |  7
   10 | 1 |3 |  0 |  1
   11 | 1 |2 |  7 |  3
   12 | 1 |2 |  5 |  4
   13 | 1 |2 |  3 |  1
   14 | 1 |2 |  2 |  1
   15 | 1 |2 |  0 |  2
   16 | 1 |2 |  0 |  1
   17 | 1 |1 |  2 |  1
   18 | 1 |1 |  1 |  2
   19 | 2 |1 |  1 |  0
   21 | 2 |1 |  0 |  2
   23 | 1 |1 |  0 |  1
   24 | 1 |1 |  0 |  0
   25 | 1 |0 |  4 |  4
   26 | 1 |0 |  3 |  0
   27 | 1 |0 |  2 |  1
   28 | 1 |0 |  2 |  0
   29 | 1 |0 |  1 |  5
   30 | 1 |0 |  1 |  2
   31 | 1 |0 |  1 |  1
   32 | 5 |0 |  1 |  0
   37 | 4 |0 |  0 |  2
   41 | 8 |0 |  0 |  1
(32 rows)
-- query #2: list of countries ordered by their ranks
select setval('seq1', 1);
select setval('seq2', 1);
select
  (case numranker when 1 then '' else '=' end) || rank as rank,
  countryid,
  cm.gold, cm.silver, cm.bronze
from countrymedal cm
left join
  (select
  setval('seq1',
 currval('seq1')+setval('seq2',count(*))
)-count(*) as rank,
  count(*) as numranker,
  gold, silver, bronze
  from countrymedal
  group by gold, silver, bronze
  order by gold desc, silver desc, bronze desc
  ) t1 on cm.gold=t1.gold and cm.silver=t1.silver and cm.bronze=t1.bronze
order by t1.rank;
-- result of query #2
 rank | countryid | gold | silver | bronze
--+---+--++
 1| USA   |   15 | 11 | 10
 2| CHN   |   15 |  9 |  8
 3| JPN   |9 |  4 |  2
 4| AUS   |7 |  5 |  8
 5| ITA   |5 |  6 |  3
 6| FRA   |5 |  3 |  5
 7| UKR   |5 |  1 |  1
 8| RUS   |4 |  8 | 10
 9| GER   |4 |  4 |  7
 10   | TUR   |3 |  0 |  1
 11   | KOR   |2 |  7 |  3
 12   | NED   |2 |  5 |  4
 13   | HUN   |2 |  3 |  1
 14   | SVK   |2 |  2 |  1
 15   | ROM   |2 |  0 |  2
 16   | GRE   |2 |  0 |  1
 17   | POL   |1 |  2 |  1
 18   | BLR   |1 |  1 |  2
 =19  | GEO   |1 |  1 |  0
 =19  | RSA   |1 |  1 |  0
 =21  | BUL   |1 |  0 |  2
 =21  | THA   |1 |

Re: [SQL] olympics ranking query

2004-08-20 Thread David Garamond
Tom Lane wrote:
Challenge question: is there a simpler way to do query #1 (without any 
PL, and if possible without sequences too?
Can't without sequences AFAIK, but you certainly can do it simpler:
select setval('seq1', 0);
select nextval('seq1'), * from
(select count(*) as numranker,
 gold, silver, bronze
 from countrymedal
 group by gold, silver, bronze
 order by gold desc, silver desc, bronze desc) ss;
This is not quite the same. The ranks are sequential, but they skip, so 
as to match the number of participating countries.

--
dave
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [SQL] olympics ranking query

2004-08-20 Thread David Garamond
Bruno Wolff III wrote:
On Fri, Aug 20, 2004 at 23:40:08 +0700,
  David Garamond <[EMAIL PROTECTED]> wrote:
Challenge question: is there a simpler way to do query #1 (without any 
PL, and if possible without sequences too?
You could use a subselect to count how many countries had a lower
medal ranking and add 1 to get the rank. This should be a lot more standard
than using sequences. It will probably be a little slower, but for tables
of that size it shouldn't be a big deal.
Thanks for the tip. This is what I came up with:
select
  (select count(*) from countrymedal c1 where
   c1.gold>c2.gold or
 (c1.gold=c2.gold and (c1.silver>c2.silver or
   (c1.silver=c2.silver and c1.bronze>c2.bronze+1 as rank,
  count(*) as numranker,
  gold, silver, bronze
from countrymedal c2
group by gold, silver, bronze
order by rank;
--
dave
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


[SQL] Interpolation and extrapolation in SQL

2004-09-13 Thread David Garamond
On my first try, interpolation and extrapolation turns out to be pretty 
easy to do. In psql:


-- the "lookup" table
CREATE TABLE p (
  x DOUBLE PRECISION NOT NULL UNIQUE,
  y DOUBLE PRECISION NOT NULL
);
INSERT INTO p VALUES (1,1);
INSERT INTO p VALUES (2,5);
INSERT INTO p VALUES (5,14);
INSERT INTO p VALUES (10,21);

-- the table that contains our x values, the y values of which will be
-- looked up in the lookup table.
CREATE TABLE q (
  x DOUBLE PRECISION NOT NULL
);
INSERT INTO q VALUES (0);
INSERT INTO q VALUES (1);
INSERT INTO q VALUES (2);
INSERT INTO q VALUES (3.5);
INSERT INTO q VALUES (5.5);
INSERT INTO q VALUES (10);
INSERT INTO q VALUES (11);

-- query A. only handles interpolation
\set x1 '(SELECT p.x FROM p WHERE p.x <= q.x ORDER BY p.x DESC LIMIT 1)'
\set x2 '(SELECT p.x FROM p WHERE p.x >= q.x ORDER BY p.x ASC  LIMIT 1)'
\set y1 '(SELECT p.y FROM p WHERE p.x <= q.x ORDER BY p.x DESC LIMIT 1)'
\set y2 '(SELECT p.y FROM p WHERE p.x >= q.x ORDER BY p.x ASC  LIMIT 1)'
SELECT
  q.x,
  CASE
WHEN :x1 = :x2 THEN :y1
ELSE(:y1 + (q.x-:x1)/(:x2-:x1)*(:y2-:y1))
  END AS y
FROM q;

-- query B. also handles extrapolation, and the note column tells us
-- whether a certain y value is directly taken from p, or interpolated,
-- or extrapolated.
\set x0 '(SELECT p.x FROM p WHERE p.x <= q.x ORDER BY p.x DESC LIMIT 1 
OFFSET 1)'
\set x3 '(SELECT p.x FROM p WHERE p.x >= q.x ORDER BY p.x ASC  LIMIT 1 
OFFSET 1)'
\set y0 '(SELECT p.y FROM p WHERE p.x <= q.x ORDER BY p.x DESC LIMIT 1 
OFFSET 1)'
\set y3 '(SELECT p.y FROM p WHERE p.x >= q.x ORDER BY p.x ASC  LIMIT 1 
OFFSET 1)'

SELECT
  q.x,
  CASE
WHEN :x1 = :x2 THEN   :y1
WHEN :x1 IS NULL THEN (:y2 + (q.x-:x2)/(:x3-:x2)*(:y3-:y2))
WHEN :x2 IS NULL THEN (:y0 + (q.x-:x0)/(:x1-:x0)*(:y1-:y0))
ELSE  (:y1 + (q.x-:x1)/(:x2-:x1)*(:y2-:y1))
  END AS y,
  CASE
WHEN :x1 = :x2 THEN   'direct'
WHEN :x1 IS NULL THEN 'extrapolated to the left'
WHEN :x2 IS NULL THEN 'extrapolated to the right'
ELSE  'interpolated'
  END AS note
FROM q;

-- C. to handle the case where x in p is not unique, replace the x0..x3 
and y0..y3 template into:

\set x1 '(SELECT p2.x FROM (SELECT DISTINCT x FROM p) p2 WHERE p2.x <= 
q.x ORDER BY p2.x DESC LIMIT 1)'
\set x2 '(SELECT p2.x FROM (SELECT DISTINCT x FROM p) p2 WHERE p2.x >= 
q.x ORDER BY p2.x ASC  LIMIT 1)'
\set y1 '(SELECT p.y  FROM (SELECT DISTINCT x FROM p) p2 LEFT JOIN p ON 
p.x=p2.x WHERE p2.x <= q.x ORDER BY p2.x DESC LIMIT 1)'
\set y2 '(SELECT p.y  FROM (SELECT DISTINCT x FROM p) p2 LEFT JOIN p ON 
p.x=p2.x WHERE p2.x >= q.x ORDER BY p2.x ASC  LIMIT 1)'
\set x0 '(SELECT p2.x FROM (SELECT DISTINCT x FROM p) p2 WHERE p2.x <= 
q.x ORDER BY p2.x DESC LIMIT 1 OFFSET 1)'
\set x3 '(SELECT p2.x FROM (SELECT DISTINCT x FROM p) p2 WHERE p2.x >= 
q.x ORDER BY p2.x ASC  LIMIT 1 OFFSET 1)'
\set y0 '(SELECT p.y  FROM (SELECT DISTINCT x FROM p) p2 LEFT JOIN p ON 
p.x=p2.x WHERE p2.x <= q.x ORDER BY p2.x DESC LIMIT 1 OFFSET 1)'
\set y3 '(SELECT p.y  FROM (SELECT DISTINCT x FROM p) p2 LEFT JOIN p ON 
p.x=p2.x WHERE p2.x >= q.x ORDER BY p2.x ASC  LIMIT 1 OFFSET 1)'


Questions:
1) Is the above correct? (It gives me correct result, but I'm not 100% sure)
2) Does the optimizer cache the result of identical subqueries (e.g. :x1 
or :x2, which is mentioned several times in the query)? If yes, how do I 
know this?

3) Is there another (simpler, more elegant, more efficient) way to do 
interpolation/extrapolation in SQL?

--
dave
---(end of broadcast)---
TIP 8: explain analyze is your friend


[SQL] order and order line item

2005-12-27 Thread David Garamond
Suppose we have the usual order & order line item entities:

 CREATE TABLE "order" (
   id INT PRIMARY KEY,
   date DATE NOT NULL
 );

 CREATE TABLE orderlineitem (
   id INT PRIMARY KEY,
   orderid INT REFERENCES "order"(id),
   seq INT NOT NULL,
   CONSTRAINT con1 UNIQUE (orderid, seq),
   productid INT NOT NULL,
   quantity INT NOT NULL,
   price NUMERIC(18,4) NOT NULL
 );

how do we enforce things like, for example: a) each order must contain
at least 1 line item; or b) each order must have a non-zero total value;
without making INSERT painful? I'm new to triggers, so I'm thinking of
something like a "before COMMIT" trigger but there's no such thing
apparently.

Regards,
dave

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[SQL] constraint and ordered value

2005-12-27 Thread David Garamond
Is it possible to use only CHECK constraint (and not triggers) to
completely enforce ordered value of a column (colx) in a table? By that
I mean:

1. Rows must be inserted in the order of colx=1, then colx=2, 3, and so on;

2. When deleting (or updating), "holes" must not be formed, e.g. if
there are three rows then row with colx=3 must be the first one deleted,
and then colx=2 the second, and so on.

I can see #1 being accomplished using a NOT NULL + UNIQUE constraint and
a CHECK constraint that calls some PL function where the function does a
simple checking (new.colx=1 if COUNT(colx)==0, or new.colx=MAX(colx)+1).

But is it possible to do #2 using only constraints?

Regards,
dave

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [SQL] constraint and ordered value

2005-12-29 Thread David Garamond
Daryl Richter wrote:
>> No. A constraint only applies to one row at a time. If you try to 
>> work around
>> this by calling a function that does queries it isn't guarenteed to 
>> work.
>> And if you are thinking of calling a function that does a query,  you
>> aren't
>> looking at saving time over using triggers.
>>
>> Also, if you are going to have concurrent updates, you are going to 
>> need to
>> do table locking to make this work.
> 
> And, finally, you should ask yourself *why* are you doing this, given 
> that one of the fundamental properties of a table (relation) is that 
> the rows (tuples) are *unordered.*  So much of what makes a  relational
> db a wonderful thing for storing data depends on this notion.
> 
> If you provide an explanation of what you are trying to model,  perhaps
> we can help you find a better schema design.

Thanks for the insightful answers. Actually I'm just learning about and
trying out CHECK constraints in Postgres (and Firebird) :-)

Regards,
dave

---(end of broadcast)---
TIP 6: explain analyze is your friend


[SQL] autoupdating mtime column

2006-08-04 Thread David Garamond
Dear all,Please see SQL below. I'm already satisfied with everything except I wish in #4, mtime got automatically updated to NOW() if not explicitly SET in UPDATE statement. Is there a way to make the mtime column behave more like I wanted? Thanks in advance.
create table t1 (  id int primary key,  t text,  ctime timestamp with time zone,  mtime timestamp with time zone);create or replace function update_times() returns trigger as $$begin
  if tg_op='INSERT' then    if NEW.ctime is null then NEW.ctime = NOW(); end if;    if NEW.mtime is null then NEW.mtime = NOW(); end if;  elsif tg_op='UPDATE' then     if NEW.ctime is null then NEW.ctime
 = OLD.ctime; end if;    if NEW.mtime is null then NEW.mtime = NOW(); end if;  end if;  return NEW;end;$$ language plpgsql;create trigger update_times before update or insert on t1 for each row execute procedure trig1();
-- #1. mtime & ctime set to NOW()insert into t1 values (1,'text1',null,null);-- #2. mtime & ctime set to '2001-01-01'insert into t1 values (2,'text2','2000-01-01','2000-01-01');-- #3. mtime and ctime set to '2006-06-06'
update t1 set t='new text1',ctime='2006-6-6',mtime='2006-6-6' where id=1;-- #4. mtime and ctime unchangedupdate t1 set t='new text1' where id=1;-- #5. mtime automatically updated to NOW()update t1 set t='new text1',mtime=null where id=1;



Re: [SQL] autoupdating mtime column

2006-08-04 Thread David Garamond
On 8/4/06, Richard Huxton <[email protected]> wrote:
David Garamond wrote:> Dear all,>> Please see SQL below. I'm already satisfied with everything except I> wish in> #4, mtime got automatically updated to NOW() if not explicitly SET in
> UPDATE> statement. Is there a way to make the mtime column behave more like I> wanted? Thanks in advance.Just check for OLD.mtime = NEW.mtime, or am I missing something here?
How do I differentiate between:UPDATE t SET mtime=mtime ...;in which mtime is specifically set and should not change, andUPDATE t SET foo=bar ...;in which mtime is not mentioned and should be updated automagically to NOW().
--dave


Re: [SQL] autoupdating mtime column

2006-08-04 Thread David Garamond
On 8/4/06, Rodrigo De León <[EMAIL PROTECTED]> wrote:
How about:create or replace functionupdate_times()returns trigger as $$beginif TG_OP='INSERT' thennew.ctime = coalesce(new.ctime,now());
new.mtime = coalesce(new.mtime,now());elsif TG_OP='UPDATE' thennew.ctime = old.ctime;new.mtime = now();end if;return new;
end;$$ language plpgsql;But that would disallow setting mtime and ctime to arbitrary values, which I want to permit. But ctime and mtime should be set to current time if not specified in INSERT, and mtime to current time if not specified in UPDATE. I guess what I want is the MySQL's TIMESTAMP autoupdate behaviour, whichI can't seem to be able to emulate in Postgres, because there's no information given about which columns are specified in the SET clause, only the NEW and OLD records.
--dave


Re: [SQL] autoupdating mtime column

2006-08-04 Thread David Garamond
On 8/4/06, Tom Lane <[EMAIL PROTECTED]> wrote:
If you are really intent on having a way to suppress the mtime updateyou could dedicate an additional field to the purpose, egUPDATE t SET foo=..., bar=..., keepmtime = true ...and in the trigger something like
if new.keepmtime thennew.keepmtime = false;elsenew.mtime = now();As long as nothing else ever touches keepmtime this would work.Personally I'm dubious that it's worth the trouble
Yeah, it's too expensive an overhead just for the sake of a slightly shorter UPDATE statement. 
 --- do youhave a real use-case for suppressing mtime updates?Syncing tables between databases (a la "rsync --times"). Btw, I'm considering temporarily disabling the update_times() trigger when sync-ing.
Thanks,--dave


Re: [SQL] autoupdating mtime column

2006-08-04 Thread David Garamond
On 8/5/06, Richard Huxton  wrote:
>> have a real use-case for suppressing mtime updates?> Syncing tables between databases (a la "rsync --times"). Btw, I'm> considering temporarily disabling the update_times() trigger when sync-ing.
I'd consider running the sync as a different (privileged) user andchecking the current user in the trigger. Ordinary users always getnow(), privileged users always get what they provide (and they areforced to provide some value). Does what you want and adds a safety
catch too.Alternatively, you could do something similar with views.That's a nice idea indeed. Thanks!--dave


[SQL] query: last N price for each product?

2008-07-04 Thread David Garamond
Dear SQL masters,

The query for "latest price for each product" goes like this (which I
can grasp quite easily):

 SELECT * FROM price p1
 WHERE ctime=(SELECT MAX(ctime) FROM price p2 WHERE p1.product_id=p2.product_id)

or:

 SELECT * FROM price p1
 WHERE NOT EXISTS (SELECT * FROM price p2 WHERE
p1.product_id=p2.product_id AND p1.ctime < p2.ctime)

but how do yo query for "latest 3 prices for each product"? The
"price" table contains current price as well as price histories.

Regards,
dave

-- 
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


[SQL] grouping/clustering query

2008-10-21 Thread David Garamond
Dear all,

I have an invoices (inv) table and bank transaction (tx) table.
There's also the payment table which is a many-to-many relation
between the former two tables, because each invoice can be paid by one
or more bank transactions, and each bank transaction can pay for one
or more invoices. Example:

# (invoiceid, txid)
(A, 1)
(A, 3)
(B, 1)
(B, 2)
(C, 5)
(D, 6)
(D, 7)
(E, 8)
(F, 8)

For journalling, I need to group/cluster this together. Is there a SQL
query that can generate this output:

# (journal: invoiceids, txids)
[A,B] , [1,2,3]
[C], [5]
[D], [6,7]
[E,F], [8]

Regards,
dave

-- 
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] grouping/clustering query

2008-10-24 Thread David Garamond
Tony, Joe, Steve,

Thanks for the follow-ups. Yes, the problem is related to double-entry
accounting, where one needs to balance total debit and credit
(payments and invoices) in each journal/transaction.

Due to time constraint, I ended up doing this in the client-side
programming language, since I am nowhere near fluent in PLs. The
algorithm should be simple (at least the "brute force" version), it's
basically checking if each element of the pair (txid, invoiceid) is
already mentioned in some journal and if it is, add the pair to the
journal, otherwise create a new journal with that pair as the first
entry. I believe this can easily be implemented in a PL. But still I
wonder if there is some SQL incantation that can do the same without
any PL.

Regards,
dave

On Sat, Oct 25, 2008 at 3:48 AM, Tony Wasson <[EMAIL PROTECTED]> wrote:
> On Fri, Oct 24, 2008 at 10:24 AM, Tony Wasson <[EMAIL PROTECTED]> wrote:
>> On Fri, Oct 24, 2008 at 8:04 AM, Steve Midgley <[EMAIL PROTECTED]> wrote:
>>> At 11:28 AM 10/23/2008, Joe wrote:

 Steve Midgley wrote:
>>
>> # (invoiceid, txid)
>> (A, 1)
>> (A, 3)
>> (B, 1)
>> (B, 2)
>> (C, 5)
>> (D, 6)
>> (D, 7)
>> (E, 8)
>> (F, 8)
>>
>> For journalling, I need to group/cluster this together. Is there a SQL
>> query that can generate this output:
>>
>> # (journal: invoiceids, txids)
>> [A,B] , [1,2,3]
>> [C], [5]
>> [D], [6,7]
>> [E,F], [8]
>
> Hi Dave,
>
> I'm not following the logic here. A has 1,3 and B has 1,2. So why does
> the first line print:
>
>> [A,B] , [1,2,3]
>
> What's the rule that tells the query to output this way? Is it that all
> of B's values are between A's values?

 From a purely accounting standpoint, since transaction 1 was applied to
 both invoices A and B, you need to group the invoices so that you can
 compare total invoiced against total paid.
>>>
>>> I tinkered around briefly but didn't come up with a good idea, but I bet
>>> someone on this list can. However, I did create a CREATE script for your
>>> table design which, in my experience, makes it more likely that a real
>>> expert will take on your problem..
>>>
>>> Hope this helps,
>>>
>>> Steve
>>>
>>> DROP TABLE IF EXISTS trans;
>>>
>>> CREATE TABLE trans
>>> (
>>>  id serial NOT NULL,
>>>  inv_id character varying,
>>>  tx_id character varying,
>>>  CONSTRAINT pk_id PRIMARY KEY (id)
>>> )
>>> WITH (OIDS=FALSE);
>>>
>>> insert into trans (inv_id, tx_id) values('A','1');
>>> insert into trans (inv_id, tx_id) values('A','3');
>>> insert into trans (inv_id, tx_id) values('B','1');
>>> insert into trans (inv_id, tx_id) values('B','2');
>>> insert into trans (inv_id, tx_id) values('C','5');
>>> insert into trans (inv_id, tx_id) values('D','6');
>>> insert into trans (inv_id, tx_id) values('D','7');
>>> insert into trans (inv_id, tx_id) values('E','8');
>>> insert into trans (inv_id, tx_id) values('F','8');
>
>
> Here's a stab at a custom aggregate attempting to explain what's going
> on inside.
>
> CREATE OR REPLACE FUNCTION varchar_array_accum_unique (x VARCHAR[], y
> VARCHAR) RETURNS VARCHAR[] AS $$
> DECLARE
>  res VARCHAR[];
> BEGIN
>  RAISE NOTICE 'input state is %',x;
>  RAISE NOTICE 'input variable is %',y;
>  IF x = '{}' THEN
>RAISE NOTICE 'x is empty, returning input variable %',y;
>res[1] := y;
>  ELSE
>RAISE NOTICE 'input array is not empty, checking if input
> variable is a member %',y;
>res := x;
>IF y = ANY(res) THEN
>RAISE NOTICE 'y is already in array %, skipping',res;
>ELSE
>res := array_append(res, y);
>RAISE NOTICE 'appending input variable %',y;
>END IF;
>  END IF;
>  RETURN res;
> END
> $$ LANGUAGE plpgsql STRICT;
>
> DROP AGGREGATE array_accum_unique(VARCHAR);
> CREATE AGGREGATE array_accum_unique
> (
>basetype = VARCHAR
>, sfunc = varchar_array_accum_unique
>, stype = VARCHAR[]
>, initcond = '{}'
> );
>
> SELECT array_accum_unique(inv_id) AS invoiceids
>  , array_accum_unique(tx_id) AS transactionids
> FROM (
>SELECT tx_id, inv_id
>FROM trans
>WHERE inv_id IN (
>SELECT inv_id
>FROM trans
>WHERE id IN (
>SELECT id FROM trans
> WHERE tx_id=1
>)
>)
>ORDER BY tx_id, inv_id
>  ) AS ss
> ;
>
>
> Returns this result for transaction id 1.
>
>  invoiceids | transactionids
> +
>  {A,B}  | {1,2,3}
> (1 row)
>
> Hope this helps!
> Tony
>

-- 
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


[SQL] Array: comparing first N elements?

2009-05-12 Thread David Garamond
I have a "materialized path" tree table like this (simplified):

CREATE TABLE product (
id SERIAL PRIMARY KEY,
parents INT[] NOT NULL,
name TEXT NOT NULL,
UNIQUE (parents, name)
);
CREATE INDEX name ON product(name);

Previously I use TEXT column for parents, but arrays look interesting and
convenient so I'm considering migrating to arrays. However, how do I rewrite
this using arrays?

SELECT * FROM product
WHERE parents LIKE '0001/0010/%';

In other words, testing against the first N elements in an array.

Regards,
Dave


Re: [SQL] Array: comparing first N elements?

2009-05-12 Thread David Garamond
On Tue, May 12, 2009 at 3:28 PM, Pavel Stehule wrote:

> Hello
>
> create or replace function eqn(anyarray, anyarray, int)
> returns boolean as $$
>  select not exists(select $1[i] from generate_series(1,$3) g(i)
>   except
>   select $2[i] from generate_series(1,$3) g(i))
> $$ language sql immutable strict;
>
> postgres=# select eqn(array[1,2,3,4,5], array[1,2,3,5,6], 3);
>  eqn
> -
>  t
> (1 row)
>
> Time: 1,590 ms
> postgres=# select eqn(array[1,2,3,4,5], array[1,2,3,5,6], 4);
>  eqn
> -
>  f
> (1 row)
>

Hi Pavel,

Thanks for the solution, but that's too slow. I'd rather just do this
instead:

select * from product
where parents[1:(select array_length(parents,1) from product where
name='wanted')+1]=
  (select parents from product where name='wanted')||
  (select id from product where name='wanted');

but the above query is also unable to use any indices (unlike LIKE 'foo%').

Regards,
Dave


Re: [SQL] Array: comparing first N elements?

2009-05-12 Thread David Garamond
2009/5/12 Achilleas Mantzios 

> you would want to look at the intarray contrib package for index suppor and
> many other goodies,
> also you might want to write fucntions first(parents), last(parents) and
> then have an index
> on those as well.
> This way searching for the direct children of a node is very fast.


Thanks for the suggestions! Index support is exactly what I'm looking for.
Will look into intarray.

Regards,
dave