Re: [SQL] celko nested set functions

2002-10-29 Thread greg

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
NotDashEscaped: You need GnuPG to verify this message


> I'm wondering if anyone has written code that enables you to move
> entities between parents in a nested set model. Specifically something
> that can do it without deleting any of the children involved in the
> process.  I looked in the postgresql cookbook which had adding and
> building tree's, but not moving.  I'm hoping to find something
> postgresql specific but if not that's ok. Thanks in advance,


CREATE FUNCTION MoveTree (integer, integer) RETURNS text
AS '

-- Moves part of a nested set tree to another part.
-- Pass in the left of the child (from) and the left of the parent (to)

DECLARE

cleft INTEGER; cright INTEGER;
pleft INTEGER; pright INTEGER;
leftbound INTEGER; rightbound INTEGER;
treeshift INTEGER; cwidth INTEGER;

BEGIN

SELECT lft, rht FROM tree WHERE lft = $1 INTO cleft, cright;
SELECT lft, rht FROM tree WHERE lft = $2 INTO pleft, pright;

-- Make sure the child exists
IF cleft IS NULL THEN
  RETURN ''No entry found with a left of ''||$1;
END IF;

-- Make sure the parent exists
IF pleft IS NULL THEN
  RETURN ''No entry found with a left of ''||$2;
END IF;

-- Self-move makes no sense
IF cleft = pleft THEN
  RETURN ''Cannot move: entries are identical'';
END IF;

-- Parent cannot be underneath the child
IF pleft BETWEEN cleft AND cright THEN
  RETURN ''Cannot move: first entry contains second'';
END IF;

-- Child may already be in the proper place
IF cleft = pleft+1 THEN
  RETURN ''No changes need to be made'';
END IF;

IF cleft > pleft THEN 
  treeshift  := pleft - cleft + 1;
  leftbound  := pleft+1;
  rightbound := cleft-1;
  cwidth := cright-cleft+1;
ELSE
  treeshift  := pleft - cright;
  leftbound  := cright + 1;
  rightbound := pleft;
  cwidth := cleft-cright-1;
END IF;


UPDATE tree
  SET lft = CASE
WHEN lft BETWEEN leftbound AND rightbound THEN lft + cwidth
WHEN lft BETWEEN cleft AND cright THEN lft + treeshift
ELSE lft END,
  rht = CASE
WHEN rht BETWEEN leftbound AND rightbound THEN rht + cwidth
WHEN rht BETWEEN cleft AND cright THEN rht + treeshift
ELSE rht END;


RETURN ''Tree has been moved'';

END;
'
LANGUAGE 'plpgsql';



Greg Sabino Mullane [EMAIL PROTECTED]
PGP Key: 0x14964AC8 200210291424

-BEGIN PGP SIGNATURE-
Comment: http://www.turnstep.com/pgp.html

iD8DBQE9vuDNvJuQZxSWSsgRApruAJ0bD2XyonsYNHV+XVEBYqJji3jxygCfeVk/
27Cl7rTs5bQAkyBQXuXl3mw=
=MZbR
-END PGP SIGNATURE-



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [SQL] Trees: maintaining pathnames

2002-11-17 Thread greg

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
NotDashEscaped: You need GnuPG to verify this message


Instead of storing the path in each row, why not let Postgres 
take care of computing it with a function? Then make a view 
and you've got the same table, without all the triggers.

CREATE TABLE tree (
 idINTEGER NOT NULL,
 parent_id INTEGER,
 "name"TEXT NOT NULL,
 PRIMARY KEY (id)
);


INSERT INTO tree VALUES (1,NULL,'');
INSERT INTO tree VALUES (2,1,'usr');
INSERT INTO tree VALUES (3,1,'tmp');
INSERT INTO tree VALUES (4,1,'home');
INSERT INTO tree VALUES (5,4,'greg');
INSERT INTO tree VALUES (6,5,'etc');

CREATE OR REPLACE FUNCTION pathname(INTEGER)
RETURNS TEXT AS
'

DECLARE 
  mypath TEXT;
  myname TEXT;
  myid   INTEGER;

BEGIN

  SELECT parent_id,name FROM tree WHERE id=$1 INTO myid,mypath;
  IF mypath IS NULL THEN
RETURN ''No such id\n'';
  END IF;

  LOOP
SELECT parent_id,name FROM tree WHERE id=myid INTO myid,myname;
mypath := ''/'' || mypath;
EXIT WHEN myid IS NULL;
mypath := myname || mypath;
  END LOOP;

RETURN mypath;

END;
' LANGUAGE 'plpgsql';

CREATE VIEW mytree AS SELECT *, PATHNAME(id) AS path FROM tree;

SELECT * FROM tree ORDER BY id;

 id | parent_id | name 
+---+--
  1 |   | 
  2 | 1 | usr
  3 | 1 | tmp
  4 | 1 | home
  5 | 4 | greg
  6 | 5 | etc
(6 rows)

SELECT * FROM mytree ORDER BY id;

 id | parent_id | name |  path  
+---+--+
  1 |   |  | /
  2 | 1 | usr  | /usr
  3 | 1 | tmp  | /tmp
  4 | 1 | home | /home
  5 | 4 | greg | /home/greg
  6 | 5 | etc  | /home/greg/etc
(6 rows)

UPDATE tree SET name='users' WHERE id=4;

SELECT * FROM mytree ORDER BY id;

 id | parent_id | name  |  path   
+---+---+-----
  1 |   |   | /
  2 | 1 | usr   | /usr
  3 | 1 | tmp   | /tmp
  4 | 1 | users | /users
  5 | 4 | greg  | /users/greg
  6 | 5 | etc   | /users/greg/etc
(6 rows)


Greg Sabino Mullane  [EMAIL PROTECTED]
PGP Key: 0x14964AC8 200211172015

-BEGIN PGP SIGNATURE-
Comment: http://www.turnstep.com/pgp.html

iD8DBQE92D9RvJuQZxSWSsgRAn2oAKDyIcrtgB8v1fAMY3B/ITKZ+lBlYgCfXRMe
W/xntabEsfuEdseo44cAXbY=
=MANm
-END PGP SIGNATURE-



---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [SQL] celko nested set functions

2002-11-25 Thread greg

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
NotDashEscaped: You need GnuPG to verify this message


Robert Treat and I came up with a better way to move 
nodes from one branch to another inside of a nested tree:



CREATE or REPLACE FUNCTION move_tree (integer, integer) RETURNS text
AS '

-- Moves part of a nested set tree to another part.
-- Pass in the left of the child (from) and the left of the parent (to)

DECLARE

cleft INTEGER; cright INTEGER;
pleft INTEGER; pright INTEGER;
leftbound INTEGER; rightbound INTEGER;
treeshift INTEGER; cwidth INTEGER;
leftrange INTEGER; rightrange INTEGER;

BEGIN

-- Self-move makes no sense
IF $1 = $2 THEN
  RETURN ''Cannot move: entries are identical'';
END IF;

SELECT lft, rgt FROM tree WHERE lft = $1 INTO cleft, cright;
SELECT lft, rgt FROM tree WHERE lft = $2 INTO pleft, pright;

-- Make sure the child exists
IF cleft IS NULL THEN
  RETURN ''No entry found with an left of ''||$1;
END IF;

-- Make sure the parent exists
IF pleft IS NULL THEN
  RETURN ''No entry found with a left of ''||$2;
END IF;

-- Parent cannot be underneath the child
IF pleft BETWEEN cleft AND cright THEN
  RETURN ''Cannot move: first entry contains second'';
END IF;

-- Child may already be in the proper place
IF cleft = pleft+1 THEN
  RETURN ''No changes need to be made'';
END IF;

IF cleft > pleft THEN 
  treeshift  := pleft - cleft + 1;
  leftbound  := pleft+1;
  rightbound := cleft-1;
  cwidth := cright-cleft+1;
  leftrange := cright;
  rightrange  := pleft;
ELSE
  treeshift  := pleft - cright;
  leftbound  := cright + 1;
  rightbound := pleft;
  cwidth := cleft-cright-1;
  leftrange  := pleft+1;
  rightrange := cleft;
END IF;


UPDATE tree
  SET lft = CASE
WHEN lft BETWEEN leftbound AND rightbound THEN lft + cwidth
WHEN lft BETWEEN cleft AND cright THEN lft + treeshift
ELSE lft END,
  rgt = CASE
WHEN rgt BETWEEN leftbound AND rightbound THEN rgt + cwidth
WHEN rgt BETWEEN cleft AND cright THEN rgt + treeshift
ELSE rgt END
WHERE lft < leftrange OR rgt > rightrange;

RETURN ''Tree has been moved'';

END;
'
LANGUAGE 'plpgsql';




Greg Sabino Mullane [EMAIL PROTECTED]
PGP Key: 0x14964AC8 200211251526

-BEGIN PGP SIGNATURE-
Comment: http://www.turnstep.com/pgp.html

iD8DBQE94ojRvJuQZxSWSsgRAkkUAJ0eX9VJtXYajAo60UeKYaXH1xxmkwCeJDtX
qrX7tgXmUCJNd/fphjGi7tI=
=+ADv
-END PGP SIGNATURE-




---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [SQL] Delete 1 Record of 2 Duplicate Records

2003-01-30 Thread greg

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


If they are truly identical, then you must use the hidden 'oid' column to 
differentiate the two. No need to peek at the oid, just do this:

DELETE FROM test WHERE oid = (SELECT oid FROM test WHERE column_id=5 LIMIT 1);

- --
Greg Sabino Mullane [EMAIL PROTECTED]
PGP Key: 0x14964AC8 200301301006

-BEGIN PGP SIGNATURE-
Comment: http://www.turnstep.com/pgp.html

iD8DBQE+OT+AvJuQZxSWSsgRAgZOAKCrwW2O/bQpxo5LBBp4vDkS8YoZ9wCg2H7N
R9R4CTSXx/lRmjm5NvZkYXE=
=VI0G
-END PGP SIGNATURE-


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [SQL] Help with a query for charting

2003-02-02 Thread greg

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


> I'm trying to do a query to count the number of tickets opened on 
> each day of a month. It'll always be from the 1st to the end of 
> the month.
> ...
> But it doesn't give me a zero for the days when no tickets were opened

The problem is that SQL has no concept of how many days there are supposed 
to be inside of the range you gave it, now does it have a way of easily 
determining how many months are in a year. You will have to put that 
information into the database: a simple table with a date field and one 
row per day should do it. Make sure that you go well beyond any days 
you will ever need. For example:

CREATE TABLE mydates (date_opened date);

(Using the same column name allows us to use "USING" instead of "ON" 
in our JOIN later on.)

Populate it somehow. Here is a quick and dirty way to add 1000 days:

perl -e \
"{print \"INSERT INTO mydates VALUES('\" . scalar 
localtime($^T+(\$x*86400)). \"');\n\"; redo if \$x++<1000}" \
| psql

Watch those escape characters!

Once you have such a table, you will need to join your query to it, 
by using a RIGHT OUTER JOIN (RIGHT OUTER as we are listing the 
important table first, then making sure that we have at least one 
row from the second, or "right" table). We also need to wrap the 
query for the first table inside of a subselect to allow us to use 
the GROUP BY with a JOIN. The date specification is only needed on 
the second table (mydates), although you could add it to the first 
as well if you wish. The TO_CHAR has been moved to the "outer level", 
so we can simply join on the DATE_TRUNC'ed column. Finally, a COALESCE 
on the count is added, in order to generate the wanted zeroes:

SELECT TO_CHAR(DATE_TRUNC('day',T2.date_opened), 'DD') AS "day", 
   COALESCE(T1.mycount,0) AS "count"
FROM 
  (SELECT date_opened, COUNT(*) AS mycount 
   FROM ticket GROUP BY date_opened) AS T1
RIGHT OUTER JOIN 
  (SELECT DISTINCT date_opened
   FROM mydates
   WHERE date_opened BETWEEN '23-Jan-2003' AND '26-Jan-2003') AS T2
USING (date_opened)
ORDER BY "day" ASC;

The DISTINCT is not strictly needed, but is a safeguard in case the 
mydates table has more than one entry with the same date.

Hope that helps.


- --
Greg Sabino Mullane  [EMAIL PROTECTED]
PGP Key: 0x14964AC8 200302021403

-BEGIN PGP SIGNATURE-
Comment: http://www.turnstep.com/pgp.html

iD8DBQE+PX6rvJuQZxSWSsgRAqAxAKC/NwhBKTavlNXYkTmsy7DMcxeLPwCgnP4K
y2RTdNiyQv+V29prKmo1yMw=
=bBpJ
-END PGP SIGNATURE-




---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [SQL] Elegant SQL solution:

2003-06-08 Thread greg

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


> SELECT date_part('month',rowdate) as month, count(*) as rows FROM mytable GROUP
> BY month;
>...
> I could create a one-column table with values 1 - 12 in it, and select from
> that table with a where clause matching "month". I could also create a view 
> "SELECT 1 UNION SELECT 2 UNION ..." and select against the view. There MUST be
> a more elegant way to do this.

The first solution is probably the best one. It does not seem that "unelegant" 
to me. Another way would be just to do it in the application itself.

...or you could consider this one I came up with. Use at your own risk ;)

SELECT dos.mym AS "Month", COALESCE(uno.rc,0) AS "Total" FROM 
  (SELECT date_part('month',rowdate) AS mym, count(*) AS rc
   FROM mytable GROUP BY 1) AS uno
RIGHT JOIN 
  (SELECT oid::integer-15 AS mym 
   FROM pg_type ORDER BY oid ASC LIMIT 12) AS dos
USING (mym);

- --
Greg Sabino Mullane [EMAIL PROTECTED]
PGP Key: 0x14964AC8 200306072131

-BEGIN PGP SIGNATURE-
Comment: http://www.turnstep.com/pgp.html

iD8DBQE+4phlvJuQZxSWSsgRAqLRAJsGr5YNiGXKoXBOWq6+3OpSZXOG3ACdFr2F
ywb1tBYllZt6CKtKYhoc7G4=
=6yvp
-END PGP SIGNATURE-



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [SQL] max length of sql select statement (long!)

2003-07-15 Thread greg

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


> ... I don't want to take GO:06 into account (two parents in which 
> I am intested in). That menas, whenever I ask for children of two nodes, 
> I want a DISTINCT SET of children.

To start with, you can avoid the Java and do this in SQL:

SELECT child FROM gograph WHERE parent='GO:002' OR parent='GO:005'
EXCEPT
(SELECT child FROM gograph WHERE parent='GO:002'
INTERSECT 
SELECT child FROM gograph WHERE parent='GO:005');


And yes, I would certainly start by normalizing things a little bit:


CREATE SEQUENCE goid_seq;

CREATE TABLE goID (
  idname TEXT,
  id INTEGER NOT NULL DEFAULT nextval('goid_seq')
);

INSERT INTO goid(idname) SELECT DISTINCT parent FROM gograph;

INSERT INTO goid(idname) 
SELECT DISTINCT child FROM gograph WHERE NOT EXISTS (SELECT 1 FROM goID WHERE idname = 
child);

CREATE TABLE gomap (
  parent INTEGER,
  child INTEGER
);

INSERT INTO gomap SELECT 
(SELECT id FROM goid WHERE idname=parent),
(SELECT id FROM goid WHERE idname=child)
FROM gograph


As far as the binaryInteraction table, a little more information is needed: 
how are each of these tables being populated? Why the distinct? Is it because 
there may be duplicate rows in the table? The reason I as is that it might be 
better to ue triggers to compute some of the information as it comes in, 
depending on which tables are changes and how often.


- --
Greg Sabino Mullane [EMAIL PROTECTED]
PGP Key: 0x14964AC8 200307151035

-BEGIN PGP SIGNATURE-
Comment: http://www.turnstep.com/pgp.html

iD8DBQE/FBGrvJuQZxSWSsgRAlEfAKCL4ttDdTRHxPRW9N00nowPh1/q/QCgqrkv
e7Ncj4al4aJ4ihktEyweJJo=
=Z/rk
-END PGP SIGNATURE-



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [SQL] Count dates distinct within an interval

2003-07-15 Thread greg

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



> Now, I want to count the occurences of each value of stuff in the table, 
> but so that entries within 24 hours from each other count as one...
>...
> A = 2001 - 01- 01 20:20:00
> B = 2001 - 01 - 02 20:19:00
> C = 2001 - 01 - 02 20:21:00
> Should be counted as *two* (A === B, and B === C, but *not* A === C)...

You need to elaborate on your logic some more, and state exactly what you 
would want in the A,B,C case above. Does B get lumped with A or with C? 
It is within 24 hours of both, after all. Does C not get lumped in with 
B simply because B has already been lumped in with A?

Perhaps showing us the sample output of an ideal query would be best, 
along with some sample rows (e.g. use real insert statements)


- --
Greg Sabino Mullane [EMAIL PROTECTED]
PGP Key: 0x14964AC8 200307151045
-BEGIN PGP SIGNATURE-
Comment: http://www.turnstep.com/pgp.html

iD8DBQE/FBS9vJuQZxSWSsgRAnk3AJ0bqyDk6iZWqSZuHfZslFCjxwl7fgCfaZ7r
XdwpPsO4OaTa9YpjmXx1hmA=
=IFRz
-END PGP SIGNATURE-




---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [SQL] Count dates distinct within an interval

2003-07-15 Thread greg

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


> Also, I could certainly write a simple function, that would get all the 
> entries in order, and scan through them, counting according to my rules...
> But I was hoping to find some way to do this in plain sql though...

In this example, you are best off using a function.

- --
Greg Sabino Mullane [EMAIL PROTECTED]
PGP Key: 0x14964AC8 200307151137

-BEGIN PGP SIGNATURE-
Comment: http://www.turnstep.com/pgp.html

iD8DBQE/FCAgvJuQZxSWSsgRAg1jAJ9kS9PpIiMkij6TtOg63O59TeezPACgzhMF
ZM/84SEPP4doDR8fsGpnUBU=
=w5Wa
-END PGP SIGNATURE-


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [SQL] parse error for function def

2003-07-17 Thread greg

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


> I'm trying to create a function to use on a trigger to check reference 
> to views since pg does not support foreign keys referencing views.

Can you explain exactly what you are trying to do and why? You are getting 
the error because a SQL function does not RETURN, it must end with a 
SELECT statement. It also has no DECLARE, BEGIN, or END. You can either 
remove all of those or change the language to plpgsql. See:

http://www.postgresql.org/docs/7.3/static/xfunc-sql.html

In addition, you cannot (with SQL) use an argument as the tablename. You 
also probably want to use EXISTS, not "count..::bool".

- --
Greg Sabino Mullane [EMAIL PROTECTED]
PGP Key: 0x14964AC8 200307171005

-BEGIN PGP SIGNATURE-
Comment: http://www.turnstep.com/pgp.html

iD8DBQE/Fq/7vJuQZxSWSsgRAj01AKCz9BA4aYrp8pnqWy8VHA4i3WGjtgCgjndA
yzNOE52VAvJBOEvilACSGvA=
=EcwZ
-END PGP SIGNATURE-



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [SQL] obtuse plpgsql function needs

2003-07-23 Thread greg

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



Perhaps something like this?. Called like thus:

SELECT a,b,c,msgmaker('t1',ctid) FROM t1 WHERE a=b;


CREATE OR REPLACE FUNCTION msgmaker(text,tid) RETURNS text AS '

DECLARE

  mytable ALIAS FOR $1;
  mytid   ALIAS FOR $2;
  myctid  TEXT;

  myquery TEXT;
  mylen   SMALLINT := 20;
  yourlen SMALLINT;
  mydec   SMALLINT;
  myinfo  TEXT;
  myrec   RECORD;
  biglist TEXT := \'Error\';

BEGIN

  myquery := \'
SELECT length(attname) AS lenny FROM pg_attribute
WHERE attnum >=1
AND attrelid = (SELECT oid FROM pg_class WHERE relname = \'\'\' || mytable || 
\'\'\')
ORDER BY 1 DESC LIMIT 1\';

  FOR myrec IN EXECUTE myquery LOOP
mylen := myrec.lenny;
  END LOOP;

  myquery := \'
SELECT attname, atttypid, atttypmod FROM pg_attribute
WHERE attnum >=1
AND attrelid = (SELECT oid FROM pg_class WHERE relname = \'\'\' || mytable || 
\'\'\')
ORDER BY attname ASC\';

  myinfo := \'SELECT \';

  FOR myrec IN EXECUTE myquery LOOP
myinfo := myinfo || \'\'\'- \' || myrec.attname || \': \';
yourlen := LENGTH(myrec.attname);
LOOP
  myinfo := myinfo || \' \';
  yourlen := yourlen + 1;
  EXIT WHEN yourlen > mylen;
END LOOP;
myinfo := myinfo || \'\'\' || COALESCE(\';
IF myrec.atttypid = 1184 THEN
  myinfo := myinfo || \'TO_CHAR(\' || myrec.attname || \',\'\'Mon DD,  
HH24:MI\'\')\';
ELSIF myrec.atttypid = 16 THEN
  myinfo := myinfo || \'CASE WHEN \' || myrec.attname || \' IS TRUE THEN 
\'\'True\'\' ELSE \'\'False\'\' END\';
ELSIF myrec.atttypid = 17 THEN
  myinfo := myinfo || \'ENCODE(\' || myrec.attname || \',\'\'hex\'\')\';
ELSIF myrec.atttypid = 1700 THEN
  SELECT substr(rtrim(format_type(myrec.atttypid, myrec.atttypmod),\')\'), 
position(\',\' IN format_type(myrec.atttypid, myrec.atttypmod))+1) INTO mydec;
  myinfo := myinfo || \'TO_CHAR(\' || myrec.attname || \',\'\'FM990\';
  IF mydec > 1 THEN
myinfo := myinfo || \'.\';
LOOP
  myinfo := myinfo || \'0\';
  mydec := mydec - 1;
  EXIT WHEN mydec < 1;
END LOOP;
  END IF;
  myinfo := myinfo || \'\'\')\';
ELSE 
  myinfo := myinfo || myrec.attname;
END IF;
myinfo := myinfo || \'::text,\'\'\'\'::text) || \'\'n\'\' || \\n\';
  END LOOP;

  SELECT mytid INTO myctid;

  myinfo := myinfo || \'\'\'n\'\' AS info FROM \' || mytable || \' WHERE ctid = 
\'\'\' || myctid || \'\'\'\';

  FOR myrec IN EXECUTE myinfo LOOP
biglist := myrec.info;
  END LOOP;

  RETURN biglist;

END;

' LANGUAGE 'plpgsql';




- --
Greg Sabino Mullane [EMAIL PROTECTED]
PGP Key: 0x14964AC8 200307231536
-BEGIN PGP SIGNATURE-
Comment: http://www.turnstep.com/pgp.html

iD8DBQE/HuPCvJuQZxSWSsgRAnNsAJ9Qljeo+2NkBIp17TKb6SRf2T6WwACg8bwV
A2TBRJdMzk0jpw67sIk3+uc=
=cjEZ
-END PGP SIGNATURE-



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [SQL] Is there something wrong with Perl`s dbi and PostgreSQL?

2003-09-29 Thread greg

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


> I just tried running a perl script that I wrote about a year ago. It had
> worked for me in the past. Now it doesn't work.

You have AutoCommit set to off, and never commit the transaction. Therefore, 
the table creation is rolled back. Add a 

$dbh->commit()

after your execute line and it should work as expected.

Also note that your "die" on connect will not work because you have RaiseError 
set: Leave it off for the connect, then turn it on again immediately via:

$dbh->{RaiseError}=1;


- --
Greg Sabino Mullane [EMAIL PROTECTED]
PGP Key: 0x14964AC8 200309291445

-BEGIN PGP SIGNATURE-
Comment: http://www.turnstep.com/pgp.html

iD8DBQE/eH4AvJuQZxSWSsgRAgkVAJ46YX5iJ1+pbeJOQ6RJYId/6yhOWQCeKy7R
doP2RZN1y353MT+c4KdYywA=
=KUHS
-END PGP SIGNATURE-



---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [SQL] How can I produce the following desired result?

2003-10-15 Thread greg

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


> How can I produce the following desired result?

SELECT * FROM mytable
JOIN 
(SELECT goodid, MAX(storehistoryid) AS storehistoryid FROM mytable GROUP by 1) AS a
USING (goodid,storehistoryid);


- --
Greg Sabino Mullane [EMAIL PROTECTED]
PGP Key: 0x14964AC8 200310151046

-BEGIN PGP SIGNATURE-
Comment: http://www.turnstep.com/pgp.html

iD8DBQE/jV4PvJuQZxSWSsgRAmYlAJwL06D+VNkmAT7RDcjXPgGu9oPXkgCgx1SJ
OASzEJlWv6qi05xXhPH1NBY=
=szda
-END PGP SIGNATURE-




---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [SQL] Equivalent of Reverse() functions

2003-11-29 Thread greg

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
 
 
> I am migrating an SQL Server 2000 database to Postgres 7.3.4 running
> on RH Linux 7.2. While migrating I encounter SQL Server REVERSE( )
> function, seems no such functions at Postgres.
>
> Is there a equivalent function available at Postgres? Please
> shed some light
 
No. If you have plperl installed, as somebody already posted, you can
take advantage of perl's built in reverse() function. Here is a
plpgsql version:
  
CREATE OR REPLACE FUNCTION reverse(TEXT) RETURNS TEXT AS '
  DECLARE
original ALIAS FOR $1;
reversed TEXT := \'\';
onechar  VARCHAR;
myposINTEGER;
  BEGIN
SELECT LENGTH(original) INTO mypos;
LOOP
  EXIT WHEN mypos < 1;
  SELECT substring(original FROM mypos FOR 1) INTO onechar;
  reversed := reversed || onechar;
  mypos := mypos -1;
END LOOP;
RETURN reversed;
  END
' LANGUAGE plpgsql IMMUTABLE RETURNS NULL ON NULL INPUT;
  
pg=> SELECT reverse('A man, a plan, a canal, Panama');
 
reverse
- ----
 amanap ,lanac a ,nalp a ,nam A
 
 
- --
Greg Sabino Mullane [EMAIL PROTECTED]
PGP Key: 0x14964AC8 200311291246
-BEGIN PGP SIGNATURE-
 
iD8DBQE/yNwwvJuQZxSWSsgRAnTyAJ9TqV0D3pV4Cv2b0VZfb8TxuvgxKgCeNBN+
OoFWwoD3omlLw+MUxcWZkT0=
=JtRf
-END PGP SIGNATURE-



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [SQL] Updating session id based on accesstimeout

2003-12-02 Thread greg

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
 
 
> I want to set a session id based on a session timeout per user. I
> have a very large logfile with more than 1.3 Mio records and look
> for a very efficient solution - maybe with PL/pgSQL .
 
Can you post a clearer example with the exact columns in the table,
and what you wish to do?
 
- --
Greg Sabino Mullane [EMAIL PROTECTED]
PGP Key: 0x14964AC8 200312022156
-BEGIN PGP SIGNATURE-
 
iD8DBQE/zVDsvJuQZxSWSsgRAnXaAJ9ndHTI7ha9ZyWS+Bnybgbve09jVQCffWTa
H/csLQmY29QTReOH7XYMtHs=
=7xo3
-END PGP SIGNATURE-



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [SQL] Fetch a single record

2003-12-09 Thread greg

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
 
 
> I'm looking for the fastest way to fetch a single record from a table.
> I don't care what record it is.
 
That's a strange request. If you really don't care what comes back, you
don't even need to query a table:
 
SELECT 1;
 
If you perhaps want the column names, query the system tables.
 
Otherwise the LIMIT 1 should be very fast, especially if you have no
ordering. Avoid the "SELECT *" if you don't need all the columns
for a little more speed.
 
- --
Greg Sabino Mullane [EMAIL PROTECTED]
PGP Key: 0x14964AC8 200312091943
 
-BEGIN PGP SIGNATURE-
 
iD8DBQE/1mynvJuQZxSWSsgRAk0HAKDKTHglcodYw2G9j5Il60e96Vv/xwCfcZ6p
ffIBwsqFtqW0UABYttqzT3U=
=JV2a
-END PGP SIGNATURE-



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[SQL] ANN: www.SQL-Scripts.Com

2004-04-19 Thread Greg



Hello,Announcing the release of a new web site : www.SQL-Scripts.comAt www.SQL-Scripts.Com you can find a collection of SQL Scripts for 
manydifferent database system.  Using our search system you can find 
scriptsthat you need quickly and simply.  If you have scripts that you 
use why notlodge them on our site.  If you lodge your script you could 
win a prize. Inaddition to scripts there are many documents, news feeds and 
articles.We have a monthly contest for the first people to lodge 20 
scripts in onemonth.Please help to make this site 
great.Webmaster at www.SQL-Scripts.com


[SQL] Retrieving the new "nextval" for primary keys....

2002-08-28 Thread Greg Patnude

I am using postgreSQL with Perl::CGI and Perl::DBI::Pg... I would like to be
able to insert a row from my Perl script [$SQL->exec();] and have postgreSQL
return the id of the newly inserted record (new.id) directly to the Perl
script for further processing... Anyone with a solution / idea ???

Nearly EVERY table I create in postgreSQL (7.2) has the following minimum
structure:

create table "tblName" (

id int4 primary key nextval ("tblName_id_seq"),

..field...
..field...
..field...

create_dt date default 'CURRENT_DATE',
change_dt timestamptz default 'now()',
active_flag bool default 'TRUE'

)







---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



[SQL] Why must the function that a trigger calls return "opaque" ???

2002-08-30 Thread Greg Patnude

This seem rather limiting...  Suppose I want a trigger that after insert,
returns the currval(sequence) of the newly inserted row automatically
without having to run another query ???

GP





---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [SQL] Passing array to PL/SQL and looping

2002-09-28 Thread Greg Johnson




Is it possible to construct and return an array with plpgsql 

like..

CREATE FUNCTION test_array( ) RETURNS VARCHAR[] AS '
DECLARE 
    return_array VARCHAR[];
BEGIN
    return_array[0] := ''test'';
    return_array[1] := ''test 1'';
    return_array[2] := ''test 2'';
RETURN (return_array);
END;'
LANGUAGE 'plpgsql';

I get the following error when I try to run it:
ERROR:  parse error at or near "[" on line 4.

If this worked I could clean up a LOT of hacky plpgsql code I have had to write.

On Thu, 2002-09-26 at 18:20, Josh Berkus wrote:

Peter,

> I'm a newbie to PL/SQL and need help badly. I'm trying to pass my array of
> id's to the function and then loop through until the array is empty. I know
> there must be atleast five things I'm doing wrong.

Simplified example:

CREATE FUNCTION test_array (
	INT[] ) 
RETURNS INT AS '
DECLARE id_array ALIAS for $1;
	count_it INT;
BEGIN
count_it := 1;
WHILE id_array[count_it] LOOP
	count_it := count_it + 1;
END LOOP;
RETURN (count_it - 1);
END;'
LANGUAGE 'plpgsql';

returns the number of elements in the supplied array.

-- 
Josh Berkus
[EMAIL PROTECTED]
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
    (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])




-- 
Greg Johnson <[EMAIL PROTECTED]>








Re: [SQL] Inquiry From Form [pgsql]

2003-03-02 Thread Greg Stark

Jack Kiss <[EMAIL PROTECTED]> writes:

> 1)Do you have a function which is similar to Oracle\'s DECODE.

Yes

> 2) Can you extract day of week (Monday,etc) from yours date functions.

Yes

Check out the "Date/Time Function and Operators" and the "Conditional
Expressions" sections of this:

http://www.ca.postgresql.org/users-lounge/docs/7.3/postgres/functions.html


-- 
greg


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [SQL] HardCORE QUERY HELP!!!

2003-03-03 Thread Greg Stark

Metnetsky <[EMAIL PROTECTED]> writes:

> It's for a class and my professor has a thing for brain teaser type
> questions.

Incidentally, TAs and Profs aren't stupid, and have been known to check on
newsgroups and mailing lists for students asking for people to do their
homework for them.

--
greg


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org


Re: [SQL] Forcing query to use an index

2003-03-03 Thread Greg Stark

One suggestion I'll make about your data model -- I'm not sure it would
actually help this query, but might help elsewhere:

 WHERE ( C.Disabled > '2003-02-28'
  OR C.Disabled IS NULL
   )

Don't use NULL values like this. Most databases don't index NULLs (Oracle) or
even if they do, don't make "IS NULL" an indexable operation (postgres).
There's been some talk of changing this in postgres but even then, it wouldn't
be able to use an index for an OR clause like this.

If you used a very large date, like -01-01 as your "not deactivated" value
then the constraint would be C.disabled > '2003-02-28' and postgres could use
an index on "disabled".

Alternatively if you have a disabled_flag and disabled_date then you could
have an index on disabled_flag,disabled_date and uhm, there should be a way to
use that index though I'm not seeing it right now. 

This won't matter at first when 99% of your customers are active. And ideally
in this query you find some way to use an index to find "kate" rather than
doing a fully table scan. But later when 90% of the clients are disabled, then
in a bigger batch job where you actually want to process every active record
it could prevent postgres from having to dig through a table full of old
inactive records.

> This may make better use of your index, because the planner will have a more 
> accurate estimate of the number of rows returned from the outer join.
> 
> AND:
> 
>AND ( C.Accountnum~* 'kate'
>   OR C.Firstname ~* 'kate'
>   OR C.Lastname  ~* 'kate'
>   OR C.Organization  ~* 'kate'
>   OR C.Address   ~* 'kate'
>   OR C.Postal~* 'kate'
>   OR C.City  ~* 'kate'
>   OR EA.Name || '@' || JoinDomain(EA.Domain) ~* 'kate'
> 
> This set of expressions has "seq scan" written all over it.   I hihgly suggest 
> that you try to find a way to turn these into anchored text searches, perhaps 
> using functional indexes on lower(column).

If you really need to find substring matches everywhere you might want to look
into the full text search module in contrib/tsearch. I haven't started using
it yet but I expect I will have to when I get to that part of my project. 

> Finally:
> 
>   OR CMS.Package ~* 'kate'

*confusion*. Oooh, Yeah, this one is a big problem. It means it's not clear
which end of the join to start with. Maybe it would be better to separate this
into two separate queries, give the user the option to search for a user
"kate" or a package "kate" but not both simultaneously.

-- 
greg


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html


Re: [SQL] Forcing query to use an index

2003-03-03 Thread Greg Stark


> ->  Merge Join  (cost=6106.42..6335.30 rows=2679 width=265) 
   (actual time=859.77..948.06 rows=1 loops=1)

Actually another problem, notice the big discrepancy between the estimated row
and the actual rows. That's because you have the big OR clause so postgres
figures there's a good chance one of the clauses will be true so it estimates
a lot of rows will match. In fact of course they're all very selective and
you'll usually probably only get a few records.

If you're stuck with the unanchored text search it will always do a full table
scan so it will never be lightening fast. But it would probably be a bit
faster if you put a limit clause (on a subquery) on the table that's doing the
full table scan. 

That will convince postgres that there won't be thousands of resulting
records, which might convince it to do a nested loop.

Also, as a beneficial side effect will also limit the damage if one your users
does a search for "e"...

This only really helps if you can get rid of the OR CMS.package clause...
otherwise it actually needs all the records in case they match a summary
record with a kate package.

-- 
greg


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html


Re: [SQL] Gist indexes on int arrays

2003-03-04 Thread Greg Stark

Greg Stark <[EMAIL PROTECTED]> writes:

> Can I have a GiST index on (foo_id, attribute_set_array) and have it be just
> as fast at narrowing the search to just foo_id = 900 but also speed up the ~
> operation?

Hm, so if I understand what I'm reading I can do this if I load the btree_gist
contrib module as well. I'm still not sure whether it'll be worthwhile for
this application though.

I have a bit of a problem though. Is building GiST indexes supposed to take
much much longer than building btree indexes? It's been running nearly an hour
and it's still going. The hard drive is hardly moving so it seems to be all
cpu usage. I don't even see any pgsql_tmp usage.

db=# CREATE INDEX cache_gist_idx on cache using gist ( foo_id , attribute_set 
gist__int_ops);

postgres 30176 86.3 22.2 64896 57344 ?   R11:08  40:32 postgres: postgres slo 
[local] CREATE INDEX


I don't remember exact numbers but building the normal btree index took on the
order of 15m. This will have to be rebuilt nightly, an hour long index build
won't be practical.

--
greg


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly


Re: [SQL] Complex outer joins?

2003-03-26 Thread Greg Stark
Tom Lane <[EMAIL PROTECTED]> writes:

> The SQL-standard way of writing this would presumably be either
> 
> from G left join L on (G.SELID = L.SELID)
>  left join C on (L.SELID = C.SELID and L.SELLEVEL = C.SELLEVEL)

I would think of it as this one.

> from G left join
>  (L left join C on (L.SELID = C.SELID and L.SELLEVEL = C.SELLEVEL))
>  on (G.SELID = L.SELID)

I don't see how that would be at all different.

> depending on which join you think ought to be done first.  It might be
> that the results are the same in this case, but I'm not convinced of
> that.  In general the results of outer joins definitely depend on join
> order.

I'm pretty sure Oracle actually builds an abstract join representation where
the two queries above would actually be represented the same way. Then decides
the order from amongst the equivalent choices based on performance decisions.

Can you show an example where the join order would affect the result set? I
can't think of any.


-- 
greg


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org


Re: [SQL] LEAST and GREATEST functions?

2003-07-01 Thread Greg Stark
Josh Berkus <[EMAIL PROTECTED]> writes:

> Stefan,
> 
> > I know the LEAST and GREATEST functions are not part
> > of standard SQL, but they sure were handy where I came
> > from (Oracle-land).
> 
> Um, what's wrong with MAX and MIN, exactly?

MAX and MIN are single-parameter aggregate functions. LEAST and GREATEST are
two-parameter (though in postgres they could be defined for 3 and more
parameters) scalar functions.

eg:

SELECT max(a) FROM bar 

would return a single tuple with the maximum value of a from amongst every
record. whereas:

SELECT greatest(a,b) FROM bar

would return one tuple for every record in the table with a single value
representing the greater of bar.a and bar.b.

You could define your own functions to do this but it would be tiresome to
define one for every datatype.

-- 
greg


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [SQL] summing tables

2003-07-15 Thread Greg Stark

To solve this problem efficiently you probably need the lead/lag analytic
functions. Unfortunately Postgres doesn't have them.

You could do it with something like:

update foo set c = a+b+(select c from foo as x where seq < foo.seq ORDER BY seq desc 
LIMIT 1) 

or the more standard but likely to be way slower:

update foo set c = a+b+(select c from foo as x where seq = (select max(seq) from foo 
as y where seq < foo.seq))


However, i would suggest that if you have an implicit relationship between
records you should make that relationship explicit with a foreign key. If you
had a column that contained the seq of the parent record then this would be
easy. I'm really puzzled how this query as currently specified could be
useful.


-- 
greg


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [SQL] summing tables

2003-07-15 Thread Greg Stark

"Viorel Dragomir" <[EMAIL PROTECTED]> writes:

> Anyway, in real life this update modifies only one row with a value wich is
> diff of null. It was really handy if it was specified the option ORDER for
> the update command.

Are you hoping to produce a running total? That's very difficult in standard
SQL. That would be very different from the query you asked for. 

Running totals, ranking, lead/lag, are all things that are very difficult to
do in standard SQL. They don't fit in the unordered set model that SQL follows
so doing them without special non-standard functions is very hard and
inefficient. 

The functions to do them don't fit well within the SQL universe either, which
might be why they don't exist yet in postgres.

-- 
greg


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [SQL] Recursive request ...

2003-07-17 Thread Greg Stark
Joe Conway <[EMAIL PROTECTED]> writes:

> BenLaKnet wrote:
> > I see connect by in Oracle
> > ??? is there an equivalent in PostgreSQL or not ??
> 
> Someone is working on the SQL99 equivalent, but it isn't done yet. Perhaps for
> 7.5.

There's a connectby hack in the contrib/tablefunc directory. I haven't used it
so I'm not clear on how powerful it is compared to the real deal, but people
on one of the pgsql lists seemed to find it useful when it came up in the
past.


connectby(text relname, text keyid_fld, text parent_keyid_fld,
text start_with, int max_depth [, text branch_delim])
  - returns keyid, parent_keyid, level, and an optional branch string
  - requires anonymous composite type syntax in the FROM clause. See
the instructions in the documentation below.


-- 
greg


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [SQL] Table Partitioning and Rules

2003-07-17 Thread Greg Stark

"Girish Bajaj" <[EMAIL PROTECTED]> writes:

> I cant possibly index all the cols in the table. So I thought Id best manage
> the data by splitting up the table into multiple partitions and eventually
> depending on application logic, only scan those tables that are necessary to
> scan sequentially instead of the whole big table.

But that's only going to help if one of the columns they're searching on is
the last name column isn't it?

I'm a fan of partitioned tables but you have to consider what advantage you're
trying to achieve to understand if it will actually be helpful for you:


Partitioned tables only really improve query performance if virtually all
queries use a common constraint. The canonical example is accounting tables
being partitioned based on fiscal year. Virtually all the queries--even ones
doing massive batch queries best served by sequential scans--will only scan
the current fiscal year.

In your case unless you can impose a constraint on the UI that users always
perform their queries on a single letter of the alphabet at a time and only
see results for people whose last names match that letter, it's not really a
great match as far as query performance.


The other advantage of partitioned tables is space management; it allows
placing each partition on a separate physical storage space. However without
native support in Postgres doing it via rules is going to be a headache. I
would think you would be better off striping the disks together and storing it
as a single large table. That's the only clean approach Postgres really allows
at this point anyways. 


Finally, if I WAS going to partition based on the first letter of a text
string, which I doubt I would, I would probably create 26 partitions right off
the bat. Not try to make up arbitrary break points. If those arbitrary
breakpoints turn out to be poorly chosen it'll be a complex manual job to move
them. Whereas if you just have 26 partitions some will be large and some small
and you can move partitions between physical storage freely to balance things.

-- 
greg


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [SQL] casting to arrays

2003-07-19 Thread Greg Stark

Joe Conway <[EMAIL PROTECTED]> writes:

> Not possible in current releases, but it will be in 7.4 (about to start beta).
> It looks like this:

Well there is the int_array_aggregate function in the contrib/intagg
directory. It has to be compiled separately, and it has a few quirks (like the
arrays are zero-based instead of 1-based) but it works more or less, and it
does exactly what you describe.

But the 7.4 stuff should be much cleaner and more flexible, so if you don't
need it right now you're better off waiting.

-- 
greg


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [SQL] min() and NaN

2003-07-24 Thread Greg Stark
Tom Lane <[EMAIL PROTECTED]> writes:

> NULL can be special, because it acts specially in comparisons anyway.
> But NaN is just a value of the datatype.

Does postgres intend to support all the different types of NaN? Does you
intend to have +Inf and -Inf and underflow detection and all the other goodies
you actually need to make it useful?

If not it seems more useful to just use the handy unknown-value thing SQL
already has and turn NaN into a NULL. 

-- 
greg


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [SQL] numeric and float converts to int differently?

2003-10-28 Thread Greg Stark

"SZÛCS Gábor" <[EMAIL PROTECTED]> writes:

> > cannot see is that the float values are not actually exactly 0.5
> 
> Yes I could guess that (floating point vs fixed), but is this a coincidence
> that both '0.5'::float and '-0.5'::float are closer to 0, whereas they could
> be closer to +/-1, as well as both closer to the lower or upper bound.

Wouldn't 0.5 and -0.5 be representable exactly as floats?


-- 
greg


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [SQL] Table versions

2003-10-29 Thread Greg Stark
Tom Lane <[EMAIL PROTECTED]> writes:

> Returning to the original problem, it seems to me that comparing "pg_dump
> -s" output is a reasonable way to proceed.  

I've actually started checking in a pg_dump -s output file into my CVS tree. 

However I prune a few key lines from it. I prune the TOC OID numbers from it,
and anything not owned by the user I'm interested in.

The makefile rule I use looks like:

schema.sql:
pg_dump -U postgres -s user | sed '/^-- TOC entry/d;/^\\connect - 
postgres/,/^\\connect - user/d;/^SET search_path/d;/^$$/d;/^--$$/d' > $@


This still suffers from one major deficiency. The order that objects are
outputed isn't necessarily consistent between databases. If I add tables to
the development server but then add them to the production server in a
different order the schema still shows differences even though the objects in
the two databases are identical.

-- 
greg


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [SQL] Getting last insert value

2003-11-15 Thread Greg Stark

Guillaume LELARGE <[EMAIL PROTECTED]> writes:

> Doing a "select currval() from my_table" after your insert should work.

That's "select currval('my_table_pkcol_seq')" actually. 

The above would have called the currval() function for every record of the
table which isn't what you want and in any case currval takes an argument.

-- 
greg


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [SQL] Programatically switching database

2003-11-16 Thread Greg Stark
ow <[EMAIL PROTECTED]> writes:

> My concern though ... wouldn't pgSql server collapse when faced with
> transaction spawning across 100M+ records? 

The number of records involved really doesn't faze Postgres at all. However
the amount of time spent in the transaction could be an issue if there is
other activity in other schemas of the same database.

As long as the transaction is running none of the deleted or old updated data
in any schema of the database can be cleaned up by vacuum as postgres thinks
the big transaction "might" need to see it sometime.

So if the rest of the database is still active the tables and indexes being
updated may grow larger than normal. If it goes on for a _really_ long time
they might need a VACUUM FULL at some point to clean them up.

-- 
greg


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [SQL] Expressional Indexes

2003-11-18 Thread Greg Stark

"Randolf Richardson, DevNet SysOp 29" <[EMAIL PROTECTED]> writes:

>   For example, if I want to index on a date field but only have the index 
> keep track of the most recent 30 days (and then create a secondary index for 
> all dates) so as to improve performance on more heavily loaded systems.
> 
>   Am I understanding this new terminology correctly?  Thanks in advance.

No, you could do the above using "partial indexes" but it wouldn't work very
well in this case because the "last 30 days" keeps moving and you would have
to keep redefining the index periodically. It also wouldn't really help
performance.

Expression Indexes are just more powerful "functional indexes". In 7.3 they
could be used for indexing expressions like "lower(foo)". In 7.4 they're more
powerful and you can index expressions other than simple function calls. 

They still should be things that always return the same value, which excludes
subqueries.

-- 
greg


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [SQL] Expressional Indexes

2003-11-21 Thread Greg Stark

Tom Lane <[EMAIL PROTECTED]> writes:

> No, because the above represents a moving cutoff; it will (and should)
> be rejected as a non-immutable predicate condition.  You could do
> something like
> 
>   CREATE INDEX my_Nov_03_index on my_table (create_date)
> WHERE (create_date >= date '2003-11-01');
> 
> and then a month from now replace this with
> 
>   CREATE INDEX my_Dec_03_index on my_table (create_date)
> WHERE (create_date >= date '2003-12-01');
> 
> bearing in mind that this index can be used with queries that contain
> WHERE conditions like "create_date >= some-date-constant".  The planner
> must be able to convince itself that the right-hand side of the WHERE
> condition is >= the cutoff in the index's predicate condition.  Since
> the planner is not very bright, both items had better be simple DATE
> constants, or it won't be able to figure it out ...

Note that if you're just doing this to speed up regular queries where you have
create_date in some small range, then you'll likely not see much of an
increase. Mainly you'll just save space.

What can be interesting is to create a partial index like this but over a
second unrelated column. Something like:

CREATE INDEX my_dec_03_index on my_table (userid)
 WHERE (create_date >= date '2003-11-02');

Then you can do queries like

SELECT * FROM my_table WHERE userid = ? AND create_date >= date '2003-11-02'

And it'll be able to efficiently pull out just those records, even if there
are thousands more records that are older than 2003-11-02.

This avoids having to create a two-column index with a low-selectivity column
like "month".

-- 
greg


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [SQL] Scaler forms as function arguments

2003-11-28 Thread Greg Stark

Joe Conway <[EMAIL PROTECTED]> writes:

> In 7.4 you could use an array. It would look like this:


Though note that 7.4 doesn't know how to optimize this form:


db=> explain select * from foo where foo_id in (1,2);
   QUERY PLAN  
  
-
 Index Scan using foo_pkey, foo_pkey on foo  (cost=0.00..6.05 rows=2 width=756)
   Index Cond: ((foo_id = 1) OR (foo_id = 2))
(2 rows)


db=> explain select * from foo where foo_id = ANY (array[1,2]);
  QUERY PLAN  
--
 Seq Scan on foo  (cost=0.00..1132.82 rows=5955 width=756)
   Filter: (foo_id = ANY ('{1,2}'::integer[]))
(2 rows)

-- 
greg


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [SQL] OFFSET and subselects

2003-11-28 Thread Greg Stark

[EMAIL PROTECTED] (Dmitri Bichko) writes:

> I am running in trouble with pagination here, somehow (rather naively) I
> assumed that when doing a LIMIT and OFFSET, the subselects on the records
> before the OFFSET would not be performed, which quite apparently is not the
> case. So, LIMIT 50 OFFSET 0 takes 100ms to run, LIMIT 50 OFFSET 50 takes
> 200ms, LIMIT 50 OFFSET 100 takes 300ms; and so forth, this really becomes
> unacceptable after a few pages.

If you don't need any of the results of the subqueries in your WHERE clause
then you can do this by introducing a view in your query like:

SELECT *,
   (SELECT ...) AS sub_1,
   (SELECT ...) AS sub_2,
   (SELECT ...) AS sub_3
  FROM (
SELECT x,y,z
  FROM ...
 WHERE ...
   )
 LIMIT 50
OFFSET 50


If you do use the results of the subqueries in your where clause or order by
clause then, well, you're SOL. Since the OFFSET and LIMIT clauses only kick in
after the where clause restrictions are taken into account.

-- 
greg


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [SQL] Index not recognized

2003-12-06 Thread Greg Stark

"Grace C. Unson" <[EMAIL PROTECTED]> writes:

> Why is it that my index for text[] data type is not recognized by the
> Planner?
> 
> I did these steps:
> 
> 1. create function textarr(text[]) returns text language sql as 'select
> $1[1]' strict immutable
> 2. create index org_idx on EmpData (textarr(org));

This index will only be used if you use the expression textarr(org) in your
query. You would probably have some success if you did:

 select * from empdata where textarr(org) = 'math' 

> 3. vacuum full
> 4. explain analyze select name from EmpData where org *= 'math';

Is this *= operator from the contrib/array directory? It's not an indexable
operator at all using standard btree indexes.

The GiST indexing does make indexable operators that can do things like *= but
that's a whole other ball of wax.

What are you really trying to do?

> Result:
> =
> Seq Scan on EmpData (cost=0.00..3193.20 rows=102 width=488)
> (actual time=3.71.35..371.35 rows=0 loops=1)
> 
> Filter: (org[0]='math'::text)

Well that's awfully odd. I don't know how that expression came out of the
query you gave. You'll have to give a lot more information about how you're
defining *= and why you think it's related to the function you used to define
the index.

-- 
greg


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [SQL] How can I get the last element out of GROUP BY sets?

2004-01-22 Thread Greg Stark
Tom Lane <[EMAIL PROTECTED]> writes:

> Robert Creager <[EMAIL PROTECTED]> writes:
> > ... one piece of data I need is the last value for each GROUP BY
> > period.  Alas, I cannot figure out how to do this.
> 
> SELECT DISTINCT ON (rather than GROUP BY) could get this done for you.

Or if you need to combine this with other aggregate functions like sum, count,
etc:

CREATE FUNCTION first_accum (integer, integer) RETURNS integer AS 'select 
coalesce($1,$2)' LANGUAGE sql;
CREATE FUNCTION last_accum (integer, integer) RETURNS integer AS 'select $2' LANGUAGE 
sql;
CREATE AGGREGATE first (BASETYPE = integer, SFUNC = first_accum, STYPE = integer);
CREATE AGGREGATE last (BASETYPE = integer, SFUNC = last_accum, STYPE = integer);

Then you can do first() and last(). These definitions only work for integer
but you can pattern match for other datatypes. You might be able to get a
universal function working using anyelement now, I haven't tried.

-- 
greg


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [SQL] limit 1 and functional indexes

2004-01-29 Thread Greg Stark

Bruno Wolff III <[EMAIL PROTECTED]> writes:

> >QUERY PLAN
> > 
> >  Sort  (cost=11824.16..11831.55 rows=2956 width=33) (actual time=248.17..248.17 
> > rows=0 loops=1)
> >Sort Key: order_date
> >->  Index Scan using transactions_pop_i on transactions
> > (cost=0.00..11653.79 rows=2956 width=33) 
> > (actual time=126.13..126.13 rows=0 loops=1)
> >  Index Cond: (upper((pop)::text) = 
> > '79BCDC8A4A4F99E7C111'::text)
> >  Total runtime: 248.25 msec


Yeah, the problem with functional indexes is that the optimizer doesn't have
any clue how the records are distributed since it only has statistics for
columns, not your expression. Notice it's estimating 2956 rows where in fact
there are 0.

I think someone was actually working on this so it may be improved in 7.5 but
I'm not sure.

Given the type of data you're storing, which looks like hex strings, are you
sure you need to do a case-insensitive search here? Can't you just uppercase
it when you store it?

The other option would be to use a subquery and force the planner not to pull
it up, something like:


 select code
   from (
 select code 
   from transactions 
  where UPPER(pop) = UPPER('79bcdc8a4a4f99e7c111') 
 offset 0
)
 order by order_date DESC;


The offset 0 prevents the optimizer from pulling the subquery into the outer
query. I think this will prevent it from even considering the order_date index
scan, but you'll have to try to be sure.

-- 
greg


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [SQL] limit 1 and functional indexes: SOLVED

2004-01-30 Thread Greg Stark

"Alexandra Birch" <[EMAIL PROTECTED]> writes:

> It works perfectly - thanks a million!
> Strangely the offset 0 does not seem to make any difference.
> Gotta read up more about subqueries :)
> 
>  explain analyze
>  select code,order_date
>from (
>  select code, order_date
>from transactions
>   where UPPER(pop) = UPPER('c892eb2f877e3a28ddc8e196cd5a8aad')
>   limit 1
> ) as foo
>  order by order_date DESC;

I think what you're trying to do here is get the last order? Then you'll want
the limit to be on the outer query where it's ordered by order_date:

  select code,order_date
from (
  select code, order_date
from transactions
   where UPPER(pop) = UPPER('c892eb2f877e3a28ddc8e196cd5a8aad')
  offset 0
 ) as foo
   order by order_date DESC;
   limit 1

Note that in theory postgres should be able to find the same plan for this
query as yours since it's equivalent. It really ought to use the order_date
index since it thinks it would be more efficient. 

However it's unable to because postgres doesn't try every possible index, only
the ones that look like they'll be useful for a where clause or an order by.
And the order by on the outer query isn't considered when it's looking at the
subquery. 

It normally handles this case by merging the subquery into the outer query,
but it can't do that if there's a limit or offset. So an "offset 0" is
convenient for fooling it into thinking the subquery can't be pulled up
without actually changing the output.

You could do "order by upper(pop)" instead which might be clearer for someone
reading the query in that it makes it look like you're trying to encourage it
to use the index on upper(pop). In theory "order by"s on subqueries are
useless and postgres could ignore them, but it doesn't.

-- 
greg


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [SQL] Index not used - now me

2004-02-09 Thread Greg Stark

Christoph Haller <[EMAIL PROTECTED]> writes:

> Why appear? If the Index Scan has a Total runtime: 2.46 msec and the Seq Scan 
> a Total runtime: 46.19 msec, then the Index Scan is much faster. 
> Or am I completely off the track reading the explain analyze output? 

To estimate the relative costs of a sequential scan and an index scan Postgres
has to take into account the likelihood the blocks needed will be the disk
cache. In your example your database is otherwise idle and the entire table is
small enough that the entire index is probably in cache.

This means that the random access pattern of the index isn't really hurting
the index scan at all. Whereas in a busy database with less available RAM the
random access pattern makes a big difference.

You could try raising effective_cache_size to give postgres a better chance at
guessing that all the blocks will be in cache. But that may no longer be true
when the query is run on a busy database.

You could also try lowering random_page_cost. Some people find as low as 1.2
or so to be useful, but that would almost certainly be lying to postgres about
the costs of random access and would cause it to use index scans aggressively
even when they're not faster.

-- 
greg


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [SQL] determining how many products are in how many categories

2004-02-15 Thread Greg Patnude
You might be better off with something more like this:

SELECT COUNT(P.*) AS p_count, P.p_product_category_id FROM p_product P
WHERE P.p_product_category_id IN (SELECT DISTINCT id FROM
p_product_category) GROUP BY P.p_product_category_id;

obviously tailored to YOUR schema... not mine...
-- 
Greg Patnude / The Digital Demention
2916 East Upper Hayden Lake Road
Hayden Lake, ID 83835
(208) 762-0762

"David Garamond" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> # product table (simplified):
> create table p (
>id char(22) not null primary key,
>name text,
>desc text
> );
>
> # product category table (simpl.):
> create table pc (
>id char(22) not null primary key,
>name text,
>desc text
> );
>
> # table that maps products into categories:
> create table p_pc (
>id char(22) not null primary key,
>pid char(22) not null references p(id),
>pcid char(22) not null references pc(id)
> );
> create index i_ppc_pid on p_pc(pid);
> create index i_ppc_pcid on p_pc(pcid);
> create unique index i_ppc_pid_pcid on p_pc(pid, pcid);
>
> There are about 50k rows in p, 40k in pc, and 140k in p_pc. Most
> products are only placed in 1 category, some in 2, fewer in 3, and fewer
> stills in 4, 5, 6 categories.
>
> I want to know how many products are placed in 1 category, how many in
> 2, and so on. Here's my query:
>
> select count(pid),num_cat from (
>select pid,count(cid) as num_cat
>from ppc group by pid) as f
> group by num_cat;
>
> A sample output (4000 products are categorized in 5 different places,
> 4998 in 4, and so on):
>
>   count | num_cat
> ---+-
>4000 |   5
>4998 |   4
>7502 |   3
>   10001 |   2
>   17499 |   1
> (5 rows)
>
> However, this query is taking about 2.5 minutes. Any idea on how to make
> it faster?
>
> -- 
> dave
>
> ---(end of broadcast)---
> TIP 5: Have you checked our extensive FAQ?
>
>http://www.postgresql.org/docs/faqs/FAQ.html
>



---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [SQL] Unique Constraint with foreign Key

2004-02-15 Thread Greg Patnude
Pleas also note that the referenced column in the foreign table either needs
to be the PRIMARY KEY or have a unique constraint on it or maybe it just
requires an index on it -- I'm not sure but I discovered that if the column
in the foreign table (containing the REFERENCED key...) is NOT the primary
key column -- the REFERENCES a(x) will faill unless a.x is specified as
'UNIQUE' -- as in the following example:

create table a (
y integer not null primary key default nextval('nexta_seq'),
x varchar not null UNIQUE

);

create table b (

z integer not null PRIMARY KEY default nextval('nextbz_seq'),
x varchar NOT NULL REFERENCES a(x),

);


-- 
Greg Patnude / The Digital Demention
2916 East Upper Hayden Lake Road
Hayden Lake, ID 83835
(208) 762-0762

"Bruno Wolff III" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> On Tue, Feb 10, 2004 at 12:03:36 -0500,
>   [EMAIL PROTECTED] wrote:
> > Hi all i am using postgres and torque
> > I have a column x in Table A of type varchar and NOT NULL. I want to
> > impose a unique key constraint on that table and column y is the primary
> > key.
> > I have another Table B with column x. can i make this column x as a
> > foreign key on column x in table A.
>
> Yes. The reference will need to specify column x, since the primary key
> column (y) will be the default.
>
> > or Can i have varchar types  of size 50 as primary keys in Postgres.
>
> Yes. But unless the 50 character limit comes from a business rule, you
> might want to use the type 'text'.
>
> ---(end of broadcast)---
> TIP 7: don't forget to increase your free space map settings
>



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [SQL] nextval problem

2004-02-17 Thread Greg Patnude
New feature for posgreSQL 7.3 & later -- sequences are automatically dropped
when the owning table is dropped -- you need to recreate the sequence

The easiest way is in your CREATE TABLE tblName (

id serial PRIMARY KEY,
blah...,
blah...,

) WITH OIDS;

-- 
Greg Patnude / The Digital Demention
2916 East Upper Hayden Lake Road
Hayden Lake, ID 83835
(208) 762-0762

"Ivo Anon" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> I'm new to postgresql and I'm having problems with the 'nextval' function
> (or whatever it is called in postgresql).
>
> I created a table (called ADDRESS) using 'pgadmin' which has a serial
field
> (called addressid). When I tried the sql statement:
>
> select nextval('public."ADDRESS_addressid_seq"')
>
> everything seemed to be working fine.
>
> I then used the create table statement (displayed in pgadmin when the
table
> 'ADDRESS' is selected) to create the same table (of course after first
> dropping 'ADDRESS'). When I try the same select-statement I suddenly got
an
> error message saying:
>
> ERROR:  Relation "public"."ADDRESS_addressid_seq" does not exist
>
> Can anybody help me with this problem?
>
>




---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [SQL] Unique Constraint with foreign Key

2004-02-17 Thread Greg Patnude
You've got it wrong when you reference the data column (a.x) -- your
foreign key should reference the primary key in the referenced table
(a.y)... 

Besides,  in your table A -- 1, 99 2, 99 violates your unique constraint
on column 'X' -- it would never happen... 

What I suggested is like this:

create table a (

y integer not null primary key default nextval('nexta_seq'),
x varchar not null UNIQUE

);

 
create table b (

z integer not null PRIMARY KEY default nextval('nextbz_seq'),
x integer NOT NULL REFERENCES a(y),
 
);

Table A would have 

1, 99
2, 99
..

99,99

and table B would have

1, 1
1, 2
..
1, 99








Greg Patnude / The Digital Demention
2916 East Upper Hayden Lake Road
Hayden Lake, ID. 83835
(208) 762-0762
 
Send replies to: [EMAIL PROTECTED]

Website: http://www.left-center.com

-Original Message-
From: Jan Wieck [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, February 17, 2004 6:42 AM
To: Greg Patnude
Cc: [EMAIL PROTECTED]
Subject: Re: [SQL] Unique Constraint with foreign Key

Greg Patnude wrote:
> Pleas also note that the referenced column in the foreign table either
needs
> to be the PRIMARY KEY or have a unique constraint on it or maybe it
just
> requires an index on it -- I'm not sure but I discovered that if the
column
> in the foreign table (containing the REFERENCED key...) is NOT the
primary
> key column -- the REFERENCES a(x) will faill unless a.x is specified
as
> 'UNIQUE' -- as in the following example:

This is according to the SQL specification, which doesn't like doubts.

Imagine rows (1, 99), (2, 99) in table a and row (3, 99) in table b. 
Which of the a-rows is now referenced and am I allowed to delete the 
other? There are good arguments either way, but if you require a UNIQUE 
on a.x, then this question will never come up.


Jan

> 
> create table a (
> y integer not null primary key default nextval('nexta_seq'),
> x varchar not null UNIQUE
> 
> );
> 
> create table b (
> 
> z integer not null PRIMARY KEY default nextval('nextbz_seq'),
> x varchar NOT NULL REFERENCES a(x),
> 
> );
> 
> 


-- 
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #


---(end of broadcast)---
TIP 8: explain analyze is your friend


[SQL] User defined types -- Social Security number...

2004-02-20 Thread Greg Patnude
Anyone have a good pre-built user-defined type definition for creating /
maintaining / manipulating a SSN ... where valid chars are in the range
000-00- through 999-99-.

I imagine that the storage column is probably varchar(11) -- I am looking
for a type definition that

1) checks that all numbers are in the range [0..9]
2) inserts the dashes at position
3) accepts either '000123456' OR '000-12-3456' as the input
4) leading zeroes (as in the above) must be stored -- hence varchar(11)
instead of numeric
5) always outputs in thje format '000-12-3456'

Either that or the question is: How can I coerce postgreSQL into using an
input / output "mask"...

-- 
Greg Patnude / The Digital Demention
2916 East Upper Hayden Lake Road
Hayden Lake, ID 83835
(208) 762-0762



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [SQL] User defined types -- Social Security number...

2004-02-29 Thread Greg Patnude
Thanks Josh -- I understand that there are valid and invalid SSN's -- 
similar rules apply to zip codes and area codes...

I tried this:

SELECT to_char(123456789, '000-00-');
which yields 123-45-6789 -- nicely, I might add...

the trick is getting postgreSQL to do this without having to create an ON
SELECT and ON UPDATE TRIGGER...

an ON UPDATE, SELECT, and INSERT re-write RULE might do the trick...

SSN's CAN in fact start with a leading 0 -- mine does -- "041-xx-" --

I do agree that there are valid ranges -- my main concern is being able to
store any leading zeros - I just need to make sure that something "looks"
like a valid SSN in the formattig
(nnn-nn-) and that I can store / retrieve it with the approoriate
format -- what I am really trying to accomplish is an "input mask"...

I hadn't considered using a Domain have to look at that

-- 
Greg Patnude / The Digital Demention
2916 East Upper Hayden Lake Road
Hayden Lake, ID 83835
(208) 762-0762

"Josh Berkus" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> Greg,
>
> > Anyone have a good pre-built user-defined type definition for creating /
> > maintaining / manipulating a SSN ... where valid chars are in the range
> > 000-00- through 999-99-.
>
> Actually, the range is more narrowly defined than that.  I'm not sure of
the
> exact rules, but you will never see a leading 0 or a -00- in an SSN.
>
> > I imagine that the storage column is probably varchar(11) -- I am
looking
> > for a type definition that
>
> Use DOMAINs, not a custom type.  It's less work.
>
> > Either that or the question is: How can I coerce postgreSQL into using
an
> > input / output "mask"...
>
> After you've created your DOMAIN, based on the TEXT type, you can overload
the
> input and output functions to format correctly.   Beware, though: input &
> output functions pretty much have to be written in C.
>
> -- 
> Josh Berkus
> Aglio Database Solutions
> San Francisco
>
> ---(end of broadcast)---
> TIP 8: explain analyze is your friend
>



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [SQL] Slow sub-selects, max and count(*)

2004-03-27 Thread Greg Stark

Josh Berkus <[EMAIL PROTECTED]> writes:

> Max() and Count() cannot use indexes for technical reasons.   Browse through 
> the archives of SQL, PERFORM, and HACKERS for about 12,000 discussions on the 
> subject.

Please don't confuse the issue by throwing Max() and Count() into the same
basket.

The issue with Min()/Max() is that the result could be generated efficiently
by scanning indexes but it's just hard, especially given generic aggregate
functions, and the work simply hasn't been done, or even started, yet.

The issue with Count() is that people want the result to be cached in a single
per-table counter, but that can't be done as simply as that because of
transactions. People have discussed complex solutions to this but it's a much
more complex problem than it appears.

They're really two entirely separate issues.

-- 
greg


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [SQL] SQL challenge--top 10 for each key value?

2004-04-08 Thread Greg Stark

Jeff Boes <[EMAIL PROTECTED]> writes:

> I headed off in the direction of groups of SELECTs and UNIONs, and quit when I
> got to something like four levels of "SELECT ... AS FOO" ...

four? wimp, that's nothing!

ok, seriously I think there's no way to do this directly with straight SQL.
You would have to define a non-immutable function that has some temporary
storage where it keeps track of how many it has seen. 

The generic function that would help here would be some kind of rank(value)
that would give you the equivalent of rownum except with a level break every
time value changes. I've been hoping to see something like this on the list
for a long time but haven't yet.

If the value of n is constant and small you could cheat with an aggregate
function with an array of the top n values.

db=> create function first3_accum(integer[],integer) returns integer[] as 'select case 
when array_upper($1,1) >= 3 then $1 else array_append($1,$2) end' language sql strict 
immutable;
CREATE FUNCTION
db=> create aggregate first3 (basetype = integer, sfunc = first3_accum, stype = 
integer[], initcond = '{}');
CREATE AGGREGATE

then something like:

SELECT first3(id)
  FROM (SELECT id 
 FROM my_table 
ORDER BY query, 
 CASE WHEN include THEN 1 ELSE 2 END ASC, 
 score DESC)
 GROUP BY query

But then you'll have to go back to the table to refetch the original records
that you've found. The best way I find to do that is with the int_array_enum()
function from the int_aggregate contrib module.

SELECT * 
  FROM my_table 
 WHERE id IN (
   SELECT int_array_enum(f3)
 FROM (
   SELECT first3(id) as f3
 FROM (SELECT id 
FROM my_table 
   ORDER BY query, 
CASE WHEN include THEN 1 ELSE 2 END ASC, 
score DESC) as x
GROUP BY query
 ) as x
   )


This last step is kind of annoying since you've already seen all those
records. And it requires writing a new aggregate function every time the value
of n changes though, which kind of sucks.

In theory if the new work in 7.5 handling structured datatypes is as cool as
it sounds you could have an array of complete records and when UNNEST is
eventually incorporated into the array code then you could expand those
instead of using the int_array_enum function. Neither of those things are
ready yet as far as I know though.

-- 
greg


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [SQL] SQL challenge--top 10 for each key value?

2004-04-09 Thread Greg Stark
Josh Berkus <[EMAIL PROTECTED]> writes:

> Rod,
> 
> > Something along the lines of the below would accomplish what you want
> > according to spec. ROW_NUMBER() is a spec defined function. (6.10 of
> > SQL200N)
> 
> Great leaping little gods!   They added something called "row number" to the 
> spec? 
> 
> Boy howdy, folks were right ... the ANSI committee really has completly blown 
> off the relational model completely.   

If it's like Oracle's rownum then it's the row number of the *output*, not the
position on disk. So it's not entirely blowing off the relational model any
more than ORDER BY does.

The weird thing is the number of cases where you want ORDER BY or rownum
inside subselects. Which the solution to the original question needed.

> When a standards committee becomes hostage to a handful of vendors, kiss
> real standards goodbye.

In the case of SQL was there ever any pretension otherwise? Was the SQL
standard ever really useful as a "real standard"? I can write useful ANSI C89
code that will compile and work on any C compiler. Trying to write portable
SQL92 code that does any useful work is about as productive as stapling bagels
to your forehead.

-- 
greg


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [SQL] SQL challenge--top 10 for each key value?

2004-04-10 Thread Greg Stark

elein <[EMAIL PROTECTED]> writes:

> create or replace function pycounter(integer)
> returns integer as
> '
>if args[0] == 0:
>   SD["nextno"] = 1
>   return SD["nextno"]
>try:
>   SD["nextno"] += 1
>except:
>   SD["nextno"] = 1
>return SD["nextno"]
> ' language 'plpythonu';
> 
> And clearly it can be done faster as a little
> C function.

Does this approach have a hope of working if it's used twice in the same
query?


-- 
greg


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [SQL] trigger/for key help

2004-04-12 Thread Greg Stark

Tom Lane <[EMAIL PROTECTED]> writes:

> Bret Hughes <[EMAIL PROTECTED]> writes:
> > FWIW I tried to use alter table but could never get the parser to accept
> > $1 as a constraint name.  I used single and double quotes as well as a
> > lame attempt \$1.
> 
> Hm, "$1" works for me ...

Hm, this reminds me. When I was first learning this stuff I was stymied by the
same issue. It took me quite a while to figure out how to drop constraints
because of the quoting issue.

Of course now it seems obvious, but for someone just starting it adds another
roadblock. Is there a reason postgres goes out of its way to pick names that
will be harder to work with than necessary?

Or is it considered a good thing on the theory that if it's hard to reference
it's also hard to accidentally use such names in conflicting ways?

Perhaps names like _1 _2 ... would be easier to handle? 
Or perhaps making $ not require quoting would be helpful?

-- 
greg


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [SQL] trigger/for key help

2004-04-13 Thread Greg Stark
Tom Lane <[EMAIL PROTECTED]> writes:

> Greg Stark <[EMAIL PROTECTED]> writes:
> > Is there a reason postgres goes out of its way to pick names that
> > will be harder to work with than necessary?
> 
> If we use ordinary identifiers for system-generated names then we will
> be infringing on user name space --- ie, there's a potential for
> conflict.  I suppose we could use long randomly-generated names like
> ewjncm343cnlen, but are those really easier to work with?

I don't see an unseverable link between "user name space" and "identifiers
that don't need to be quoted". Mixed case names for instance seem like
perfectly good user name space identifiers.

Postgres could just as easily say "the system reserves all identifiers
starting with $" and still not require quoting $.

> I think a more useful approach is to treat it as a documentation
> problem.  Perhaps an example in the ALTER TABLE man page would help.

-- 
greg


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [SQL] Concatenate results of a single column query

2004-04-18 Thread Greg Stark

Marco Lazzeri <[EMAIL PROTECTED]> writes:

> SELECT
>   p.name, ARRAY(SELECT nicknames FROM people WHERE people.id = p.id)
> FROM people AS p

> Any suggestions?

Something like:

db=> create aggregate array_aggregate (basetype = integer, sfunc = array_append, stype 
= integer[], initcond = '{}');
CREATE AGGREGATE

db=> select array_aggregate(id) from tab;
  array_aggregate  
 

 
{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,28,29,30,31,32,33,34,36,37,38,39,40,41,42,43,27,26,44,45,46,47,48,49,50,51,52,53,54,35}
(1 row)

-- 
greg


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [SQL] Concatenate results of a single column query

2004-04-19 Thread Greg Stark
Christoph Haller <[EMAIL PROTECTED]> writes:

> Interesting feature, but I cannot find function array_append: 
> ERROR:  AggregateCreate: function array_append(integer[], integer) does not exist

It's new in Postgres 7.4

I think you could do this in 7.3 though, it would just be more awkward. Try ||
but I think that's new in 7.4 as well. Otherwise I think you would have to
pick out the upper bound of the array with array_dims and set the upper+1'th
element of the array.

If you're doing text you may want to go directly to a textual concatenation
like:

CREATE FUNCTION concat_agg_accum(text, text) RETURNS text
AS 'select $1 || '', '' || $2'
LANGUAGE sql IMMUTABLE STRICT;

CREATE AGGREGATE concat_agg (
BASETYPE = text,
SFUNC = concat_agg_accum,
STYPE = text
);


-- 
greg


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [SQL] isolation level

2004-05-13 Thread Greg Stark

Bruno Wolff III <[EMAIL PROTECTED]> writes:

> On Thu, May 13, 2004 at 18:13:23 +,
>   Jaime Casanova <[EMAIL PROTECTED]> wrote:
> > Hi all, is there a way to set the isolation level  to something like the 
> > sql standard dirty read.
> 
> No. There will be a way to use the standard name in a SET command, but
> you will actaully get READ COMMITTED isolation (which is the next step up).

I wonder how hard this would be to implement. It doesn't seem like it should
be very hard.

It would be very convenient for debugging and for checking on the progress of
large batch updates or loads.

-- 
greg


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [SQL] Multiple outer join on same table

2004-05-13 Thread Greg Stark
Marco Lazzeri <[EMAIL PROTECTED]> writes:

> Hi!
> I'm searching a better (quicker) way to retrieve data as I used to do
> using the following query...
> 
> ==
> 
> SELECT main.codice,
>other.value AS value_one,
>other.value AS value_two
>   FROM main 
>   LEFT OUTER JOIN other   ON (main.id = other.id_main)
>   LEFT OUTER JOIN other AS other2 ON (main.id = other2.id_main)
>  WHERE other.type = 'type_one'
>AND other2.type = 'type_two'
> ;

a) you're better off sending the actual query rather than retyping it. I
assume you made a typo in the select column list and it should be
"other2.value AS value_two"? Also the parentheses are required on the ON
clause.

b) The WHERE clause will effectively make this a plain inner join, not an
outer join at all. Since any values that aren't found would have a NULL type
column and cause the row to not be selected.

I think the query you meant to write would be

SELECT codice, 
   other1.value AS value_one, 
   other2.value AS value_two
  FROM main
  LEFT OUTER JOIN other as other1 ON (main.id = other1.id_main AND type = 'type_one')
  LEFT OUTER JOIN other as other2 ON (main.id = other2.id_main AND type = 'type_two)

Another way to write this query that might be faster or might not depending
would be:

SELECT codice,
   (SELECT value FROM other WHERE id_main = id AND type = 'type_one') AS value_one,
   (SELECT value FROM other WHERE id_main = id AND type = 'type_two') AS value_two
  FROM codice

In theory the two queries really ought to always result in the same plan
because they're equivalent. However the Postgres optimizer as clever as it is
is incapable of seeing this. 

The first form with the outer join leaves the optimizer with a lot more
flexibility though, including at least one plan that is effectively identical
to what the optimizer is forced to do for the second query. So really the
first one should be no worse than the second.

If you find the second faster (or if they're both still slow) you might
consider posting explain analyze output for both queries. It may be that you
have other issues preventing the optimizer from finding a good plan.

You have run analyze on these tables recently? And you vacuum regularly? And
for the second query you would really want an index on other.id_main too. For
the first one it would depend on the data in the two tables.

-- 
greg


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [SQL] Selecting "sample" data from large tables.

2004-06-03 Thread Greg Stark

> Joseph Turner <[EMAIL PROTECTED]> writes:
> > I have a table with a decent number of rows (let's say for example a
> > billion rows).  I am trying to construct a graph that displays the
> > distribution of that data.  However, I don't want to read in the
> > complete data set (as reading a billion rows would take a while).  Can
> > anyone thing of a way to do this is postgresql?

One way would be to have an indexed column with random values in it. Then you
could use an index scan to pull out samples.

However this has a few downsides. 

a) index scans are a lot less efficient than sequential scans. Effectively
reducing the sample size you can get for a given amount of time even further.
a 10% sample using this technique is probably almost as slow as reading the
entire table, for example. If you only need .1% though this might be a good
approach.

b) the data in the random column would have to be static meaning multiple
samples wouldn't be entirely independent. Depending on what you're doing with
the samples this might be a problem or not.

Oh, and if you're content with always using the same sample but want to
analyze it multiple different ways, you might want to use a partial index. You
could have partial indexes matching the order by and where clauses of your
analyses but with a where clause of its own selecting on the random data. Then
make sure that where clause is in every query.

But still, the sample has to be small enough that even using an index scan
you're winning over reading the entire data.

-- 
greg


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [SQL] Empty array IS NULL?

2004-06-28 Thread Greg Stark

Markus Bertheau <[EMAIL PROTECTED]> writes:

> oocms=# SELECT ARRAY(SELECT 1 WHERE FALSE);
>  ?column?
> --
>   

This one seems strange to me. Shouldn't it result in an empty array?


-- 
greg


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: FW: [SQL] "=" operator vs. "IS"

2004-06-29 Thread Greg Stark
Rich Hall <[EMAIL PROTECTED]> writes:

> "(anything) = NULL" is always Null, this cannot be what the coder intended.

I often have such things in my SQL. Consider what happens when you have SQL
constructed dynamically. Or more frequently, consider that many drivers still
don't use the new binary placeholder syntax and emulate it by putting the
parameters directly into the SQL.

-- 
greg


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: FW: [SQL] "=" operator vs. "IS"

2004-06-29 Thread Greg Stark

Stephan Szabo <[EMAIL PROTECTED]> writes:

> IS TRUE and IS FALSE have a different effect from =true and =false when
> the left hand side is NULL. The former will return false, the latter will
> return NULL.

No, actually they both return false.

(But thanks, I didn't even realize they were special this way)

-- 
greg


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [SQL] How to create an aggregate?

2004-08-01 Thread Greg Stark

Ray Aspeitia <[EMAIL PROTECTED]> writes:

> I also would like to pass the delimiter to the aggregate as a parameter and
> I am not sure if it can handle that.

Currently aggregates that take multiple parameters are just not supported.

-- 
greg


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [SQL] reply to setting

2004-08-22 Thread Greg Stark

Joe Conway <[EMAIL PROTECTED]> writes:

> This is very true. In fact, I get mildly annoyed when people *don't* include
> the direct reply to me, because I very actively filter/redirect my mail.
> Replies directly to me are pretty much guaranteed to be seen quickly, but the
> ones that go to the list might get lost among the hundreds of posts that go
> into my "postgres" inbox every day. I think many other people do something
> similar.

Just as a side comment, one trick I found very helpful in my mail filters is
to treat any message with one of my message-ids in the references as a
personal message as far as mail notifications. This way I get notifications
for any message on a thread following a post of my own.

This is easy in Gnus since the message id has the sending hostname and also
the first few characters has a base64 encoded copy of the unix userid. You
would have to figure out how to recognize message-ids from your MUA.


-- 
greg


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [SQL] sleep function

2004-08-22 Thread Greg Stark

John DeSoi <[EMAIL PROTECTED]> writes:

> On Aug 10, 2004, at 10:57 AM, Bruce Momjian wrote:
> 
> > I can't think of one, no.  I think you will have to use one of the
> > server-side languages and call a sleep in there.
> 
> This is no good in the real world since it pounds the CPU, but it worked well
> enough for my testing purposes.

You went the long way around. I think what he meant was something as simple
as:

bash-2.05b$ /usr/lib/postgresql/bin/createlang -U postgres plperlu test

bash-2.05b$ psql -d test -U postgres
Welcome to psql 7.4.2, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms
   \h for help with SQL commands
   \? for help on internal slash commands
   \g or terminate with semicolon to execute query
   \q to quit

test=# create or replace function sleep(integer) returns integer as 'return 
sleep(shift)' language plperlu;
CREATE FUNCTION

test=# \timing
Timing is on.

test=# select sleep(10) ;
 sleep 
---
10
(1 row)

Time: 10002.493 ms


-- 
greg


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [SQL] only last records in subgroups

2004-08-22 Thread Greg Stark

Dino Vliet <[EMAIL PROTECTED]> writes:

> x,0 and y,4 but how do I manage this in sql? I was
> hoping for a keyword LAST or so, where I can specify
> that when I've ordered my results with order by, I
> could only get the last of the subgroups (the first
> one is easy because I could use limit 1)

There's no concept of "first" and "last" in SQL outside of the ORDER BY clause
of your query. And you can easily reverse the order of the ORDER BY sort by
putting "DESC" after the columns you're sorting on.

But I don't understand how you intend to use "LIMIT 1" to solve your problem.
As you describe the problem you want the last (or first) record of *each*
*group*. Solving that using LIMIT would require a complex query with a
subquery in the column list which would be quite a pain.

As the other poster suggested, if you're just looking to fetch a single column
you can just use min() or max() to solve this. 

If you're looking to fetch more than one column Postgres provides a
non-standard SQL extension for dealing with this situation, "DISTINCT ON".

SELECT DISTINCT ON (id) id,day,other,columns FROM tab ORDER BY id,day

That gets the lowest value of "day". Using "ORDER BY id, day DESC" to get the
greatest value of "day".

-- 
greg


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [SQL] from PG_DUMP to CVS

2004-08-26 Thread Greg Stark
"Riccardo G. Facchini" <[EMAIL PROTECTED]> writes:

> After searching throught the list, I assume you mean this link:
> http://www.rbt.ca/autodoc/index.html
> by Rod Taylor.
> 
> Looks promising, but still what I need is a proper CVS output, as I
> need to review the changes made to the specific database structure.

Well, CVS can still be useful even if the changes are all in one file. Look at
"cvs annotate" for example. And CVS diff would still give you useful
information.

-- 
greg


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [SQL] Aggregate query for multiple records

2004-08-28 Thread Greg Stark

Scott Gerhardt <[EMAIL PROTECTED]> writes:

> Hello, I am new to the list, my apology if this question is beyond the scope or
> charter of this list.
> 
> My questions is:
> What is the best method to perform an aggregate query to calculate sum() values
> for each distinct wid as in the example below, but except for all wid's (not
> just WHERE wid='01/1-6-1-30w1/0').
> 
> Also, performance wise, would it be better to build a function for this query.
> The table has 9 million records and these aggregate queries take hours.

The "top n" type query (or in this case "first n" or "last n" but it's the
same thing) is actually very tricky to do in standard SQL. The best solution
seen here for postgres is to use arrays and custom aggregate functions. 

The following is based on a previous answer from Tom Lane to a similar
question. (I thought I already posted this for you on pgsql-general but the
list archives are down and you don't seem to have seen it, so I'm resending
it)

It allows you to do the whole query with a single sort for the grouping and
the ordering by date together. You would have to use it with something like:

SELECT sum_first_6(oil) 
  FROM (SELECT oil from prd_data ORDER BY wid, "date") 
 GROUP BY wid

If you pump up sort_mem enough -- you can do it within the session for the
single connection using "SET sort_mem" it should be pretty fast.

I think it's the best you're going to get. If you're absolutely sure the data
is physically stored in chronological order -- which I would only feel
comfortable with if you've never done any updates or deletes, only inserts and
perhaps occasional truncates, then you might be able to get by without
ordering and convince it to do a hash aggregate. That would be the optimal
result, no sorts at all. But it would be hard to make sure it would always
work.

test=> create or replace function first_6_accum (integer[], integer) returns integer[] 
language sql immutable as 'select case when array_upper($1,1)>=6 then $1 else $1||$2 
end';
CREATE FUNCTION

test=> create function sum_6(integer[]) returns integer immutable language sql as 
'select $1[1]+$1[2]+$1[3]+$1[4]+$1[5]+$1[6]';
CREATE FUNCTION

test=> create aggregate sum_first_6 (basetype=integer, sfunc=first_6_accum, 
stype=integer[],initcond='{}',finalfunc=sum_6);
CREATE AGGREGATE

test=> select sum_first_6(i) from (select i from (select 1 as i union select 2 union 
select 3 union select 4 union select 5 union select 6 union select 7 union select 8) 
as x order by i desc) as x;
 sum_first_6 
-
  33
(1 row)

test=> select sum_first_6(i) from (select i from (select 1 as i union select 2 union 
select 3 union select 4 union select 5 union select 6 union select 7 union select 8) 
as x order by i asc) as x;
 sum_first_6 
-
  21
(1 row)

This can easily be switched around to make it "last_6" and you can write
functions to handle 6 records or 9 records. And all of these could be combined
in a single query, so you only have to do the sort once.

Unfortunately you cannot make aggregate functions that take multiple
parameters, nor can you pass extra parameters to the state function. So you'll
have to create a separate set of functions for each variant.

Also, you'll have to change it to use reals.


-- 
greg


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [SQL] Isnumeric function?

2004-09-08 Thread Greg Stark

Theo Galanakis <[EMAIL PROTECTED]> writes:

>   error: btree item size 2744 exceeds maximum 2713.
> 
> I assume I had to change some server settings to extend the maximum, however

I would guess the block size. But I'm just guessing.

> in the end this column holds content, and even applying an index would be
> incredible slow to search across hundred of thousands of "content" records
> looking for a primary key.

Perhaps you could have an indexed column that contains a crc32 hash? Then you
could do searches by comparing crc32 which make for fast efficient integer
index lookups. You should still include a comparison against the original
content column since it is possible for there to be a rare crc32 collision.

This doesn't let you do range lookups efficiently. But it does let you look up
specific values quickly even when they aren't numeric.

-- 
greg


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [SQL] Isnumeric function?

2004-09-09 Thread Greg Stark

Theo Galanakis <[EMAIL PROTECTED]> writes:

> I created the Index you specified, however it chooses to run a seq scan on
> the column rather than a Index scan. How can you force it to use that
> Index..
> 
> CREATE INDEX idx_content_numeric ON botched_table(content) WHERE content ~
> '^[0-9]{1,9}$';
> 
> select * from botched_table where content = 200::integer

You need to put a "and content ~ '^[0-9]{1,9}$'" in your query, the clause has
to match the clause in the partial index pretty closely.

perhaps you would find it convenient to make a view of
 select * from botched_table where content ~ '^[0-9]{1,9}$'
and then just always select these values from that view.

Also the "::integer" is useless. It actually gets cast to text here anyways.
The index is on the text contents of the content column.

You might consider making the index a functional index on content::integer
instead. I suspect that would be faster and smaller than an index on the text
version of content:

slo=> create table botched_table (content text);
CREATE TABLE
slo=> create index idx_botched_table on botched_table ((content::integer)) where 
content ~ '^[0-9]{1,9}$';
CREATE INDEX
slo=> create view botched_view as (select content::integer as content_id, * from 
botched_table where content ~ '^[0-9]{1,9}$');
CREATE VIEW
slo=> explain select * from botched_view where content_id = 1;
   QUERY PLAN  
 

 Index Scan using idx_botched_table on botched_table  (cost=0.00..3.72 rows=3 width=32)
   Index Cond: ((content)::integer = 1)
   Filter: (content ~ '^[0-9]{1,9}$'::text)
(3 rows)


-- 
greg


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [SQL] Isnumeric function?

2004-09-10 Thread Greg Stark
Greg Stark <[EMAIL PROTECTED]> writes:

> Theo Galanakis <[EMAIL PROTECTED]> writes:
> 
> > I created the Index you specified, however it chooses to run a seq scan on
> > the column rather than a Index scan. How can you force it to use that
> > Index..
> > 
> > CREATE INDEX idx_content_numeric ON botched_table(content) WHERE content ~
> > '^[0-9]{1,9}$';
> > 
> > select * from botched_table where content = 200::integer
> 
> You need to put a "and content ~ '^[0-9]{1,9}$'" in your query, the clause has
> to match the clause in the partial index pretty closely.

Well this is weird. I tried to come up with a cleaner way to arrange this than
the view I described before using a function. But postgres isn't using the
partial index when it seems it ought to be available.

When I say it has to match "pretty closely" in this case I think it would have
to match exactly, however in the case of simple range operators postgres knows
how to figure out implications. Ie, "where a>1" should use a partial index
built on "where a>0".

slo=> create table test (a integer);
CREATE TABLE
slo=> create index idx_text on test (a) where a > 0;
CREATE INDEX
slo=> explain select * from test where a > 0;
   QUERY PLAN   

 Index Scan using idx_text on test  (cost=0.00..17.50 rows=334 width=4)
   Index Cond: (a > 0)
(2 rows)

slo=> explain select * from test where a > 1;
   QUERY PLAN   

 Index Scan using idx_text on test  (cost=0.00..17.50 rows=334 width=4)
   Index Cond: (a > 1)
(2 rows)



That's all well and good. But when I tried to make a version of your situation
that used a function I found it doesn't work so well with functional indexes:



slo=> create function test(integer) returns integer as 'select $1' language plpgsql 
immutable;
CREATE FUNCTION
slo=> create index idx_test_2 on test (test(a)) where test(a) > 0;
CREATE INDEX
slo=> explain select test(a) from test where test(a) > 0;
QUERY PLAN
--
 Index Scan using idx_test_2 on test  (cost=0.00..19.17 rows=334 width=4)
   Index Cond: (test(a) > 0)
(2 rows)

slo=> explain select test(a) from test where test(a) > 1;
  QUERY PLAN   
---
 Seq Scan on test  (cost=0.00..25.84 rows=334 width=4)
   Filter: (test(a) > 1)
(2 rows)


I can't figure out why this is happening. I would think it has something to do
with the lack of statistics on functional indexes except a) none of the tables
is analyzed anyways and b) the estimated row count is the same anyways.


-- 
greg


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [SQL] Isnumeric function?

2004-09-10 Thread Greg Stark
Tom Lane <[EMAIL PROTECTED]> writes:

> Greg Stark <[EMAIL PROTECTED]> writes:
> > That's all well and good. But when I tried to make a version of your
> > situation that used a function I found it doesn't work so well with
> > functional indexes:
> > ...
> > I can't figure out why this is happening.
> 
> You're using 7.3 or older?

7.4.3.


-- 
greg


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [SQL] implementing an out-of-transaction trigger

2004-09-15 Thread Greg Stark

"Iain" <[EMAIL PROTECTED]> writes:

> Though, as far as I can tell, there is no way to have the notify activate a
> pl/pgsql function directly. I'll still need to write a client program to
> create a session and actually do the listening, that is if I havn't missed
> anything else...

Right, presumably some sort of daemon that sits and waits for events. Much
like you would have with Oracle advanced queuing and such.

The big difference is that NOTIFY doesn't pass along any parameters. You will
need some way for your daemon to find any pending data it needs to process.
You might need some kind of queue table, or you might be able to get by
without one.

-- 
greg


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [SQL] 1-byte integers

2004-09-18 Thread Greg Stark

stig erikson <[EMAIL PROTECTED]> writes:

> how can i specify an integer to be one byte byte or even 4 bits long?
> int1, int(1), tinyint are nowhere to be seen.
> smallest i can find is smallint that is 2 bytes.

There's a type called "char" (the double quotes are needed). It's used by
postgres system catalogues but it's available for users too. It's a little
quirky since it has operators that treat it as a 1 byte text data type and
other operators that treat it as an integer data type. But that doesn't
normally lead to any problems, just strange symptoms when your code has a bug.

Other than that there are things like bit(4) which has a cast to and from
integer. But they won't save you any storage space. If you have multiple
columns like this and want to define a new type that aggregates them all for
storage but lets you access them individually that could be useful.

-- 
greg


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [SQL] ORDER BY and NULLs

2004-09-19 Thread Greg Stark

T E Schmitz <[EMAIL PROTECTED]> writes:

> I just dug out the PostgreSQL book again because I thought I might've garbled
> it:
> 
> Quote: "PostgreSQL will not index NULL values. Because an index will never
> include NULL values, it cannot be used to satisfy the ORDER BY clause of a
> query that returns all rows in a table."

You should just cross out that whole section. It's just flatly wrong. 

I had always assumed it was just people bringing assumptions over from Oracle
where it is true. Perhaps this book is to blame for some of the confusion.
Which book is it?

Postgres indexes NULLs. It can use them for ORDER BY clauses. 

Where it cannot use them is to satisfy "WHERE foo IS NULL" or "WHERE foo IS
NOT NULL" constraints though. That's an implementation detail, but it can be
worked around with partial indexes.

-- 
greg


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [SQL] ORDER BY and NULLs

2004-09-19 Thread Greg Stark

Tom Lane <[EMAIL PROTECTED]> writes:

> > The paragraph continues:
> > "If the SELECT command included the clause WHERE phone NOT NULL, 
> > PostgreSQL could use the index to satisfy the ORDER BY clause.
> > An index that covers optional (NOT NULL) columns will not be used to 
> > speed table joins either."
> 
> My goodness, it seems to be a veritable fount of misinformation :-(
> 
> I wonder how much of this is stuff that is true for Oracle and they just
> assumed it carried over?

The first part is true for Oracle. You have to add the WHERE phone NOT NULL to
convince Oracle it can use an index. Or just make the column NOT NULL to begin
with I think.

However as far as I recall the second part is not true. Oracle is smart enough
to realize that an equijoin clause implies NOT NULL and therefore allows it to
use the index.

(This may have all changed in Oracle 9+. The last I saw of Oracle was 8i)

I wonder if they just tried explain on a bunch of queries and noticed that
postgres wasn't using an index for SELECT * FROM foo ORDER BY bar and came up
with explanations for the patterns they saw?

-- 
greg


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [SQL] COUNT(*) to find records which have a certain number of dependencies ?

2004-09-20 Thread Greg Stark

T E Schmitz <[EMAIL PROTECTED]> writes:

> I want to select only those BRAND/MODEL combinations, where the MODEL has more
> than one TYPE, but only where one of those has TYPE_NAME='xyz'.
> I am not interested in MODELs with multiple TYPEs where none of them are called
> 'xyz'.


There are lots of approaches to this with various pros and cons.

The simplest one off the top of my head:

select * 
  from brand 
  join model on (brand_pk = brand_fk)
 where exists (select 1 from type where model_fk = model_pk and type_name = 'xyz')
   and (select count(*) from type where model_fk = model_pk) > 1


You could try to be clever about avoiding the redundant access to the type table:

select * 
  from brand 
  join model on (brand_pk = brand_fk)
 where (select count(*)
  from type 
 where model_fk = model_pk 
having sum(case when type = 'xyz' then 1 else 0 end) >= 1
   ) > 1

I'm haven't tested that, it might need some tweaking. In any case I don't
think it's worth the added complexity, assuming you have indexes on type. I'm
not even sure it would run faster.

You could try to be really clever about it by turning the whole thing into a
join:

select * 
  from brand 
  join model on (brand_pk = brand_fk)
  join (select model_fk
  from type 
 group by model_fk
having sum(case when type = 'xyz' then 1 else 0 end) >= 1
   and count(*) > 1
   ) on (model_fk = model_pk)

This would let the planner have a more plans to choose from and might be a big
win if there are lots of brands and models but few that satisfy the criteria
you're looking for.

-- 
greg


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [SQL] COUNT(*) to find records which have a certain number of

2004-09-20 Thread Greg Stark

T E Schmitz <[EMAIL PROTECTED]> writes:

> SELECT
> BRAND_NAME,MODEL_NAME
...
> intersect
...

Huh, I never think of the set operation solutions. I'm curious how it compares
speed-wise.

-- 
greg


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [SQL] COUNT(*) to find records which have a certain number of dependencies ?

2004-09-20 Thread Greg Stark

T E Schmitz <[EMAIL PROTECTED]> writes:

> ) as somealias on (model_fk = model_pk)
> 
> (subquery in FROM must have an alias)

ARGH! This is one of the most annoying things about postgres! It bites me all
the time. Obviously it's totally insignificant since it's easy for my to just
throw an "AS x" on the end of it. But damn.

I see there's a comment foreseeing some annoyance value for this in the
source:

  /*
   * The SQL spec does not permit a subselect
   * () without an alias clause,
   * so we don't either.  This avoids the problem
   * of needing to invent a unique refname for it.
   * That could be surmounted if there's sufficient
   * popular demand, but for now let's just implement
   * the spec and see if anyone complains.
   * However, it does seem like a good idea to emit
   * an error message that's better than "syntax error".
   */

So where can I officially register my complaint? :)

-- 
greg


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [SQL] JOIN performance

2004-09-21 Thread Greg Stark

Tom Lane <[EMAIL PROTECTED]> writes:

> Fixing this properly is a research project, and I haven't thought of any
> quick-and-dirty hacks that aren't too ugly to consider :-(

Just thinking out loud here. Instead of trying to peek inside the CASE
couldn't the optimizer just wrap the non-strict expression in a conditional
that tests whether the row was found?


-- 
greg


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [SQL] [GENERAL] need ``row number``

2004-09-24 Thread Greg Stark
Tom Lane <[EMAIL PROTECTED]> writes:

> Karsten Hilbert <[EMAIL PROTECTED]> writes:
> > I am not convinced I'll need a SRF. I am not trying to
> > calculate something that isn't there yet. I am just trying to
> > join two views appropriately. I might have to employ some
> > variant of Celko's integer helper table but I'm not sure how
> > to proceed.
> 
> A fairly common hack for this is to use a sequence:
> 
>   create temp sequence tseq;
>   select nextval('tseq'), * from (select  order by ...) ss;

But I thought she wanted to get the row number within a group. Not the row
number for the entire result set. A sequence can't do that. Or at least, I
suppose it could but it would be mighty strange to see setval() in a SELECT
query. And I can't think of how to detect the level break in a select query
either.

-- 
greg


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [SQL] Inserting into table only if the row does not already exist.

2004-10-14 Thread Greg Stark

"C. Bensend" <[EMAIL PROTECTED]> writes:

> INSERT INTO table ( column1, column2, column3 )
>SELECT column1, column2, column3
>WHERE NOT EXISTS (
>   SELECT column1, column2, column3 FROM table WHERE
>  column1 = $column1 AND
>  column2 = $column2 AND
>  column3 = $column3 )
> 
>.. which gave me 'ERROR: column1 does not exist'.  Nuts.

Well you're not selecting from any table so "column1" isn't going to exist.
You just have to put it in the select list as a constant. If you're feeling
generous to the next programmer to read it you could put "AS column1" after
each one, but the column name doesn't actually have to match the column you're
inserting into.

 INSERT INTO table ( column1, column2, column3 )
   (
SELECT $column1, $column2, $column3
 WHERE NOT EXISTS (
   SELECT 1
 FROM table 
WHERE column1 = $column1
  AND column2 = $column2
  AND column3 = $column3 )
)

Note that this is going to have some concurrency issues. I think it will be
possible for a second query to execute before the first commits. In that case
it won't see the record the first query inserted and try to insert again.
You'll just get a primary key violation though which I guess you can just
ignore.

Which raises a question. Why not forgoe this complicated SQL and try to do the
insert. If you get a primary key violation, well there's your answer... If you
don't care about the failure just ignore it and move on. I would suggest
checking specifically for a primary key violation and still stopping execution
on unexpected errors though.

If you're doing this inside a bigger transaction that's a bit more of a pain.
Until 8.0 postgres can't handle ignoring an error on a query without aborting
the entire transaction. But if you're in autocommit mode then you can just
ignore the primary key violation and continue. 

Incidentally, if you're putting your parameters directly into your queries
using $column1 then you've got a potential security problem. Unless you're
quoting every variable everywhere religiously using postgres's quoting
functions an attacker can sneak extra SQL into your queries. Potentially
including whole new statements such as "DELETE FROM table"...

-- 
greg


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [SQL] Inserting into table only if the row does not already

2004-10-16 Thread Greg Stark

"C. Bensend" <[EMAIL PROTECTED]> writes:

> The risk of a cron gone wild is acceptable to me at this moment.

Gee, now I have images of late-night advertisements for bofh-porn video tapes
of Cron Jobs Gone Wild(tm) dancing through my head... thanks.

-- 
greg


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [SQL] help on a query

2004-10-08 Thread Greg Stark

Michelle Murrain <[EMAIL PROTECTED]> writes:

> The OUTER JOIN version is quite a bit more efficient (by an order of magnitude)
> than the option with WHERE NOT EXISTS subquery.

This is going to be heavily dependent on the version of postgres. IN/NOT IN
execution has improved a lot in 7.4 and later. If you're still on 7.3 then the
outer join will be better. But in 7.4 I would actually expect the NOT IN or
the NOT EXISTS to be faster.

-- 
greg


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [SQL] Cross tabulations

2004-10-19 Thread Greg Stark
"Muhyiddin A.M Hayat" <[EMAIL PROTECTED]> writes:

> Dear all,
> 
> I need to do something similar to a cross tabulation, but without any
> aggregation.

join your table to itself four times:

select * 
  from (select check_time::date as date, employee_id, check_time-check_time::date as 
in from test where state = 'In') as a 
  join (select check_time::date as date, employee_id, check_time-check_time::date as 
break_out from test where state = 'Break Out') as b using (employee_id,date)
  join (select check_time::date as date, employee_id, check_time-check_time::date as 
break_in from test where state = 'Break In') as d using (employee_id,date)
  join (select check_time::date as date, employee_id, check_time-check_time::date as 
out from test where state = 'Out') as e using (employee_id,date) ;

Note that this will do strange things if you don't have precisely four records
for each employee.

Alternatively use subqueries:

select date, employee_id,
   (select check_time-check_time::date from test where employee_id = x.employee_id 
and check_time::date = date and state = 'In') as in,
   (select check_time-check_time::date from test where employee_id = x.employee_id 
and check_time::date = date and state = 'Break Out') as break_out,
   (select check_time-check_time::date from test where employee_id = x.employee_id 
and check_time::date = date and state = 'Break In') as break_in,
   (select check_time-check_time::date from test where employee_id = x.employee_id 
and check_time::date = date and state = 'Out') as out
  from (select distinct employee_id, check_time::date as date from test) as x; 

This will at least behave fine if there are missing records and will give an
error if there are multiple records instead of doing strange things.

Neither of these will be particularly pretty on the performance front.

-- 
greg


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [SQL] vacuum analyze slows sql query

2004-11-03 Thread Greg Stark

patrick ~ <[EMAIL PROTECTED]> writes:

> I noticed that a freshly created db with freshly inserted data (from
> a previous pg_dump) would result in quite fast results.  However,
> after running 'vacuum analyze' the very same query slowed down about
> 1250x (Time: 1080688.921 ms vs Time: 864.522 ms).

If it gets slower immediately after a single vacuum analyze then the problem
is that one of the queries is getting planned wrong when stats are available.
If it takes a while to slow down then it could be other problems such as index
bloat etc.

Don't use "explain verbose" use "explain analyze". I don't know if it was in
7.1 but you say you reproduced the problem with 7.4. It would be helpful to
see the results of "explain analyze select ..." on the query before and after
the vacuum analyze.

-- 
greg


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [SQL] Simple SQL Question

2004-11-06 Thread Greg Stark

Tomasz Myrta <[EMAIL PROTECTED]> writes:

> > select * from table1 LIMIT x
> > gives me the first x row of the result.
> > After that, I save the last value, and next time, I adjust
> > the query as
> > select * from table1 where itemkey>:lastvalue LIMIT x
> 
> Why do you complicate it so much? Everything you need is:
> 
> select * from table1 LIMIT x
> select * from table1 LIMIT x OFFSET x
> select * from table1 LIMIT x OFFSET 2*x
> 
> Remember to sort rows before using limit/offset.

There are two good reasons to prefer his Andras' solution to yours.

a) If the data is modified between the two queries his will continue from
where the previous page left off. Yours will either skip records or overlap
with the previous page.

b) If itemkey is indexed his will be an efficient index scan that performs
similarly regardless of what page is being fetched. Yours will perform more
and more slowly as the user gets deeper into the results.


Note that both queries are wrong however. You need an "ORDER BY itemkey" or
else nothing guarantees the second page has any relation at all to the first
page.

-- 
greg


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [SQL] Comparing Dates

2004-11-18 Thread Greg Stark

"Thomas F.O'Connell" <[EMAIL PROTECTED]> writes:

> select 2004-06-08;
>   ?column?
> --
>   1990
> 
> I'm not exactly sure how the bare string is converted internally, but it's
> clearly not a complete date like you're expecting.

What string? That's just integer arithmetic.

-- 
greg


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [SQL] Making dirty reads possible?

2004-12-06 Thread Greg Stark

[EMAIL PROTECTED] writes:

> But not possible for real at the moment?
> 
> So, summarising:
> - Nested transactions is not (yet) supported
> - READ UNCOMMITTED isolation level is not (yet) supported
> - the EXECUTE plpgsql construct does not circumvent the transaction

Well nested transactions are in 8.0 but I don't think they help you much.

I find I've been stymied using server-side functions for large batch jobs for
pretty much the same reason. I find it works better and it's more flexible to
write client-side programs in the language of my choice that connect to the
database and do the batch jobs.

They can output progress logs or keep information about their progress in some
shared space. They can also control the transaction more freely committing in
the middle of the job if it's safe.

-- 
greg


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [SQL] replacing mysql enum

2004-12-11 Thread Greg Stark

Ian Barwick <[EMAIL PROTECTED]> writes:

> What I still don't quite understand is why IN in a CHECK context is
> handled differently to say: select 1 where 'x' in (null,'a','b','c') ?
> This could be a bit of a gotcha for anyone constructing a constraint
> similar to the original poster's and not realising it has no effect.

well

WHERE foo IN (null, ...)

returns null if foo isn't explicitly in the list (ie, "it may or may not equal
the unknown value in the list"). And I think constraints that return null are
deemed to have succeeded.

-- 
greg


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [SQL] Column with recycled sequence value

2005-01-13 Thread Greg Stark
Andrew Sullivan <[EMAIL PROTECTED]> writes:

> You can set the sequence up to cycle (so once it gets to the end, it
> wraps around to the beginning again).  The keyword is CYCLE at CREATE
> SEQUENCE time.  It defaults to NO CYCLE.
> 
> One potential problem, of course, are collisions on the table,
> because some value wasn't cleared out.  It sounds like you don't have
> that problem though.

Alternatively you can go through the database and make sure all the foreign
keys are declared and marked ON UPDATE CASCADE. Then go through and renumber
all your entries sequentially starting at 1 and reset your sequence. 

I'm not sure this is such a hot idea really. But I don't really like the idea
of letting the sequence wrap around much either. You'll have to consider the
pros and cons of each approach (and of just moving to bigserial too).

If you're going to do this you'll want an index on all the foreign key
columns. That is, the columns referring to this value from other tables.
Otherwise the automatic updates would be very slow.

And will probably want to schedule down-time for this. Otherwise application
code that holds values in local state might get very confused.

I think I would do it with a program that connects and updates each record
individually and commits periodically rather than with a single big update.
Just because I like having control and having things that give me progress
information and can be interrupted without losing work.

Oh, and this won't work if you have any external references to these values
from outside your database. Say if the value is something like a customer
account number that you've previously sent to customers...

-- 
greg


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [SQL] OFFSET impact on Performance???

2005-01-20 Thread Greg Stark
"Andrei Bintintan" <[EMAIL PROTECTED]> writes:

> > If you're using this to provide "pages" of results, could you use a cursor?
> What do you mean by that? Cursor?
> 
> Yes I'm using this to provide "pages", but If I jump to the last pages it goes
> very slow.

The best way to do pages for is not to use offset or cursors but to use an
index. This only works if you can enumerate all the sort orders the
application might be using and can have an index on each of them.

To do this the query would look something like:

SELECT * FROM tab WHERE col > ? ORDER BY col LIMIT 50

Then you take note of the last value used on a given page and if the user
selects "next" you pass that as the starting point for the next page.

This query takes the same amount of time no matter how many records are in the
table and no matter what page of the result set the user is on. It should
actually be instantaneous even if the user is on the hundredth page of
millions of records because it uses an index both for the finding the right
point to start and for the ordering.

It also has the advantage that it works even if the list of items changes as
the user navigates. If you use OFFSET and someone inserts a record in the
table then the "next" page will overlap the current page. Worse, if someone
deletes a record then "next" will skip a record.

The disadvantages of this are a) it's hard (but not impossible) to go
backwards. And b) it's impossible to give the user a list of pages and let
them skip around willy nilly.


(If this is for a web page then specifically don't recommend cursors. It will
mean you'll have to have some complex session management system that
guarantees the user will always come to the same postgres session and has some
garbage collection if the user disappears. And it means the URL is only good
for a limited amount of time. If they bookmark it it'll break if they come
back the next day.)

-- 
greg


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] [SQL] OFFSET impact on Performance???

2005-01-25 Thread Spiegelberg, Greg
Isn't this a prime example of when to use a servlet or something similar
in function?  It will create the cursor, maintain it, and fetch against
it for a particular page.

Greg


-Original Message-
From: Richard Huxton [mailto:[EMAIL PROTECTED]
Sent: Thursday, January 20, 2005 10:21 AM
To: Andrei Bintintan
Cc: pgsql-sql@postgresql.org; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] [SQL] OFFSET impact on Performance???


Andrei Bintintan wrote:
>> If you're using this to provide "pages" of results, could you use a 
>> cursor?
> 
> What do you mean by that? Cursor?
> 
> Yes I'm using this to provide "pages", but If I jump to the last pages 
> it goes very slow.

DECLARE mycursor CURSOR FOR SELECT * FROM ...
FETCH FORWARD 10 IN mycursor;
CLOSE mycursor;

Repeated FETCHes would let you step through your results. That won't 
work if you have a web-app making repeated connections.

If you've got a web-application then you'll probably want to insert the 
results into a cache table for later use.

--
   Richard Huxton
   Archonet Ltd

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] [SQL] OFFSET impact on Performance???

2005-01-25 Thread Greg Stark

Alex Turner <[EMAIL PROTECTED]> writes:

> I am also very interesting in this very question.. Is there any way to
> declare a persistant cursor that remains open between pg sessions? 
> This would be better than a temp table because you would not have to
> do the initial select and insert into a fresh table and incur those IO
> costs, which are often very heavy, and the reason why one would want
> to use a cursor.

TANSTAAFL. How would such a persistent cursor be implemented if not by
building a temporary table somewhere behind the scenes?

There could be some advantage if the data were stored in a temporary table
marked as not having to be WAL logged. Instead it could be automatically
cleared on every database start.

-- 
greg


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [SQL] same question little different test MSSQL vrs Postgres

2005-01-25 Thread Greg Stark

"Joel Fradkin" <[EMAIL PROTECTED]> writes:

> I also tried a simple select * from tblcase where clientum = 'SAKS'

Try:

explain analyze select * from tblcase where clientum = 'SAKS'

Send the output.

-- 
greg


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [SQL] same question little different test MSSQL vrs Postgres

2005-01-26 Thread Greg Stark
"Joel Fradkin" <[EMAIL PROTECTED]> writes:

> QUERY PLAN
> "Seq Scan on tblcase  (cost=0.00..30066.21 rows=37401 width=996) (actual
> time=0.344..962.260 rows=22636 loops=1)"
> "  Filter: ((clientnum)::text = 'SAKS'::text)"
> "Total runtime: 1034.434 ms"

Well that says it only took 1s. So it seems this is highly dependent on
whether the data is in cache. Perhaps it was in cache on MSSQL when you
profiled it there and not on postgres?

You could put an index on clientnum, but if the data is usually in cache like
this it might not even be necessary.

-- 
greg


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


  1   2   3   >