Re: [SQL] celko nested set functions
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 NotDashEscaped: You need GnuPG to verify this message > I'm wondering if anyone has written code that enables you to move > entities between parents in a nested set model. Specifically something > that can do it without deleting any of the children involved in the > process. I looked in the postgresql cookbook which had adding and > building tree's, but not moving. I'm hoping to find something > postgresql specific but if not that's ok. Thanks in advance, CREATE FUNCTION MoveTree (integer, integer) RETURNS text AS ' -- Moves part of a nested set tree to another part. -- Pass in the left of the child (from) and the left of the parent (to) DECLARE cleft INTEGER; cright INTEGER; pleft INTEGER; pright INTEGER; leftbound INTEGER; rightbound INTEGER; treeshift INTEGER; cwidth INTEGER; BEGIN SELECT lft, rht FROM tree WHERE lft = $1 INTO cleft, cright; SELECT lft, rht FROM tree WHERE lft = $2 INTO pleft, pright; -- Make sure the child exists IF cleft IS NULL THEN RETURN ''No entry found with a left of ''||$1; END IF; -- Make sure the parent exists IF pleft IS NULL THEN RETURN ''No entry found with a left of ''||$2; END IF; -- Self-move makes no sense IF cleft = pleft THEN RETURN ''Cannot move: entries are identical''; END IF; -- Parent cannot be underneath the child IF pleft BETWEEN cleft AND cright THEN RETURN ''Cannot move: first entry contains second''; END IF; -- Child may already be in the proper place IF cleft = pleft+1 THEN RETURN ''No changes need to be made''; END IF; IF cleft > pleft THEN treeshift := pleft - cleft + 1; leftbound := pleft+1; rightbound := cleft-1; cwidth := cright-cleft+1; ELSE treeshift := pleft - cright; leftbound := cright + 1; rightbound := pleft; cwidth := cleft-cright-1; END IF; UPDATE tree SET lft = CASE WHEN lft BETWEEN leftbound AND rightbound THEN lft + cwidth WHEN lft BETWEEN cleft AND cright THEN lft + treeshift ELSE lft END, rht = CASE WHEN rht BETWEEN leftbound AND rightbound THEN rht + cwidth WHEN rht BETWEEN cleft AND cright THEN rht + treeshift ELSE rht END; RETURN ''Tree has been moved''; END; ' LANGUAGE 'plpgsql'; Greg Sabino Mullane [EMAIL PROTECTED] PGP Key: 0x14964AC8 200210291424 -BEGIN PGP SIGNATURE- Comment: http://www.turnstep.com/pgp.html iD8DBQE9vuDNvJuQZxSWSsgRApruAJ0bD2XyonsYNHV+XVEBYqJji3jxygCfeVk/ 27Cl7rTs5bQAkyBQXuXl3mw= =MZbR -END PGP SIGNATURE- ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [SQL] Trees: maintaining pathnames
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 NotDashEscaped: You need GnuPG to verify this message Instead of storing the path in each row, why not let Postgres take care of computing it with a function? Then make a view and you've got the same table, without all the triggers. CREATE TABLE tree ( idINTEGER NOT NULL, parent_id INTEGER, "name"TEXT NOT NULL, PRIMARY KEY (id) ); INSERT INTO tree VALUES (1,NULL,''); INSERT INTO tree VALUES (2,1,'usr'); INSERT INTO tree VALUES (3,1,'tmp'); INSERT INTO tree VALUES (4,1,'home'); INSERT INTO tree VALUES (5,4,'greg'); INSERT INTO tree VALUES (6,5,'etc'); CREATE OR REPLACE FUNCTION pathname(INTEGER) RETURNS TEXT AS ' DECLARE mypath TEXT; myname TEXT; myid INTEGER; BEGIN SELECT parent_id,name FROM tree WHERE id=$1 INTO myid,mypath; IF mypath IS NULL THEN RETURN ''No such id\n''; END IF; LOOP SELECT parent_id,name FROM tree WHERE id=myid INTO myid,myname; mypath := ''/'' || mypath; EXIT WHEN myid IS NULL; mypath := myname || mypath; END LOOP; RETURN mypath; END; ' LANGUAGE 'plpgsql'; CREATE VIEW mytree AS SELECT *, PATHNAME(id) AS path FROM tree; SELECT * FROM tree ORDER BY id; id | parent_id | name +---+-- 1 | | 2 | 1 | usr 3 | 1 | tmp 4 | 1 | home 5 | 4 | greg 6 | 5 | etc (6 rows) SELECT * FROM mytree ORDER BY id; id | parent_id | name | path +---+--+ 1 | | | / 2 | 1 | usr | /usr 3 | 1 | tmp | /tmp 4 | 1 | home | /home 5 | 4 | greg | /home/greg 6 | 5 | etc | /home/greg/etc (6 rows) UPDATE tree SET name='users' WHERE id=4; SELECT * FROM mytree ORDER BY id; id | parent_id | name | path +---+---+----- 1 | | | / 2 | 1 | usr | /usr 3 | 1 | tmp | /tmp 4 | 1 | users | /users 5 | 4 | greg | /users/greg 6 | 5 | etc | /users/greg/etc (6 rows) Greg Sabino Mullane [EMAIL PROTECTED] PGP Key: 0x14964AC8 200211172015 -BEGIN PGP SIGNATURE- Comment: http://www.turnstep.com/pgp.html iD8DBQE92D9RvJuQZxSWSsgRAn2oAKDyIcrtgB8v1fAMY3B/ITKZ+lBlYgCfXRMe W/xntabEsfuEdseo44cAXbY= =MANm -END PGP SIGNATURE- ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [SQL] celko nested set functions
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 NotDashEscaped: You need GnuPG to verify this message Robert Treat and I came up with a better way to move nodes from one branch to another inside of a nested tree: CREATE or REPLACE FUNCTION move_tree (integer, integer) RETURNS text AS ' -- Moves part of a nested set tree to another part. -- Pass in the left of the child (from) and the left of the parent (to) DECLARE cleft INTEGER; cright INTEGER; pleft INTEGER; pright INTEGER; leftbound INTEGER; rightbound INTEGER; treeshift INTEGER; cwidth INTEGER; leftrange INTEGER; rightrange INTEGER; BEGIN -- Self-move makes no sense IF $1 = $2 THEN RETURN ''Cannot move: entries are identical''; END IF; SELECT lft, rgt FROM tree WHERE lft = $1 INTO cleft, cright; SELECT lft, rgt FROM tree WHERE lft = $2 INTO pleft, pright; -- Make sure the child exists IF cleft IS NULL THEN RETURN ''No entry found with an left of ''||$1; END IF; -- Make sure the parent exists IF pleft IS NULL THEN RETURN ''No entry found with a left of ''||$2; END IF; -- Parent cannot be underneath the child IF pleft BETWEEN cleft AND cright THEN RETURN ''Cannot move: first entry contains second''; END IF; -- Child may already be in the proper place IF cleft = pleft+1 THEN RETURN ''No changes need to be made''; END IF; IF cleft > pleft THEN treeshift := pleft - cleft + 1; leftbound := pleft+1; rightbound := cleft-1; cwidth := cright-cleft+1; leftrange := cright; rightrange := pleft; ELSE treeshift := pleft - cright; leftbound := cright + 1; rightbound := pleft; cwidth := cleft-cright-1; leftrange := pleft+1; rightrange := cleft; END IF; UPDATE tree SET lft = CASE WHEN lft BETWEEN leftbound AND rightbound THEN lft + cwidth WHEN lft BETWEEN cleft AND cright THEN lft + treeshift ELSE lft END, rgt = CASE WHEN rgt BETWEEN leftbound AND rightbound THEN rgt + cwidth WHEN rgt BETWEEN cleft AND cright THEN rgt + treeshift ELSE rgt END WHERE lft < leftrange OR rgt > rightrange; RETURN ''Tree has been moved''; END; ' LANGUAGE 'plpgsql'; Greg Sabino Mullane [EMAIL PROTECTED] PGP Key: 0x14964AC8 200211251526 -BEGIN PGP SIGNATURE- Comment: http://www.turnstep.com/pgp.html iD8DBQE94ojRvJuQZxSWSsgRAkkUAJ0eX9VJtXYajAo60UeKYaXH1xxmkwCeJDtX qrX7tgXmUCJNd/fphjGi7tI= =+ADv -END PGP SIGNATURE- ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [SQL] Delete 1 Record of 2 Duplicate Records
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 If they are truly identical, then you must use the hidden 'oid' column to differentiate the two. No need to peek at the oid, just do this: DELETE FROM test WHERE oid = (SELECT oid FROM test WHERE column_id=5 LIMIT 1); - -- Greg Sabino Mullane [EMAIL PROTECTED] PGP Key: 0x14964AC8 200301301006 -BEGIN PGP SIGNATURE- Comment: http://www.turnstep.com/pgp.html iD8DBQE+OT+AvJuQZxSWSsgRAgZOAKCrwW2O/bQpxo5LBBp4vDkS8YoZ9wCg2H7N R9R4CTSXx/lRmjm5NvZkYXE= =VI0G -END PGP SIGNATURE- ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [SQL] Help with a query for charting
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 > I'm trying to do a query to count the number of tickets opened on > each day of a month. It'll always be from the 1st to the end of > the month. > ... > But it doesn't give me a zero for the days when no tickets were opened The problem is that SQL has no concept of how many days there are supposed to be inside of the range you gave it, now does it have a way of easily determining how many months are in a year. You will have to put that information into the database: a simple table with a date field and one row per day should do it. Make sure that you go well beyond any days you will ever need. For example: CREATE TABLE mydates (date_opened date); (Using the same column name allows us to use "USING" instead of "ON" in our JOIN later on.) Populate it somehow. Here is a quick and dirty way to add 1000 days: perl -e \ "{print \"INSERT INTO mydates VALUES('\" . scalar localtime($^T+(\$x*86400)). \"');\n\"; redo if \$x++<1000}" \ | psql Watch those escape characters! Once you have such a table, you will need to join your query to it, by using a RIGHT OUTER JOIN (RIGHT OUTER as we are listing the important table first, then making sure that we have at least one row from the second, or "right" table). We also need to wrap the query for the first table inside of a subselect to allow us to use the GROUP BY with a JOIN. The date specification is only needed on the second table (mydates), although you could add it to the first as well if you wish. The TO_CHAR has been moved to the "outer level", so we can simply join on the DATE_TRUNC'ed column. Finally, a COALESCE on the count is added, in order to generate the wanted zeroes: SELECT TO_CHAR(DATE_TRUNC('day',T2.date_opened), 'DD') AS "day", COALESCE(T1.mycount,0) AS "count" FROM (SELECT date_opened, COUNT(*) AS mycount FROM ticket GROUP BY date_opened) AS T1 RIGHT OUTER JOIN (SELECT DISTINCT date_opened FROM mydates WHERE date_opened BETWEEN '23-Jan-2003' AND '26-Jan-2003') AS T2 USING (date_opened) ORDER BY "day" ASC; The DISTINCT is not strictly needed, but is a safeguard in case the mydates table has more than one entry with the same date. Hope that helps. - -- Greg Sabino Mullane [EMAIL PROTECTED] PGP Key: 0x14964AC8 200302021403 -BEGIN PGP SIGNATURE- Comment: http://www.turnstep.com/pgp.html iD8DBQE+PX6rvJuQZxSWSsgRAqAxAKC/NwhBKTavlNXYkTmsy7DMcxeLPwCgnP4K y2RTdNiyQv+V29prKmo1yMw= =bBpJ -END PGP SIGNATURE- ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [SQL] Elegant SQL solution:
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 > SELECT date_part('month',rowdate) as month, count(*) as rows FROM mytable GROUP > BY month; >... > I could create a one-column table with values 1 - 12 in it, and select from > that table with a where clause matching "month". I could also create a view > "SELECT 1 UNION SELECT 2 UNION ..." and select against the view. There MUST be > a more elegant way to do this. The first solution is probably the best one. It does not seem that "unelegant" to me. Another way would be just to do it in the application itself. ...or you could consider this one I came up with. Use at your own risk ;) SELECT dos.mym AS "Month", COALESCE(uno.rc,0) AS "Total" FROM (SELECT date_part('month',rowdate) AS mym, count(*) AS rc FROM mytable GROUP BY 1) AS uno RIGHT JOIN (SELECT oid::integer-15 AS mym FROM pg_type ORDER BY oid ASC LIMIT 12) AS dos USING (mym); - -- Greg Sabino Mullane [EMAIL PROTECTED] PGP Key: 0x14964AC8 200306072131 -BEGIN PGP SIGNATURE- Comment: http://www.turnstep.com/pgp.html iD8DBQE+4phlvJuQZxSWSsgRAqLRAJsGr5YNiGXKoXBOWq6+3OpSZXOG3ACdFr2F ywb1tBYllZt6CKtKYhoc7G4= =6yvp -END PGP SIGNATURE- ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [SQL] max length of sql select statement (long!)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 > ... I don't want to take GO:06 into account (two parents in which > I am intested in). That menas, whenever I ask for children of two nodes, > I want a DISTINCT SET of children. To start with, you can avoid the Java and do this in SQL: SELECT child FROM gograph WHERE parent='GO:002' OR parent='GO:005' EXCEPT (SELECT child FROM gograph WHERE parent='GO:002' INTERSECT SELECT child FROM gograph WHERE parent='GO:005'); And yes, I would certainly start by normalizing things a little bit: CREATE SEQUENCE goid_seq; CREATE TABLE goID ( idname TEXT, id INTEGER NOT NULL DEFAULT nextval('goid_seq') ); INSERT INTO goid(idname) SELECT DISTINCT parent FROM gograph; INSERT INTO goid(idname) SELECT DISTINCT child FROM gograph WHERE NOT EXISTS (SELECT 1 FROM goID WHERE idname = child); CREATE TABLE gomap ( parent INTEGER, child INTEGER ); INSERT INTO gomap SELECT (SELECT id FROM goid WHERE idname=parent), (SELECT id FROM goid WHERE idname=child) FROM gograph As far as the binaryInteraction table, a little more information is needed: how are each of these tables being populated? Why the distinct? Is it because there may be duplicate rows in the table? The reason I as is that it might be better to ue triggers to compute some of the information as it comes in, depending on which tables are changes and how often. - -- Greg Sabino Mullane [EMAIL PROTECTED] PGP Key: 0x14964AC8 200307151035 -BEGIN PGP SIGNATURE- Comment: http://www.turnstep.com/pgp.html iD8DBQE/FBGrvJuQZxSWSsgRAlEfAKCL4ttDdTRHxPRW9N00nowPh1/q/QCgqrkv e7Ncj4al4aJ4ihktEyweJJo= =Z/rk -END PGP SIGNATURE- ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [SQL] Count dates distinct within an interval
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 > Now, I want to count the occurences of each value of stuff in the table, > but so that entries within 24 hours from each other count as one... >... > A = 2001 - 01- 01 20:20:00 > B = 2001 - 01 - 02 20:19:00 > C = 2001 - 01 - 02 20:21:00 > Should be counted as *two* (A === B, and B === C, but *not* A === C)... You need to elaborate on your logic some more, and state exactly what you would want in the A,B,C case above. Does B get lumped with A or with C? It is within 24 hours of both, after all. Does C not get lumped in with B simply because B has already been lumped in with A? Perhaps showing us the sample output of an ideal query would be best, along with some sample rows (e.g. use real insert statements) - -- Greg Sabino Mullane [EMAIL PROTECTED] PGP Key: 0x14964AC8 200307151045 -BEGIN PGP SIGNATURE- Comment: http://www.turnstep.com/pgp.html iD8DBQE/FBS9vJuQZxSWSsgRAnk3AJ0bqyDk6iZWqSZuHfZslFCjxwl7fgCfaZ7r XdwpPsO4OaTa9YpjmXx1hmA= =IFRz -END PGP SIGNATURE- ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [SQL] Count dates distinct within an interval
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 > Also, I could certainly write a simple function, that would get all the > entries in order, and scan through them, counting according to my rules... > But I was hoping to find some way to do this in plain sql though... In this example, you are best off using a function. - -- Greg Sabino Mullane [EMAIL PROTECTED] PGP Key: 0x14964AC8 200307151137 -BEGIN PGP SIGNATURE- Comment: http://www.turnstep.com/pgp.html iD8DBQE/FCAgvJuQZxSWSsgRAg1jAJ9kS9PpIiMkij6TtOg63O59TeezPACgzhMF ZM/84SEPP4doDR8fsGpnUBU= =w5Wa -END PGP SIGNATURE- ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] parse error for function def
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 > I'm trying to create a function to use on a trigger to check reference > to views since pg does not support foreign keys referencing views. Can you explain exactly what you are trying to do and why? You are getting the error because a SQL function does not RETURN, it must end with a SELECT statement. It also has no DECLARE, BEGIN, or END. You can either remove all of those or change the language to plpgsql. See: http://www.postgresql.org/docs/7.3/static/xfunc-sql.html In addition, you cannot (with SQL) use an argument as the tablename. You also probably want to use EXISTS, not "count..::bool". - -- Greg Sabino Mullane [EMAIL PROTECTED] PGP Key: 0x14964AC8 200307171005 -BEGIN PGP SIGNATURE- Comment: http://www.turnstep.com/pgp.html iD8DBQE/Fq/7vJuQZxSWSsgRAj01AKCz9BA4aYrp8pnqWy8VHA4i3WGjtgCgjndA yzNOE52VAvJBOEvilACSGvA= =EcwZ -END PGP SIGNATURE- ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [SQL] obtuse plpgsql function needs
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Perhaps something like this?. Called like thus: SELECT a,b,c,msgmaker('t1',ctid) FROM t1 WHERE a=b; CREATE OR REPLACE FUNCTION msgmaker(text,tid) RETURNS text AS ' DECLARE mytable ALIAS FOR $1; mytid ALIAS FOR $2; myctid TEXT; myquery TEXT; mylen SMALLINT := 20; yourlen SMALLINT; mydec SMALLINT; myinfo TEXT; myrec RECORD; biglist TEXT := \'Error\'; BEGIN myquery := \' SELECT length(attname) AS lenny FROM pg_attribute WHERE attnum >=1 AND attrelid = (SELECT oid FROM pg_class WHERE relname = \'\'\' || mytable || \'\'\') ORDER BY 1 DESC LIMIT 1\'; FOR myrec IN EXECUTE myquery LOOP mylen := myrec.lenny; END LOOP; myquery := \' SELECT attname, atttypid, atttypmod FROM pg_attribute WHERE attnum >=1 AND attrelid = (SELECT oid FROM pg_class WHERE relname = \'\'\' || mytable || \'\'\') ORDER BY attname ASC\'; myinfo := \'SELECT \'; FOR myrec IN EXECUTE myquery LOOP myinfo := myinfo || \'\'\'- \' || myrec.attname || \': \'; yourlen := LENGTH(myrec.attname); LOOP myinfo := myinfo || \' \'; yourlen := yourlen + 1; EXIT WHEN yourlen > mylen; END LOOP; myinfo := myinfo || \'\'\' || COALESCE(\'; IF myrec.atttypid = 1184 THEN myinfo := myinfo || \'TO_CHAR(\' || myrec.attname || \',\'\'Mon DD, HH24:MI\'\')\'; ELSIF myrec.atttypid = 16 THEN myinfo := myinfo || \'CASE WHEN \' || myrec.attname || \' IS TRUE THEN \'\'True\'\' ELSE \'\'False\'\' END\'; ELSIF myrec.atttypid = 17 THEN myinfo := myinfo || \'ENCODE(\' || myrec.attname || \',\'\'hex\'\')\'; ELSIF myrec.atttypid = 1700 THEN SELECT substr(rtrim(format_type(myrec.atttypid, myrec.atttypmod),\')\'), position(\',\' IN format_type(myrec.atttypid, myrec.atttypmod))+1) INTO mydec; myinfo := myinfo || \'TO_CHAR(\' || myrec.attname || \',\'\'FM990\'; IF mydec > 1 THEN myinfo := myinfo || \'.\'; LOOP myinfo := myinfo || \'0\'; mydec := mydec - 1; EXIT WHEN mydec < 1; END LOOP; END IF; myinfo := myinfo || \'\'\')\'; ELSE myinfo := myinfo || myrec.attname; END IF; myinfo := myinfo || \'::text,\'\'\'\'::text) || \'\'n\'\' || \\n\'; END LOOP; SELECT mytid INTO myctid; myinfo := myinfo || \'\'\'n\'\' AS info FROM \' || mytable || \' WHERE ctid = \'\'\' || myctid || \'\'\'\'; FOR myrec IN EXECUTE myinfo LOOP biglist := myrec.info; END LOOP; RETURN biglist; END; ' LANGUAGE 'plpgsql'; - -- Greg Sabino Mullane [EMAIL PROTECTED] PGP Key: 0x14964AC8 200307231536 -BEGIN PGP SIGNATURE- Comment: http://www.turnstep.com/pgp.html iD8DBQE/HuPCvJuQZxSWSsgRAnNsAJ9Qljeo+2NkBIp17TKb6SRf2T6WwACg8bwV A2TBRJdMzk0jpw67sIk3+uc= =cjEZ -END PGP SIGNATURE- ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [SQL] Is there something wrong with Perl`s dbi and PostgreSQL?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 > I just tried running a perl script that I wrote about a year ago. It had > worked for me in the past. Now it doesn't work. You have AutoCommit set to off, and never commit the transaction. Therefore, the table creation is rolled back. Add a $dbh->commit() after your execute line and it should work as expected. Also note that your "die" on connect will not work because you have RaiseError set: Leave it off for the connect, then turn it on again immediately via: $dbh->{RaiseError}=1; - -- Greg Sabino Mullane [EMAIL PROTECTED] PGP Key: 0x14964AC8 200309291445 -BEGIN PGP SIGNATURE- Comment: http://www.turnstep.com/pgp.html iD8DBQE/eH4AvJuQZxSWSsgRAgkVAJ46YX5iJ1+pbeJOQ6RJYId/6yhOWQCeKy7R doP2RZN1y353MT+c4KdYywA= =KUHS -END PGP SIGNATURE- ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [SQL] How can I produce the following desired result?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 > How can I produce the following desired result? SELECT * FROM mytable JOIN (SELECT goodid, MAX(storehistoryid) AS storehistoryid FROM mytable GROUP by 1) AS a USING (goodid,storehistoryid); - -- Greg Sabino Mullane [EMAIL PROTECTED] PGP Key: 0x14964AC8 200310151046 -BEGIN PGP SIGNATURE- Comment: http://www.turnstep.com/pgp.html iD8DBQE/jV4PvJuQZxSWSsgRAmYlAJwL06D+VNkmAT7RDcjXPgGu9oPXkgCgx1SJ OASzEJlWv6qi05xXhPH1NBY= =szda -END PGP SIGNATURE- ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [SQL] Equivalent of Reverse() functions
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 > I am migrating an SQL Server 2000 database to Postgres 7.3.4 running > on RH Linux 7.2. While migrating I encounter SQL Server REVERSE( ) > function, seems no such functions at Postgres. > > Is there a equivalent function available at Postgres? Please > shed some light No. If you have plperl installed, as somebody already posted, you can take advantage of perl's built in reverse() function. Here is a plpgsql version: CREATE OR REPLACE FUNCTION reverse(TEXT) RETURNS TEXT AS ' DECLARE original ALIAS FOR $1; reversed TEXT := \'\'; onechar VARCHAR; myposINTEGER; BEGIN SELECT LENGTH(original) INTO mypos; LOOP EXIT WHEN mypos < 1; SELECT substring(original FROM mypos FOR 1) INTO onechar; reversed := reversed || onechar; mypos := mypos -1; END LOOP; RETURN reversed; END ' LANGUAGE plpgsql IMMUTABLE RETURNS NULL ON NULL INPUT; pg=> SELECT reverse('A man, a plan, a canal, Panama'); reverse - ---- amanap ,lanac a ,nalp a ,nam A - -- Greg Sabino Mullane [EMAIL PROTECTED] PGP Key: 0x14964AC8 200311291246 -BEGIN PGP SIGNATURE- iD8DBQE/yNwwvJuQZxSWSsgRAnTyAJ9TqV0D3pV4Cv2b0VZfb8TxuvgxKgCeNBN+ OoFWwoD3omlLw+MUxcWZkT0= =JtRf -END PGP SIGNATURE- ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [SQL] Updating session id based on accesstimeout
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 > I want to set a session id based on a session timeout per user. I > have a very large logfile with more than 1.3 Mio records and look > for a very efficient solution - maybe with PL/pgSQL . Can you post a clearer example with the exact columns in the table, and what you wish to do? - -- Greg Sabino Mullane [EMAIL PROTECTED] PGP Key: 0x14964AC8 200312022156 -BEGIN PGP SIGNATURE- iD8DBQE/zVDsvJuQZxSWSsgRAnXaAJ9ndHTI7ha9ZyWS+Bnybgbve09jVQCffWTa H/csLQmY29QTReOH7XYMtHs= =7xo3 -END PGP SIGNATURE- ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [SQL] Fetch a single record
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 > I'm looking for the fastest way to fetch a single record from a table. > I don't care what record it is. That's a strange request. If you really don't care what comes back, you don't even need to query a table: SELECT 1; If you perhaps want the column names, query the system tables. Otherwise the LIMIT 1 should be very fast, especially if you have no ordering. Avoid the "SELECT *" if you don't need all the columns for a little more speed. - -- Greg Sabino Mullane [EMAIL PROTECTED] PGP Key: 0x14964AC8 200312091943 -BEGIN PGP SIGNATURE- iD8DBQE/1mynvJuQZxSWSsgRAk0HAKDKTHglcodYw2G9j5Il60e96Vv/xwCfcZ6p ffIBwsqFtqW0UABYttqzT3U= =JV2a -END PGP SIGNATURE- ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[SQL] ANN: www.SQL-Scripts.Com
Hello,Announcing the release of a new web site : www.SQL-Scripts.comAt www.SQL-Scripts.Com you can find a collection of SQL Scripts for manydifferent database system. Using our search system you can find scriptsthat you need quickly and simply. If you have scripts that you use why notlodge them on our site. If you lodge your script you could win a prize. Inaddition to scripts there are many documents, news feeds and articles.We have a monthly contest for the first people to lodge 20 scripts in onemonth.Please help to make this site great.Webmaster at www.SQL-Scripts.com
[SQL] Retrieving the new "nextval" for primary keys....
I am using postgreSQL with Perl::CGI and Perl::DBI::Pg... I would like to be able to insert a row from my Perl script [$SQL->exec();] and have postgreSQL return the id of the newly inserted record (new.id) directly to the Perl script for further processing... Anyone with a solution / idea ??? Nearly EVERY table I create in postgreSQL (7.2) has the following minimum structure: create table "tblName" ( id int4 primary key nextval ("tblName_id_seq"), ..field... ..field... ..field... create_dt date default 'CURRENT_DATE', change_dt timestamptz default 'now()', active_flag bool default 'TRUE' ) ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[SQL] Why must the function that a trigger calls return "opaque" ???
This seem rather limiting... Suppose I want a trigger that after insert, returns the currval(sequence) of the newly inserted row automatically without having to run another query ??? GP ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] Passing array to PL/SQL and looping
Is it possible to construct and return an array with plpgsql like.. CREATE FUNCTION test_array( ) RETURNS VARCHAR[] AS ' DECLARE return_array VARCHAR[]; BEGIN return_array[0] := ''test''; return_array[1] := ''test 1''; return_array[2] := ''test 2''; RETURN (return_array); END;' LANGUAGE 'plpgsql'; I get the following error when I try to run it: ERROR: parse error at or near "[" on line 4. If this worked I could clean up a LOT of hacky plpgsql code I have had to write. On Thu, 2002-09-26 at 18:20, Josh Berkus wrote: Peter, > I'm a newbie to PL/SQL and need help badly. I'm trying to pass my array of > id's to the function and then loop through until the array is empty. I know > there must be atleast five things I'm doing wrong. Simplified example: CREATE FUNCTION test_array ( INT[] ) RETURNS INT AS ' DECLARE id_array ALIAS for $1; count_it INT; BEGIN count_it := 1; WHILE id_array[count_it] LOOP count_it := count_it + 1; END LOOP; RETURN (count_it - 1); END;' LANGUAGE 'plpgsql'; returns the number of elements in the supplied array. -- Josh Berkus [EMAIL PROTECTED] Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED]) -- Greg Johnson <[EMAIL PROTECTED]>
Re: [SQL] Inquiry From Form [pgsql]
Jack Kiss <[EMAIL PROTECTED]> writes: > 1)Do you have a function which is similar to Oracle\'s DECODE. Yes > 2) Can you extract day of week (Monday,etc) from yours date functions. Yes Check out the "Date/Time Function and Operators" and the "Conditional Expressions" sections of this: http://www.ca.postgresql.org/users-lounge/docs/7.3/postgres/functions.html -- greg ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [SQL] HardCORE QUERY HELP!!!
Metnetsky <[EMAIL PROTECTED]> writes: > It's for a class and my professor has a thing for brain teaser type > questions. Incidentally, TAs and Profs aren't stupid, and have been known to check on newsgroups and mailing lists for students asking for people to do their homework for them. -- greg ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [SQL] Forcing query to use an index
One suggestion I'll make about your data model -- I'm not sure it would actually help this query, but might help elsewhere: WHERE ( C.Disabled > '2003-02-28' OR C.Disabled IS NULL ) Don't use NULL values like this. Most databases don't index NULLs (Oracle) or even if they do, don't make "IS NULL" an indexable operation (postgres). There's been some talk of changing this in postgres but even then, it wouldn't be able to use an index for an OR clause like this. If you used a very large date, like -01-01 as your "not deactivated" value then the constraint would be C.disabled > '2003-02-28' and postgres could use an index on "disabled". Alternatively if you have a disabled_flag and disabled_date then you could have an index on disabled_flag,disabled_date and uhm, there should be a way to use that index though I'm not seeing it right now. This won't matter at first when 99% of your customers are active. And ideally in this query you find some way to use an index to find "kate" rather than doing a fully table scan. But later when 90% of the clients are disabled, then in a bigger batch job where you actually want to process every active record it could prevent postgres from having to dig through a table full of old inactive records. > This may make better use of your index, because the planner will have a more > accurate estimate of the number of rows returned from the outer join. > > AND: > >AND ( C.Accountnum~* 'kate' > OR C.Firstname ~* 'kate' > OR C.Lastname ~* 'kate' > OR C.Organization ~* 'kate' > OR C.Address ~* 'kate' > OR C.Postal~* 'kate' > OR C.City ~* 'kate' > OR EA.Name || '@' || JoinDomain(EA.Domain) ~* 'kate' > > This set of expressions has "seq scan" written all over it. I hihgly suggest > that you try to find a way to turn these into anchored text searches, perhaps > using functional indexes on lower(column). If you really need to find substring matches everywhere you might want to look into the full text search module in contrib/tsearch. I haven't started using it yet but I expect I will have to when I get to that part of my project. > Finally: > > OR CMS.Package ~* 'kate' *confusion*. Oooh, Yeah, this one is a big problem. It means it's not clear which end of the join to start with. Maybe it would be better to separate this into two separate queries, give the user the option to search for a user "kate" or a package "kate" but not both simultaneously. -- greg ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [SQL] Forcing query to use an index
> -> Merge Join (cost=6106.42..6335.30 rows=2679 width=265) (actual time=859.77..948.06 rows=1 loops=1) Actually another problem, notice the big discrepancy between the estimated row and the actual rows. That's because you have the big OR clause so postgres figures there's a good chance one of the clauses will be true so it estimates a lot of rows will match. In fact of course they're all very selective and you'll usually probably only get a few records. If you're stuck with the unanchored text search it will always do a full table scan so it will never be lightening fast. But it would probably be a bit faster if you put a limit clause (on a subquery) on the table that's doing the full table scan. That will convince postgres that there won't be thousands of resulting records, which might convince it to do a nested loop. Also, as a beneficial side effect will also limit the damage if one your users does a search for "e"... This only really helps if you can get rid of the OR CMS.package clause... otherwise it actually needs all the records in case they match a summary record with a kate package. -- greg ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [SQL] Gist indexes on int arrays
Greg Stark <[EMAIL PROTECTED]> writes: > Can I have a GiST index on (foo_id, attribute_set_array) and have it be just > as fast at narrowing the search to just foo_id = 900 but also speed up the ~ > operation? Hm, so if I understand what I'm reading I can do this if I load the btree_gist contrib module as well. I'm still not sure whether it'll be worthwhile for this application though. I have a bit of a problem though. Is building GiST indexes supposed to take much much longer than building btree indexes? It's been running nearly an hour and it's still going. The hard drive is hardly moving so it seems to be all cpu usage. I don't even see any pgsql_tmp usage. db=# CREATE INDEX cache_gist_idx on cache using gist ( foo_id , attribute_set gist__int_ops); postgres 30176 86.3 22.2 64896 57344 ? R11:08 40:32 postgres: postgres slo [local] CREATE INDEX I don't remember exact numbers but building the normal btree index took on the order of 15m. This will have to be rebuilt nightly, an hour long index build won't be practical. -- greg ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] Complex outer joins?
Tom Lane <[EMAIL PROTECTED]> writes: > The SQL-standard way of writing this would presumably be either > > from G left join L on (G.SELID = L.SELID) > left join C on (L.SELID = C.SELID and L.SELLEVEL = C.SELLEVEL) I would think of it as this one. > from G left join > (L left join C on (L.SELID = C.SELID and L.SELLEVEL = C.SELLEVEL)) > on (G.SELID = L.SELID) I don't see how that would be at all different. > depending on which join you think ought to be done first. It might be > that the results are the same in this case, but I'm not convinced of > that. In general the results of outer joins definitely depend on join > order. I'm pretty sure Oracle actually builds an abstract join representation where the two queries above would actually be represented the same way. Then decides the order from amongst the equivalent choices based on performance decisions. Can you show an example where the join order would affect the result set? I can't think of any. -- greg ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [SQL] LEAST and GREATEST functions?
Josh Berkus <[EMAIL PROTECTED]> writes: > Stefan, > > > I know the LEAST and GREATEST functions are not part > > of standard SQL, but they sure were handy where I came > > from (Oracle-land). > > Um, what's wrong with MAX and MIN, exactly? MAX and MIN are single-parameter aggregate functions. LEAST and GREATEST are two-parameter (though in postgres they could be defined for 3 and more parameters) scalar functions. eg: SELECT max(a) FROM bar would return a single tuple with the maximum value of a from amongst every record. whereas: SELECT greatest(a,b) FROM bar would return one tuple for every record in the table with a single value representing the greater of bar.a and bar.b. You could define your own functions to do this but it would be tiresome to define one for every datatype. -- greg ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [SQL] summing tables
To solve this problem efficiently you probably need the lead/lag analytic functions. Unfortunately Postgres doesn't have them. You could do it with something like: update foo set c = a+b+(select c from foo as x where seq < foo.seq ORDER BY seq desc LIMIT 1) or the more standard but likely to be way slower: update foo set c = a+b+(select c from foo as x where seq = (select max(seq) from foo as y where seq < foo.seq)) However, i would suggest that if you have an implicit relationship between records you should make that relationship explicit with a foreign key. If you had a column that contained the seq of the parent record then this would be easy. I'm really puzzled how this query as currently specified could be useful. -- greg ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [SQL] summing tables
"Viorel Dragomir" <[EMAIL PROTECTED]> writes: > Anyway, in real life this update modifies only one row with a value wich is > diff of null. It was really handy if it was specified the option ORDER for > the update command. Are you hoping to produce a running total? That's very difficult in standard SQL. That would be very different from the query you asked for. Running totals, ranking, lead/lag, are all things that are very difficult to do in standard SQL. They don't fit in the unordered set model that SQL follows so doing them without special non-standard functions is very hard and inefficient. The functions to do them don't fit well within the SQL universe either, which might be why they don't exist yet in postgres. -- greg ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [SQL] Recursive request ...
Joe Conway <[EMAIL PROTECTED]> writes: > BenLaKnet wrote: > > I see connect by in Oracle > > ??? is there an equivalent in PostgreSQL or not ?? > > Someone is working on the SQL99 equivalent, but it isn't done yet. Perhaps for > 7.5. There's a connectby hack in the contrib/tablefunc directory. I haven't used it so I'm not clear on how powerful it is compared to the real deal, but people on one of the pgsql lists seemed to find it useful when it came up in the past. connectby(text relname, text keyid_fld, text parent_keyid_fld, text start_with, int max_depth [, text branch_delim]) - returns keyid, parent_keyid, level, and an optional branch string - requires anonymous composite type syntax in the FROM clause. See the instructions in the documentation below. -- greg ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [SQL] Table Partitioning and Rules
"Girish Bajaj" <[EMAIL PROTECTED]> writes: > I cant possibly index all the cols in the table. So I thought Id best manage > the data by splitting up the table into multiple partitions and eventually > depending on application logic, only scan those tables that are necessary to > scan sequentially instead of the whole big table. But that's only going to help if one of the columns they're searching on is the last name column isn't it? I'm a fan of partitioned tables but you have to consider what advantage you're trying to achieve to understand if it will actually be helpful for you: Partitioned tables only really improve query performance if virtually all queries use a common constraint. The canonical example is accounting tables being partitioned based on fiscal year. Virtually all the queries--even ones doing massive batch queries best served by sequential scans--will only scan the current fiscal year. In your case unless you can impose a constraint on the UI that users always perform their queries on a single letter of the alphabet at a time and only see results for people whose last names match that letter, it's not really a great match as far as query performance. The other advantage of partitioned tables is space management; it allows placing each partition on a separate physical storage space. However without native support in Postgres doing it via rules is going to be a headache. I would think you would be better off striping the disks together and storing it as a single large table. That's the only clean approach Postgres really allows at this point anyways. Finally, if I WAS going to partition based on the first letter of a text string, which I doubt I would, I would probably create 26 partitions right off the bat. Not try to make up arbitrary break points. If those arbitrary breakpoints turn out to be poorly chosen it'll be a complex manual job to move them. Whereas if you just have 26 partitions some will be large and some small and you can move partitions between physical storage freely to balance things. -- greg ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [SQL] casting to arrays
Joe Conway <[EMAIL PROTECTED]> writes: > Not possible in current releases, but it will be in 7.4 (about to start beta). > It looks like this: Well there is the int_array_aggregate function in the contrib/intagg directory. It has to be compiled separately, and it has a few quirks (like the arrays are zero-based instead of 1-based) but it works more or less, and it does exactly what you describe. But the 7.4 stuff should be much cleaner and more flexible, so if you don't need it right now you're better off waiting. -- greg ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [SQL] min() and NaN
Tom Lane <[EMAIL PROTECTED]> writes: > NULL can be special, because it acts specially in comparisons anyway. > But NaN is just a value of the datatype. Does postgres intend to support all the different types of NaN? Does you intend to have +Inf and -Inf and underflow detection and all the other goodies you actually need to make it useful? If not it seems more useful to just use the handy unknown-value thing SQL already has and turn NaN into a NULL. -- greg ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [SQL] numeric and float converts to int differently?
"SZÛCS Gábor" <[EMAIL PROTECTED]> writes: > > cannot see is that the float values are not actually exactly 0.5 > > Yes I could guess that (floating point vs fixed), but is this a coincidence > that both '0.5'::float and '-0.5'::float are closer to 0, whereas they could > be closer to +/-1, as well as both closer to the lower or upper bound. Wouldn't 0.5 and -0.5 be representable exactly as floats? -- greg ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [SQL] Table versions
Tom Lane <[EMAIL PROTECTED]> writes: > Returning to the original problem, it seems to me that comparing "pg_dump > -s" output is a reasonable way to proceed. I've actually started checking in a pg_dump -s output file into my CVS tree. However I prune a few key lines from it. I prune the TOC OID numbers from it, and anything not owned by the user I'm interested in. The makefile rule I use looks like: schema.sql: pg_dump -U postgres -s user | sed '/^-- TOC entry/d;/^\\connect - postgres/,/^\\connect - user/d;/^SET search_path/d;/^$$/d;/^--$$/d' > $@ This still suffers from one major deficiency. The order that objects are outputed isn't necessarily consistent between databases. If I add tables to the development server but then add them to the production server in a different order the schema still shows differences even though the objects in the two databases are identical. -- greg ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [SQL] Getting last insert value
Guillaume LELARGE <[EMAIL PROTECTED]> writes: > Doing a "select currval() from my_table" after your insert should work. That's "select currval('my_table_pkcol_seq')" actually. The above would have called the currval() function for every record of the table which isn't what you want and in any case currval takes an argument. -- greg ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [SQL] Programatically switching database
ow <[EMAIL PROTECTED]> writes: > My concern though ... wouldn't pgSql server collapse when faced with > transaction spawning across 100M+ records? The number of records involved really doesn't faze Postgres at all. However the amount of time spent in the transaction could be an issue if there is other activity in other schemas of the same database. As long as the transaction is running none of the deleted or old updated data in any schema of the database can be cleaned up by vacuum as postgres thinks the big transaction "might" need to see it sometime. So if the rest of the database is still active the tables and indexes being updated may grow larger than normal. If it goes on for a _really_ long time they might need a VACUUM FULL at some point to clean them up. -- greg ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] Expressional Indexes
"Randolf Richardson, DevNet SysOp 29" <[EMAIL PROTECTED]> writes: > For example, if I want to index on a date field but only have the index > keep track of the most recent 30 days (and then create a secondary index for > all dates) so as to improve performance on more heavily loaded systems. > > Am I understanding this new terminology correctly? Thanks in advance. No, you could do the above using "partial indexes" but it wouldn't work very well in this case because the "last 30 days" keeps moving and you would have to keep redefining the index periodically. It also wouldn't really help performance. Expression Indexes are just more powerful "functional indexes". In 7.3 they could be used for indexing expressions like "lower(foo)". In 7.4 they're more powerful and you can index expressions other than simple function calls. They still should be things that always return the same value, which excludes subqueries. -- greg ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [SQL] Expressional Indexes
Tom Lane <[EMAIL PROTECTED]> writes: > No, because the above represents a moving cutoff; it will (and should) > be rejected as a non-immutable predicate condition. You could do > something like > > CREATE INDEX my_Nov_03_index on my_table (create_date) > WHERE (create_date >= date '2003-11-01'); > > and then a month from now replace this with > > CREATE INDEX my_Dec_03_index on my_table (create_date) > WHERE (create_date >= date '2003-12-01'); > > bearing in mind that this index can be used with queries that contain > WHERE conditions like "create_date >= some-date-constant". The planner > must be able to convince itself that the right-hand side of the WHERE > condition is >= the cutoff in the index's predicate condition. Since > the planner is not very bright, both items had better be simple DATE > constants, or it won't be able to figure it out ... Note that if you're just doing this to speed up regular queries where you have create_date in some small range, then you'll likely not see much of an increase. Mainly you'll just save space. What can be interesting is to create a partial index like this but over a second unrelated column. Something like: CREATE INDEX my_dec_03_index on my_table (userid) WHERE (create_date >= date '2003-11-02'); Then you can do queries like SELECT * FROM my_table WHERE userid = ? AND create_date >= date '2003-11-02' And it'll be able to efficiently pull out just those records, even if there are thousands more records that are older than 2003-11-02. This avoids having to create a two-column index with a low-selectivity column like "month". -- greg ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [SQL] Scaler forms as function arguments
Joe Conway <[EMAIL PROTECTED]> writes: > In 7.4 you could use an array. It would look like this: Though note that 7.4 doesn't know how to optimize this form: db=> explain select * from foo where foo_id in (1,2); QUERY PLAN - Index Scan using foo_pkey, foo_pkey on foo (cost=0.00..6.05 rows=2 width=756) Index Cond: ((foo_id = 1) OR (foo_id = 2)) (2 rows) db=> explain select * from foo where foo_id = ANY (array[1,2]); QUERY PLAN -- Seq Scan on foo (cost=0.00..1132.82 rows=5955 width=756) Filter: (foo_id = ANY ('{1,2}'::integer[])) (2 rows) -- greg ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [SQL] OFFSET and subselects
[EMAIL PROTECTED] (Dmitri Bichko) writes: > I am running in trouble with pagination here, somehow (rather naively) I > assumed that when doing a LIMIT and OFFSET, the subselects on the records > before the OFFSET would not be performed, which quite apparently is not the > case. So, LIMIT 50 OFFSET 0 takes 100ms to run, LIMIT 50 OFFSET 50 takes > 200ms, LIMIT 50 OFFSET 100 takes 300ms; and so forth, this really becomes > unacceptable after a few pages. If you don't need any of the results of the subqueries in your WHERE clause then you can do this by introducing a view in your query like: SELECT *, (SELECT ...) AS sub_1, (SELECT ...) AS sub_2, (SELECT ...) AS sub_3 FROM ( SELECT x,y,z FROM ... WHERE ... ) LIMIT 50 OFFSET 50 If you do use the results of the subqueries in your where clause or order by clause then, well, you're SOL. Since the OFFSET and LIMIT clauses only kick in after the where clause restrictions are taken into account. -- greg ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [SQL] Index not recognized
"Grace C. Unson" <[EMAIL PROTECTED]> writes: > Why is it that my index for text[] data type is not recognized by the > Planner? > > I did these steps: > > 1. create function textarr(text[]) returns text language sql as 'select > $1[1]' strict immutable > 2. create index org_idx on EmpData (textarr(org)); This index will only be used if you use the expression textarr(org) in your query. You would probably have some success if you did: select * from empdata where textarr(org) = 'math' > 3. vacuum full > 4. explain analyze select name from EmpData where org *= 'math'; Is this *= operator from the contrib/array directory? It's not an indexable operator at all using standard btree indexes. The GiST indexing does make indexable operators that can do things like *= but that's a whole other ball of wax. What are you really trying to do? > Result: > = > Seq Scan on EmpData (cost=0.00..3193.20 rows=102 width=488) > (actual time=3.71.35..371.35 rows=0 loops=1) > > Filter: (org[0]='math'::text) Well that's awfully odd. I don't know how that expression came out of the query you gave. You'll have to give a lot more information about how you're defining *= and why you think it's related to the function you used to define the index. -- greg ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [SQL] How can I get the last element out of GROUP BY sets?
Tom Lane <[EMAIL PROTECTED]> writes: > Robert Creager <[EMAIL PROTECTED]> writes: > > ... one piece of data I need is the last value for each GROUP BY > > period. Alas, I cannot figure out how to do this. > > SELECT DISTINCT ON (rather than GROUP BY) could get this done for you. Or if you need to combine this with other aggregate functions like sum, count, etc: CREATE FUNCTION first_accum (integer, integer) RETURNS integer AS 'select coalesce($1,$2)' LANGUAGE sql; CREATE FUNCTION last_accum (integer, integer) RETURNS integer AS 'select $2' LANGUAGE sql; CREATE AGGREGATE first (BASETYPE = integer, SFUNC = first_accum, STYPE = integer); CREATE AGGREGATE last (BASETYPE = integer, SFUNC = last_accum, STYPE = integer); Then you can do first() and last(). These definitions only work for integer but you can pattern match for other datatypes. You might be able to get a universal function working using anyelement now, I haven't tried. -- greg ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [SQL] limit 1 and functional indexes
Bruno Wolff III <[EMAIL PROTECTED]> writes: > >QUERY PLAN > > > > Sort (cost=11824.16..11831.55 rows=2956 width=33) (actual time=248.17..248.17 > > rows=0 loops=1) > >Sort Key: order_date > >-> Index Scan using transactions_pop_i on transactions > > (cost=0.00..11653.79 rows=2956 width=33) > > (actual time=126.13..126.13 rows=0 loops=1) > > Index Cond: (upper((pop)::text) = > > '79BCDC8A4A4F99E7C111'::text) > > Total runtime: 248.25 msec Yeah, the problem with functional indexes is that the optimizer doesn't have any clue how the records are distributed since it only has statistics for columns, not your expression. Notice it's estimating 2956 rows where in fact there are 0. I think someone was actually working on this so it may be improved in 7.5 but I'm not sure. Given the type of data you're storing, which looks like hex strings, are you sure you need to do a case-insensitive search here? Can't you just uppercase it when you store it? The other option would be to use a subquery and force the planner not to pull it up, something like: select code from ( select code from transactions where UPPER(pop) = UPPER('79bcdc8a4a4f99e7c111') offset 0 ) order by order_date DESC; The offset 0 prevents the optimizer from pulling the subquery into the outer query. I think this will prevent it from even considering the order_date index scan, but you'll have to try to be sure. -- greg ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [SQL] limit 1 and functional indexes: SOLVED
"Alexandra Birch" <[EMAIL PROTECTED]> writes: > It works perfectly - thanks a million! > Strangely the offset 0 does not seem to make any difference. > Gotta read up more about subqueries :) > > explain analyze > select code,order_date >from ( > select code, order_date >from transactions > where UPPER(pop) = UPPER('c892eb2f877e3a28ddc8e196cd5a8aad') > limit 1 > ) as foo > order by order_date DESC; I think what you're trying to do here is get the last order? Then you'll want the limit to be on the outer query where it's ordered by order_date: select code,order_date from ( select code, order_date from transactions where UPPER(pop) = UPPER('c892eb2f877e3a28ddc8e196cd5a8aad') offset 0 ) as foo order by order_date DESC; limit 1 Note that in theory postgres should be able to find the same plan for this query as yours since it's equivalent. It really ought to use the order_date index since it thinks it would be more efficient. However it's unable to because postgres doesn't try every possible index, only the ones that look like they'll be useful for a where clause or an order by. And the order by on the outer query isn't considered when it's looking at the subquery. It normally handles this case by merging the subquery into the outer query, but it can't do that if there's a limit or offset. So an "offset 0" is convenient for fooling it into thinking the subquery can't be pulled up without actually changing the output. You could do "order by upper(pop)" instead which might be clearer for someone reading the query in that it makes it look like you're trying to encourage it to use the index on upper(pop). In theory "order by"s on subqueries are useless and postgres could ignore them, but it doesn't. -- greg ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] Index not used - now me
Christoph Haller <[EMAIL PROTECTED]> writes: > Why appear? If the Index Scan has a Total runtime: 2.46 msec and the Seq Scan > a Total runtime: 46.19 msec, then the Index Scan is much faster. > Or am I completely off the track reading the explain analyze output? To estimate the relative costs of a sequential scan and an index scan Postgres has to take into account the likelihood the blocks needed will be the disk cache. In your example your database is otherwise idle and the entire table is small enough that the entire index is probably in cache. This means that the random access pattern of the index isn't really hurting the index scan at all. Whereas in a busy database with less available RAM the random access pattern makes a big difference. You could try raising effective_cache_size to give postgres a better chance at guessing that all the blocks will be in cache. But that may no longer be true when the query is run on a busy database. You could also try lowering random_page_cost. Some people find as low as 1.2 or so to be useful, but that would almost certainly be lying to postgres about the costs of random access and would cause it to use index scans aggressively even when they're not faster. -- greg ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [SQL] determining how many products are in how many categories
You might be better off with something more like this: SELECT COUNT(P.*) AS p_count, P.p_product_category_id FROM p_product P WHERE P.p_product_category_id IN (SELECT DISTINCT id FROM p_product_category) GROUP BY P.p_product_category_id; obviously tailored to YOUR schema... not mine... -- Greg Patnude / The Digital Demention 2916 East Upper Hayden Lake Road Hayden Lake, ID 83835 (208) 762-0762 "David Garamond" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > # product table (simplified): > create table p ( >id char(22) not null primary key, >name text, >desc text > ); > > # product category table (simpl.): > create table pc ( >id char(22) not null primary key, >name text, >desc text > ); > > # table that maps products into categories: > create table p_pc ( >id char(22) not null primary key, >pid char(22) not null references p(id), >pcid char(22) not null references pc(id) > ); > create index i_ppc_pid on p_pc(pid); > create index i_ppc_pcid on p_pc(pcid); > create unique index i_ppc_pid_pcid on p_pc(pid, pcid); > > There are about 50k rows in p, 40k in pc, and 140k in p_pc. Most > products are only placed in 1 category, some in 2, fewer in 3, and fewer > stills in 4, 5, 6 categories. > > I want to know how many products are placed in 1 category, how many in > 2, and so on. Here's my query: > > select count(pid),num_cat from ( >select pid,count(cid) as num_cat >from ppc group by pid) as f > group by num_cat; > > A sample output (4000 products are categorized in 5 different places, > 4998 in 4, and so on): > > count | num_cat > ---+- >4000 | 5 >4998 | 4 >7502 | 3 > 10001 | 2 > 17499 | 1 > (5 rows) > > However, this query is taking about 2.5 minutes. Any idea on how to make > it faster? > > -- > dave > > ---(end of broadcast)--- > TIP 5: Have you checked our extensive FAQ? > >http://www.postgresql.org/docs/faqs/FAQ.html > ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [SQL] Unique Constraint with foreign Key
Pleas also note that the referenced column in the foreign table either needs to be the PRIMARY KEY or have a unique constraint on it or maybe it just requires an index on it -- I'm not sure but I discovered that if the column in the foreign table (containing the REFERENCED key...) is NOT the primary key column -- the REFERENCES a(x) will faill unless a.x is specified as 'UNIQUE' -- as in the following example: create table a ( y integer not null primary key default nextval('nexta_seq'), x varchar not null UNIQUE ); create table b ( z integer not null PRIMARY KEY default nextval('nextbz_seq'), x varchar NOT NULL REFERENCES a(x), ); -- Greg Patnude / The Digital Demention 2916 East Upper Hayden Lake Road Hayden Lake, ID 83835 (208) 762-0762 "Bruno Wolff III" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > On Tue, Feb 10, 2004 at 12:03:36 -0500, > [EMAIL PROTECTED] wrote: > > Hi all i am using postgres and torque > > I have a column x in Table A of type varchar and NOT NULL. I want to > > impose a unique key constraint on that table and column y is the primary > > key. > > I have another Table B with column x. can i make this column x as a > > foreign key on column x in table A. > > Yes. The reference will need to specify column x, since the primary key > column (y) will be the default. > > > or Can i have varchar types of size 50 as primary keys in Postgres. > > Yes. But unless the 50 character limit comes from a business rule, you > might want to use the type 'text'. > > ---(end of broadcast)--- > TIP 7: don't forget to increase your free space map settings > ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [SQL] nextval problem
New feature for posgreSQL 7.3 & later -- sequences are automatically dropped when the owning table is dropped -- you need to recreate the sequence The easiest way is in your CREATE TABLE tblName ( id serial PRIMARY KEY, blah..., blah..., ) WITH OIDS; -- Greg Patnude / The Digital Demention 2916 East Upper Hayden Lake Road Hayden Lake, ID 83835 (208) 762-0762 "Ivo Anon" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > I'm new to postgresql and I'm having problems with the 'nextval' function > (or whatever it is called in postgresql). > > I created a table (called ADDRESS) using 'pgadmin' which has a serial field > (called addressid). When I tried the sql statement: > > select nextval('public."ADDRESS_addressid_seq"') > > everything seemed to be working fine. > > I then used the create table statement (displayed in pgadmin when the table > 'ADDRESS' is selected) to create the same table (of course after first > dropping 'ADDRESS'). When I try the same select-statement I suddenly got an > error message saying: > > ERROR: Relation "public"."ADDRESS_addressid_seq" does not exist > > Can anybody help me with this problem? > > ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [SQL] Unique Constraint with foreign Key
You've got it wrong when you reference the data column (a.x) -- your foreign key should reference the primary key in the referenced table (a.y)... Besides, in your table A -- 1, 99 2, 99 violates your unique constraint on column 'X' -- it would never happen... What I suggested is like this: create table a ( y integer not null primary key default nextval('nexta_seq'), x varchar not null UNIQUE ); create table b ( z integer not null PRIMARY KEY default nextval('nextbz_seq'), x integer NOT NULL REFERENCES a(y), ); Table A would have 1, 99 2, 99 .. 99,99 and table B would have 1, 1 1, 2 .. 1, 99 Greg Patnude / The Digital Demention 2916 East Upper Hayden Lake Road Hayden Lake, ID. 83835 (208) 762-0762 Send replies to: [EMAIL PROTECTED] Website: http://www.left-center.com -Original Message- From: Jan Wieck [mailto:[EMAIL PROTECTED] Sent: Tuesday, February 17, 2004 6:42 AM To: Greg Patnude Cc: [EMAIL PROTECTED] Subject: Re: [SQL] Unique Constraint with foreign Key Greg Patnude wrote: > Pleas also note that the referenced column in the foreign table either needs > to be the PRIMARY KEY or have a unique constraint on it or maybe it just > requires an index on it -- I'm not sure but I discovered that if the column > in the foreign table (containing the REFERENCED key...) is NOT the primary > key column -- the REFERENCES a(x) will faill unless a.x is specified as > 'UNIQUE' -- as in the following example: This is according to the SQL specification, which doesn't like doubts. Imagine rows (1, 99), (2, 99) in table a and row (3, 99) in table b. Which of the a-rows is now referenced and am I allowed to delete the other? There are good arguments either way, but if you require a UNIQUE on a.x, then this question will never come up. Jan > > create table a ( > y integer not null primary key default nextval('nexta_seq'), > x varchar not null UNIQUE > > ); > > create table b ( > > z integer not null PRIMARY KEY default nextval('nextbz_seq'), > x varchar NOT NULL REFERENCES a(x), > > ); > > -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 8: explain analyze is your friend
[SQL] User defined types -- Social Security number...
Anyone have a good pre-built user-defined type definition for creating / maintaining / manipulating a SSN ... where valid chars are in the range 000-00- through 999-99-. I imagine that the storage column is probably varchar(11) -- I am looking for a type definition that 1) checks that all numbers are in the range [0..9] 2) inserts the dashes at position 3) accepts either '000123456' OR '000-12-3456' as the input 4) leading zeroes (as in the above) must be stored -- hence varchar(11) instead of numeric 5) always outputs in thje format '000-12-3456' Either that or the question is: How can I coerce postgreSQL into using an input / output "mask"... -- Greg Patnude / The Digital Demention 2916 East Upper Hayden Lake Road Hayden Lake, ID 83835 (208) 762-0762 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] User defined types -- Social Security number...
Thanks Josh -- I understand that there are valid and invalid SSN's -- similar rules apply to zip codes and area codes... I tried this: SELECT to_char(123456789, '000-00-'); which yields 123-45-6789 -- nicely, I might add... the trick is getting postgreSQL to do this without having to create an ON SELECT and ON UPDATE TRIGGER... an ON UPDATE, SELECT, and INSERT re-write RULE might do the trick... SSN's CAN in fact start with a leading 0 -- mine does -- "041-xx-" -- I do agree that there are valid ranges -- my main concern is being able to store any leading zeros - I just need to make sure that something "looks" like a valid SSN in the formattig (nnn-nn-) and that I can store / retrieve it with the approoriate format -- what I am really trying to accomplish is an "input mask"... I hadn't considered using a Domain have to look at that -- Greg Patnude / The Digital Demention 2916 East Upper Hayden Lake Road Hayden Lake, ID 83835 (208) 762-0762 "Josh Berkus" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > Greg, > > > Anyone have a good pre-built user-defined type definition for creating / > > maintaining / manipulating a SSN ... where valid chars are in the range > > 000-00- through 999-99-. > > Actually, the range is more narrowly defined than that. I'm not sure of the > exact rules, but you will never see a leading 0 or a -00- in an SSN. > > > I imagine that the storage column is probably varchar(11) -- I am looking > > for a type definition that > > Use DOMAINs, not a custom type. It's less work. > > > Either that or the question is: How can I coerce postgreSQL into using an > > input / output "mask"... > > After you've created your DOMAIN, based on the TEXT type, you can overload the > input and output functions to format correctly. Beware, though: input & > output functions pretty much have to be written in C. > > -- > Josh Berkus > Aglio Database Solutions > San Francisco > > ---(end of broadcast)--- > TIP 8: explain analyze is your friend > ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [SQL] Slow sub-selects, max and count(*)
Josh Berkus <[EMAIL PROTECTED]> writes: > Max() and Count() cannot use indexes for technical reasons. Browse through > the archives of SQL, PERFORM, and HACKERS for about 12,000 discussions on the > subject. Please don't confuse the issue by throwing Max() and Count() into the same basket. The issue with Min()/Max() is that the result could be generated efficiently by scanning indexes but it's just hard, especially given generic aggregate functions, and the work simply hasn't been done, or even started, yet. The issue with Count() is that people want the result to be cached in a single per-table counter, but that can't be done as simply as that because of transactions. People have discussed complex solutions to this but it's a much more complex problem than it appears. They're really two entirely separate issues. -- greg ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [SQL] SQL challenge--top 10 for each key value?
Jeff Boes <[EMAIL PROTECTED]> writes: > I headed off in the direction of groups of SELECTs and UNIONs, and quit when I > got to something like four levels of "SELECT ... AS FOO" ... four? wimp, that's nothing! ok, seriously I think there's no way to do this directly with straight SQL. You would have to define a non-immutable function that has some temporary storage where it keeps track of how many it has seen. The generic function that would help here would be some kind of rank(value) that would give you the equivalent of rownum except with a level break every time value changes. I've been hoping to see something like this on the list for a long time but haven't yet. If the value of n is constant and small you could cheat with an aggregate function with an array of the top n values. db=> create function first3_accum(integer[],integer) returns integer[] as 'select case when array_upper($1,1) >= 3 then $1 else array_append($1,$2) end' language sql strict immutable; CREATE FUNCTION db=> create aggregate first3 (basetype = integer, sfunc = first3_accum, stype = integer[], initcond = '{}'); CREATE AGGREGATE then something like: SELECT first3(id) FROM (SELECT id FROM my_table ORDER BY query, CASE WHEN include THEN 1 ELSE 2 END ASC, score DESC) GROUP BY query But then you'll have to go back to the table to refetch the original records that you've found. The best way I find to do that is with the int_array_enum() function from the int_aggregate contrib module. SELECT * FROM my_table WHERE id IN ( SELECT int_array_enum(f3) FROM ( SELECT first3(id) as f3 FROM (SELECT id FROM my_table ORDER BY query, CASE WHEN include THEN 1 ELSE 2 END ASC, score DESC) as x GROUP BY query ) as x ) This last step is kind of annoying since you've already seen all those records. And it requires writing a new aggregate function every time the value of n changes though, which kind of sucks. In theory if the new work in 7.5 handling structured datatypes is as cool as it sounds you could have an array of complete records and when UNNEST is eventually incorporated into the array code then you could expand those instead of using the int_array_enum function. Neither of those things are ready yet as far as I know though. -- greg ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [SQL] SQL challenge--top 10 for each key value?
Josh Berkus <[EMAIL PROTECTED]> writes: > Rod, > > > Something along the lines of the below would accomplish what you want > > according to spec. ROW_NUMBER() is a spec defined function. (6.10 of > > SQL200N) > > Great leaping little gods! They added something called "row number" to the > spec? > > Boy howdy, folks were right ... the ANSI committee really has completly blown > off the relational model completely. If it's like Oracle's rownum then it's the row number of the *output*, not the position on disk. So it's not entirely blowing off the relational model any more than ORDER BY does. The weird thing is the number of cases where you want ORDER BY or rownum inside subselects. Which the solution to the original question needed. > When a standards committee becomes hostage to a handful of vendors, kiss > real standards goodbye. In the case of SQL was there ever any pretension otherwise? Was the SQL standard ever really useful as a "real standard"? I can write useful ANSI C89 code that will compile and work on any C compiler. Trying to write portable SQL92 code that does any useful work is about as productive as stapling bagels to your forehead. -- greg ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [SQL] SQL challenge--top 10 for each key value?
elein <[EMAIL PROTECTED]> writes: > create or replace function pycounter(integer) > returns integer as > ' >if args[0] == 0: > SD["nextno"] = 1 > return SD["nextno"] >try: > SD["nextno"] += 1 >except: > SD["nextno"] = 1 >return SD["nextno"] > ' language 'plpythonu'; > > And clearly it can be done faster as a little > C function. Does this approach have a hope of working if it's used twice in the same query? -- greg ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [SQL] trigger/for key help
Tom Lane <[EMAIL PROTECTED]> writes: > Bret Hughes <[EMAIL PROTECTED]> writes: > > FWIW I tried to use alter table but could never get the parser to accept > > $1 as a constraint name. I used single and double quotes as well as a > > lame attempt \$1. > > Hm, "$1" works for me ... Hm, this reminds me. When I was first learning this stuff I was stymied by the same issue. It took me quite a while to figure out how to drop constraints because of the quoting issue. Of course now it seems obvious, but for someone just starting it adds another roadblock. Is there a reason postgres goes out of its way to pick names that will be harder to work with than necessary? Or is it considered a good thing on the theory that if it's hard to reference it's also hard to accidentally use such names in conflicting ways? Perhaps names like _1 _2 ... would be easier to handle? Or perhaps making $ not require quoting would be helpful? -- greg ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [SQL] trigger/for key help
Tom Lane <[EMAIL PROTECTED]> writes: > Greg Stark <[EMAIL PROTECTED]> writes: > > Is there a reason postgres goes out of its way to pick names that > > will be harder to work with than necessary? > > If we use ordinary identifiers for system-generated names then we will > be infringing on user name space --- ie, there's a potential for > conflict. I suppose we could use long randomly-generated names like > ewjncm343cnlen, but are those really easier to work with? I don't see an unseverable link between "user name space" and "identifiers that don't need to be quoted". Mixed case names for instance seem like perfectly good user name space identifiers. Postgres could just as easily say "the system reserves all identifiers starting with $" and still not require quoting $. > I think a more useful approach is to treat it as a documentation > problem. Perhaps an example in the ALTER TABLE man page would help. -- greg ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [SQL] Concatenate results of a single column query
Marco Lazzeri <[EMAIL PROTECTED]> writes: > SELECT > p.name, ARRAY(SELECT nicknames FROM people WHERE people.id = p.id) > FROM people AS p > Any suggestions? Something like: db=> create aggregate array_aggregate (basetype = integer, sfunc = array_append, stype = integer[], initcond = '{}'); CREATE AGGREGATE db=> select array_aggregate(id) from tab; array_aggregate {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,28,29,30,31,32,33,34,36,37,38,39,40,41,42,43,27,26,44,45,46,47,48,49,50,51,52,53,54,35} (1 row) -- greg ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [SQL] Concatenate results of a single column query
Christoph Haller <[EMAIL PROTECTED]> writes: > Interesting feature, but I cannot find function array_append: > ERROR: AggregateCreate: function array_append(integer[], integer) does not exist It's new in Postgres 7.4 I think you could do this in 7.3 though, it would just be more awkward. Try || but I think that's new in 7.4 as well. Otherwise I think you would have to pick out the upper bound of the array with array_dims and set the upper+1'th element of the array. If you're doing text you may want to go directly to a textual concatenation like: CREATE FUNCTION concat_agg_accum(text, text) RETURNS text AS 'select $1 || '', '' || $2' LANGUAGE sql IMMUTABLE STRICT; CREATE AGGREGATE concat_agg ( BASETYPE = text, SFUNC = concat_agg_accum, STYPE = text ); -- greg ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [SQL] isolation level
Bruno Wolff III <[EMAIL PROTECTED]> writes: > On Thu, May 13, 2004 at 18:13:23 +, > Jaime Casanova <[EMAIL PROTECTED]> wrote: > > Hi all, is there a way to set the isolation level to something like the > > sql standard dirty read. > > No. There will be a way to use the standard name in a SET command, but > you will actaully get READ COMMITTED isolation (which is the next step up). I wonder how hard this would be to implement. It doesn't seem like it should be very hard. It would be very convenient for debugging and for checking on the progress of large batch updates or loads. -- greg ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [SQL] Multiple outer join on same table
Marco Lazzeri <[EMAIL PROTECTED]> writes: > Hi! > I'm searching a better (quicker) way to retrieve data as I used to do > using the following query... > > == > > SELECT main.codice, >other.value AS value_one, >other.value AS value_two > FROM main > LEFT OUTER JOIN other ON (main.id = other.id_main) > LEFT OUTER JOIN other AS other2 ON (main.id = other2.id_main) > WHERE other.type = 'type_one' >AND other2.type = 'type_two' > ; a) you're better off sending the actual query rather than retyping it. I assume you made a typo in the select column list and it should be "other2.value AS value_two"? Also the parentheses are required on the ON clause. b) The WHERE clause will effectively make this a plain inner join, not an outer join at all. Since any values that aren't found would have a NULL type column and cause the row to not be selected. I think the query you meant to write would be SELECT codice, other1.value AS value_one, other2.value AS value_two FROM main LEFT OUTER JOIN other as other1 ON (main.id = other1.id_main AND type = 'type_one') LEFT OUTER JOIN other as other2 ON (main.id = other2.id_main AND type = 'type_two) Another way to write this query that might be faster or might not depending would be: SELECT codice, (SELECT value FROM other WHERE id_main = id AND type = 'type_one') AS value_one, (SELECT value FROM other WHERE id_main = id AND type = 'type_two') AS value_two FROM codice In theory the two queries really ought to always result in the same plan because they're equivalent. However the Postgres optimizer as clever as it is is incapable of seeing this. The first form with the outer join leaves the optimizer with a lot more flexibility though, including at least one plan that is effectively identical to what the optimizer is forced to do for the second query. So really the first one should be no worse than the second. If you find the second faster (or if they're both still slow) you might consider posting explain analyze output for both queries. It may be that you have other issues preventing the optimizer from finding a good plan. You have run analyze on these tables recently? And you vacuum regularly? And for the second query you would really want an index on other.id_main too. For the first one it would depend on the data in the two tables. -- greg ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] Selecting "sample" data from large tables.
> Joseph Turner <[EMAIL PROTECTED]> writes: > > I have a table with a decent number of rows (let's say for example a > > billion rows). I am trying to construct a graph that displays the > > distribution of that data. However, I don't want to read in the > > complete data set (as reading a billion rows would take a while). Can > > anyone thing of a way to do this is postgresql? One way would be to have an indexed column with random values in it. Then you could use an index scan to pull out samples. However this has a few downsides. a) index scans are a lot less efficient than sequential scans. Effectively reducing the sample size you can get for a given amount of time even further. a 10% sample using this technique is probably almost as slow as reading the entire table, for example. If you only need .1% though this might be a good approach. b) the data in the random column would have to be static meaning multiple samples wouldn't be entirely independent. Depending on what you're doing with the samples this might be a problem or not. Oh, and if you're content with always using the same sample but want to analyze it multiple different ways, you might want to use a partial index. You could have partial indexes matching the order by and where clauses of your analyses but with a where clause of its own selecting on the random data. Then make sure that where clause is in every query. But still, the sample has to be small enough that even using an index scan you're winning over reading the entire data. -- greg ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] Empty array IS NULL?
Markus Bertheau <[EMAIL PROTECTED]> writes: > oocms=# SELECT ARRAY(SELECT 1 WHERE FALSE); > ?column? > -- > This one seems strange to me. Shouldn't it result in an empty array? -- greg ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: FW: [SQL] "=" operator vs. "IS"
Rich Hall <[EMAIL PROTECTED]> writes: > "(anything) = NULL" is always Null, this cannot be what the coder intended. I often have such things in my SQL. Consider what happens when you have SQL constructed dynamically. Or more frequently, consider that many drivers still don't use the new binary placeholder syntax and emulate it by putting the parameters directly into the SQL. -- greg ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: FW: [SQL] "=" operator vs. "IS"
Stephan Szabo <[EMAIL PROTECTED]> writes: > IS TRUE and IS FALSE have a different effect from =true and =false when > the left hand side is NULL. The former will return false, the latter will > return NULL. No, actually they both return false. (But thanks, I didn't even realize they were special this way) -- greg ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [SQL] How to create an aggregate?
Ray Aspeitia <[EMAIL PROTECTED]> writes: > I also would like to pass the delimiter to the aggregate as a parameter and > I am not sure if it can handle that. Currently aggregates that take multiple parameters are just not supported. -- greg ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [SQL] reply to setting
Joe Conway <[EMAIL PROTECTED]> writes: > This is very true. In fact, I get mildly annoyed when people *don't* include > the direct reply to me, because I very actively filter/redirect my mail. > Replies directly to me are pretty much guaranteed to be seen quickly, but the > ones that go to the list might get lost among the hundreds of posts that go > into my "postgres" inbox every day. I think many other people do something > similar. Just as a side comment, one trick I found very helpful in my mail filters is to treat any message with one of my message-ids in the references as a personal message as far as mail notifications. This way I get notifications for any message on a thread following a post of my own. This is easy in Gnus since the message id has the sending hostname and also the first few characters has a base64 encoded copy of the unix userid. You would have to figure out how to recognize message-ids from your MUA. -- greg ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [SQL] sleep function
John DeSoi <[EMAIL PROTECTED]> writes: > On Aug 10, 2004, at 10:57 AM, Bruce Momjian wrote: > > > I can't think of one, no. I think you will have to use one of the > > server-side languages and call a sleep in there. > > This is no good in the real world since it pounds the CPU, but it worked well > enough for my testing purposes. You went the long way around. I think what he meant was something as simple as: bash-2.05b$ /usr/lib/postgresql/bin/createlang -U postgres plperlu test bash-2.05b$ psql -d test -U postgres Welcome to psql 7.4.2, the PostgreSQL interactive terminal. Type: \copyright for distribution terms \h for help with SQL commands \? for help on internal slash commands \g or terminate with semicolon to execute query \q to quit test=# create or replace function sleep(integer) returns integer as 'return sleep(shift)' language plperlu; CREATE FUNCTION test=# \timing Timing is on. test=# select sleep(10) ; sleep --- 10 (1 row) Time: 10002.493 ms -- greg ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [SQL] only last records in subgroups
Dino Vliet <[EMAIL PROTECTED]> writes: > x,0 and y,4 but how do I manage this in sql? I was > hoping for a keyword LAST or so, where I can specify > that when I've ordered my results with order by, I > could only get the last of the subgroups (the first > one is easy because I could use limit 1) There's no concept of "first" and "last" in SQL outside of the ORDER BY clause of your query. And you can easily reverse the order of the ORDER BY sort by putting "DESC" after the columns you're sorting on. But I don't understand how you intend to use "LIMIT 1" to solve your problem. As you describe the problem you want the last (or first) record of *each* *group*. Solving that using LIMIT would require a complex query with a subquery in the column list which would be quite a pain. As the other poster suggested, if you're just looking to fetch a single column you can just use min() or max() to solve this. If you're looking to fetch more than one column Postgres provides a non-standard SQL extension for dealing with this situation, "DISTINCT ON". SELECT DISTINCT ON (id) id,day,other,columns FROM tab ORDER BY id,day That gets the lowest value of "day". Using "ORDER BY id, day DESC" to get the greatest value of "day". -- greg ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] from PG_DUMP to CVS
"Riccardo G. Facchini" <[EMAIL PROTECTED]> writes: > After searching throught the list, I assume you mean this link: > http://www.rbt.ca/autodoc/index.html > by Rod Taylor. > > Looks promising, but still what I need is a proper CVS output, as I > need to review the changes made to the specific database structure. Well, CVS can still be useful even if the changes are all in one file. Look at "cvs annotate" for example. And CVS diff would still give you useful information. -- greg ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [SQL] Aggregate query for multiple records
Scott Gerhardt <[EMAIL PROTECTED]> writes: > Hello, I am new to the list, my apology if this question is beyond the scope or > charter of this list. > > My questions is: > What is the best method to perform an aggregate query to calculate sum() values > for each distinct wid as in the example below, but except for all wid's (not > just WHERE wid='01/1-6-1-30w1/0'). > > Also, performance wise, would it be better to build a function for this query. > The table has 9 million records and these aggregate queries take hours. The "top n" type query (or in this case "first n" or "last n" but it's the same thing) is actually very tricky to do in standard SQL. The best solution seen here for postgres is to use arrays and custom aggregate functions. The following is based on a previous answer from Tom Lane to a similar question. (I thought I already posted this for you on pgsql-general but the list archives are down and you don't seem to have seen it, so I'm resending it) It allows you to do the whole query with a single sort for the grouping and the ordering by date together. You would have to use it with something like: SELECT sum_first_6(oil) FROM (SELECT oil from prd_data ORDER BY wid, "date") GROUP BY wid If you pump up sort_mem enough -- you can do it within the session for the single connection using "SET sort_mem" it should be pretty fast. I think it's the best you're going to get. If you're absolutely sure the data is physically stored in chronological order -- which I would only feel comfortable with if you've never done any updates or deletes, only inserts and perhaps occasional truncates, then you might be able to get by without ordering and convince it to do a hash aggregate. That would be the optimal result, no sorts at all. But it would be hard to make sure it would always work. test=> create or replace function first_6_accum (integer[], integer) returns integer[] language sql immutable as 'select case when array_upper($1,1)>=6 then $1 else $1||$2 end'; CREATE FUNCTION test=> create function sum_6(integer[]) returns integer immutable language sql as 'select $1[1]+$1[2]+$1[3]+$1[4]+$1[5]+$1[6]'; CREATE FUNCTION test=> create aggregate sum_first_6 (basetype=integer, sfunc=first_6_accum, stype=integer[],initcond='{}',finalfunc=sum_6); CREATE AGGREGATE test=> select sum_first_6(i) from (select i from (select 1 as i union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8) as x order by i desc) as x; sum_first_6 - 33 (1 row) test=> select sum_first_6(i) from (select i from (select 1 as i union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8) as x order by i asc) as x; sum_first_6 - 21 (1 row) This can easily be switched around to make it "last_6" and you can write functions to handle 6 records or 9 records. And all of these could be combined in a single query, so you only have to do the sort once. Unfortunately you cannot make aggregate functions that take multiple parameters, nor can you pass extra parameters to the state function. So you'll have to create a separate set of functions for each variant. Also, you'll have to change it to use reals. -- greg ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [SQL] Isnumeric function?
Theo Galanakis <[EMAIL PROTECTED]> writes: > error: btree item size 2744 exceeds maximum 2713. > > I assume I had to change some server settings to extend the maximum, however I would guess the block size. But I'm just guessing. > in the end this column holds content, and even applying an index would be > incredible slow to search across hundred of thousands of "content" records > looking for a primary key. Perhaps you could have an indexed column that contains a crc32 hash? Then you could do searches by comparing crc32 which make for fast efficient integer index lookups. You should still include a comparison against the original content column since it is possible for there to be a rare crc32 collision. This doesn't let you do range lookups efficiently. But it does let you look up specific values quickly even when they aren't numeric. -- greg ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [SQL] Isnumeric function?
Theo Galanakis <[EMAIL PROTECTED]> writes: > I created the Index you specified, however it chooses to run a seq scan on > the column rather than a Index scan. How can you force it to use that > Index.. > > CREATE INDEX idx_content_numeric ON botched_table(content) WHERE content ~ > '^[0-9]{1,9}$'; > > select * from botched_table where content = 200::integer You need to put a "and content ~ '^[0-9]{1,9}$'" in your query, the clause has to match the clause in the partial index pretty closely. perhaps you would find it convenient to make a view of select * from botched_table where content ~ '^[0-9]{1,9}$' and then just always select these values from that view. Also the "::integer" is useless. It actually gets cast to text here anyways. The index is on the text contents of the content column. You might consider making the index a functional index on content::integer instead. I suspect that would be faster and smaller than an index on the text version of content: slo=> create table botched_table (content text); CREATE TABLE slo=> create index idx_botched_table on botched_table ((content::integer)) where content ~ '^[0-9]{1,9}$'; CREATE INDEX slo=> create view botched_view as (select content::integer as content_id, * from botched_table where content ~ '^[0-9]{1,9}$'); CREATE VIEW slo=> explain select * from botched_view where content_id = 1; QUERY PLAN Index Scan using idx_botched_table on botched_table (cost=0.00..3.72 rows=3 width=32) Index Cond: ((content)::integer = 1) Filter: (content ~ '^[0-9]{1,9}$'::text) (3 rows) -- greg ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [SQL] Isnumeric function?
Greg Stark <[EMAIL PROTECTED]> writes: > Theo Galanakis <[EMAIL PROTECTED]> writes: > > > I created the Index you specified, however it chooses to run a seq scan on > > the column rather than a Index scan. How can you force it to use that > > Index.. > > > > CREATE INDEX idx_content_numeric ON botched_table(content) WHERE content ~ > > '^[0-9]{1,9}$'; > > > > select * from botched_table where content = 200::integer > > You need to put a "and content ~ '^[0-9]{1,9}$'" in your query, the clause has > to match the clause in the partial index pretty closely. Well this is weird. I tried to come up with a cleaner way to arrange this than the view I described before using a function. But postgres isn't using the partial index when it seems it ought to be available. When I say it has to match "pretty closely" in this case I think it would have to match exactly, however in the case of simple range operators postgres knows how to figure out implications. Ie, "where a>1" should use a partial index built on "where a>0". slo=> create table test (a integer); CREATE TABLE slo=> create index idx_text on test (a) where a > 0; CREATE INDEX slo=> explain select * from test where a > 0; QUERY PLAN Index Scan using idx_text on test (cost=0.00..17.50 rows=334 width=4) Index Cond: (a > 0) (2 rows) slo=> explain select * from test where a > 1; QUERY PLAN Index Scan using idx_text on test (cost=0.00..17.50 rows=334 width=4) Index Cond: (a > 1) (2 rows) That's all well and good. But when I tried to make a version of your situation that used a function I found it doesn't work so well with functional indexes: slo=> create function test(integer) returns integer as 'select $1' language plpgsql immutable; CREATE FUNCTION slo=> create index idx_test_2 on test (test(a)) where test(a) > 0; CREATE INDEX slo=> explain select test(a) from test where test(a) > 0; QUERY PLAN -- Index Scan using idx_test_2 on test (cost=0.00..19.17 rows=334 width=4) Index Cond: (test(a) > 0) (2 rows) slo=> explain select test(a) from test where test(a) > 1; QUERY PLAN --- Seq Scan on test (cost=0.00..25.84 rows=334 width=4) Filter: (test(a) > 1) (2 rows) I can't figure out why this is happening. I would think it has something to do with the lack of statistics on functional indexes except a) none of the tables is analyzed anyways and b) the estimated row count is the same anyways. -- greg ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [SQL] Isnumeric function?
Tom Lane <[EMAIL PROTECTED]> writes: > Greg Stark <[EMAIL PROTECTED]> writes: > > That's all well and good. But when I tried to make a version of your > > situation that used a function I found it doesn't work so well with > > functional indexes: > > ... > > I can't figure out why this is happening. > > You're using 7.3 or older? 7.4.3. -- greg ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [SQL] implementing an out-of-transaction trigger
"Iain" <[EMAIL PROTECTED]> writes: > Though, as far as I can tell, there is no way to have the notify activate a > pl/pgsql function directly. I'll still need to write a client program to > create a session and actually do the listening, that is if I havn't missed > anything else... Right, presumably some sort of daemon that sits and waits for events. Much like you would have with Oracle advanced queuing and such. The big difference is that NOTIFY doesn't pass along any parameters. You will need some way for your daemon to find any pending data it needs to process. You might need some kind of queue table, or you might be able to get by without one. -- greg ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] 1-byte integers
stig erikson <[EMAIL PROTECTED]> writes: > how can i specify an integer to be one byte byte or even 4 bits long? > int1, int(1), tinyint are nowhere to be seen. > smallest i can find is smallint that is 2 bytes. There's a type called "char" (the double quotes are needed). It's used by postgres system catalogues but it's available for users too. It's a little quirky since it has operators that treat it as a 1 byte text data type and other operators that treat it as an integer data type. But that doesn't normally lead to any problems, just strange symptoms when your code has a bug. Other than that there are things like bit(4) which has a cast to and from integer. But they won't save you any storage space. If you have multiple columns like this and want to define a new type that aggregates them all for storage but lets you access them individually that could be useful. -- greg ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] ORDER BY and NULLs
T E Schmitz <[EMAIL PROTECTED]> writes: > I just dug out the PostgreSQL book again because I thought I might've garbled > it: > > Quote: "PostgreSQL will not index NULL values. Because an index will never > include NULL values, it cannot be used to satisfy the ORDER BY clause of a > query that returns all rows in a table." You should just cross out that whole section. It's just flatly wrong. I had always assumed it was just people bringing assumptions over from Oracle where it is true. Perhaps this book is to blame for some of the confusion. Which book is it? Postgres indexes NULLs. It can use them for ORDER BY clauses. Where it cannot use them is to satisfy "WHERE foo IS NULL" or "WHERE foo IS NOT NULL" constraints though. That's an implementation detail, but it can be worked around with partial indexes. -- greg ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [SQL] ORDER BY and NULLs
Tom Lane <[EMAIL PROTECTED]> writes: > > The paragraph continues: > > "If the SELECT command included the clause WHERE phone NOT NULL, > > PostgreSQL could use the index to satisfy the ORDER BY clause. > > An index that covers optional (NOT NULL) columns will not be used to > > speed table joins either." > > My goodness, it seems to be a veritable fount of misinformation :-( > > I wonder how much of this is stuff that is true for Oracle and they just > assumed it carried over? The first part is true for Oracle. You have to add the WHERE phone NOT NULL to convince Oracle it can use an index. Or just make the column NOT NULL to begin with I think. However as far as I recall the second part is not true. Oracle is smart enough to realize that an equijoin clause implies NOT NULL and therefore allows it to use the index. (This may have all changed in Oracle 9+. The last I saw of Oracle was 8i) I wonder if they just tried explain on a bunch of queries and noticed that postgres wasn't using an index for SELECT * FROM foo ORDER BY bar and came up with explanations for the patterns they saw? -- greg ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [SQL] COUNT(*) to find records which have a certain number of dependencies ?
T E Schmitz <[EMAIL PROTECTED]> writes: > I want to select only those BRAND/MODEL combinations, where the MODEL has more > than one TYPE, but only where one of those has TYPE_NAME='xyz'. > I am not interested in MODELs with multiple TYPEs where none of them are called > 'xyz'. There are lots of approaches to this with various pros and cons. The simplest one off the top of my head: select * from brand join model on (brand_pk = brand_fk) where exists (select 1 from type where model_fk = model_pk and type_name = 'xyz') and (select count(*) from type where model_fk = model_pk) > 1 You could try to be clever about avoiding the redundant access to the type table: select * from brand join model on (brand_pk = brand_fk) where (select count(*) from type where model_fk = model_pk having sum(case when type = 'xyz' then 1 else 0 end) >= 1 ) > 1 I'm haven't tested that, it might need some tweaking. In any case I don't think it's worth the added complexity, assuming you have indexes on type. I'm not even sure it would run faster. You could try to be really clever about it by turning the whole thing into a join: select * from brand join model on (brand_pk = brand_fk) join (select model_fk from type group by model_fk having sum(case when type = 'xyz' then 1 else 0 end) >= 1 and count(*) > 1 ) on (model_fk = model_pk) This would let the planner have a more plans to choose from and might be a big win if there are lots of brands and models but few that satisfy the criteria you're looking for. -- greg ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [SQL] COUNT(*) to find records which have a certain number of
T E Schmitz <[EMAIL PROTECTED]> writes: > SELECT > BRAND_NAME,MODEL_NAME ... > intersect ... Huh, I never think of the set operation solutions. I'm curious how it compares speed-wise. -- greg ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [SQL] COUNT(*) to find records which have a certain number of dependencies ?
T E Schmitz <[EMAIL PROTECTED]> writes: > ) as somealias on (model_fk = model_pk) > > (subquery in FROM must have an alias) ARGH! This is one of the most annoying things about postgres! It bites me all the time. Obviously it's totally insignificant since it's easy for my to just throw an "AS x" on the end of it. But damn. I see there's a comment foreseeing some annoyance value for this in the source: /* * The SQL spec does not permit a subselect * () without an alias clause, * so we don't either. This avoids the problem * of needing to invent a unique refname for it. * That could be surmounted if there's sufficient * popular demand, but for now let's just implement * the spec and see if anyone complains. * However, it does seem like a good idea to emit * an error message that's better than "syntax error". */ So where can I officially register my complaint? :) -- greg ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [SQL] JOIN performance
Tom Lane <[EMAIL PROTECTED]> writes: > Fixing this properly is a research project, and I haven't thought of any > quick-and-dirty hacks that aren't too ugly to consider :-( Just thinking out loud here. Instead of trying to peek inside the CASE couldn't the optimizer just wrap the non-strict expression in a conditional that tests whether the row was found? -- greg ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [SQL] [GENERAL] need ``row number``
Tom Lane <[EMAIL PROTECTED]> writes: > Karsten Hilbert <[EMAIL PROTECTED]> writes: > > I am not convinced I'll need a SRF. I am not trying to > > calculate something that isn't there yet. I am just trying to > > join two views appropriately. I might have to employ some > > variant of Celko's integer helper table but I'm not sure how > > to proceed. > > A fairly common hack for this is to use a sequence: > > create temp sequence tseq; > select nextval('tseq'), * from (select order by ...) ss; But I thought she wanted to get the row number within a group. Not the row number for the entire result set. A sequence can't do that. Or at least, I suppose it could but it would be mighty strange to see setval() in a SELECT query. And I can't think of how to detect the level break in a select query either. -- greg ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [SQL] Inserting into table only if the row does not already exist.
"C. Bensend" <[EMAIL PROTECTED]> writes: > INSERT INTO table ( column1, column2, column3 ) >SELECT column1, column2, column3 >WHERE NOT EXISTS ( > SELECT column1, column2, column3 FROM table WHERE > column1 = $column1 AND > column2 = $column2 AND > column3 = $column3 ) > >.. which gave me 'ERROR: column1 does not exist'. Nuts. Well you're not selecting from any table so "column1" isn't going to exist. You just have to put it in the select list as a constant. If you're feeling generous to the next programmer to read it you could put "AS column1" after each one, but the column name doesn't actually have to match the column you're inserting into. INSERT INTO table ( column1, column2, column3 ) ( SELECT $column1, $column2, $column3 WHERE NOT EXISTS ( SELECT 1 FROM table WHERE column1 = $column1 AND column2 = $column2 AND column3 = $column3 ) ) Note that this is going to have some concurrency issues. I think it will be possible for a second query to execute before the first commits. In that case it won't see the record the first query inserted and try to insert again. You'll just get a primary key violation though which I guess you can just ignore. Which raises a question. Why not forgoe this complicated SQL and try to do the insert. If you get a primary key violation, well there's your answer... If you don't care about the failure just ignore it and move on. I would suggest checking specifically for a primary key violation and still stopping execution on unexpected errors though. If you're doing this inside a bigger transaction that's a bit more of a pain. Until 8.0 postgres can't handle ignoring an error on a query without aborting the entire transaction. But if you're in autocommit mode then you can just ignore the primary key violation and continue. Incidentally, if you're putting your parameters directly into your queries using $column1 then you've got a potential security problem. Unless you're quoting every variable everywhere religiously using postgres's quoting functions an attacker can sneak extra SQL into your queries. Potentially including whole new statements such as "DELETE FROM table"... -- greg ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [SQL] Inserting into table only if the row does not already
"C. Bensend" <[EMAIL PROTECTED]> writes: > The risk of a cron gone wild is acceptable to me at this moment. Gee, now I have images of late-night advertisements for bofh-porn video tapes of Cron Jobs Gone Wild(tm) dancing through my head... thanks. -- greg ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [SQL] help on a query
Michelle Murrain <[EMAIL PROTECTED]> writes: > The OUTER JOIN version is quite a bit more efficient (by an order of magnitude) > than the option with WHERE NOT EXISTS subquery. This is going to be heavily dependent on the version of postgres. IN/NOT IN execution has improved a lot in 7.4 and later. If you're still on 7.3 then the outer join will be better. But in 7.4 I would actually expect the NOT IN or the NOT EXISTS to be faster. -- greg ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [SQL] Cross tabulations
"Muhyiddin A.M Hayat" <[EMAIL PROTECTED]> writes: > Dear all, > > I need to do something similar to a cross tabulation, but without any > aggregation. join your table to itself four times: select * from (select check_time::date as date, employee_id, check_time-check_time::date as in from test where state = 'In') as a join (select check_time::date as date, employee_id, check_time-check_time::date as break_out from test where state = 'Break Out') as b using (employee_id,date) join (select check_time::date as date, employee_id, check_time-check_time::date as break_in from test where state = 'Break In') as d using (employee_id,date) join (select check_time::date as date, employee_id, check_time-check_time::date as out from test where state = 'Out') as e using (employee_id,date) ; Note that this will do strange things if you don't have precisely four records for each employee. Alternatively use subqueries: select date, employee_id, (select check_time-check_time::date from test where employee_id = x.employee_id and check_time::date = date and state = 'In') as in, (select check_time-check_time::date from test where employee_id = x.employee_id and check_time::date = date and state = 'Break Out') as break_out, (select check_time-check_time::date from test where employee_id = x.employee_id and check_time::date = date and state = 'Break In') as break_in, (select check_time-check_time::date from test where employee_id = x.employee_id and check_time::date = date and state = 'Out') as out from (select distinct employee_id, check_time::date as date from test) as x; This will at least behave fine if there are missing records and will give an error if there are multiple records instead of doing strange things. Neither of these will be particularly pretty on the performance front. -- greg ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [SQL] vacuum analyze slows sql query
patrick ~ <[EMAIL PROTECTED]> writes: > I noticed that a freshly created db with freshly inserted data (from > a previous pg_dump) would result in quite fast results. However, > after running 'vacuum analyze' the very same query slowed down about > 1250x (Time: 1080688.921 ms vs Time: 864.522 ms). If it gets slower immediately after a single vacuum analyze then the problem is that one of the queries is getting planned wrong when stats are available. If it takes a while to slow down then it could be other problems such as index bloat etc. Don't use "explain verbose" use "explain analyze". I don't know if it was in 7.1 but you say you reproduced the problem with 7.4. It would be helpful to see the results of "explain analyze select ..." on the query before and after the vacuum analyze. -- greg ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [SQL] Simple SQL Question
Tomasz Myrta <[EMAIL PROTECTED]> writes: > > select * from table1 LIMIT x > > gives me the first x row of the result. > > After that, I save the last value, and next time, I adjust > > the query as > > select * from table1 where itemkey>:lastvalue LIMIT x > > Why do you complicate it so much? Everything you need is: > > select * from table1 LIMIT x > select * from table1 LIMIT x OFFSET x > select * from table1 LIMIT x OFFSET 2*x > > Remember to sort rows before using limit/offset. There are two good reasons to prefer his Andras' solution to yours. a) If the data is modified between the two queries his will continue from where the previous page left off. Yours will either skip records or overlap with the previous page. b) If itemkey is indexed his will be an efficient index scan that performs similarly regardless of what page is being fetched. Yours will perform more and more slowly as the user gets deeper into the results. Note that both queries are wrong however. You need an "ORDER BY itemkey" or else nothing guarantees the second page has any relation at all to the first page. -- greg ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] Comparing Dates
"Thomas F.O'Connell" <[EMAIL PROTECTED]> writes: > select 2004-06-08; > ?column? > -- > 1990 > > I'm not exactly sure how the bare string is converted internally, but it's > clearly not a complete date like you're expecting. What string? That's just integer arithmetic. -- greg ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [SQL] Making dirty reads possible?
[EMAIL PROTECTED] writes: > But not possible for real at the moment? > > So, summarising: > - Nested transactions is not (yet) supported > - READ UNCOMMITTED isolation level is not (yet) supported > - the EXECUTE plpgsql construct does not circumvent the transaction Well nested transactions are in 8.0 but I don't think they help you much. I find I've been stymied using server-side functions for large batch jobs for pretty much the same reason. I find it works better and it's more flexible to write client-side programs in the language of my choice that connect to the database and do the batch jobs. They can output progress logs or keep information about their progress in some shared space. They can also control the transaction more freely committing in the middle of the job if it's safe. -- greg ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [SQL] replacing mysql enum
Ian Barwick <[EMAIL PROTECTED]> writes: > What I still don't quite understand is why IN in a CHECK context is > handled differently to say: select 1 where 'x' in (null,'a','b','c') ? > This could be a bit of a gotcha for anyone constructing a constraint > similar to the original poster's and not realising it has no effect. well WHERE foo IN (null, ...) returns null if foo isn't explicitly in the list (ie, "it may or may not equal the unknown value in the list"). And I think constraints that return null are deemed to have succeeded. -- greg ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [SQL] Column with recycled sequence value
Andrew Sullivan <[EMAIL PROTECTED]> writes: > You can set the sequence up to cycle (so once it gets to the end, it > wraps around to the beginning again). The keyword is CYCLE at CREATE > SEQUENCE time. It defaults to NO CYCLE. > > One potential problem, of course, are collisions on the table, > because some value wasn't cleared out. It sounds like you don't have > that problem though. Alternatively you can go through the database and make sure all the foreign keys are declared and marked ON UPDATE CASCADE. Then go through and renumber all your entries sequentially starting at 1 and reset your sequence. I'm not sure this is such a hot idea really. But I don't really like the idea of letting the sequence wrap around much either. You'll have to consider the pros and cons of each approach (and of just moving to bigserial too). If you're going to do this you'll want an index on all the foreign key columns. That is, the columns referring to this value from other tables. Otherwise the automatic updates would be very slow. And will probably want to schedule down-time for this. Otherwise application code that holds values in local state might get very confused. I think I would do it with a program that connects and updates each record individually and commits periodically rather than with a single big update. Just because I like having control and having things that give me progress information and can be interrupted without losing work. Oh, and this won't work if you have any external references to these values from outside your database. Say if the value is something like a customer account number that you've previously sent to customers... -- greg ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [SQL] OFFSET impact on Performance???
"Andrei Bintintan" <[EMAIL PROTECTED]> writes: > > If you're using this to provide "pages" of results, could you use a cursor? > What do you mean by that? Cursor? > > Yes I'm using this to provide "pages", but If I jump to the last pages it goes > very slow. The best way to do pages for is not to use offset or cursors but to use an index. This only works if you can enumerate all the sort orders the application might be using and can have an index on each of them. To do this the query would look something like: SELECT * FROM tab WHERE col > ? ORDER BY col LIMIT 50 Then you take note of the last value used on a given page and if the user selects "next" you pass that as the starting point for the next page. This query takes the same amount of time no matter how many records are in the table and no matter what page of the result set the user is on. It should actually be instantaneous even if the user is on the hundredth page of millions of records because it uses an index both for the finding the right point to start and for the ordering. It also has the advantage that it works even if the list of items changes as the user navigates. If you use OFFSET and someone inserts a record in the table then the "next" page will overlap the current page. Worse, if someone deletes a record then "next" will skip a record. The disadvantages of this are a) it's hard (but not impossible) to go backwards. And b) it's impossible to give the user a list of pages and let them skip around willy nilly. (If this is for a web page then specifically don't recommend cursors. It will mean you'll have to have some complex session management system that guarantees the user will always come to the same postgres session and has some garbage collection if the user disappears. And it means the URL is only good for a limited amount of time. If they bookmark it it'll break if they come back the next day.) -- greg ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] [SQL] OFFSET impact on Performance???
Isn't this a prime example of when to use a servlet or something similar in function? It will create the cursor, maintain it, and fetch against it for a particular page. Greg -Original Message- From: Richard Huxton [mailto:[EMAIL PROTECTED] Sent: Thursday, January 20, 2005 10:21 AM To: Andrei Bintintan Cc: pgsql-sql@postgresql.org; pgsql-performance@postgresql.org Subject: Re: [PERFORM] [SQL] OFFSET impact on Performance??? Andrei Bintintan wrote: >> If you're using this to provide "pages" of results, could you use a >> cursor? > > What do you mean by that? Cursor? > > Yes I'm using this to provide "pages", but If I jump to the last pages > it goes very slow. DECLARE mycursor CURSOR FOR SELECT * FROM ... FETCH FORWARD 10 IN mycursor; CLOSE mycursor; Repeated FETCHes would let you step through your results. That won't work if you have a web-app making repeated connections. If you've got a web-application then you'll probably want to insert the results into a cache table for later use. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] [SQL] OFFSET impact on Performance???
Alex Turner <[EMAIL PROTECTED]> writes: > I am also very interesting in this very question.. Is there any way to > declare a persistant cursor that remains open between pg sessions? > This would be better than a temp table because you would not have to > do the initial select and insert into a fresh table and incur those IO > costs, which are often very heavy, and the reason why one would want > to use a cursor. TANSTAAFL. How would such a persistent cursor be implemented if not by building a temporary table somewhere behind the scenes? There could be some advantage if the data were stored in a temporary table marked as not having to be WAL logged. Instead it could be automatically cleared on every database start. -- greg ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [SQL] same question little different test MSSQL vrs Postgres
"Joel Fradkin" <[EMAIL PROTECTED]> writes: > I also tried a simple select * from tblcase where clientum = 'SAKS' Try: explain analyze select * from tblcase where clientum = 'SAKS' Send the output. -- greg ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [SQL] same question little different test MSSQL vrs Postgres
"Joel Fradkin" <[EMAIL PROTECTED]> writes: > QUERY PLAN > "Seq Scan on tblcase (cost=0.00..30066.21 rows=37401 width=996) (actual > time=0.344..962.260 rows=22636 loops=1)" > " Filter: ((clientnum)::text = 'SAKS'::text)" > "Total runtime: 1034.434 ms" Well that says it only took 1s. So it seems this is highly dependent on whether the data is in cache. Perhaps it was in cache on MSSQL when you profiled it there and not on postgres? You could put an index on clientnum, but if the data is usually in cache like this it might not even be necessary. -- greg ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])