Re: [PERFORM] Optimising queries involving unions
Hi, I'm using a workaround for this kind of issues: consider: select A from (select B from T1 where C union select B from T2 where C union select B from T3 where C ) foo where D in your case: SELECT u.txt FROM ( SELECT id, txt FROM largetable1,smalltable t WHERE t.id = u.id AND t.foo = 'bar' UNION ALL SELECT id, txt FROM largetable2,smalltable t WHERE t.id = u.id AND t.foo = 'bar' ) u and select A from foo where C and D (A, B, C, D being everything you want, C and D may also include GROUP BY,ORDER...) The first version will be handled correctly by the optimiser, whereas in the second version, Postgres will first build the UNION and then run the query on it. I'm having large tables with identical structure, one per day. Instead of defining a view on all tables, I' using functions that distribute my query on all tables. The only issue if that I need to define a type that match the result structure and each return type needs its own function. Example: (The first parameter is a schema name, the four next corresponds to A, B, C, D - create type T_i2_vc1 as (int_1 int,int_2 int,vc_1 varchar); CREATE OR REPLACE FUNCTION vq_T_i2_vc1(varchar,varchar,varchar,varchar,varchar) RETURNS setof T_i2_vc1 AS $$ DECLARE result T_i2_vc1%rowtype; mviews RECORD; sql varchar; counter int; BEGIN select into counter 1; -- loop on all daily tables FOR mviews IN SELECT distinct this_day FROM daylist order by plainday desc LOOP IF counter =1 THEN select INTO sql 'SELECT '||mviews.this_day||' AS plainday, '||$2||' FROM '||$3||'_'||mviews.plainday||' WHERE '||$4; ELSE select INTO sql sql||' UNION ALL SELECT '||mviews.this_day||' AS plainday, '||$2||' FROM '||$3||'_'||mviews.plainday||' WHERE '||$4; END IF; select into counter counter+1; END LOOP; select INTO sql 'SELECT '||$1||' FROM ('||sql||')foo '||$5; for result in EXECUTE (sql) LOOP return NEXT result; end loop; return ; END; $$ LANGUAGE plpgsql; Note: in your case the function shoud have a further parameter to join largetable(n) to smalltable in the sub queries HTH, Marc I've got a query that I think the query optimiser should be able to work it's magic on but it doesn't! I've had a look around and asked on the IRC channel and found that the current code doesn't attempt to optimise for what I'm asking it to do at the moment. Here's a bad example: SELECT u.txt FROM smalltable t, ( SELECT id, txt FROM largetable1 UNION ALL SELECT id, txt FROM largetable2) u WHERE t.id = u.id AND t.foo = 'bar'; I was hoping that smalltable would get moved up into the union, but it doesn't at the moment and the database does a LOT of extra work. In this case, I can manually do quite a couple of transforms to move things around and it does the right thing: SELECT txt FROM ( SELECT l.id as lid, r.id as rid, r.foo, l.txt FROM largetable1 l, smalltable r UNION ALL SELECT l.id as lid, r.id as rid, r.foo, l.txt FROM largetable1 l, smalltable r) WHERE foo = 'bar'; AND lid = rid The optimiser is intelligent enough to move the where clauses up into the union and end end up with a reasonably optimal query. Unfortunatly, in real life, the query is much larger and reorganising everything manually isn't really feasible! -- Weitersagen: GMX DSL-Flatrates mit Tempo-Garantie! Ab 4,99 Euro/Monat: http://www.gmx.net/de/go/dsl ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] slow queries, possibly disk io
On 5/26/05, Christopher Kings-Lynne [EMAIL PROTECTED] wrote: I have some queries that have significan't slowed down in the last couple days. It's gone from 10 seconds to over 2 mins. The cpu has never gone over 35% in the servers lifetime, but the load average is over 8.0 right now. I'm assuming this is probably due to disk io. You sure it's not a severe lack of vacuuming that's the problem? It's vacuumed hourly. If it needs to be more than that I could do it I guess. But from everything I've been told, hourly should be enough. -Josh ---(end of broadcast)--- TIP 8: explain analyze is your friend
[PERFORM] slow queries, possibly disk io
Few mandatory questions: 1. Do you vacuum your db on regular basis? :) It's vacuumed once every hour. The table sizes and data are constantly changing. 2. Perhaps statistics for tables in question are out of date, did you try alter table set statistics? No I haven't. What would that do for me? 3. explain analyze of the slow query? Here is the function that is ran: CREATE OR REPLACE FUNCTION adaption.funmsgspermin() RETURNS int4 AS ' DECLARE this_rServerIds RECORD; this_sQuery TEXT; this_iMsgsPerMin INT; this_rNumSentRECORD; BEGIN this_iMsgsPerMin := 0; FOR this_rServerIds IN SELECT iId FROM adaption.tblServers LOOP this_sQuery := \' SELECT COALESCE( SUM( iNumSent ), 0 ) AS iNumSent FROM adaption.tblBatchHistory_\' || this_rServerIds.iId || \' WHERE tStamp now() - interval \'\'5 mins\'\'; \'; FOR this_rNumSent IN EXECUTE this_sQuery LOOP this_iMsgsPerMin := this_iMsgsPerMin + this_rNumSent.iNumSent; END LOOP; END LOOP; this_iMsgsPerMin := this_iMsgsPerMin / 5; RETURN this_iMsgsPerMin; END; ' LANGUAGE 'plpgsql' VOLATILE; Here is the explain analyze of one loops of the sum: Aggregate (cost=31038.04..31038.04 rows=1 width=4) (actual time=14649.602..14649.604 rows=1 loops=1) - Seq Scan on tblbatchhistory_1 (cost=0.00..30907.03 rows=52401 width=4) (actual time=6339.223..14648.433 rows=919 loops=1) Filter: (tstamp (now() - '00:05:00'::interval)) Total runtime: 14649.709 ms 4. if you for some reason cannot give explain analyze, please try to describe the type of query (what kind of join(s)) and amount of data found in the tables. 2 minutes from 10 seconds is a huge leap, and it may mean that PostgreSQL for some reason is not planning as well as it could. Throwing more RAM at the problem can help, but it would be better to hint the planner to do the right thing. It may be a good time to play with planner variables. :) Is there any documentation on planner vars? And how would I throw more ram at it? It has 2 gigs right now. How do I know if postgres is using that? -Josh ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
[PERFORM] postgresql-8.0.1 performance tuning
Hi @ all, i'm trying to tune my postgresql-db but i don't know if the values are right set. I use the following environment for the postgres-db: # Hardware cpu: 2x P4 3Ghz ram: 1024MB DDR 266Mhz partitions: /dev/sda3 23G 9,6G 13G 44% / /dev/sda1 11G 156M 9,9G 2% /var /dev/sdb1 69G 13G 57G 19% /var/lib/pgsql /dev/sda is in raid 1 (2x 35GB / 1upm / sca) /dev/sdb is in raid 10 (4x 35GB / 1upm / sca) # /Hardware # Config /etc/sysctl.conf: kernel.shmall = 786432000 kernel.shmmax = 786432000 /etc/fstab: /dev/sdb1 /var/lib/pgsql reiserfs acl,user_xattr,noatime,data="" 1 2 /var/lib/pgsql/data/postgresql.conf superuser_reserved_connections = 2 shared_buffers = 3000 work_mem = 131072 maintenance_work_mem = 131072 max_stack_depth = 2048 max_fsm_pages = 2 max_fsm_relations = 1000 max_files_per_process = 1000 vacuum_cost_delay = 10 vacuum_cost_page_hit = 1 vacuum_cost_page_miss = 10 vacuum_cost_page_dirty = 20 vacuum_cost_limit = 200 bgwriter_delay = 200 bgwriter_percent = 1 bgwriter_maxpages = 100 fsync = true wal_sync_method = fsync wal_buffers = 64 commit_delay = 0 commit_siblings = 5 checkpoint_segments = 256 checkpoint_timeout = 900 checkpoint_warning = 30 effective_cache_size = 1 random_page_cost = 4 cpu_tuple_cost = 0.01 cpu_index_tuple_cost = 0.001 cpu_operator_cost = 0.0025 geqo = true geqo_threshold = 12 geqo_effort = 5 geqo_pool_size = 0 geqo_generations = 0 geqo_selection_bias = 2.0 deadlock_timeout = 1000 max_locks_per_transaction = 64 # /Config # Transactions we have about 115-300 transactions/min in about 65 tables. # /Transactions I'm really new at using postgres. So i need some experience to set this parameters in the postgresql- and the system-config. I can't find standard calculations for this. :/ The postgresql-documentation doesn't help me to set the best values for this. The database must be high-availble. I configured rsync to sync the complete /var/lib/pgsql-directory to my hot-standby. On the hotstandby i will make the dumps of the database to improve the performance of the master-db. In my tests the synchronization works fine. I synchronised the hole directory and restarted the database of the hotstandby. While restarting, postgresql turned back the old (not archived) wals and the database of my hotstandby was consistent. Is this solution recommended? Or must i use archived wal's with real system-snapshots? best regards, Martin Fandel
Re: [PERFORM] slow queries, possibly disk io
Josh Close [EMAIL PROTECTED] writes: this_sQuery := \' SELECT COALESCE( SUM( iNumSent ), 0 ) AS iNumSent FROM adaption.tblBatchHistory_\' || this_rServerIds.iId || \' WHERE tStamp now() - interval \'\'5 mins\'\'; \'; Here is the explain analyze of one loops of the sum: Aggregate (cost=31038.04..31038.04 rows=1 width=4) (actual time=14649.602..14649.604 rows=1 loops=1) - Seq Scan on tblbatchhistory_1 (cost=0.00..30907.03 rows=52401 width=4) (actual time=6339.223..14648.433 rows=919 loops=1) Filter: (tstamp (now() - '00:05:00'::interval)) Total runtime: 14649.709 ms I think you really want that seqscan to be an indexscan, instead. I'm betting this is PG 7.4.something? If so, probably the only way to make it happen is to simplify the now() expression to a constant: SELECT COALESCE( SUM( iNumSent ), 0 ) AS iNumSent FROM adaption.tblBatchHistory_\' || this_rServerIds.iId || \' WHERE tStamp \\\'' || (now() - interval \'5 mins\')::text || \'\\\'\'; because pre-8.0 the planner won't realize that the inequality is selective enough to favor an indexscan, unless it's comparing to a simple constant. (BTW, 8.0's dollar quoting makes this sort of thing a lot less painful) regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] slow queries, possibly disk io
I think you really want that seqscan to be an indexscan, instead. I'm betting this is PG 7.4.something? If so, probably the only way to make it happen is to simplify the now() expression to a constant: SELECT COALESCE( SUM( iNumSent ), 0 ) AS iNumSent FROM adaption.tblBatchHistory_\' || this_rServerIds.iId || \' WHERE tStamp \\\'' || (now() - interval \'5 mins\')::text || \'\\\'\'; The dollar sign thing would be a lot easier. I can't get this to work. I'm using a db manager where I can just use ' instead of \'. How would it look for that? In other words, it doesn't have the create or replace function as ' --stuff ' language 'plpgsql' it just has the actual function. Makes things a little easier. I'm getting an error at or near 5. because pre-8.0 the planner won't realize that the inequality is selective enough to favor an indexscan, unless it's comparing to a simple constant. (BTW, 8.0's dollar quoting makes this sort of thing a lot less painful) regards, tom lane -- -Josh ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] slow queries, possibly disk io
Doing the query explain SELECT COALESCE( SUM( iNumSent ), 0 ) AS iNumSent FROM adaption.tblBatchHistory_1 WHERE tStamp ( now() - interval '5 mins' )::text gives me this: Aggregate (cost=32138.33..32138.33 rows=1 width=4) - Seq Scan on tblbatchhistory_1 (cost=0.00..31996.10 rows=56891 width=4) Filter: ((tstamp)::text ((now() - '00:05:00'::interval))::text) Still not an index scan. On 5/27/05, Tom Lane [EMAIL PROTECTED] wrote: Josh Close [EMAIL PROTECTED] writes: this_sQuery := \' SELECT COALESCE( SUM( iNumSent ), 0 ) AS iNumSent FROM adaption.tblBatchHistory_\' || this_rServerIds.iId || \' WHERE tStamp now() - interval \'\'5 mins\'\'; \'; Here is the explain analyze of one loops of the sum: Aggregate (cost=31038.04..31038.04 rows=1 width=4) (actual time=14649.602..14649.604 rows=1 loops=1) - Seq Scan on tblbatchhistory_1 (cost=0.00..30907.03 rows=52401 width=4) (actual time=6339.223..14648.433 rows=919 loops=1) Filter: (tstamp (now() - '00:05:00'::interval)) Total runtime: 14649.709 ms I think you really want that seqscan to be an indexscan, instead. I'm betting this is PG 7.4.something? If so, probably the only way to make it happen is to simplify the now() expression to a constant: SELECT COALESCE( SUM( iNumSent ), 0 ) AS iNumSent FROM adaption.tblBatchHistory_\' || this_rServerIds.iId || \' WHERE tStamp \\\'' || (now() - interval \'5 mins\')::text || \'\\\'\'; because pre-8.0 the planner won't realize that the inequality is selective enough to favor an indexscan, unless it's comparing to a simple constant. (BTW, 8.0's dollar quoting makes this sort of thing a lot less painful) regards, tom lane -- -Josh ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[PERFORM] OID vs overall system performances on high load databases.
What are the effect of having a table with arround 500 insert/update/delete on two to eight table in a time frame of 2 minutes 24/24h, when you have oid enabled versus the same setup when you dont have oid? That deployment is done on a postgres with 8 to 9 databases, each having those 2 to 8 high load tables with oid enabled. Would the oid colum slow down table scan when you have over 20 millions row? Would the cost of maintaining the oid column inside thoses high load tables when there is no oid reference used for data seeking costy for postgres ressources!? Eric Lauzon [Recherche Développement] Above Sécurité / Above Security Tél : (450) 430-8166 Fax : (450) 430-1858 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[PERFORM] Redundant indexes?
Would I be correct in assuming that the following two indexes are completely redundant except for the fact that one complains about uniqueness constraint violations and the other does not? Or is there are legitimate use for having BOTH indexes? I'm trying to figure out if it's okay to delete the non-unique index. (I have a bunch of tables suffering this malady from some problematic application code). Table public.erf Column | Type | Modifiers +-+--- rid| integer | not null cid| integer | not null Indexes: erf_rid_key unique btree (rid), erf_rid_idx btree (rid) Index public.erf_rid_idx Column | Type +- rid| integer btree, for table public.erf Index public.erf_rid_key Column | Type +- rid| integer unique, btree, for table public.erf ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Redundant indexes?
Jeffrey Tenny [EMAIL PROTECTED] writes: Would I be correct in assuming that the following two indexes are completely redundant except for the fact that one complains about uniqueness constraint violations and the other does not? Yup ... regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] OID vs overall system performances on high load
On Fri, 2005-05-27 at 13:05 -0400, Eric Lauzon wrote: What are the effect of having a table with arround 500 insert/update/delete on two to eight table in a time frame of 2 minutes 24/24h, when you have oid enabled versus the same setup when you dont have oid? That deployment is done on a postgres with 8 to 9 databases, each having those 2 to 8 high load tables with oid enabled. Would the oid colum slow down table scan when you have over 20 millions row? Would the cost of maintaining the oid column inside thoses high load tables when there is no oid reference used for data seeking costy for postgres ressources!? The OID column is an extra few bytes on each row. If you don't have any use for it (and let's face it: most of us don't), then create your tables without OID. The amount of impact that it makes will depend on what the general row size is. If they are rows with a couple of integers then the size of an OID column will be a significant portion of the size of each row, and removing it will make the physical on-disk data size significantly smaller. If the size of the average row is (e.g.) 2k then the OID will only be a very small fraction of the data, and removing it will only make a small difference. Regards, Andrew McMillan. - Andrew @ Catalyst .Net .NZ Ltd, PO Box 11-053, Manners St, Wellington WEB: http://catalyst.net.nz/PHYS: Level 2, 150-154 Willis St DDI: +64(4)803-2201 MOB: +64(272)DEBIAN OFFICE: +64(4)499-2267 ... I want a COLOR T.V. and a VIBRATING BED!!! - signature.asc Description: This is a digitally signed message part