Re: [PERFORM] could not open relation...
On 04/10/2010 12:29 AM, Tom Lane [...@sss.pgh.pa.us] wrote: Seems a bit off-topic for pgsql-performance, What would be the appropriate forum? but anyway: the main known cause for that is if one of the tables used in the query got dropped (by another session) just after the query started. Could that have happened to you? interesting and possible; but why doesn't locking prevent this? Brian -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] could not open relation...
I saw this in the postgres log. Anyone know what would cause this? Thanks, Brian postgres 8.3.5 on RHEL4 update 6 [3358-cemdb-admin-2010-04-09 04:00:19.029 PDT]ERROR: could not open relation with OID 170592 [3358-cemdb-admin-2010-04-09 04:00:19.029 PDT]STATEMENT: select lm.ts_login_name,sm.ts_session_id from ts_user_logins_map lm join ts_user_sessions_map sm on sm.ts_user_id=lm.ts_user_id where not sm.ts_soft_delete and not lm.ts_soft_delete and lm.ts_user_id != 1 and lm.ts_app_id in (61) order by sm.ts_creation_time -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] query slow; strace output worrisome
On 04/06/2010 01:18 AM, Craig Ringer [cr...@postnewspapers.com.au] wrote: I'm wondering if the issue is with strace rather than Pg. That is to say, that strace is trying to print: Thanks, Craig: I do think that this is a strace issue. As for what Pg is doing: creat() returns -1 on error and a file descriptor otherwise, so the return value appears to indicate success. I'm kind of wondering what the Pg backend is actually _doing_ though - if it was using sort temp files you'd expect open()/write()/read()/close() not just creat() calls. My quesiton exactly: what is the backend doing calling creat() over and over again? Note that this query does complete -- in 30-40 mins. Brian -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] query slow; strace output worrisome
The SELECT show below has been running for 30+mins and the strace output is alarming: [r...@dione ~]# strace -p 10083 Process 10083 attached - interrupt to quit creat(umovestr: Input/output error 0x2, 0377)= 1025220608 creat(umovestr: Input/output error 0x2, 0377)= 1025220608 creat(umovestr: Input/output error 0x2, 0377)= 1025220608 creat(NULL, 0377) = 216203264 restart_syscall(... resuming interrupted call ...) = 8192 creat(umovestr: Input/output error 0x2, 0377)= 1025220608 creat(umovestr: Input/output error 0x2, 0377)= 1025220608 creat(umovestr: Input/output error 0x2, 0377)= 1025220608 creat(umovestr: Input/output error 0x2, 0377)= 1025220608 however, I can find no indication of I/O errors in the postgres log. Anyone have any idea what's going on here? postgres 8.3.7 [r...@dione ~]# uname -a Linux dione.ca.com 2.6.18-164.el5 #1 SMP Tue Aug 18 15:51:48 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux Thanks, Brian top: PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 10083 postgres 25 0 2288m 822m 817m R 100.0 1.7 28:08.79 postgres cemdb=# select procpid,query_start,current_query from pg_stat_activity; 10083 | 2010-04-05 17:18:34.989022-07 | select b.ts_id from ts_stats_transetgroup_usergroup_daily b , ts_stats_transet_user_interval c, ts_transetgroup_transets_map tm, ts_usergroup_users_map um where b.ts_transet_group_id = tm.ts_transet_group_id and tm.ts_transet_incarnation_id = c.ts_transet_incarnation_id and b.ts_usergroup_id = um.ts_usergroup_id and um.ts_user_incarnation_id = c.ts_user_incarnation_id and c.ts_interval_start_time = '2010-03-29 21:00' and c.ts_interval_start_time '2010-03-29 22:00' and b.ts_interval_start_time = '2010-03-29 00:00' and b.ts_interval_start_time '2010-03-30 00:00' cemdb=# explain select b.ts_id from ts_stats_transetgroup_usergroup_daily b , ts_stats_transet_user_interval c, ts_transetgroup_transets_map tm, ts_usergroup_users_map um where b.ts_transet_group_id = tm.ts_transet_group_id and tm.ts_transet_incarnation_id = c.ts_transet_incarnation_id and b.ts_usergroup_id = um.ts_usergroup_id and um.ts_user_incarnation_id = c.ts_user_incarnation_id and c.ts_interval_start_time = '2010-03-29 21:00' and c.ts_interval_start_time '2010-03-29 22:00' and b.ts_interval_start_time = '2010-03-29 00:00' and b.ts_interval_start_time '2010-03-30 00:00'; QUERY PLAN --- Hash Join (cost=1169.95..7046.23 rows=1 width=8) Hash Cond: ((b.ts_transet_group_id = tm.ts_transet_group_id) AND (c.ts_transet_incarnation_id = tm.ts_transet_incarnation_id)) - Nested Loop (cost=1159.95..7036.15 rows=10 width=24) - Nested Loop (cost=0.00..28.16 rows=6 width=24) - Index Scan using ts_stats_transetgroup_usergroup_daily_starttimeindex on ts_stats_transetgroup_usergroup_daily b (cost=0.00..8.86 rows=1 width=24) Index Cond: ((ts_interval_start_time = '2010-03-29 00:00:00-07'::timestamp with time zone) AND (ts_interval_start_time '2010-03-30 00:00:00-07'::timestamp with time zone)) - Index Scan using ts_usergroup_users_map_usergroupindex on ts_usergroup_users_map um (cost=0.00..19.23 rows=6 width=16) Index Cond: (um.ts_usergroup_id = b.ts_usergroup_id) - Bitmap Heap Scan on ts_stats_transet_user_interval c (cost=1159.95..1167.97 rows=2 width=16) Recheck Cond: ((c.ts_user_incarnation_id = um.ts_user_incarnation_id) AND (c.ts_interval_start_time = '2010-03-29 21:00:00-07'::timestamp with time zone) AND (c.ts_interval_start_time '2010-03-29 22:00:00-07'::timestamp with time zone)) - BitmapAnd (cost=1159.95..1159.95 rows=2 width=0) - Bitmap Index Scan on ts_stats_transet_user_interval_userincarnationidindex (cost=0.00..14.40 rows=438 width=0) Index Cond: (c.ts_user_incarnation_id = um.ts_user_incarnation_id) - Bitmap Index Scan on ts_stats_transet_user_interval_starttime (cost=0.00..1134.09 rows=44856 width=0) Index Cond: ((c.ts_interval_start_time = '2010-03-29 21:00:00-07'::timestamp with time zone) AND (c.ts_interval_start_time '2010-03-29 22:00:00-07'::timestamp with time zone)) - Hash (cost=7.00..7.00 rows=200 width=16) - Seq Scan on ts_transetgroup_transets_map tm (cost=0.00..7.00 rows=200 width=16) (17 rows) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription:
Re: [PERFORM] query slow; strace output worrisome
On 04/05/2010 09:53 PM, Craig Ringer [cr...@postnewspapers.com.au] wrote: Anything in `dmesg' (command) or /var/log/syslog ? nothing out of the ordinary. Brian -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] query has huge variance in execution times
On 03/31/2010 12:37 AM, David Wilson [david.t.wil...@gmail.com] wrote: These won't necessarily get the same plan. If you want to see what plan the prepared query is getting, you'll need to prepare it (prepare foo as query) and then explain *that* via explain execute foo. The prepared version likely has a much more generic plan, whereas the regular query gets optimized for the actual values provided. I didn't know this. Thanks. The plans are indeed different: cemdb=# prepare sq as select b.ts_id from ts_stats_tranunit_user_daily b, ts_stats_tranunit_user_interval c where b.ts_transet_incarnation_id = c.ts_transet_incarnation_id and b.ts_tranunit_id = c.ts_tranunit_id and b.ts_user_incarnation_id = c.ts_user_incarnation_id and c.ts_interval_start_time = $1 and c.ts_interval_start_time $2 and b.ts_interval_start_time = $3 and b.ts_interval_start_time $4; cemdb=# explain execute sq('2010-3-29 01:00', '2010-3-29 02:00', '2010-3-29', '2010-3-30'); QUERY PLAN - Merge Join (cost=7885.37..8085.91 rows=30 width=8) Merge Cond: ((b.ts_transet_incarnation_id = c.ts_transet_incarnation_id) AND (b.ts_tranunit_id = c.ts_tranunit_id) AND (b.ts_user_incarnation_id = c.ts_user_incarnation_id)) - Sort (cost=1711.82..1716.81 rows=3994 width=32) Sort Key: b.ts_transet_incarnation_id, b.ts_tranunit_id, b.ts_user_incarnation_id - Index Scan using ts_stats_tranunit_user_daily_starttime on ts_stats_tranunit_user_daily b (cost=0.00..1592.36 rows=3994 width=32) Index Cond: ((ts_interval_start_time = $3) AND (ts_interval_start_time $4)) cemdb=# explain select b.ts_id from ts_stats_tranunit_user_daily b, ts_stats_tranunit_user_interval c where b.ts_transet_incarnation_id = c.ts_transet_incarnation_id and b.ts_tranunit_id = c.ts_tranunit_id and b.ts_user_incarnation_id = c.ts_user_incarnation_id and c.ts_interval_start_time = '2010-3-29 01:00' and c.ts_interval_start_time '2010-3-29 02:00' and b.ts_interval_start_time = '2010-3-29' and b.ts_interval_start_time '2010-3-30'; QUERY PLAN -- Hash Join (cost=291965.90..335021.46 rows=13146 width=8) Hash Cond: ((c.ts_transet_incarnation_id = b.ts_transet_incarnation_id) AND (c.ts_tranunit_id = b.ts_tranunit_id) AND (c.ts_user_incarnation_id = b.ts_user_incarnation_id)) - Index Scan using ts_stats_tranunit_user_interval_starttime on ts_stats_tranunit_user_interval c (cost=0.00..11783.36 rows=88529 width=24) Index Cond: ((ts_interval_start_time = '2010-03-29 01:00:00-07'::timestamp with time zone) AND (ts_interval_start_time '2010-03-29 02:00:00-07'::timestamp with time zone)) - Hash (cost=285681.32..285681.32 rows=718238 width=32) - Index Scan using ts_stats_tranunit_user_daily_starttime on ts_stats_tranunit_user_daily b (cost=0.00..285681.32 rows=718238 width=32) Index Cond: ((ts_interval_start_time = '2010-03-29 00:00:00-07'::timestamp with time zone) AND (ts_interval_start_time '2010-03-30 00:00:00-07'::timestamp with time zone)) (7 rows) - Sort (cost=6173.55..6218.65 rows=36085 width=24) Sort Key: c.ts_transet_incarnation_id, c.ts_tranunit_id, c.ts_user_incarnation_id - Index Scan using ts_stats_tranunit_user_interval_starttime on ts_stats_tranunit_user_interval c (cost=0.00..4807.81 rows=36085 width=24) Index Cond: ((ts_interval_start_time = $1) AND (ts_interval_start_time $2)) (10 rows) I notice that the row estimates are substantially different; this is due to the lack of actual values? But, this prepared query runs in ~4 secs: [r...@rdl64xeoserv01 log]# cat /tmp/select.sql prepare sq as select b.ts_id from ts_stats_tranunit_user_daily b, ts_stats_tranunit_user_interval c where b.ts_transet_incarnation_id = c.ts_transet_incarnation_id and b.ts_tranunit_id = c.ts_tranunit_id and b.ts_user_incarnation_id = c.ts_user_incarnation_id and c.ts_interval_start_time = $1 and c.ts_interval_start_time $2 and b.ts_interval_start_time = $3 and b.ts_interval_start_time $4; execute sq('2010-3-29 01:00', '2010-3-29 02:00', '2010-3-29', '2010-3-30 [r...@rdl64xeoserv01 log]# time PGPASSWORD=quality psql -U postgres -d cemdb -f /tmp/select.sql /tmp/select1.txt 21 real0m4.131s user0m0.119s sys 0m0.007s so the question still remains: why did it take 20 mins? To see if it was due to autovacuum running ANALYZE, I turned off autovacuum, created a table using SELECT * INTO temp FROM ts_stats_tranunit_user_daily, added the index on ts_interval_start_time and ran the prepared
[PERFORM] query has huge variance in execution times
postgres 8.3.5 on RHEL4 update 6 This query starts executing at 18:41: cemdb= select query_start,current_query from pg_stat_activity where procpid=10022; query_start | current_query ---+- 2010-03-30 18:41:11.685261-07 | select b.ts_id from ts_stats_tranunit_user_daily b, ts_stats_tranunit_user_interval c where b.ts_transet_incarnation_id = c.ts_transet_incarnation_id and b.ts_tranunit_id = c.ts_tranunit_id and b.ts_user_incarnation_id = c.ts_user_incarnation_id and c.ts_interval_start_time = $1 and c.ts_interval_start_time $2 and b.ts_interval_start_time = $3 and b.ts_interval_start_time $4 (1 row) about 5 mins later, I, suspecting problems, do (the values are the same as for $1 et al above; EXPLAIN was done on purpose to keep stats [hopefully] the same as when pid 10022 started; there are 80,000 rows in each of the 2 tables at the time of this EXPLAIN and when 10022 started): cemdb= explain select b.ts_id from ts_stats_tranunit_user_daily b, ts_stats_tranunit_user_interval c where b.ts_transet_incarnation_id = c.ts_transet_incarnation_id and b.ts_tranunit_id = c.ts_tranunit_id and b.ts_user_incarnation_id = c.ts_user_incarnation_id and c.ts_interval_start_time = '2010-3-29 01:00' and c.ts_interval_start_time '2010-3-29 02:00' and b.ts_interval_start_time = '2010-3-29' and b.ts_interval_start_time '2010-3-30'; QUERY PLAN -- Merge Join (cost=33574.89..34369.38 rows=25207 width=8) Merge Cond: ((b.ts_transet_incarnation_id = c.ts_transet_incarnation_id) AND (b.ts_tranunit_id = c.ts_tranunit_id) AND (b.ts_user_incarnation_id = c.ts_user_incarnation_id)) - Sort (cost=13756.68..13854.96 rows=78623 width=32) Sort Key: b.ts_transet_incarnation_id, b.ts_tranunit_id, b.ts_user_incarnation_id - Index Scan using ts_stats_tranunit_user_daily_starttime on ts_stats_tranunit_user_daily b (cost=0.00..10560.13 rows=78623 width=32) Index Cond: ((ts_interval_start_time = '2010-03-29 00:00:00-07'::timestamp with time zone) AND (ts_interval_start_time '2010-03-30 00:00:00-07'::timestamp with time zone)) - Sort (cost=19818.21..19959.72 rows=113207 width=24) Sort Key: c.ts_transet_incarnation_id, c.ts_tranunit_id, c.ts_user_incarnation_id - Index Scan using ts_stats_tranunit_user_interval_starttime on ts_stats_tranunit_user_interval c (cost=0.00..15066.74 rows=113207 width=24) Index Cond: ((ts_interval_start_time = '2010-03-29 01:00:00-07'::timestamp with time zone) AND (ts_interval_start_time '2010-03-29 02:00:00-07'::timestamp with time zone)) (10 rows) cemdb= \q I then run the query manually: [r...@rdl64xeoserv01 log]# time PGPASSWORD=quality psql -U admin -d cemdb -c select b.ts_id from ts_stats_tranunit_user_daily b, ts_stats_tranunit_user_interval c where b.ts_transet_incarnation_id = c.ts_transet_incarnation_id and b.ts_tranunit_id = c.ts_tranunit_id and b.ts_user_incarnation_id = c.ts_user_incarnation_id and c.ts_interval_start_time = '2010-3-29 01:00' and c.ts_interval_start_time '2010-3-29 02:00' and b.ts_interval_start_time = '2010-3-29' and b.ts_interval_start_time '2010-3-30' /tmp/select.txt 21 real0m0.813s user0m0.116s sys 0m0.013s I let process 10022 run for an hour. an strace shows lots of I/O: [r...@rdl64xeoserv01 log]# strace -p 10022 read(18, \214\2\0\0\374\200#\1\0\0\0\0P\3\0 \4 \0\0\0\0\320\234..., 8192) = 8192 semop(73007122, 0xbfe0fc20, 1) = 0 _llseek(18, 538451968, [538451968], SEEK_SET) = 0 read(18, \214\2\0\0\274\347\t#\1\0\0\0\0P\3\0 \4 \0\0\0\0\320\234..., 8192) = 8192 _llseek(18, 535928832, [535928832], SEEK_SET) = 0 read(18, \214\2\0\0\310\300\226\\1\0\0\0\0P\3\0 \4 \0\0\0\0\320..., 8192) = 8192 _llseek(18, 532398080, [532398080], SEEK_SET) = 0 many more similar lines I then kill 10022 and the application retries the same query: [10022-cemdb-admin-2010-03-30 19:02:37.460 PDT]FATAL: terminating connection due to administrator command [10022-cemdb-admin-2010-03-30 19:02:37.460 PDT]STATEMENT: select b.ts_id from ts_stats_tranunit_user_daily b, ts_stats_tranunit_user_interval c where b.ts_transet_incarnation_id = c.ts_transet_incarnation_id and b.ts_tranunit_id = c.ts_tranunit_id and b.ts_user_incarnation_id = c.ts_user_incarnation_id and c.ts_interval_start_time = $1 and
Re: [PERFORM] query looping?
On 01/05/2010 08:34 PM, Robert Haas [robertmh...@gmail.com] wrote: - If you have other queries where this index helps (even though it is hurting this one), then you're going to have to find a way to execute the query without using bound parameters - i.e. with the actual values in there instead of $1 through $4. That will allow the planner to see that the index scan is a loser because it will see that there are a lot of rows in the specified range of ts_interval_start_times. I think that this is possible without too much work. FYI - this test is still running and the same query has been executed at least 2 more times (it gets done 1-24 times per day) since it took 124M ms with acceptable response times (several secs). I don't see how either of the 2 query plans posted could've taken that long (and the actually execution times I posted confirm this), so I'm assuming that some other plan was used for the 124M ms execution. Seems like it must have been some NxM plan. Do you think that autovacuuming more frequently would prevent the query planner from making this poor choice? Brian -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] query looping?
SELECT SUM(1) FROM ts_stats_transetgroup_user_weekly b WHERE ts_interval_start_time [value] AND ts_interval_start_time [value]; ...and similarly for the bitmap index scan. cemdb= SELECT SUM(1) FROM ts_stats_transetgroup_user_weekly b WHERE ts_interval_start_time = '2009-12-28' AND ts_interval_start_time '2010-01-04'; sum --- 89758 (1 row) cemdb= select sum(1) from ts_stats_transet_user_interval where ts_interval_start_time = '2009-01-03' and ts_interval_start_time '2009-01-03 08:00'; sum - (1 row) cemdb= select sum(1) from ts_stats_transet_user_interval where ts_interval_start_time = '2010-01-03' and ts_interval_start_time '2010-01-03 08:00'; sum 80 (1 row) the estimates in the 1st query plan are OK (since they are the same). The 2nd, however, look to be too low. FYI, this query finally completed, so it wasn't looping but the query plan is very poor: [24585-cemdb-admin-2010-01-05 10:54:49.511 PST]LOG: duration: 124676746.863 ms execute unnamed: select count(distinct b.ts_id) from ts_stats_transetgroup_user_weekly b, ts_stats_transet_user_interval c, ts_transetgroup_transets_map m where b.ts_transet_group_id = m.ts_transet_group_id and m.ts_transet_incarnation_id = c.ts_transet_incarnation_id and c.ts_user_incarnation_id = b.ts_user_incarnation_id and c.ts_interval_start_time = $1 and c.ts_interval_start_time $2 and b.ts_interval_start_time = $3 and b.ts_interval_start_time $4 [24585-cemdb-admin-2010-01-05 10:54:49.511 PST]DETAIL: parameters: $1 = '2010-01-03 00:00:00-08', $2 = '2010-01-03 08:00:00-08', $3 = '2010-01-01 00:00:00-08', $4 = '2010-01-04 00:00:00-08' compare to: [r...@rdl64xeoserv01 log]# time PGPASSWORD= psql -U admin -d cemdb -c select count(distinct b.ts_id) from ts_stats_transetgroup_user_weekly b, ts_stats_transet_user_interval c, ts_transetgroup_transets_map m where b.ts_transet_group_id = m.ts_transet_group_id and m.ts_transet_incarnation_id = c.ts_transet_incarnation_id and c.ts_user_incarnation_id = b.ts_user_incarnation_id and c.ts_interval_start_time = '2010-01-03 00:00' and c.ts_interval_start_time '2010-01-03 08:00' and b.ts_interval_start_time = '2009-12-28 00:00' and b.ts_interval_start_time '2010-01-04 00:00' count --- 89758 (1 row) real0m3.804s user0m0.001s sys 0m0.003s so why the former ~40,000 times slower? Thanks, Brian -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] query looping?
The query shown below [select count(distinct...] seems to be looping (99-101% CPU as shown by top for 11+ hours). This using postgres 8.3.5 on a dual quad core machine (Intel(R) Xeon(R) CPU X5460 @ 3.16GHz) with 32G RAM. Can I provide any other info to help investigate this issue? Or any thoughts on how to prevent/avoid it? Thanks, Brian top - 11:03:39 up 91 days, 22:39, 2 users, load average: 3.73, 2.14, 1.42 Tasks: 135 total, 3 running, 132 sleeping, 0 stopped, 0 zombie Cpu(s): 27.3% us, 7.7% sy, 0.0% ni, 54.0% id, 11.0% wa, 0.0% hi, 0.0% si Mem: 33264272k total, 33247780k used,16492k free,17232k buffers Swap: 4088532k total, 334264k used, 3754268k free, 26760304k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 24585 postgres 17 0 572m 494m 484m R 99 1.5 646:13.63 postmaster cemdb=# select procpid,query_start,current_query from pg_stat_activity; procpid | query_start | current_query -+---+--- 13210 | 2010-01-04 10:54:04.490107-08 | IDLE 24496 | 2010-01-04 10:56:14.982997-08 | IDLE 30636 | 2010-01-04 10:54:04.488569-08 | IDLE 5309 | 2010-01-04 10:56:22.850159-08 | select procpid,query_start,current_query from pg_stat_activity; 30637 | 2010-01-04 10:54:04.490152-08 | IDLE 24500 | 2010-01-04 10:56:14.98354-08 | IDLE 13213 | 2010-01-04 10:54:04.491743-08 | IDLE 13214 | 2010-01-04 10:56:04.197887-08 | IDLE 24499 | 2010-01-04 10:56:14.981709-08 | IDLE 24502 | 2010-01-04 10:56:14.985027-08 | IDLE 13217 | 2010-01-04 10:54:04.487039-08 | IDLE 24504 | 2010-01-04 10:56:14.984631-08 | IDLE 24505 | 2010-01-04 10:56:14.985248-08 | IDLE 27938 | 2010-01-04 10:54:04.485782-08 | IDLE 1104 | 2010-01-04 10:54:04.489896-08 | IDLE 27941 | 2010-01-04 10:54:04.491925-08 | IDLE 24585 | 2010-01-04 00:16:52.764899-08 | select count(distinct b.ts_id) from ts_stats_transetgroup_user_weekly b, ts_stats_transet_user_interval c, ts_transetgroup_transets_map m where b.ts_transet_group_id = m.ts_transet_group_id and m.ts_transet_incarnation_id = c.ts_transet_incarnation_id and c.ts_user_incarnation_id = b.ts_user_incarnation_id and c.ts_interval_start_time = $1 and c.ts_interval_start_time $2 and b.ts_interval_start_time = $3 and b.ts_interval_start_time $4 24586 | 2010-01-04 10:56:14.986201-08 | IDLE 13224 | 2010-01-04 10:54:04.487762-08 | IDLE 24591 | 2010-01-04 10:56:14.98333-08 | IDLE 24592 | 2010-01-04 10:56:14.98157-08 | IDLE (21 rows) cemdb=# select locktype,database,relation,virtualtransaction,mode,granted from pg_locks where pid=24585; locktype | database | relation | virtualtransaction | mode | granted +--+--++-+- relation |74755 |76064 | 23/1037332 | AccessShareLock | t virtualxid | | | 23/1037332 | ExclusiveLock | t relation |74755 |75245 | 23/1037332 | AccessShareLock | t relation |74755 |76062 | 23/1037332 | AccessShareLock | t relation |74755 |76188 | 23/1037332 | AccessShareLock | t relation |74755 |74822 | 23/1037332 | AccessShareLock | t relation |74755 |76187 | 23/1037332 | AccessShareLock | t relation |74755 |76186 | 23/1037332 | AccessShareLock | t relation |74755 |76189 | 23/1037332 | AccessShareLock | t relation |74755 |76057 | 23/1037332 | AccessShareLock | t relation |74755 |75846 | 23/1037332 | AccessShareLock | t relation |74755 |76063 | 23/1037332 | AccessShareLock | t relation |74755 |76058 | 23/1037332 | AccessShareLock | t relation |74755 |76185 | 23/1037332 | AccessShareLock | t relation |74755 |75874 | 23/1037332 | AccessShareLock | t relation |74755 |76061 | 23/1037332 | AccessShareLock | t relation |74755 |76191 | 23/1037332 | AccessShareLock | t relation |74755 |76059 | 23/1037332 | AccessShareLock | t relation |74755 |76363 | 23/1037332 | AccessShareLock | t relation |74755 |76364 | 23/1037332 | AccessShareLock | t relation |74755 |76192 | 23/1037332 | AccessShareLock | t relation |74755 |76362 | 23/1037332 | AccessShareLock | t
Re: [PERFORM] query looping?
On 01/04/2010 04:53 PM, Robert Haas [robertmh...@gmail.com] wrote: PREPARE foo AS the query, with the $x entries still in there EXPLAIN EXECUTE foo(the values); Thanks for the response. Results below. Brian cemdb= prepare foo as select count(distinct b.ts_id) from ts_stats_transetgroup_user_weekly b, ts_stats_transet_user_interval c, ts_transetgroup_transets_map m where b.ts_transet_group_id = m.ts_transet_group_id and m.ts_transet_incarnation_id = c.ts_transet_incarnation_id and c.ts_user_incarnation_id = b.ts_user_incarnation_id and c.ts_interval_start_time = $1 and c.ts_interval_start_time $2 and b.ts_interval_start_time = $3 and b.ts_interval_start_time $4; PREPARE cemdb= explain execute foo('2010-01-03 00:00','2010-01-03 08:00','2009-12-28 00:00','2010-01-04 00:00'); QUERY PLAN -- Aggregate (cost=325382.51..325382.51 rows=1 width=8) - Hash Join (cost=3486.00..325382.00 rows=406 width=8) Hash Cond: ((b.ts_transet_group_id = m.ts_transet_group_id) AND (c.ts_transet_incarnation_id = m.ts_transet_incarnation_id)) - Hash Join (cost=3484.17..325370.84 rows=1944 width=24) Hash Cond: (c.ts_user_incarnation_id = b.ts_user_incarnation_id) - Bitmap Heap Scan on ts_stats_transet_user_interval c (cost=2177.34..322486.61 rows=96473 width=16) Recheck Cond: ((ts_interval_start_time = $1) AND (ts_interval_start_time $2)) - Bitmap Index Scan on ts_stats_transet_user_interval_starttime (cost=0.00..2165.28 rows=96473 width=0) Index Cond: ((ts_interval_start_time = $1) AND (ts_interval_start_time $2)) - Hash (cost=1301.21..1301.21 rows=898 width=24) - Index Scan using ts_stats_transetgroup_user_weekly_starttimeindex on ts_stats_transetgroup_user_weekly b (cost=0.00..1301.21 rows=898 width=24) Index Cond: ((ts_interval_start_time = $3) AND (ts_interval_start_time $4)) - Hash (cost=1.33..1.33 rows=67 width=16) - Seq Scan on ts_transetgroup_transets_map m (cost=0.00..1.33 rows=67 width=16) (14 rows) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Planner question - bit data types
pgsql-performance-ow...@postgresql.org wrote: On Sat, Sep 5, 2009 at 8:19 PM, Karl Denningerk...@denninger.net wrote: There was a previous thread and I referenced it. I don't have the other one in my email system any more to follow up to it. I give up; the attack-dog crowd has successfully driven me off. Ciao. Perhaps I'm biased by knowing some of the people involved, but I don't think anyone on this thread has been anything but polite. I use several online forums and this -- hands down -- is the best: not only for politeness even when the information I provided was misleading or the question I asked was, in retrospect, Duh? but also for 1) speed of response, 2) breadth of ideas and 3) accuracy of information -- often on complex issues with no simple solution from folk who probably have more to do than sit around waiting for the next post. My thanks to the knowledgeable people on this forum. Brian -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] autovacuum hung?
Tom Lane [...@sss.pgh.pa.us] wrote: Oh, and don't forget the more-complete pg_locks state. We'll want all the columns of pg_locks, not just the ones you showed before. auto vacuum of ts_user_sessions_map has been running for 17 hours. This table has 2,204,488 rows. I hope that I've captured enough info. Thanks, Brian cemdb=# select procpid,current_query,query_start from pg_stat_activity; procpid | current_query | query_start -+-+--- 8817 | IDLE | 2009-07-09 16:48:12.656419-07 8818 | autovacuum: VACUUM public.ts_user_sessions_map | 2009-07-09 16:48:18.873677-07 cemdb=# select l.pid,c.relname,l.mode,l.granted,l.virtualxid,l.virtualtransaction from pg_locks l left outer join pg_class c on c.oid=l.relation order by l.pid; pid | relname | mode | granted | virtualxid | virtualtransaction ---++--+-++ 8818 | ts_user_sessions_map_interimsessionidindex | RowExclusiveLock | t || 2/3 8818 | ts_user_sessions_map_appindex | RowExclusiveLock | t || 2/3 8818 | ts_user_sessions_map_sessionidindex| RowExclusiveLock | t || 2/3 8818 | ts_user_sessions_map | ShareUpdateExclusiveLock | t || 2/3 8818 || ExclusiveLock | t | 2/3| 2/3 8818 | ts_user_sessions_map_pkey | RowExclusiveLock | t || 2/3 13706 || ExclusiveLock | t | 6/50 | 6/50 13706 | pg_class_oid_index | AccessShareLock | t || 6/50 13706 | pg_class_relname_nsp_index | AccessShareLock | t || 6/50 13706 | pg_locks | AccessShareLock | t || 6/50 13706 | pg_class | AccessShareLock | t || 6/50 (11 rows) [r...@rdl64xeoserv01 log]# strace -p 8818 -o /tmp/strace.log Process 8818 attached - interrupt to quit Process 8818 detached [r...@rdl64xeoserv01 log]# more /tmp/strace.log select(0, NULL, NULL, NULL, {0, 13000}) = 0 (Timeout) read(36, `\0\0\0\370\354\250u\1\0\0\0\34\0\264\37\360\37\4 \0\0..., 8192) = 8192 read(36, `\0\0\0\340\f\251u\1\0\0\0\34\0\264\37\360\37\4 \0\0\0..., 8192) = 8192 read(36, `\0\0\0\300,\251u\1\0\0\0\34\0\264\37\360\37\4 \0\0\0\0..., 8192) = 8192 read(36, `\0\0\0(L\251u\1\0\0\0\34\0\264\37\360\37\4 \0\0\0\0\264..., 8192) = 8192 read(36, `\0\0\0|M\251u\1\0\0\0\34\0\264\37\360\37\4 \0\0\0\0\264..., 8192) = 8192 read(36, `\0\0\0\\~\251u\1\0\0\0\34\0\264\37\360\37\4 \0\0\0\0\264..., 8192) = 8192 read(36, `\0\0\0D\234\251u\1\0\0\0\34\0\264\37\360\37\4 \0\0\0\0..., 8192) = 8192 read(36, `\0\0\0\34\255\251u\1\0\0\0\34\0\264\37\360\37\4 \0\0\0..., 8192) = 8192 read(36, `\0\0\0\4\315\251u\1\0\0\0\34\0\264\37\360\37\4 \0\0\0..., 8192) = 8192 read(36, `\0\0\0\234\2334x\1\0\0\0\34\0\264\37\360\37\4 \0\0\0\0..., 8192) = 8192 read(36, `\0\0\0\354\354\251u\1\0\0\0\34\0\264\37\360\37\4 \0\0..., 8192) = 8192 read(36, `\0\0\0\324\f\252u\1\0\0\0\34\0\264\37\360\37\4 \0\0\0..., 8192) = 8192 read(36, `\0\0\0\274,\252u\1\0\0\0\34\0\264\37\360\37\4 \0\0\0\0..., 8192) = 8192 read(36, `\0\0\0\244L\252u\1\0\0\0\34\0\264\37\360\37\4 \0\0\0\0..., 8192) = 8192 read(36, `\0\0\0008^\252u\1\0\0\0\34\0\264\37\360\37\4 \0\0\0\0..., 8192) = 8192 read(36, `\0\0\0,\233\252u\1\0\0\0\34\0\264\37\360\37\4 \0\0\0\0..., 8192) = 8192 read(36, `\0\0\0\370\330\252u\1\0\0\0\34\0\264\37\360\37\4 \0\0..., 8192) = 8192 read(36, `\0\0\\371\252u\1\0\0\0\34\0\270\37\360\37\4 \0\0\0..., 8192) = 8192 read(36, `\0\0\0\364\30\253u\1\0\0\0\34\0\264\37\360\37\4 \0\0\0..., 8192) = 8192 read(36, `\0\0\0\2448\253u\1\0\0\0\34\0\264\37\360\37\4 \0\0\0\0..., 8192) = 8192 select(0, NULL, NULL, NULL, {0, 2}) = 0 (Timeout) read(36, `\0\0\0dX\253u\1\0\0\0\34\0\264\37\360\37\4 \0\0\0\0\264..., 8192) = 8192 read(36, `\0\0\0X\216\253u\1\0\0\0\34\0\264\37\360\37\4 \0\0\0\0..., 8192) = 8192 read(36, `\0\0\0\10\256\253u\1\0\0\0\34\0\264\37\360\37\4 \0\0\0..., 8192) = 8192 read(36, `\0\0\0\300\315\253u\1\0\0\0\34\0\264\37\360\37\4 \0\0..., 8192) = 8192 read(36, `\0\0\0\304\f\254u\1\0\0\0\34\0\264\37\360\37\4 \0\0\0..., 8192) = 8192 read(36, `\0\0\0\354=\254u\1\0\0\0\34\0\264\37\360\37\4 \0\0\0\0..., 8192) = 8192 read(36, `\0\0\0\254]\254u\1\0\0\0\34\0\264\37\360\37\4 \0\0\0\0..., 8192) = 8192 read(36, `\0\0\0d}\254u\1\0\0\0\34\0\270\37\360\37\4 \0\0\0\0\270..., 8192) = 8192 read(36,
Re: [PERFORM] select max() much slower than select min()
Tom Lane [...@sss.pgh.pa.us] wrote: Some experimentation suggests that it might help to provide a 2-column index on (ts_id, ts_interval_start_time). This is still going to be scanned in order by ts_id, but it will be possible to check the ts_interval_start_time condition in the index, eliminating a large number of useless trips to the heap. Whether this type of query is important enough to justify maintaining an extra index for is something you'll have to decide for yourself... Thanks to all for the analysis and suggestions. Since the number of rows in an hour ~500,000, brute force looks to be a fast solution: select ts_id from ... where ts_interval_start_time = ... and ... This query runs very fast as does a single pass through the ids to find the min and max. Brian -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] select max() much slower than select min()
David Rees [dree...@gmail.com] wrote: Along those lines, couldn't you just have the DB do the work? select max(ts_id), min(ts_id) from ... where ts_interval_start_time = ... and ... Then you don't have to transfer 500k ids across the network... I guess you didn't read the entire thread: I started it because the query you suggest took 15 mins to complete. Brian -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] very slow selects on a small table
Tom Lane [...@sss.pgh.pa.us] wrote: Yup. So according to those stats, all ts_id values fall in the range 61 .. 6000250068. It's no wonder it's not expecting to find anything between 0 and 10. I think maybe you forgot to re-analyze after loading data ... although this being 8.3, I'd have expected autovacuum to update the stats at some point ... yes, this is a concern. I may have to do the vacuum analyze myself or learn how to make autovacuum run more frequently. Recommendation: re-ANALYZE, check that the plan changes to something with a higher estimate for the number of rows for this table, and then abort and restart those processes. Lord knows how long you'll be waiting for them to finish with their current plans :-( these queries are still running now 27.5 hours later... These queries are generated by some java code and in putting it into a test program so I could capture the queries, I failed to get the id range correct -- sorry for wasting your time with bogus data. Below is the EXPLAIN output from the 4 correct queries. I can't tell which one is being executed by PID 7397, but the query plans, except the last, do look very similar. In any event, as I mentioned, all 4 are still running. Thanks, Brian cemdb=# explain select * from ts_stats_transetgroup_user_daily a where a.ts_id in (select b.ts_id from ts_stats_transetgroup_user_daily b,ts_stats_transet_user_interval c, ts_transetgroup_transets_map m where b.ts_transet_group_id = m.ts_transet_group_id and m.ts_transet_incarnation_id = c.ts_transet_incarnation_id and c.ts_user_incarnation_id = b.ts_user_incarnation_id and c.ts_interval_start_time = '2009-6-16 01:00' and c.ts_interval_start_time '2009-6-16 02:00') and a.ts_id 61 and a.ts_id 62 order by a.ts_id; QUERY PLAN Sort (cost=138722.75..138734.37 rows=9299 width=779) Sort Key: a.ts_id - Hash IN Join (cost=131710.94..138416.28 rows=9299 width=779) Hash Cond: (a.ts_id = b.ts_id) - Index Scan using ts_stats_transetgroup_user_daily_pkey on ts_stats_transetgroup_user_daily a (cost=0.00..6602.21 rows=9299 width=779) Index Cond: ((ts_id 61::bigint) AND (ts_id 62::bigint)) - Hash (cost=130113.34..130113.34 rows=255616 width=8) - Hash Join (cost=82370.45..130113.34 rows=255616 width=8) Hash Cond: ((m.ts_transet_group_id = b.ts_transet_group_id) AND (c.ts_user_incarnation_id = b.ts_user_incarnation_id)) - Hash Join (cost=3.32..29255.47 rows=229502 width=16) Hash Cond: (c.ts_transet_incarnation_id = m.ts_transet_incarnation_id) - Index Scan using ts_stats_transet_user_interval_starttime on ts_stats_transet_user_interval c (cost=0.00..27674.33 rows=229502 width=16) Index Cond: ((ts_interval_start_time = '2009-06-16 01:00:00-07'::timestamp with time zone) AND (ts_interval_start_time '2009-06-16 02:00:00-07'::timestamp with time zone)) - Hash (cost=2.58..2.58 rows=117 width=16) - Seq Scan on ts_transetgroup_transets_map m (cost=0.00..2.58 rows=117 width=16) - Hash (cost=80511.26..80511.26 rows=247451 width=24) - Seq Scan on ts_stats_transetgroup_user_daily b (cost=0.00..80511.26 rows=247451 width=24) (17 rows) cemdb=# explain select * from ts_stats_transetgroup_user_daily a where a.ts_id in (select b.ts_id from ts_stats_transetgroup_user_daily b,ts_stats_transet_user_interval c, ts_transetgroup_transets_map m where b.ts_transet_group_id = m.ts_transet_group_id and m.ts_transet_incarnation_id = c.ts_transet_incarnation_id and c.ts_user_incarnation_id = b.ts_user_incarnation_id and c.ts_interval_start_time = '2009-6-16 01:00' and c.ts_interval_start_time '2009-6-16 02:00') and a.ts_id 62 and a.ts_id 63 order by a.ts_id; QUERY PLAN Sort (cost=139430.64..139443.43 rows=10237 width=779) Sort Key: a.ts_id - Hash IN Join (cost=131710.94..139089.71 rows=10237 width=779) Hash Cond: (a.ts_id = b.ts_id) - Index Scan using ts_stats_transetgroup_user_daily_pkey on ts_stats_transetgroup_user_daily a (cost=0.00..7265.23 rows=10237 width=779) Index Cond: ((ts_id 62::bigint) AND (ts_id
Re: [PERFORM] very slow selects on a small table
Grzegorz Jakiewicz [gryz...@gmail.com] wrote: this might be quite bogus question, just a hit - but what is your work_mem set to ? Guys, isn't postgresql giving hudge cost, when it can't sort in memory ? work_mem = 64MB -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] select max() much slower than select min()
ts_stats_transet_user_interval has ~48M rows. ts_id is the PK and there is an index on ts_interval_start_time. I reindexed it and ran vacuum analyze. Only SELECTs have been done since these operations. cemdb=# explain select min(ts_id) from ts_stats_transet_user_interval a where 0=0 and a.ts_interval_start_time = '2009-6-16 01:00' and a.ts_interval_start_time '2009-6-16 02:00'; QUERY PLAN Result (cost=12.19..12.20 rows=1 width=0) InitPlan - Limit (cost=0.00..12.19 rows=1 width=8) - Index Scan using ts_stats_transet_user_interval_pkey on ts_stats_transet_user_interval a (cost=0.00..5496152.30 rows=450799 width=8) Filter: ((ts_id IS NOT NULL) AND (ts_interval_start_time = '2009-06-16 01:00:00-07'::timestamp with time zone) AND (ts_interval_start_time '2009-06-16 02:00:00-07'::timestamp with time zone)) (5 rows) cemdb=# explain select max(ts_id) from ts_stats_transet_user_interval a where 0=0 and a.ts_interval_start_time = '2009-6-16 01:00' and a.ts_interval_start_time '2009-6-16 02:00'; QUERY PLAN Result (cost=12.19..12.20 rows=1 width=0) InitPlan - Limit (cost=0.00..12.19 rows=1 width=8) - Index Scan Backward using ts_stats_transet_user_interval_pkey on ts_stats_transet_user_interval a (cost=0.00..5496152.30 rows=450799 width=8) Filter: ((ts_id IS NOT NULL) AND (ts_interval_start_time = '2009-06-16 01:00:00-07'::timestamp with time zone) AND (ts_interval_start_time '2009-06-16 02:00:00-07'::timestamp with time zone)) (5 rows) [r...@rdl64xeoserv01 log]# time PGPASSWORD=quality psql -U admin -d cemdb -c select min(ts_id) from ts_stats_transet_user_interval a where a.ts_interval_start_time = '2009-6-16 01:00' and a.ts_interval_start_time '2009-6-16 02:00' min 603210 (1 row) real1m32.025s user0m0.000s sys 0m0.003s [r...@rdl64xeoserv01 log]# time PGPASSWORD=quality psql -U admin -d cemdb -c select max(ts_id) from ts_stats_transet_user_interval a where a.ts_interval_start_time = '2009-6-16 01:00' and a.ts_interval_start_time '2009-6-16 02:00' max 603239 (1 row) real16m39.412s user0m0.002s sys 0m0.002s seems like max() shouldn't take any longer than min() and certainly not 10 times as long. Any ideas on how to determine the max more quickly? Thanks, Brian -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] select max() much slower than select min()
Kevin Grittner [kevin.gritt...@wicourts.gov] wrote: Is there any correlation between ts_id and ts_interval_start_time? only vaguely: increasing ts_interval_start_time implies increasing ts_id but there may be many rows (100,000's) with the same ts_interval_start_time Perhaps if you tried min and max with different time ranges it would find a row on a backward scan faster. It'll take ten times as long if it has to scan through ten times as many rows to find a match. it looks like there are fewer rows backwards than forwards: cemdb= select count(*) from ts_stats_transet_user_interval where ts_interval_start_time '2009-6-16 01:00'; count -- 3210 (1 row) cemdb= select count(*) from ts_stats_transet_user_interval where ts_interval_start_time = '2009-6-16 02:00'; count -- 1350 (1 row) I don't suppose you have an index on ts_interval_start_time? there is an index. I mentioned this in my orginal posting. Thanks, Brian -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] very slow selects on a small table
There are 4 threads (4 postgres processes) loading all rows from a table with 50,018 rows. The table has a int8 PK that is incremented by 1 for each new row and the PK is used by the threads to partition the rows so that each loads distinct rows. As you can see below, these 4 SELECTs have been running since 6:30am (it's now 11:30am) -- sluggish at best -- and each of the postgres processes is using 100% CPU. The table schema is long (~160 cols), so I'm omitting it but will provide it if deemed necessary. Any ideas about the cause of this are appreciated. Thanks, Brian cemdb=# select procpid, xact_start, current_query from pg_stat_activity; procpid | xact_start | current_query -+---+ - 27825 | | IDLE 27826 | | IDLE 27824 | | IDLE 27828 | | IDLE 27827 | | IDLE 27829 | | IDLE 27830 | | IDLE 27831 | | IDLE 27832 | | IDLE 27833 | | IDLE 14031 | 2009-06-17 05:48:02.931503-07 | autovacuum: VACUUM ANALYZE public.ts_stats_transet_user_weekly 16044 | | IDLE 32169 | 2009-06-17 08:17:39.034142-07 | autovacuum: VACUUM ANALYZE public.ts_stats_transetgroup_user_weekly 7165 | | IDLE 16043 | | IDLE 22130 | 2009-06-17 11:22:05.339582-07 | select procpid, xact_start, current_query from pg_stat_activity; 7169 | 2009-06-17 06:31:26.997641-07 | select a.ts_id as id1_0_, a.version_info as versionI2_1_0_, a.ts_user_id as userId1_0_, null as tranSetId1_0_, null as tranUnitId1_0_, null as userGro10_1_0_, a.ts_transet_group_id as tranSetG7_1_0_, a.ts_last_aggregated_row as lastAgg12_1_0_, null as tranSetI5_1_0_, a.ts_user_incarnation_id as userInca9_1_0_, a.ts_interval_start_time as interva11_1_0_, a.ts_interval_wire_time as interva13_1_0_, a.ts_total_transactions as totalTr14_1_0_, a.ts_bad_transactions as badTran15_1_0_, a.ts_opportunities as opportu16_1_0_, a.ts_defects as defects1_0_, a.ts_data_type as dataType1_0_, a.ts_tth_type as tthType1_0_, a.ts_tth_lower_spec as tthLowe20_1_0_, a.ts_tth_upper_spec as tthUppe21_1_0_, a.ts_tth_b0 as tthB22_1_0_, a.ts_tth_b1 as tthB23_1_0_, a.ts_tth_b2 as tthB24_1_0_, a.ts_tth_b3 as tthB25_1_0_, a.ts_tth_b4 as tthB26_1_0_, a.ts_tth_b5 as tthB27_1_0_, a.ts_tth_b6 as tthB28_1_0_, a.ts_tth_b7 as tthB29_1_0_, a.ts_tth_b8 as tthB30_1_0_, a.ts_tth_b9 as tthB31_1_0_, a.ts_tth_b10 as tthB32_1_0_, a.ts_tth_b11 as tthB33_1 7171 | | IDLE 7172 | | IDLE 28106 | | IDLE 7392 | 2009-06-17 06:31:26.997985-07 | select a.ts_id as id1_0_, a.version_info as versionI2_1_0_, a.ts_user_id as userId1_0_, null as tranSetId1_0_, null as tranUnitId1_0_, null as userGro10_1_0_, a.ts_transet_group_id as tranSetG7_1_0_, a.ts_last_aggregated_row as lastAgg12_1_0_, null as tranSetI5_1_0_, a.ts_user_incarnation_id as userInca9_1_0_, a.ts_interval_start_time as interva11_1_0_, a.ts_interval_wire_time as interva13_1_0_, a.ts_total_transactions as totalTr14_1_0_, a.ts_bad_transactions as badTran15_1_0_, a.ts_opportunities as opportu16_1_0_, a.ts_defects as defects1_0_, a.ts_data_type as dataType1_0_, a.ts_tth_type as tthType1_0_, a.ts_tth_lower_spec as tthLowe20_1_0_, a.ts_tth_upper_spec as tthUppe21_1_0_, a.ts_tth_b0 as tthB22_1_0_, a.ts_tth_b1 as tthB23_1_0_, a.ts_tth_b2 as tthB24_1_0_, a.ts_tth_b3 as tthB25_1_0_, a.ts_tth_b4 as tthB26_1_0_, a.ts_tth_b5 as tthB27_1_0_, a.ts_tth_b6 as tthB28_1_0_, a.ts_tth_b7 as tthB29_1_0_, a.ts_tth_b8 as tthB30_1_0_,
Re: [PERFORM] very slow selects on a small table
Tom Lane [...@sss.pgh.pa.us] wrote: That is a pretty odd trace for a Postgres backend; apparently it's repeatedly acquiring and releasing a meg or two worth of memory, which is not very normal within a single query. Can you tell us more about the query it's running? An EXPLAIN plan would be particularly interesting. Also, could you try to determine which files 144 and 65 refer to (see lsof)? Here's the explain and a current strace and lsof. The strace shows even less I/O activity. Thanks, Brian cemdb=# explain select * from ts_stats_transetgroup_user_daily a where a.ts_id in (select b.ts_id from ts_stats_transetgroup_user_daily b,ts_stats_transet_user_interval c, ts_transetgroup_transets_map m where b.ts_transet_group_id = m.ts_transet_group_id and m.ts_transet_incarnation_id = c.ts_transet_incarnation_id and c.ts_user_incarnation_id = b.ts_user_incarnation_id and c.ts_interval_start_time = '2009-6-16 01:00' and c.ts_interval_start_time '2009-6-16 02:00'); QUERY PLAN -- Hash IN Join (cost=128162.63..211221.43 rows=231127 width=779) Hash Cond: (a.ts_id = b.ts_id) - Seq Scan on ts_stats_transetgroup_user_daily a (cost=0.00..80511.26 rows=247451 width=779) - Hash (cost=126718.09..126718.09 rows=231127 width=8) - Hash Join (cost=82370.45..126718.09 rows=231127 width=8) Hash Cond: ((m.ts_transet_group_id = b.ts_transet_group_id) AND (c.ts_user_incarnation_id = b.ts_user_incarnation_id)) - Hash Join (cost=3.32..27316.61 rows=211716 width=16) Hash Cond: (c.ts_transet_incarnation_id = m.ts_transet_incarnation_id) - Index Scan using ts_stats_transet_user_interval_starttime on ts_stats_transet_user_interval c (cost=0.00..25857.75 rows=211716 width=16) Index Cond: ((ts_interval_start_time = '2009-06-16 01:00:00-07'::timestamp with time zone) AND (ts_interval_start_time '2009-06-16 02:00:00-07'::timestamp with time zone)) - Hash (cost=2.58..2.58 rows=117 width=16) - Seq Scan on ts_transetgroup_transets_map m (cost=0.00..2.58 rows=117 width=16) - Hash (cost=80511.26..80511.26 rows=247451 width=24) - Seq Scan on ts_stats_transetgroup_user_daily b (cost=0.00..80511.26 rows=247451 width=24) (14 rows) [r...@rdl64xeoserv01 log]# strace -p 7397 Process 7397 attached - interrupt to quit mmap2(NULL, 266240, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x95515000 mmap2(NULL, 528384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x95494000 mmap2(NULL, 1052672, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x95393000 munmap(0x95393000, 1052672) = 0 munmap(0x95494000, 528384) = 0 munmap(0x95515000, 266240) = 0 brk(0x8603000) = 0x8603000 brk(0x85fb000) = 0x85fb000 _llseek(164, 0, [201940992], SEEK_END) = 0 brk(0x85eb000) = 0x85eb000 brk(0x8613000) = 0x8613000 mmap2(NULL, 266240, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x95515000 mmap2(NULL, 528384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x95494000 mmap2(NULL, 1052672, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x95393000 munmap(0x95393000, 1052672) = 0 munmap(0x95494000, 528384) = 0 munmap(0x95515000, 266240) = 0 brk(0x8603000) = 0x8603000 brk(0x85fb000) = 0x85fb000 _llseek(164, 0, [201940992], SEEK_END) = 0 brk(0x85eb000) = 0x85eb000 brk(0x8613000) = 0x8613000 mmap2(NULL, 266240, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x95515000 mmap2(NULL, 528384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x95494000 mmap2(NULL, 1052672, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x95393000 munmap(0x95393000, 1052672) = 0 munmap(0x95494000, 528384) = 0 munmap(0x95515000, 266240) = 0 brk(0x8603000) = 0x8603000 brk(0x85fb000) = 0x85fb000 _llseek(164, 0, [201940992], SEEK_END) = 0 brk(0x85eb000) = 0x85eb000 brk(0x8613000) = 0x8613000 mmap2(NULL, 266240, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x95515000 mmap2(NULL, 528384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x95494000 mmap2(NULL, 1052672, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x95393000 munmap(0x95393000, 1052672) = 0 munmap(0x95494000,
Re: [PERFORM] very slow selects on a small table
Tom Lane [...@sss.pgh.pa.us] wrote: Um, are you sure that is the query that PID 7397 is running? It doesn't match your previous pg_stat_activity printout, nor do I see anything about partitioning by PKs. Umm, indeed. I had to construct the query by hand and left out the partition part. Here's the full query. Also, I took the liberty of reducing the completely expanded column list (shown in part in the pg_stat_activity printout) in the actual query to *. Thanks, Brian cemdb=# explain select * from ts_stats_transetgroup_user_daily a where a.ts_id in (select b.ts_id from ts_stats_transetgroup_user_daily b,ts_stats_transet_user_interval c, ts_transetgroup_transets_map m where b.ts_transet_group_id = m.ts_transet_group_id and m.ts_transet_incarnation_id = c.ts_transet_incarnation_id and c.ts_user_incarnation_id = b.ts_user_incarnation_id and c.ts_interval_start_time = '2009-6-16 01:00' and c.ts_interval_start_time '2009-6-16 02:00') and a.ts_id 0 and a.ts_id 10 order by a.ts_id; QUERY PLAN Nested Loop IN Join (cost=82370.45..128489.59 rows=1 width=779) Join Filter: (b.ts_id = a.ts_id) - Index Scan using ts_stats_transetgroup_user_daily_pkey on ts_stats_transetgroup_user_daily a (cost=0.00..8.22 rows=1 width=779) Index Cond: ((ts_id 0) AND (ts_id 10)) - Hash Join (cost=82370.45..127026.87 rows=232721 width=8) Hash Cond: ((m.ts_transet_group_id = b.ts_transet_group_id) AND (c.ts_user_incarnation_id = b.ts_user_incarnation_id)) - Hash Join (cost=3.32..27507.92 rows=213176 width=16) Hash Cond: (c.ts_transet_incarnation_id = m.ts_transet_incarnation_id) - Index Scan using ts_stats_transet_user_interval_starttime on ts_stats_transet_user_interval c (cost=0.00..26039.02 rows=213176 width=16) Index Cond: ((ts_interval_start_time = '2009-06-16 01:00:00-07'::timestamp with time zone) AND (ts_interval_start_time '2009-06-16 02:00:00-07'::timestamp with time zone)) - Hash (cost=2.58..2.58 rows=117 width=16) - Seq Scan on ts_transetgroup_transets_map m (cost=0.00..2.58 rows=117 width=16) - Hash (cost=80511.26..80511.26 rows=247451 width=24) - Seq Scan on ts_stats_transetgroup_user_daily b (cost=0.00..80511.26 rows=247451 width=24) (14 rows) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] very slow selects on a small table
Tom Lane [...@sss.pgh.pa.us] wrote: This seems like kind of a stupid plan anyway (which PG version was this exactly?) but certainly the big issue is the catastrophically bad rowcount estimate for the indexscan. Do you have ANALYZE stats for ts_stats_transetgroup_user_daily at all (look in pg_stats)? postgres 8.3.5. Yes, here's a count(*) from pg_stats: cemdb=# select count(*) from pg_stats where tablename='ts_stats_transetgroup_user_daily'; count --- 186 (1 row) Thanks, Brian -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] very slow selects on a small table
Tom Lane [...@sss.pgh.pa.us] wrote: OK, so what's the entry for column ts_id? Is this what you requested? Brian cemdb=# select * from pg_stats where tablename='ts_stats_transetgroup_user_daily' and attname = 'ts_id'; schemaname |tablename | attname | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation +--+-+---+---++--+---+ -- +- public | ts_stats_transetgroup_user_daily | ts_id | 0 | 8 | -1 | | | {61,602537,605139,607918,610330,613206,615829,618440,621018,623430,625887,628165,630571,633290,636434,638845,641276,643702,645978,648385,650648,653220,655602,658138,660613,663114,665750,668440,670859,673162,675597,678199,681054,683455,686049,688753,691231,693942,696229,698598,6000101190,6000103723,6000105917,6000108273,6000110687,6000113114,6000115528,6000118024,6000121085,6000123876,6000126548,6000128749,6 000131260,6000133668,6000135988,6000138755,6000141251,6000143855,6000146302,6000148963,6000151424,6000153772,6000156222,6000159005,6000161293,6000163783,6000166624,6000168913,6000171220,6000173349,6000175584,6000177882,6000180605,6000183207,6000185420,6000187949,6000190128,6000192738,6000195452,6000197843,6000200173,6000202838,6000205245,6000207579,6000210566,6000212935,6000215382,6000218095,6000220940,6000223634,6000226196,6000228596,6000230733,6000232988,6000235066,6000237064,6000239736,6000242470,6000244915,6000247102,6000250068} | 0.94954 (1 row) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] autovacuum hung?
Tom Lane [...@sss.pgh.pa.us] wrote: Are those processes actually doing anything, or just waiting? strace or local equivalent would be the most conclusive check. These must not have been hung, because they finally completed (after 10-15 hrs - some time between 11pm and 8am). Question is why does it take so long to do this on such a relatively small table? This query isn't very helpful because it fails to show locks that are not directly associated with tables. How can that (locks not directly associated...) be determined? Thanks, Brian -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] autovacuum hung?
Tom Lane [...@sss.pgh.pa.us] wrote: They might have been blocked behind some other process that was sitting in an open transaction for some reason. The other likely cause is badly chosen autovacuum delay, but I think that was already covered. Well, after I noticed this running for a while, I shutdown the postgres port and restarted postgres. The autovacuum of these tables kicked in promptly when postgres was back up. I then let them run. So, I don't think that surmise #1 is likely. As for #2, I'm using the default. These tables get updated once a day with each row (potentially) being updated 1-24 times over many minutes to a handful of hours. Dp you think it would be better to manually vacuum these tables? If so, would it be best to disable autovacuum of them? And while I'm at it, if you disable autovacuum of the master table will that disable it for the actual partitions? Don't assume every row in pg_locks has a join partner in pg_class. You could use an outer join ... Yes, of course. It never occurred that there could be db locks not associated with tables. Thanks, Brian -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] autovacuum hung?
Tom Lane [...@sss.pgh.pa.us] wrote: No, no, and no. What would be best is to find out what actually happened. The evidence is gone now, but if you see it again please take a closer look. OK. You mentioned strace. It's got a lot of options; any in particular that would be useful if this happens again? Brian -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] autovacuum hung?
autovacuum has been running on 2 tables for 5 hours. There tables are not huge (see below). For the past ~1 hour, I've shut off all other activity on this database. The other table being vacuumed has more rows (1897810). Anyone have any ideas about why this is taking so long? Thanks, Brian [r...@rdl64xeoserv01 log]# fgrep autov /var/lib/pgsql/data/postgresql.conf autovacuum = on # enable autovacuum subprocess? autovacuum_naptime = 60s# time between autovacuum runs, in secs autovacuum_vacuum_threshold = 200 # min # of tuple updates before autovacuum_analyze_threshold = 50 # min # of tuple updates before autovacuum_vacuum_scale_factor = 0.2# fraction of rel size before autovacuum_analyze_scale_factor = 0.1 # fraction of rel size before #autovacuum_vacuum_cost_delay = -1 # default vacuum cost delay for # autovac, -1 means use #autovacuum_vacuum_cost_limit = -1 # default vacuum cost limit for Welcome to psql 8.3.5, the PostgreSQL interactive terminal. Type: \copyright for distribution terms \h for help with SQL commands \? for help with psql commands \g or terminate with semicolon to execute query \q to quit cemdb=# select procpid,query_start,current_query from pg_stat_activity; procpid | query_start | current_query -+---+- 24866 | 2009-05-29 13:50:11.251397-07 | autovacuum: VACUUM public.ts_user_sessions_map 24869 | 2009-05-29 11:46:54.221713-07 | autovacuum: VACUUM ANALYZE public.ts_stats_transet_user_daily 24872 | 2009-05-29 11:31:28.324954-07 | autovacuum: VACUUM ANALYZE public.ts_stats_transet_user_weekly 28097 | 2009-05-29 15:58:49.24832-07 | select procpid,query_start,current_query from pg_stat_activity; (4 rows) cemdb=# select count(*) from ts_stats_transet_user_daily; count 558321 (1 row) cemdb=# select count(*) from ts_stats_transet_user_weekly; count 24 (1 row) cemdb=# select c.oid,c.relname,l.pid,l.mode,l.granted from pg_class c join pg_locks l on c.oid=l.relation order by l.pid; oid|relname| pid | mode | granted --+---+---+--+- 26612062 | ts_user_sessions_map | 24866 | ShareUpdateExclusiveLock | t 26613644 | ts_user_sessions_map_interimsessionidindex| 24866 | RowExclusiveLock | t 26613645 | ts_user_sessions_map_sessionidindex | 24866 | RowExclusiveLock | t 26612846 | ts_user_sessions_map_appindex | 24866 | RowExclusiveLock | t 26612417 | ts_user_sessions_map_pkey | 24866 | RowExclusiveLock | t 27208308 | ts_stats_transet_user_daily_userindex | 24869 | RowExclusiveLock | t 27208305 | ts_stats_transet_user_daily_transetincarnationidindex | 24869 | RowExclusiveLock | t 27208310 | ts_stats_transet_user_daily_yearindex | 24869 | RowExclusiveLock | t 27208307 | ts_stats_transet_user_daily_userincarnationidindex| 24869 | RowExclusiveLock | t 27208302 | ts_stats_transet_user_daily_lastaggregatedrowindex| 24869 | RowExclusiveLock | t 27208309 | ts_stats_transet_user_daily_weekindex | 24869 | RowExclusiveLock | t 26612320 | ts_stats_transet_user_daily_pkey | 24869 | RowExclusiveLock | t 27208306 | ts_stats_transet_user_daily_transetindex | 24869 | RowExclusiveLock | t 26611722 | ts_stats_transet_user_daily | 24869 | ShareUpdateExclusiveLock | t 27208303 | ts_stats_transet_user_daily_monthindex| 24869 | RowExclusiveLock | t 27208304 | ts_stats_transet_user_daily_starttimeindex| 24869 | RowExclusiveLock | t 27208300 | ts_stats_transet_user_daily_dayindex | 24869 | RowExclusiveLock | t 27208301 | ts_stats_transet_user_daily_hourindex | 24869 | RowExclusiveLock | t 26612551 | ts_stats_transet_user_weekly_lastaggregatedrowindex | 24872 | RowExclusiveLock | t 26612558 | ts_stats_transet_user_weekly_yearindex| 24872 | RowExclusiveLock | t 26612326 | ts_stats_transet_user_weekly_pkey | 24872 | RowExclusiveLock | t 26612554 | ts_stats_transet_user_weekly_transetindex | 24872 | RowExclusiveLock | t 26612555 | ts_stats_transet_user_weekly_userincarnationidindex | 24872 | RowExclusiveLock | t 26611743 | ts_stats_transet_user_weekly | 24872 | ShareUpdateExclusiveLock
Re: [PERFORM] autovacuum hung?
Alvaro Herrera [alvhe...@commandprompt.com] wrote: What's vacuum_cost_delay? #vacuum_cost_delay = 0 # 0-1000 milliseconds #vacuum_cost_page_hit = 1 # 0-1 credits #vacuum_cost_page_miss = 10 # 0-1 credits #vacuum_cost_page_dirty = 20# 0-1 credits #vacuum_cost_limit = 200# 0-1 credits so, whatever the default happens to be. Thanks, Brian -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] determining the locks that will be held by a query
I sent this out on 4/7 and either missed a response or didn't get one. If this is the wrong forum, I'd appreciate a redirect. I know that EXPLAIN will show the query plan. I know that pg_locks will show the locks currently held for activity transactions. Is there a way to determine what locks a query will hold when it is executed? Thanks, Brian -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] determining the locks that will be held by a query
I know that EXPLAIN will show the query plan. I know that pg_locks will show the locks currently held for activity transactions. Is there a way to determine what locks a query will hold when it is executed? Thanks, Brian -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] multiple threads inserting into the same table
The application log shows that 99652 rows are being inserted into relation ts_stats_transet_user_daily. 5 threads are doing the inserts. The schema is lengthy, but it has a synthetic primary key (ts_id int8 not null) and the following constraints: alter table ts_stats_transet_user_daily add constraint FK8ED105ED9DADA24 foreign key (ts_transet_id) references ts_transets; alter table ts_stats_transet_user_daily add constraint K8ED105ED545ADA6D foreign key (ts_user_id) references ts_users; This relation currently has 456532 rows and is not partitioned. The inserts have been going on now for almost 1 hour -- not exactly speedy. Here's what I find on the postgres side: cemdb= select current_query, procpid, xact_start from pg_stat_activity; current_query | procpid | xact_start --+-+--- IDLE in transaction| 15147 | 2009-03-23 12:08:31.604433-07 IDLE | 15382 | select current_query, procpid, xact_start from pg_stat_activity; | 15434 | 2009-03-23 12:10:38.913764-07 IDLE | 15152 | IDLE | 15150 | IDLE | 15156 | IDLE in transaction| 15183 | 2009-03-23 12:09:50.864992-07 IDLE in transaction| 15186 | 2009-03-23 12:10:07.955838-07 IDLE | 15188 | IDLE | 15192 | IDLE in transaction| 15193 | 2009-03-23 12:10:07.955859-07 IDLE in transaction| 15194 | 2009-03-23 12:08:59.940101-07 (12 rows) cemdb= select c.oid,c.relname,l.pid,l.mode,l.granted from pg_class c join pg_locks l on c.oid=l.relation order by l.mode; oid| relname | pid | mode | granted --+-+---+--+- 26493289 | ts_users_pkey | 15183 | AccessShareLock | t 26493267 | ts_transets_pkey| 15186 | AccessShareLock | t 1259 | pg_class| 15434 | AccessShareLock | t 26493289 | ts_users_pkey | 15147 | AccessShareLock | t 10969 | pg_locks| 15434 | AccessShareLock | t 26493267 | ts_transets_pkey| 15193 | AccessShareLock | t 26493289 | ts_users_pkey | 15194 | AccessShareLock | t 2662 | pg_class_oid_index | 15434 | AccessShareLock | t 26493267 | ts_transets_pkey| 15194 | AccessShareLock | t 26493289 | ts_users_pkey | 15193 | AccessShareLock | t 26493267 | ts_transets_pkey| 15147 | AccessShareLock | t 26493289 | ts_users_pkey | 15186 | AccessShareLock | t 26493267 | ts_transets_pkey| 15183 | AccessShareLock | t 2663 | pg_class_relname_nsp_index | 15434 | AccessShareLock | t 26472890 | ts_stats_transet_user_daily | 15147 | RowExclusiveLock | t 26472890 | ts_stats_transet_user_daily | 15186 | RowExclusiveLock | t 26472890 | ts_stats_transet_user_daily | 15193 | RowExclusiveLock | t 26472890 | ts_stats_transet_user_daily | 15194 | RowExclusiveLock | t 26472890 | ts_stats_transet_user_daily | 15183 | RowExclusiveLock | t 26473252 | ts_users| 15194 | RowShareLock | t 26472508 | ts_transets | 15183 | RowShareLock | t 26472508 | ts_transets | 15193 | RowShareLock | t 26473252 | ts_users| 15193 | RowShareLock | t 26473252 | ts_users| 15183 | RowShareLock | t 26472508 | ts_transets | 15147 | RowShareLock | t 26473252 | ts_users| 15186 | RowShareLock | t 26472508 | ts_transets | 15186 | RowShareLock | t 26473252 | ts_users| 15147 | RowShareLock | t 26472508 | ts_transets | 15194 | RowShareLock | t (29 rows) cemdb= select c.oid,c.relname,l.pid,l.mode,l.granted from pg_class c join pg_locks l on c.oid=l.relation order by l.pid; oid| relname | pid | mode | granted --+-+---+--+- 26493289 | ts_users_pkey | 15147 | AccessShareLock | t 26473252 | ts_users| 15147 | RowShareLock | t 26493267 | ts_transets_pkey| 15147 | AccessShareLock | t 26472508 | ts_transets | 15147 | RowShareLock | t 26472890 |
Re: [PERFORM] multiple threads inserting into the same table
Tom Lane [...@sss.pgh.pa.us] wrote: pg_stat_activity says those five threads are doing nothing except sitting around with open transactions. You sure you don't have a bug on the application side? regards, tom lane This is a java app. A thread dump reveals that these 5 threads are all asleep on a socket read to postgres (see below). DbUtils.java:2265 is: session.connection().createStatement() .executeUpdate(((DatabaseInsert) insertObject).getInsertStmt(session)); This generates and executes a single SQL insert. Since, as you point out, postgres seems to think that this transaction isn't doing anything, it's hard to figure out what the read is doing. Brian DatabasePool.Thread1 prio=10 tid=0x27f04c00 nid=0x3b38 runnable [0x29e27000..0x29e281b0] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:129) at org.postgresql.core.VisibleBufferedInputStream.readMore(VisibleBufferedInputStream.java:135) at org.postgresql.core.VisibleBufferedInputStream.ensureBytes(VisibleBufferedInputStream.java:104) at org.postgresql.core.VisibleBufferedInputStream.read(VisibleBufferedInputStream.java:73) at org.postgresql.core.PGStream.ReceiveChar(PGStream.java:259) at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1182) at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:194) - locked 0x8975c878 (a org.postgresql.core.v3.QueryExecutorImpl) at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:451) at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:336) at org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:282) at com.mchange.v2.c3p0.impl.NewProxyStatement.executeUpdate(NewProxyStatement.java:64) at com.timestock.tess.util.DbUtils$DatabaseInsertTask.insertObject(DbUtils.java:2265) at com.timestock.tess.util.DbUtils$DatabaseInsertTask.call(DbUtils.java:2200) at com.timestock.tess.util.DbUtils$DatabaseInsertTask.call(DbUtils.java:2157) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] multiple threads inserting into the same table
Scott Marlowe [scott.marl...@gmail.com] wrote: Might you have a firewall that's killing the connections? What does netstat -an on the client side say about these connections? I don't think so: 1) app and postgres are on the same machine and 2) this has been the set up for months and I don't think anyone has diddled with the machine. netstat -an | fgrep 5432 shows: [r...@rdl64xeoserv01 log]# netstat -an | fgrep 5432 tcp0 0 0.0.0.0:54320.0.0.0:* LISTEN tcp0 0 127.0.0.1:5432 127.0.0.1:35996 ESTABLISHED tcp0 0 127.0.0.1:5432 127.0.0.1:35999 ESTABLISHED tcp0 0 127.0.0.1:5432 127.0.0.1:36017 ESTABLISHED tcp0 0 127.0.0.1:5432 127.0.0.1:36019 ESTABLISHED tcp0 0 127.0.0.1:5432 127.0.0.1:36018 ESTABLISHED tcp0 0 127.0.0.1:5432 127.0.0.1:36005 ESTABLISHED tcp0 0 127.0.0.1:5432 127.0.0.1:36001 ESTABLISHED tcp0 0 127.0.0.1:5432 127.0.0.1:36013 ESTABLISHED tcp0 0 127.0.0.1:5432 127.0.0.1:36008 ESTABLISHED tcp0 0 127.0.0.1:5432 127.0.0.1:36011 ESTABLISHED tcp0 0 130.200.163.80:5432 130.200.164.26:50481 ESTABLISHED tcp0 0 :::5432 :::* LISTEN tcp0 0 :::127.0.0.1:36001 :::127.0.0.1:5432 ESTABLISHED tcp0 0 :::127.0.0.1:36005 :::127.0.0.1:5432 ESTABLISHED tcp0 0 :::127.0.0.1:36008 :::127.0.0.1:5432 ESTABLISHED tcp0 0 :::127.0.0.1:36011 :::127.0.0.1:5432 ESTABLISHED tcp0 0 :::127.0.0.1:36013 :::127.0.0.1:5432 ESTABLISHED tcp0 0 :::127.0.0.1:36017 :::127.0.0.1:5432 ESTABLISHED tcp0 0 :::127.0.0.1:36019 :::127.0.0.1:5432 ESTABLISHED tcp0 0 :::127.0.0.1:36018 :::127.0.0.1:5432 ESTABLISHED tcp0 0 :::127.0.0.1:35996 :::127.0.0.1:5432 ESTABLISHED tcp0 0 :::127.0.0.1:35999 :::127.0.0.1:5432 ESTABLISHED unix 2 [ ACC ] STREAM LISTENING 2640437 /tmp/.s.PGSQL.5432 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] multiple threads inserting into the same table
David Wilson [david.t.wil...@gmail.com] wrote: How many indexes are there on ts_stats_transset_user_daily? 10: create index ts_stats_transet_user_daily_DayIndex on ts_stats_transet_user_daily (ts_day); create index ts_stats_transet_user_daily_HourIndex on ts_stats_transet_user_daily (ts_hour); create index ts_stats_transet_user_daily_LastAggregatedRowIndex on ts_stats_transet_user_daily (ts_last_aggregated_row); create index ts_stats_transet_user_daily_MonthIndex on ts_stats_transet_user_daily (ts_month); create index ts_stats_transet_user_daily_StartTimeIndex on ts_stats_transet_user_daily (ts_interval_start_time); create index ts_stats_transet_user_daily_TranSetIncarnationIdIndex on ts_stats_transet_user_daily (ts_transet_incarnation_id); create index ts_stats_transet_user_daily_TranSetIndex on ts_stats_transet_user_daily (ts_transet_id); create index ts_stats_transet_user_daily_UserIncarnationIdIndex on ts_stats_transet_user_daily (ts_user_incarnation_id); create index ts_stats_transet_user_daily_UserIndex on ts_stats_transet_user_daily (ts_user_id); create index ts_stats_transet_user_daily_WeekIndex on ts_stats_transet_user_daily (ts_week); create index ts_stats_transet_user_daily_YearIndex on ts_stats_transet_user_daily (ts_year); Are these rows being inserted in individual insert statements, or are they batched in some fashion? individual insert stmts in a single transaction. What's the disk/cpu activity on your system look like? The app is using 100% CPU and I haven't figured out why, but the insert threads are generally doing socket reads. But they can't be completely blocked as 1 thread is doing a read in one thread dump and is doing processing (preparing for another insert) in a later thread dump. So, it looks as if the inserts are taking a l-o-n-g time. Here's the output of vmstat and iostat. I've never looked at this info before, so I'm not sure what it says. [r...@rdl64xeoserv01 log]# vmstat procs ---memory-- ---swap-- -io --system-- cpu r b swpd free buff cache si sobibo incs us sy id wa 1 0 0 9194740 58676 2098026400 8211 2 2 0 98 0 [r...@rdl64xeoserv01 log]# iostat Linux 2.6.9-42.ELsmp (rdl64xeoserv01.ca.com)03/23/2009 avg-cpu: %user %nice%sys %iowait %idle 1.710.000.090.02 98.18 Device:tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 10.46 126.23 343.38 304224943 827588034 sda1 0.00 0.00 0.00 1428 58 sda2 57.73 126.23 343.37 304221426 827576144 sda3 0.00 0.00 0.00 1073 0 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] slow queries
Tom Lane [...@sss.pgh.pa.us] wrote: [ shrug... ] You tell us. To me it sounds a whole lot like some client program sitting on an open transaction that has a nonexclusive lock on the table to be dropped. That transaction wasn't necessarily doing any useful work; it might have just been waiting on the client. I wish I could... And, in any event, aren't all transactions listed in the pg_stat_activity select? At this point I suppose arguing about it is moot because the evidence is all gone. If it happens again, capture the contents of pg_locks and pg_stat_activity while things are still stuck. This happened again last night. This time I'd added a lock (in the java code) to prevent inserts into other partitions of ts_defects while the drop is in progress. Below is the output from: select xact_start,datid,datname,procpid,usesysid,substring(current_query from 0 for 40),waiting,client_addr from pg_stat_activity order by xact_start; and select locktype,database,relation,virtualxid,virtualtransaction,pid,mode from pg_locks order by mode; As you can see there are only 3 transactions and 1 starts 1 hour after the drop begins. I'm still trying to figure out how to interpret the pg_locks output, but (presumably) you/others on this forum have more experience at this than I. Thanks, Brian cemdb= select xact_start,datid,datname,procpid,usesysid,substring(current_query from 0 for 40),waiting,client_addr from pg_stat_activity order by xact_start; xact_start | datid | datname | procpid | usesysid |substring| waiting | client_addr ---+--+-+-+--+-+-+ 2009-03-01 14:10:42.606592-08 | 26472437 | cemdb | 13833 | 16392 | IDLE in transaction | f | 130.200.164.15 2009-03-02 00:30:00.039977-08 | 26472437 | cemdb | 13842 | 16392 | drop table ts_defects_20090227 | t | 127.0.0.1 2009-03-02 00:30:00.066728-08 | 26472437 | cemdb | 13865 | 16392 | select transetdef0_.ts_id as ts1_85_0_, | t | 127.0.0.1 2009-03-02 01:01:00.992486-08 | 26472437 | cemdb | 13840 | 16392 | CREATE VIEW TranSetGroupSlaPerformanceA | t | 127.0.0.1 2009-03-02 10:16:21.252969-08 | 26472437 | cemdb | 29985 | 16392 | select xact_start,datid,datname,procpid | f | | 26472437 | cemdb | 13735 | 16392 | IDLE | f | 127.0.0.1 | 26472437 | cemdb | 13744 | 16392 | IDLE | f | 127.0.0.1 | 26472437 | cemdb | 13857 | 16392 | IDLE | f | 127.0.0.1 | 26472437 | cemdb | 13861 | 16392 | IDLE | f | 127.0.0.1 | 26472437 | cemdb | 13864 | 16392 | IDLE | f | 127.0.0.1 | 26472437 | cemdb | 13855 | 16392 | IDLE | f | 127.0.0.1 | 26472437 | cemdb | 13740 | 16392 | IDLE | f | 127.0.0.1 (12 rows) cemdb= select locktype,database,relation,virtualxid,virtualtransaction,pid,mode from pg_locks order by mode; locktype| database | relation | virtualxid | virtualtransaction | pid |mode ---+--+--+++---+- relation | 26472437 | 26592616 || 15/69749 | 13842 | AccessExclusiveLock relation | 26472437 | 26592608 || 15/69749 | 13842 | AccessExclusiveLock relation | 26472437 | 26592615 || 15/69749 | 13842 | AccessExclusiveLock relation | 26472437 | 26592613 || 15/69749 | 13842 | AccessExclusiveLock relation | 26472437 | 26472508 || 15/69749 | 13842 | AccessExclusiveLock relation | 26472437 | 26493706 || 11/131 | 13833 | AccessShareLock relation | 26472437 | 26473141 || 11/131 | 13833 | AccessShareLock relation | 26472437 |10969 || 1/77414 | 29985 | AccessShareLock relation | 26472437 | 26473176 || 11/131 | 13833 | AccessShareLock relation | 26472437 | 26493307 || 11/131 | 13833 | AccessShareLock relation | 26472437 | 26493271 || 11/131 | 13833 | AccessShareLock relation | 26472437 | 26493704 || 11/131 | 13833 | AccessShareLock relation | 26472437 | 26493711 || 11/131 | 13833 | AccessShareLock relation | 26472437 | 2674 || 15/69749 | 13842 | AccessShareLock relation | 26472437 | 26493279 || 11/131 | 13833 |
Re: [PERFORM] slow queries
Tom Lane [...@sss.pgh.pa.us] wrote: If you hadn't left out the granted column we could be more sure, but what it looks like to me is the DROP (pid 13842) is stuck behind the IDLE transaction (pid 13833). In particular these two rows of pg_locks look like a possible conflict: relation | 26472437 | 26472508 || 15/69749 | 13842 | AccessExclusiveLock relation | 26472437 | 26472508 || 11/131 | 13833 | AccessShareLock select c.oid,c.relname,l.pid,l.mode,l.granted from pg_class c join pg_locks l on c.oid=l.relation order by l.pid; 26472508 | ts_transets | 13833 | AccessShareLock | t 26472508 | ts_transets | 13842 | AccessExclusiveLock | f pid 13833 is the idle transaction and 13842 is doing the drop table. So, the idle transaction is the problem. Thanks to you, Scott Carey and Robert Haas for pointing this out. However, why does the drop of ts_defects_20090227 need exclusive access to ts_transets? I assume it must be due to this FK? alter table ts_defects_20090227 add constraint FK34AA2B629DADA24 foreign key (ts_transet_id) references ts_transets; Thanks again, Brian -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] slow queries
Tom Lane [...@sss.pgh.pa.us] wrote: Well, that's certainly a sufficient reason, if perhaps not the only reason. Dropping ts_defects_20090227 will require removal of FK triggers on ts_transets, and we can't do that concurrently with transactions that might be trying to fire those triggers. Now admittedly, it would probably be sufficient to take ExclusiveLock rather than AccessExclusiveLock when removing triggers, since we do not have triggers ON SELECT. Right now though, we just take AccessExclusiveLock for most any DDL on a table. There was a patch submitted last fall to reduce DDL locking in some cases, but it hasn't been reworked to fix the problems that were pointed out (and I disremember if it addressed DROP TRIGGER in particular anyway). regards, tom lane Thanks for furthering my understanding of postgres (and probably other SQL servers as well). I can fix this problem easily. Brian -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] slow queries
Actually, they're all deadlocked. The question is why? Here's a brief background. The ts_defects table is partitioned by occurrence date; each partition contains the rows for 1 day. When the data gets old enough, the partition is dropped. Since the correct partition can be determined from the occurrence date, there is no trigger: inserts are done directly into the correct partition. Multiple threads may be inserting into a partition at the same time. The thread that checks for old data to be dropped runs at 00:30 each night. It also creates the partition for the next day. Below is the output from: select xact_start,query_start,substring(current_query from 0 for 40) from pg_stat_activity order by xact_start; run at 18:40 on 28 Feb 2009 (i.e. these queries have been running for 6 hours). The 1st select is not on any of the ts_defect partitions nor is the CREATE VIEW. The SELECT's shown last are not (directly) generated by the java program that is running the drop table, inserts, the 1st select and the CREATE VIEW. Thanks for your ideas, Brian 2009-02-28 00:30:00.01572-08 | 2009-02-28 00:30:00.015758-08 | drop table ts_defects_20090225 2009-02-28 00:30:00.693353-08 | 2009-02-28 00:30:00.69337-08 | select transetdef0_.ts_id as ts1_85_0_, 2009-02-28 00:30:01.875671-08 | 2009-02-28 00:30:01.875911-08 | insert into ts_defects_20090228 (ts_id, 2009-02-28 00:30:01.875673-08 | 2009-02-28 00:30:01.875911-08 | insert into ts_defects_20090228 (ts_id, 2009-02-28 00:30:01.875907-08 | 2009-02-28 00:30:01.87611-08 | insert into ts_defects_20090228 (ts_id, 2009-02-28 00:30:01.87615-08 | 2009-02-28 00:30:01.876334-08 | insert into ts_defects_20090228 (ts_id, 2009-02-28 00:30:01.87694-08 | 2009-02-28 00:30:01.877153-08 | insert into ts_defects_20090228 (ts_id, 2009-02-28 00:30:01.876952-08 | 2009-02-28 00:30:01.877171-08 | insert into ts_defects_20090228 (ts_id, 2009-02-28 00:30:01.876965-08 | 2009-02-28 00:30:01.87716-08 | insert into ts_defects_20090228 (ts_id, 2009-02-28 00:30:01.877267-08 | 2009-02-28 00:30:01.877483-08 | insert into ts_defects_20090228 (ts_id, 2009-02-28 00:30:01.877928-08 | 2009-02-28 00:30:01.878101-08 | insert into ts_defects_20090228 (ts_id, 2009-02-28 00:30:06.822733-08 | 2009-02-28 00:30:06.822922-08 | insert into ts_defects_20090228 (ts_id, 2009-02-28 01:01:00.95051-08 | 2009-02-28 01:01:00.950605-08 | CREATE VIEW TranSetGroupSlaPerformanceA 2009-02-28 09:12:33.181039-08 | 2009-02-28 09:12:33.181039-08 | SELECT c.oid, c.relname, pg_get_userbyi 2009-02-28 09:19:47.335621-08 | 2009-02-28 09:19:47.335621-08 | SELECT c.oid, c.relname, pg_get_userbyi 2009-02-28 10:52:36.638467-08 | 2009-02-28 10:52:36.638467-08 | SELECT c.oid, c.relname, pg_get_userbyi 2009-02-28 11:01:05.023126-08 | 2009-02-28 11:01:05.023126-08 | SELECT c.oid, c.relname, pg_get_userbyi -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Deleting millions of rows
I'm using 8.3.5. Table ts_defects has 48M rows. Through psql: delete from ts_defects; Result: out of memory/Can't allocate size: 32 I then did 10 or so deletes to get rid of the rows. Afterwards, inserts into or queries on this table performed significantly slower. I tried a vacuum analyze, but this didn't help. To fix this, I dumped and restored the database. 1) why can't postgres delete all rows in a table if it has millions of rows? 2) is there any other way to restore performance other than restoring the database? Thanks, Brian -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Deleting millions of rows
Robert Haas [robertmh...@gmail.com] wrote: Thanks for your response. Does the table have triggers on it? Does it have indexes? What is the result of pg_relation_size() on that table? No triggers; 3 indexes cemdb= select pg_relation_size('ts_defects'); pg_relation_size -- 9464971264 (1 row) cemdb= cemdb= select pg_relation_size('ts_defects_DateIndex'); pg_relation_size -- 1299931136 (1 row) cemdb= select pg_relation_size('ts_defects_DefectIndex'); pg_relation_size -- 1217224704 (1 row) cemdb= select pg_relation_size('ts_defects_EventIndex'); pg_relation_size -- 1216528384 How much memory do you have in your machine? What is work_mem set to? 32G; work_mem=64M Did you try VACUUM FULL instead of just plain VACUUM to recover performance? You might also need to REINDEX. Or you could TRUNCATE the table. I didn't try FULL or REINDEX. In this case, TRUNCATE is the best option as I was just trying to reset the state of the table for another test. But this brings up another question: will autovacuum do the right thing to preserve performance on this table when many rows are deleted? Thanks, Brian -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Deleting millions of rows
David Wilson [david.t.wil...@gmail.com] wrote: Is this table the target of any foreign keys? There are 2 on delete cascade FKs that reference this table. Brian -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Deleting millions of rows
Tom Lane [...@sss.pgh.pa.us] wrote: It's the pending trigger list. He's got two trigger events per row, which at 40 bytes apiece would approach 4GB of memory. Apparently it's a 32-bit build of Postgres, so he's running out of process address space. Yes, this is a 32 bit Postgres running on a 32 bit Linux. I assume that the 2 triggers are due to the 2 on delete cascade FKs. Thanks for explaining this bit of a mystery. TRUNCATE is the best solution if you want to get rid of the whole table contents. If you're deleting very many but not all rows, people tend to drop the FK constraints and re-establish them afterwards. Retail checking is just too slow. Thanks also to you (and several others) for reminding me of TRUNCATE. This will definitely work for what I was trying to do: reset this table for more testing. In production, the table on which I ran DELETE FROM grows constantly with old data removed in bunches periodically (say up to a few 100,000s of rows [out of several millions] in a bunch). I'm assuming that auto-vacuum/analyze will allow Postgres to maintain reasonable performance for INSERTs and SELECTs on it; do you think that this is a reasonable assumption? Thanks, Brian -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] limit + order by is slow if no rows in result set
There are 1.9M rows in ts_defects and indexes on b.ts_id (primary key) d.ts_biz_event_id and d.ts_occur_date. Both queries below return 0 rows. The 1st runs fast and the 2nd 400x slower. The 2nd query differs from the 1st only by the addition of limit 1. Why the big difference in performance? Thanks, Brian [EMAIL PROTECTED] jsp]$ time PGPASSWORD= psql -U admin -d cemdb -h 192.168.1.30 -c 'select * from ts_defects d join ts_biz_events b on b.ts_id = d.ts_biz_event_id where b.ts_status=3 order by d.ts_occur_date desc;' (column list deleted) ---+--+--+---+---+++++--+-+---+--+---+-+---+-+-+-++++--++--+---++--+---+-+--+--++---+--+---+---+--+-++---+-+--+---+--+-++---+-+--+---+--+--+-+ -+--+--+ (0 rows) real0m0.022s user0m0.003s sys 0m0.003s [EMAIL PROTECTED] jsp]$ time PGPASSWORD= psql -U admin -d cemdb -h 192.168.1.30 -c 'select * from ts_defects d join ts_biz_events b on b.ts_id = d.ts_biz_event_id where b.ts_status=3 order by d.ts_occur_date desc limit 1;' (column list deleted) ---+--+--+---+---+++++--+-+---+--+---+-+---+-+-+-++++--++--+---++--+---+-+--+--++---+--+---+---+--+-++---+-+--+---+--+-++---+-+--+---+--+--+-+ -+--+--+ (0 rows) real0m9.410s user0m0.005s sys 0m0.002s ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] limit + order by is slow if no rows in result set
Hi Heikki, Thanks for your response. Please run EXPLAIN ANALYZE on both queries, and send back the results. [EMAIL PROTECTED] jsp]$ PGPASSWORD=quality psql -U admin -d cemdb -h 192.168.1.30 -c 'explain analyze select * from ts_defects d join ts_biz_events b on b.ts_id = d.ts_biz_event_id where b.ts_status=3 order by d.ts_occur_date desc;' QUERY PLAN -- Sort (cost=160400.01..160646.91 rows=98762 width=2715) (actual time=0.303..0.303 rows=0 loops=1) Sort Key: d.ts_occur_date - Hash Join (cost=33.20..82567.14 rows=98762 width=2715) (actual time=0.218..0.218 rows=0 loops=1) Hash Cond: (outer.ts_biz_event_id = inner.ts_id) - Seq Scan on ts_defects d (cost=0.00..71882.88 rows=1932688 width=1545) (actual time=0.022..0.022 rows=1 loops=1) - Hash (cost=33.04..33.04 rows=65 width=1170) (actual time=0.135..0.135 rows=0 loops=1) - Bitmap Heap Scan on ts_biz_events b (cost=2.23..33.04 rows=65 width=1170) (actual time=0.132..0.132 rows=0 loops=1) Recheck Cond: (ts_status = 3) - Bitmap Index Scan on ts_biz_events_statusindex (cost=0.00..2.23 rows=65 width=0) (actual time=0.054..0.054 rows=61 loops=1) Index Cond: (ts_status = 3) Total runtime: 0.586 ms (11 rows) [EMAIL PROTECTED] jsp]$ PGPASSWORD=quality psql -U admin -d cemdb -h 192.168.1.30 -c 'explain analyze select * from ts_defects d join ts_biz_events b on b.ts_id = d.ts_biz_event_id where b.ts_status=3 order by d.ts_occur_date desc limit 1;' QUERY PLAN --- Limit (cost=0.00..87.37 rows=1 width=2715) (actual time=17999.482..17999.482 rows=0 loops=1) - Nested Loop (cost=0.00..8628543.77 rows=98762 width=2715) (actual time=17999.476..17999.476 rows=0 loops=1) - Index Scan Backward using ts_defects_dateindex on ts_defects d (cost=0.00..227675.97 rows=1932688 width=1545) (actual time=0.047..3814.923 rows=1932303 loops=1) - Index Scan using ts_biz_events_pkey on ts_biz_events b (cost=0.00..4.33 rows=1 width=1170) (actual time=0.005..0.005 rows=0 loops=1932303) Index Cond: (b.ts_id = outer.ts_biz_event_id) Filter: (ts_status = 3) Total runtime: 17999.751 ms (7 rows) Also, what indexes are there on the tables involved? I tried to mention the relevant indexes in my original posting, but omitted one; here's a list of all indexes: ts_defects: ts_id, ts_occur_date, ts_defect_def_id, ts_biz_event_id, ts_trancomp_id, ts_transet_incarnation_id, ts_transet_id, ts_tranunit_id, ts_user_incarnation_id, ts_user_id ts_biz_events: ts_id, ts_defect_def_id, ts_status Thanks, Brian ---(end of broadcast)--- TIP 6: explain analyze is your friend