[PERFORM] Slower query after psql changed it's execution plan
Hello all. To our surprise this morning, we found a query that used to return it's result in about 50 ~ 100ms now take about 7.000ms to run. After some investigation, we found out that the PostgreSQL server (8.1) changed the execution plan (I'm assuming because the number of rows increased). Since this query may be executed a few times per second, it caused some problem :) The query is the following: SELECT meta_id, meta_type, COUNT(M.post_id) as count FROM dc_meta M LEFT JOIN dc_post P ON M.post_id = P.post_id WHERE P.blog_id = 'b4c62627b3203e7780078cf2f6373ab5' AND M.blog_id = 'b4c62627b3203e7780078cf2f6373ab5' AND meta_type = 'tag' AND ((post_status = 1 AND post_password IS NULL )) GROUP BY meta_id,meta_type,P.blog_id ORDER BY count DESC LIMIT 40 The dc_post table has the following fields: - post_id bigint NOT NULL, - blog_id character varying(32) NOT NULL, - post_password character varying(32), - post_status smallint NOT NULL DEFAULT 0, - and some other not used for this query ;) Usefull indexes: - dc_pk_post PRIMARY KEY(post_id) - dc_fk_post_blog FOREIGN KEY (blog_id) - dc_idx_blog_post_post_status btree (blog_id, post_status) - dc_idx_post_blog_id btree (blog_id) dc_meta is as follow: - meta_id character varying(255) NOT NULL, - meta_type character varying(64) NOT NULL, - post_id bigint NOT NULL, - blog_id character varying(32) With indexes: - dc_pk_meta PRIMARY KEY(meta_id, meta_type, post_id) - dc_fk_meta_blog FOREIGN KEY (blog_id) - dc_fk_meta_post FOREIGN KEY (post_id) - dc_idx_meta_blog_id btree (blog_id) - dc_idx_meta_meta_type btree (meta_type) - dc_idx_meta_post_id btree (post_id) (Aren't the foreign keys and index redundant btw? :) I've attached the EXPLAIN ANALYZE that runs now, the one that runs on our test server (witch contains data from 10 days ago), and another one on the production server with nested loop disabled. The query plan for the test server is the same that the production server was last week. On production dc_meta contains approx 791756 rows dc_post contains approx 235524 rows On test : dc_meta contains approx 641398 rows dc_post contains approx 211295 rows The statistics are at the default value everywhere (10) The 'b4c6' blog is one of your biggest blogs, which contains 9326 tags and 3178 posts on the production server (9156 / 3132 in test) Does anyone have and idea why this happened and how we may be able to fix the problem ? Disabling nested loop falls back on the previous plan, but we can't really disable them since the application used (dotclear) and it's db layer is designed to work with mysql as well. For the moment I've changed the query to remove the P.blog_id = 'b4c6..' clause and it does the trick, but it's still slower than the previous one. Thank you for your time -- Romuald Brunet Limit (cost=34380.87..34380.97 rows=40 width=94) (actual time=96.698..96.753 rows=40 loops=1) - Sort (cost=34380.87..34381.16 rows=115 width=94) (actual time=96.696..96.712 rows=40 loops=1) Sort Key: count(m.post_id) - HashAggregate (cost=34375.50..34376.94 rows=115 width=94) (actual time=85.320..89.064 rows=3688 loops=1) - Hash Join (cost=10298.29..34374.35 rows=115 width=94) (actual time=35.783..71.702 rows=8957 loops=1) Hash Cond: (outer.post_id = inner.post_id) - Bitmap Heap Scan on dc_meta m (cost=336.93..24341.33 rows=9401 width=58) (actual time=3.812..26.720 rows=9156 loops=1) Recheck Cond: ((blog_id)::text = 'b4c62627b3203e7780078cf2f6373ab5'::text) Filter: ((meta_type)::text = 'tag'::text) - Bitmap Index Scan on dc_idx_meta_blog_id (cost=0.00..336.93 rows=9407 width=0) (actual time=3.168..3.168 rows=9156 loops=1) Index Cond: ((blog_id)::text = 'b4c62627b3203e7780078cf2f6373ab5'::text) - Hash (cost=9954.92..9954.92 rows=2577 width=44) (actual time=31.918..31.918 rows=3038 loops=1) - Bitmap Heap Scan on dc_post p (cost=35.51..9954.92 rows=2577 width=44) (actual time=2.508..28.751 rows=3038 loops=1) Recheck Cond: (((blog_id)::text = 'b4c62627b3203e7780078cf2f6373ab5'::text) AND (post_status = 1)) Filter: (post_password IS NULL) - Bitmap Index Scan on dc_idx_blog_post_post_status (cost=0.00..35.51 rows=2584 width=0) (actual time=1.650..1.650 rows=3038 loops=1) Index Cond: (((blog_id)::text = 'b4c62627b3203e7780078cf2f6373ab5'::text) AND (post_status = 1)) Total runtime: 97.191 ms Limit (cost=2740.03..2740.03 rows=1 width=96) (actual time=7712.844..7712.923 rows=40 loops=1) - Sort (cost=2740.03..2740.03 rows=1 width=96) (actual time=7712.840..7712.840 rows=40 loops=1) Sort Key: count(m.post_id) -
Re: [PERFORM] Slower query after psql changed it's execution plan
Romuald Brunet romuald.bru...@gmail.com wrote: The statistics are at the default value everywhere (10) Try setting that to 100 and running ANALYZE. The small size of the sample with the default of 10 happened to land you with a bad estimate this time. (If the numbers it has were actually representative of the data, the plan it's using would be reasonable.) -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Slower query after psql changed it's execution plan
Romuald Brunet romuald.bru...@gmail.com writes: After some investigation, we found out that the PostgreSQL server (8.1) changed the execution plan (I'm assuming because the number of rows increased). The problem seems to be that the estimate for the number of rows fetched from dc_post changed drastically --- it was in the right ballpark, and now it's off by a factor of 30. Why is that? Maybe you had had a special statistics target setting, and it got dropped? The statistics are at the default value everywhere (10) Almost certainly not enough. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Need help with one query
Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: On Fri, Mar 20, 2009 at 1:16 PM, Anne Rosset aros...@collab.net wrote: The db version is 8.2.4 Something is wrong here. How can setting enable_seqscan to off result in a plan with a far lower estimated cost than the original plan? Planner bug no doubt ... given how old the PG release is, I'm not particularly interested in probing into it now. If Anne can still reproduce this behavior on 8.2.something-recent, we should look closer. regards, tom lane Thanks Tom, Richard. Here are our postgres conf : shared_buffers = 2000MB sort_mem = 15 vacuum_mem = 10 work_mem = 20MB# min 64kB maintenance_work_mem = 256MB# min 1MB max_fsm_pages = 204800 full_page_writes = off# recover from partial page writes wal_buffers = 1MB# min 32kB # (change requires restart) commit_delay = 2# range 0-10, in microseconds commit_siblings = 3# range 1-1000 checkpoint_segments = 128 checkpoint_warning = 240s enable_indexscan = on enable_mergejoin = on enable_nestloop = off random_page_cost = 2.0 effective_cache_size = 2500MB geqo = off default_statistics_target = 750 stats_command_string = on update_process_title = on stats_start_collector = on stats_row_level = on autovacuum = on autovacuum_vacuum_threshold = 500# min # of tuple updates before # vacuum autovacuum_analyze_threshold = 250# min # of tuple updates before # analyze autovacuum_vacuum_scale_factor = 0.2# fraction of rel size before # vacuum autovacuum_analyze_scale_factor = 0.1 Anne -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Need help with one query
On Mon, Mar 23, 2009 at 1:08 PM, Anne Rosset aros...@collab.net wrote: enable_nestloop = off That may be the source of your problem. Generally setting enable_* to off is a debugging tool, not something you ever want to do in production. ...Robert -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] multiple threads inserting into the same table
The application log shows that 99652 rows are being inserted into relation ts_stats_transet_user_daily. 5 threads are doing the inserts. The schema is lengthy, but it has a synthetic primary key (ts_id int8 not null) and the following constraints: alter table ts_stats_transet_user_daily add constraint FK8ED105ED9DADA24 foreign key (ts_transet_id) references ts_transets; alter table ts_stats_transet_user_daily add constraint K8ED105ED545ADA6D foreign key (ts_user_id) references ts_users; This relation currently has 456532 rows and is not partitioned. The inserts have been going on now for almost 1 hour -- not exactly speedy. Here's what I find on the postgres side: cemdb= select current_query, procpid, xact_start from pg_stat_activity; current_query | procpid | xact_start --+-+--- IDLE in transaction| 15147 | 2009-03-23 12:08:31.604433-07 IDLE | 15382 | select current_query, procpid, xact_start from pg_stat_activity; | 15434 | 2009-03-23 12:10:38.913764-07 IDLE | 15152 | IDLE | 15150 | IDLE | 15156 | IDLE in transaction| 15183 | 2009-03-23 12:09:50.864992-07 IDLE in transaction| 15186 | 2009-03-23 12:10:07.955838-07 IDLE | 15188 | IDLE | 15192 | IDLE in transaction| 15193 | 2009-03-23 12:10:07.955859-07 IDLE in transaction| 15194 | 2009-03-23 12:08:59.940101-07 (12 rows) cemdb= select c.oid,c.relname,l.pid,l.mode,l.granted from pg_class c join pg_locks l on c.oid=l.relation order by l.mode; oid| relname | pid | mode | granted --+-+---+--+- 26493289 | ts_users_pkey | 15183 | AccessShareLock | t 26493267 | ts_transets_pkey| 15186 | AccessShareLock | t 1259 | pg_class| 15434 | AccessShareLock | t 26493289 | ts_users_pkey | 15147 | AccessShareLock | t 10969 | pg_locks| 15434 | AccessShareLock | t 26493267 | ts_transets_pkey| 15193 | AccessShareLock | t 26493289 | ts_users_pkey | 15194 | AccessShareLock | t 2662 | pg_class_oid_index | 15434 | AccessShareLock | t 26493267 | ts_transets_pkey| 15194 | AccessShareLock | t 26493289 | ts_users_pkey | 15193 | AccessShareLock | t 26493267 | ts_transets_pkey| 15147 | AccessShareLock | t 26493289 | ts_users_pkey | 15186 | AccessShareLock | t 26493267 | ts_transets_pkey| 15183 | AccessShareLock | t 2663 | pg_class_relname_nsp_index | 15434 | AccessShareLock | t 26472890 | ts_stats_transet_user_daily | 15147 | RowExclusiveLock | t 26472890 | ts_stats_transet_user_daily | 15186 | RowExclusiveLock | t 26472890 | ts_stats_transet_user_daily | 15193 | RowExclusiveLock | t 26472890 | ts_stats_transet_user_daily | 15194 | RowExclusiveLock | t 26472890 | ts_stats_transet_user_daily | 15183 | RowExclusiveLock | t 26473252 | ts_users| 15194 | RowShareLock | t 26472508 | ts_transets | 15183 | RowShareLock | t 26472508 | ts_transets | 15193 | RowShareLock | t 26473252 | ts_users| 15193 | RowShareLock | t 26473252 | ts_users| 15183 | RowShareLock | t 26472508 | ts_transets | 15147 | RowShareLock | t 26473252 | ts_users| 15186 | RowShareLock | t 26472508 | ts_transets | 15186 | RowShareLock | t 26473252 | ts_users| 15147 | RowShareLock | t 26472508 | ts_transets | 15194 | RowShareLock | t (29 rows) cemdb= select c.oid,c.relname,l.pid,l.mode,l.granted from pg_class c join pg_locks l on c.oid=l.relation order by l.pid; oid| relname | pid | mode | granted --+-+---+--+- 26493289 | ts_users_pkey | 15147 | AccessShareLock | t 26473252 | ts_users| 15147 | RowShareLock | t 26493267 | ts_transets_pkey| 15147 | AccessShareLock | t 26472508 | ts_transets | 15147 | RowShareLock | t 26472890 |
Re: [PERFORM] multiple threads inserting into the same table
On Mon, Mar 23, 2009 at 3:34 PM, Brian Cox brian@ca.com wrote: The application log shows that 99652 rows are being inserted into relation ts_stats_transet_user_daily. 5 threads are doing the inserts. The schema is lengthy, but it has a synthetic primary key (ts_id int8 not null) and the following constraints: How many indexes are there on ts_stats_transset_user_daily? Are these rows being inserted in individual insert statements, or are they batched in some fashion? What's the disk/cpu activity on your system look like? -- - David T. Wilson david.t.wil...@gmail.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Need help with one query
Robert Haas wrote: On Mon, Mar 23, 2009 at 1:08 PM, Anne Rosset aros...@collab.net wrote: enable_nestloop = off That may be the source of your problem. Generally setting enable_* to off is a debugging tool, not something you ever want to do in production. ...Robert Thanks Robert. It seems to have solved the problem. Thanks a lot, Anne -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] multiple threads inserting into the same table
Brian Cox brian@ca.com writes: The application log shows that 99652 rows are being inserted into relation ts_stats_transet_user_daily. 5 threads are doing the inserts. pg_stat_activity says those five threads are doing nothing except sitting around with open transactions. You sure you don't have a bug on the application side? regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] multiple threads inserting into the same table
Tom Lane [...@sss.pgh.pa.us] wrote: pg_stat_activity says those five threads are doing nothing except sitting around with open transactions. You sure you don't have a bug on the application side? regards, tom lane This is a java app. A thread dump reveals that these 5 threads are all asleep on a socket read to postgres (see below). DbUtils.java:2265 is: session.connection().createStatement() .executeUpdate(((DatabaseInsert) insertObject).getInsertStmt(session)); This generates and executes a single SQL insert. Since, as you point out, postgres seems to think that this transaction isn't doing anything, it's hard to figure out what the read is doing. Brian DatabasePool.Thread1 prio=10 tid=0x27f04c00 nid=0x3b38 runnable [0x29e27000..0x29e281b0] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:129) at org.postgresql.core.VisibleBufferedInputStream.readMore(VisibleBufferedInputStream.java:135) at org.postgresql.core.VisibleBufferedInputStream.ensureBytes(VisibleBufferedInputStream.java:104) at org.postgresql.core.VisibleBufferedInputStream.read(VisibleBufferedInputStream.java:73) at org.postgresql.core.PGStream.ReceiveChar(PGStream.java:259) at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1182) at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:194) - locked 0x8975c878 (a org.postgresql.core.v3.QueryExecutorImpl) at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:451) at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:336) at org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:282) at com.mchange.v2.c3p0.impl.NewProxyStatement.executeUpdate(NewProxyStatement.java:64) at com.timestock.tess.util.DbUtils$DatabaseInsertTask.insertObject(DbUtils.java:2265) at com.timestock.tess.util.DbUtils$DatabaseInsertTask.call(DbUtils.java:2200) at com.timestock.tess.util.DbUtils$DatabaseInsertTask.call(DbUtils.java:2157) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] multiple threads inserting into the same table
On Mon, Mar 23, 2009 at 2:25 PM, Brian Cox brian@ca.com wrote: Tom Lane [...@sss.pgh.pa.us] wrote: pg_stat_activity says those five threads are doing nothing except sitting around with open transactions. You sure you don't have a bug on the application side? regards, tom lane This is a java app. A thread dump reveals that these 5 threads are all asleep on a socket read to postgres (see below). DbUtils.java:2265 is: session.connection().createStatement() .executeUpdate(((DatabaseInsert) insertObject).getInsertStmt(session)); This generates and executes a single SQL insert. Since, as you point out, postgres seems to think that this transaction isn't doing anything, it's hard to figure out what the read is doing. Might you have a firewall that's killing the connections? What does netstat -an on the client side say about these connections? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] multiple threads inserting into the same table
Scott Marlowe [scott.marl...@gmail.com] wrote: Might you have a firewall that's killing the connections? What does netstat -an on the client side say about these connections? I don't think so: 1) app and postgres are on the same machine and 2) this has been the set up for months and I don't think anyone has diddled with the machine. netstat -an | fgrep 5432 shows: [r...@rdl64xeoserv01 log]# netstat -an | fgrep 5432 tcp0 0 0.0.0.0:54320.0.0.0:* LISTEN tcp0 0 127.0.0.1:5432 127.0.0.1:35996 ESTABLISHED tcp0 0 127.0.0.1:5432 127.0.0.1:35999 ESTABLISHED tcp0 0 127.0.0.1:5432 127.0.0.1:36017 ESTABLISHED tcp0 0 127.0.0.1:5432 127.0.0.1:36019 ESTABLISHED tcp0 0 127.0.0.1:5432 127.0.0.1:36018 ESTABLISHED tcp0 0 127.0.0.1:5432 127.0.0.1:36005 ESTABLISHED tcp0 0 127.0.0.1:5432 127.0.0.1:36001 ESTABLISHED tcp0 0 127.0.0.1:5432 127.0.0.1:36013 ESTABLISHED tcp0 0 127.0.0.1:5432 127.0.0.1:36008 ESTABLISHED tcp0 0 127.0.0.1:5432 127.0.0.1:36011 ESTABLISHED tcp0 0 130.200.163.80:5432 130.200.164.26:50481 ESTABLISHED tcp0 0 :::5432 :::* LISTEN tcp0 0 :::127.0.0.1:36001 :::127.0.0.1:5432 ESTABLISHED tcp0 0 :::127.0.0.1:36005 :::127.0.0.1:5432 ESTABLISHED tcp0 0 :::127.0.0.1:36008 :::127.0.0.1:5432 ESTABLISHED tcp0 0 :::127.0.0.1:36011 :::127.0.0.1:5432 ESTABLISHED tcp0 0 :::127.0.0.1:36013 :::127.0.0.1:5432 ESTABLISHED tcp0 0 :::127.0.0.1:36017 :::127.0.0.1:5432 ESTABLISHED tcp0 0 :::127.0.0.1:36019 :::127.0.0.1:5432 ESTABLISHED tcp0 0 :::127.0.0.1:36018 :::127.0.0.1:5432 ESTABLISHED tcp0 0 :::127.0.0.1:35996 :::127.0.0.1:5432 ESTABLISHED tcp0 0 :::127.0.0.1:35999 :::127.0.0.1:5432 ESTABLISHED unix 2 [ ACC ] STREAM LISTENING 2640437 /tmp/.s.PGSQL.5432 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] multiple threads inserting into the same table
David Wilson [david.t.wil...@gmail.com] wrote: How many indexes are there on ts_stats_transset_user_daily? 10: create index ts_stats_transet_user_daily_DayIndex on ts_stats_transet_user_daily (ts_day); create index ts_stats_transet_user_daily_HourIndex on ts_stats_transet_user_daily (ts_hour); create index ts_stats_transet_user_daily_LastAggregatedRowIndex on ts_stats_transet_user_daily (ts_last_aggregated_row); create index ts_stats_transet_user_daily_MonthIndex on ts_stats_transet_user_daily (ts_month); create index ts_stats_transet_user_daily_StartTimeIndex on ts_stats_transet_user_daily (ts_interval_start_time); create index ts_stats_transet_user_daily_TranSetIncarnationIdIndex on ts_stats_transet_user_daily (ts_transet_incarnation_id); create index ts_stats_transet_user_daily_TranSetIndex on ts_stats_transet_user_daily (ts_transet_id); create index ts_stats_transet_user_daily_UserIncarnationIdIndex on ts_stats_transet_user_daily (ts_user_incarnation_id); create index ts_stats_transet_user_daily_UserIndex on ts_stats_transet_user_daily (ts_user_id); create index ts_stats_transet_user_daily_WeekIndex on ts_stats_transet_user_daily (ts_week); create index ts_stats_transet_user_daily_YearIndex on ts_stats_transet_user_daily (ts_year); Are these rows being inserted in individual insert statements, or are they batched in some fashion? individual insert stmts in a single transaction. What's the disk/cpu activity on your system look like? The app is using 100% CPU and I haven't figured out why, but the insert threads are generally doing socket reads. But they can't be completely blocked as 1 thread is doing a read in one thread dump and is doing processing (preparing for another insert) in a later thread dump. So, it looks as if the inserts are taking a l-o-n-g time. Here's the output of vmstat and iostat. I've never looked at this info before, so I'm not sure what it says. [r...@rdl64xeoserv01 log]# vmstat procs ---memory-- ---swap-- -io --system-- cpu r b swpd free buff cache si sobibo incs us sy id wa 1 0 0 9194740 58676 2098026400 8211 2 2 0 98 0 [r...@rdl64xeoserv01 log]# iostat Linux 2.6.9-42.ELsmp (rdl64xeoserv01.ca.com)03/23/2009 avg-cpu: %user %nice%sys %iowait %idle 1.710.000.090.02 98.18 Device:tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 10.46 126.23 343.38 304224943 827588034 sda1 0.00 0.00 0.00 1428 58 sda2 57.73 126.23 343.37 304221426 827576144 sda3 0.00 0.00 0.00 1073 0 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] multiple threads inserting into the same table
Scott Marlowe scott.marl...@gmail.com writes: On Mon, Mar 23, 2009 at 2:25 PM, Brian Cox brian@ca.com wrote: This generates and executes a single SQL insert. Since, as you point out, postgres seems to think that this transaction isn't doing anything, it's hard to figure out what the read is doing. Might you have a firewall that's killing the connections? What does netstat -an on the client side say about these connections? netstat will probably say the connection is open on both sides --- otherwise the sockets would have closed. It looks like both sides still think the connection is open. A firewall timeout is still a possibility, but you'd have had to have a fairly long idle time for that to happen. Are any of the threads issuing commands that might have run for very long intervals (tens of minutes)? regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] multiple threads inserting into the same table
Brian Cox brian@ca.com writes: This is a java app. A thread dump reveals that these 5 threads are all asleep on a socket read to postgres (see below). It seems clear that what you've got isn't a performance problem. May I suggest taking it to pgsql-jdbc? The folk there are more likely to be able to help you figure out what's wrong than most of us. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Why creating GIN table index is so slow than inserting data into empty table with the same index?
example: select version(); version PostgreSQL 8.3.6 on i486-pc-linux-gnu, compiled by GCC gcc-4.3.real (Debian 4.3.3-3) 4.3.3 show maintenance_work_mem ; maintenance_work_mem -- 128MB create table a (i1 int, i2 int, i3 int, i4 int, i5 int, i6 int); insert into a select n, n, n, n, n, n from generate_series(1, 10) as n; INSERT 0 10 Время: 570,110 мс create index arr_gin on a using gin ( (array[i1, i2, i3, i4, i5, i6]) ); CREATE INDEX Время: 203068,314 мс truncate a; drop index arr_gin ; create index arr_gin on a using gin ( (array[i1, i2, i3, i4, i5, i6]) ); CREATE INDEX Время: 3,246 мс insert into a select n, n, n, n, n, n from generate_series(1, 10) as n; INSERT 0 10 Время: 2405,481 мс select pg_size_pretty(pg_total_relation_size('a')) as total, pg_size_pretty(pg_relation_size('a')) as table; total | table -+- 9792 kB | 5096 kB 203068.314 ms VS 2405.481 ms, is this behaviour normal ? Thanks ! -- Sergey Burladyan -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Why creating GIN table index is so slow than inserting data into empty table with the same index?
Sergey Burladyan eshkin...@gmail.com writes: show maintenance_work_mem ; maintenance_work_mem -- 128MB create table a (i1 int, i2 int, i3 int, i4 int, i5 int, i6 int); insert into a select n, n, n, n, n, n from generate_series(1, 10) as n; create index arr_gin on a using gin ( (array[i1, i2, i3, i4, i5, i6]) ); [ takes forever ] Seems the reason this is so awful is that the incoming data is exactly presorted, meaning that the tree structure that ginInsertEntry() is trying to build degenerates to a linear list (ie, every new element becomes the right child of the prior one). So the processing becomes O(N^2) up till you reach maintenance_work_mem and flush the tree. With a large setting for maintenance_work_mem it gets spectacularly slow. I think a reasonable solution for this might be to keep an eye on maxdepth and force a flush if that gets too large (more than a few hundred, perhaps?). Something like this: /* If we've maxed out our available memory, dump everything to the index */ + /* Also dump if the tree seems to be getting too unbalanced */ - if (buildstate-accum.allocatedMemory = maintenance_work_mem * 1024L) + if (buildstate-accum.allocatedMemory = maintenance_work_mem * 1024L || + buildstate-accum.maxdepth DEPTH_LIMIT) { The new fast-insert code likely will need a similar defense. Comments? regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance