Re: [PERFORM] Planner selects different execution plans depending on limit
On 10/09/12 16:24, bill_mar...@freenet.demailto:bill_mar...@freenet.de wrote: Hi All I´ve ft_simple_core_content_content_idx ON core_content USING gin (to_tsvector('simple'::regconfig, content) ); If I´m seaching for a word which is NOT in the column content the query plan and the execution time differs with the given limit. If I choose 3927 or any higher number the query execution took only few milliseconds. core_content content where to_tsvector('simple', content.content) @@ tsquery(plainto_tsquery('simple', 'asdasdadas') :: varchar || ':*')=true Limit 3927 Limit (cost=0.00..19302.23 rows=3926 width=621) (actual time=52147.149..52147.149 rows=0 loops=1) - Seq Scan on core_content content (cost=0.00..98384.34 rows=20011 width=621) (actual time=52147.147..52147.147 rows=0 loops=1) Filter: (to_tsvector('simple'::regconfig, content) @@ '''asdasdadas'':*'::tsquery) Total runtime: 52147.173 ms Is there any posibility to improve the performance even if the limit is only 10? Is it possible to determine that the query optimizer takes only the fast bitmap heap scan instead of the slow seq scan? The big hammer is: set enable_seqscan = off, but if you tell which PG version you're on there may be something to do. I suggest you'd start by bumping the statistics target for the column to 1 and run analyze to see what that changes. -- Jesper Hi, my email client delete a lot of the content of the original thread message. Here is the full content: Hi All I´ve created following table which contains one million records. CREATE TABLE core_content ( id bigint NOT NULL, content text NOT NULL, short_content text, CONSTRAINT core_content_pkey PRIMARY KEY (id ) ) CREATE INDEX ft_simple_core_content_content_idx ON core_content USING gin (to_tsvector('simple'::regconfig, content) ); If I´m seaching for a word which is not in the column content the query plan and the execution time differs with the given limit. If I choose 3927 or any higher number the query execution took only few milliseconds. select * from core_content content where to_tsvector('simple', content.content) @@ tsquery(plainto_tsquery('simple', 'asdasdadas') :: varchar || ':*')=true Limit 3927 Limit (cost=10091.09..19305.68 rows=3927 width=621) (actual time=0.255..0.255 rows=0 loops=1) - Bitmap Heap Scan on core_content content (cost=10091.09..57046.32 rows=20011 width=621) (actual time=0.254..0.254 rows=0 loops=1) Recheck Cond: (to_tsvector('simple'::regconfig, content) @@ '''asdasdadas'':*'::tsquery) - Bitmap Index Scan on ft_simple_core_content_content_idx (cost=0.00..10086.09 rows=20011 width=0) (actual time=0.251..0.251 rows=0 loops=1) Index Cond: (to_tsvector('simple'::regconfig, content) @@ '''asdasdadas'':*'::tsquery) Total runtime: 0.277 ms If I choose 3926 or any lower number (e.g. 10) the query execution took more than fifty seconds. select * from core_content content where to_tsvector('simple', content.content) @@ tsquery(plainto_tsquery('simple', 'asdasdadas') :: varchar || ':*')=true Limit 3927 Limit (cost=0.00..19302.23 rows=3926 width=621) (actual time=52147.149..52147.149 rows=0 loops=1) - Seq Scan on core_content content (cost=0.00..98384.34 rows=20011 width=621) (actual time=52147.147..52147.147 rows=0 loops=1) Filter: (to_tsvector('simple'::regconfig, content) @@ '''asdasdadas'':*'::tsquery) Total runtime: 52147.173 ms Is there any posibility to tune up the performance even if the limit is only 10? Is it possible to determine that the query optimizer takes only the fast bitmap heap scan instead of the slow seq scan? I use PostgreSQL 9.1.5.; Intel i5-2400 @ 3.1 GHz, 16GB; Windows 7 64 Bit Regards, Bill Martin
[PERFORM] add column with default value is very slow
I have a table as follows: \d entity Table public.entity Column|Type | Modifiers --+-+ crmid| integer | not null smcreatorid | integer | not null default 0 smownerid| integer | not null default 0 modifiedby | integer | not null default 0 setype | character varying(30) | not null description | text| createdtime | timestamp without time zone | not null modifiedtime | timestamp without time zone | not null viewedtime | timestamp without time zone | status | character varying(50) | version | integer | not null default 0 presence | integer | default 1 deleted | integer | not null default 0 Indexes: entity_pkey PRIMARY KEY, btree (crmid) entity_createdtime_idx btree (createdtime) entity_modifiedby_idx btree (modifiedby) entity_modifiedtime_idx btree (modifiedtime) entity_setype_idx btree (setype) WHERE deleted = 0 entity_smcreatorid_idx btree (smcreatorid) entity_smownerid_idx btree (smownerid) ftx_en_entity_description gin (to_tsvector('vcrm_en'::regconfig, for_fts(description))) entity_deleted_idx btree (deleted) Referenced by: TABLE service CONSTRAINT fk_1_service FOREIGN KEY (serviceid) REFERENCES entity(crmid) ON DELETE CASCADE TABLE servicecontracts CONSTRAINT fk_1_servicecontracts FOREIGN KEY (servicecontractsid) REFERENCES entity(crmid) ON DELETE CASCADE TABLE vantage_cc2entity CONSTRAINT fk_vantage_cc2entity_entity FOREIGN KEY (crm_id) REFERENCES entity(crmid) ON UPDATE CASCADE ON DELETE CASCADE TABLE vantage_emails_optout_history CONSTRAINT fk_vantage_emails_optout_history_crmid FOREIGN KEY (crmid) REFERENCES entity(crmid) ON DELETE CASCADE TABLE vantage_emails_optout_history CONSTRAINT fk_vantage_emails_optout_history_emailid FOREIGN KEY (emailid) REFERENCES entity(crmid) ON DELETE CASCADE I execued the query: ALTER TABLE entity ADD COLUMN owner_type char(1) NOT NULL default 'U'; The db is stuck. The enity table has 2064740 records; Watching locks: select pg_stat_activity.datname,pg_class.relname,pg_locks.mode, pg_locks.granted, pg_stat_activity.usename,substr(pg_stat_activity.current_query,1,10), pg_stat_activity.query_start, age(now(),pg_stat_activity.query_start) as age, pg_stat_activity.procpid from pg_stat_activity,pg_locks left outer join pg_class on (pg_locks.relation = pg_class.oid) where pg_locks.pid=pg_stat_activity.procpid order by query_start; datname | relname |mode | granted | usename | substr | query_start | age | procpid ---+-+-+-+--++---+-+- db_test | entity_modifiedtime_idx | AccessExclusiveLock | t | user | ALTER TABL | 2012-09-11 12:26:20.269965+06 | 00:45:46.101971 | 13574 db_test | | ExclusiveLock | t | user | ALTER TABL | 2012-09-11 12:26:20.269965+06 | 00:45:46.101971 | 13574 db_test | entity_modifiedby_idx| AccessExclusiveLock | t | user | ALTER TABL | 2012-09-11 12:26:20.269965+06 | 00:45:46.101971 | 13574 db_test | entity_createdtime_idx | AccessExclusiveLock | t | user | ALTER TABL | 2012-09-11 12:26:20.269965+06 | 00:45:46.101971 | 13574 db_test | entity| ShareLock | t | user | ALTER TABL | 2012-09-11 12:26:20.269965+06 | 00:45:46.101971 | 13574 db_test | entity| AccessExclusiveLock | t | user | ALTER TABL | 2012-09-11 12:26:20.269965+06 | 00:45:46.101971 | 13574 db_test | | AccessExclusiveLock | t | user | ALTER TABL | 2012-09-11 12:26:20.269965+06 | 00:45:46.101971 | 13574 db_test | | AccessExclusiveLock | t | user | ALTER TABL | 2012-09-11 12:26:20.269965+06 | 00:45:46.101971 | 13574 db_test | | ExclusiveLock | t | user | ALTER TABL | 2012-09-11 12:26:20.269965+06 | 00:45:46.101971 | 13574 db_test | entity_pkey | AccessExclusiveLock | t | user | ALTER TABL | 2012-09-11 12:26:20.269965+06 | 00:45:46.101971 | 13574 db_test | | ShareLock | t | user | ALTER TABL | 2012-09-11 12:26:20.269965+06 | 00:45:46.101971 | 13574 db_test | ftx_en_entity_description | AccessExclusiveLock | t | user | ALTER TABL | 2012-09-11 12:26:20.269965+06 | 00:45:46.101971 | 13574 db_test | | AccessShareLock |
Re: [PERFORM] add column with default value is very slow
AI Rumman wrote: I execued the query: ALTER TABLE entity ADD COLUMN owner_type char(1) NOT NULL default 'U'; The db is stuck. The enity table has 2064740 records; Watching locks: [all locks are granted] Any idea for the db stuck? To add the column, PostgreSQL has to modify all rows in the table. But then 2064740 records is not very much, so it shouldn't take forever. Do you see processor or I/O activity? Yours, Laurenz Albe -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] add column with default value is very slow
On Tue, Sep 11, 2012 at 07:20:28PM +0600, AI Rumman wrote: I have a table as follows: I execued the query: ALTER TABLE entity ADD COLUMN owner_type char(1) NOT NULL default 'U'; The db is stuck. The enity table has 2064740 records; such alter table has to rewrite whole table. So it will take a while Watching locks: output of this was perfectly unreadable, because your email client wrapped lines at some random places. In future - please put such dumps on some paste site, or just attach it to mail, and not copy/paste them to body of message. Best regards, depesz -- The best thing about modern society is how easy it is to avoid contact with it. http://depesz.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] add column with default value is very slow
I added the excel file for locks data. I was surprised to see that while I was updating a single column value for all records in a tables, all indexes are locked by the server. On Tue, Sep 11, 2012 at 7:44 PM, hubert depesz lubaczewski dep...@depesz.com wrote: On Tue, Sep 11, 2012 at 07:20:28PM +0600, AI Rumman wrote: I have a table as follows: I execued the query: ALTER TABLE entity ADD COLUMN owner_type char(1) NOT NULL default 'U'; The db is stuck. The enity table has 2064740 records; such alter table has to rewrite whole table. So it will take a while Watching locks: output of this was perfectly unreadable, because your email client wrapped lines at some random places. In future - please put such dumps on some paste site, or just attach it to mail, and not copy/paste them to body of message. Best regards, depesz -- The best thing about modern society is how easy it is to avoid contact with it. http://depesz.com/ pglocks.xls Description: MS-Excel spreadsheet -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] add column with default value is very slow
Table size is 1186 MB. I split the command in three steps as you said, but the result same during the update operation. One more thing, I have just restored the db from dump and analyzed it and I am using Postgresql 9.1 with 3 GB Ram with dual core machine. On Tue, Sep 11, 2012 at 7:59 PM, hubert depesz lubaczewski dep...@depesz.com wrote: On Tue, Sep 11, 2012 at 07:55:24PM +0600, AI Rumman wrote: I added the excel file for locks data. well, it worked, but why didn't you just make it text file, in notepad or something like this? I was surprised to see that while I was updating a single column value for all records in a tables, all indexes are locked by the server. alter table is not locked (At least looking at the pg_locks data you showed). this means - it just takes long time. Please do: select pg_total_relation_size('entity'); to see how much data it has to rewrite. for future - just don't do alter table, with default, and not null. doing it via add column; set default; batch-backfill data, set not null will take longer but will be done with much shorter locks. Best regards, depesz -- The best thing about modern society is how easy it is to avoid contact with it. http://depesz.com/
Re: [PERFORM] add column with default value is very slow
On Tue, Sep 11, 2012 at 07:55:24PM +0600, AI Rumman wrote: I added the excel file for locks data. well, it worked, but why didn't you just make it text file, in notepad or something like this? I was surprised to see that while I was updating a single column value for all records in a tables, all indexes are locked by the server. alter table is not locked (At least looking at the pg_locks data you showed). this means - it just takes long time. Please do: select pg_total_relation_size('entity'); to see how much data it has to rewrite. for future - just don't do alter table, with default, and not null. doing it via add column; set default; batch-backfill data, set not null will take longer but will be done with much shorter locks. Best regards, depesz -- The best thing about modern society is how easy it is to avoid contact with it. http://depesz.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] add column with default value is very slow
On Tue, Sep 11, 2012 at 08:04:06PM +0600, AI Rumman wrote: Table size is 1186 MB. if it takes long, it just means that your IO is slow. I split the command in three steps as you said, but the result same during the update operation. three? I was showing four steps, and one of them is usually consisting hundreds, if not thousands, of queries. One more thing, I have just restored the db from dump and analyzed it and I am using Postgresql 9.1 with 3 GB Ram with dual core machine. so it looks like your IO channel is slow. Best regards, depesz -- The best thing about modern society is how easy it is to avoid contact with it. http://depesz.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] add column with default value is very slow
On 09/11/2012 09:55 AM, AI Rumman wrote: I added the excel file for locks data. I was surprised to see that while I was updating a single column value for all records in a tables, all indexes are locked by the server. Any ALTER TABLE command locks the whole table in ACCESS EXCLUSIVE mode, indexes included. See the description of ACCESS EXCLUSIVE lock at http://www.postgresql.org/docs/current/static/explicit-locking.html cheers andrew -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Planner selects different execution plans depending on limit
Bill Martin bill.mar...@communote.com writes: I´ve created following table which contains one million records. ... Limit (cost=10091.09..19305.68 rows=3927 width=621) (actual time=0.255..0.255 rows=0 loops=1) - Bitmap Heap Scan on core_content content (cost=10091.09..57046.32 rows=20011 width=621) (actual time=0.254..0.254 rows=0 loops=1) Recheck Cond: (to_tsvector('simple'::regconfig, content) @@ '''asdasdadas'':*'::tsquery) - Bitmap Index Scan on ft_simple_core_content_content_idx (cost=0.00..10086.09 rows=20011 width=0) (actual time=0.251..0.251 rows=0 loops=1) Index Cond: (to_tsvector('simple'::regconfig, content) @@ '''asdasdadas'':*'::tsquery) Total runtime: 0.277 ms Is there any posibility to tune up the performance even if the limit is only 10? The problem is the way-off rowcount estimate (20011 rows when it's really none); with a smaller estimate there, the planner wouldn't decide to switch to a seqscan. Did you take the advice to increase the column's statistics target? Because 20011 looks suspiciously close to the default estimate that tsquery_opr_selec will fall back on if it hasn't got enough stats to come up with a trustworthy estimate for a *-pattern query. (I think there are probably some bugs in tsquery_opr_selec's estimate for this, as I just posted about on pgsql-hackers. But this number looks like you're not even getting to the estimation code, for lack of enough statistics entries.) The other thing that seems kind of weird here is that the cost estimate for the bitmap index scan seems out of line even given the 2-entries-to-fetch estimate. I'd have expected a cost estimate of a few hundred for that, not 1. Perhaps this index is really bloated, and it's time to REINDEX it? regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance